Alex Welch
03/07/2021, 6:40 AMDocker Storage
with a ECSRun Config
. What I am looking to make happen is to have the github repo cloned to the docker container so that my flow has access to various files (jupyter notebooks primarily).
I have been trying solutions for a number of days and I am currently stuck on
Error while fetching server API version: {0}'.format(e)
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or direct
This would indicate that my prefect image does not have access to the docker deamon. But I can’t figure out what I am doing wrong.
I have prefect backend cloud
set. And below are my files.Alex Welch
03/07/2021, 6:41 AMFROM prefecthq/prefect
FROM <http://docker.pkg.github.com/<company>/<repo>/data-image:latest|docker.pkg.github.com/<company>/<repo>/data-image:latest>
# Prefect Config
ARG GH_BRANCH=master
ARG GH_TOKEN
ARG PREFECT_DEPS="jupyter,aws"
ARG PREFECT_HOME=/usr/local/
ARG PREFECT_VERSION=0.14.10
ENV GH_TOKEN=$GH_TOKEN
# Copy in required files
COPY requirements.txt ./
# Install Python Requirements
RUN pip install -U pip
RUN pip install prefect[${PREFECT_DEPS}]==${PREFECT_VERSION}
# Install VIM and Bash completion
RUN apt-get update
RUN apt-get install -y vim
RUN apt-get install -y bash-completion
# Cloning the master branch
WORKDIR ${PREFECT_HOME}
RUN git clone --branch ${GH_BRANCH} https://${GH_TOKEN}@github.com/<company>/<repo>
# Renaming the directory for convenience
RUN cp -r "${PREFECT_HOME}data-team-pipeline" "${PREFECT_HOME}pipeline"
WORKDIR "${PREFECT_HOME}pipeline"
Docker Compose
I’m using this to run and test the flows locally
version: "3.3"
services:
prefect:
image: prefect-image:latest
restart: always
command: bash -c "prefect auth login -t <AUTH_TOKEN> && /bin/bash"
working_dir: /usr/local/pipeline/flows/
environment:
PREFECT__CONTEXT__SECRETS__GH_TOKEN: ${GH_TOKEN}
PREFECT__CONTEXT__SECRETS__GH_USER: ${GH_USER}
WORKENV: dev
volumes:
- type: bind
source: .
target: /usr/local/pipeline
- type: bind
source: ${HOME}/.aws
target: /root/.aws
- type: bind
source: ${HOME}/.prefect
target: /root/.prefect
Alex Welch
03/07/2021, 6:41 AMimport sys
import os
sys.path.insert(0, os.path.abspath('..'))
import prefect
from prefect import task, Flow
from prefect.tasks.jupyter.jupyter import ExecuteNotebook
from flows.prefect_utils import (
RUN_CONFIG,
STORAGE
)
@task
def query_snowflake():
logger = prefect.context.get("logger")
<http://logger.info|logger.info>("Running Notebook 01_Raw_Data_Snowflake_Query")
ExecuteNotebook(path='/usr/local/pipeline/flows/01_Raw_Data_Snowflake_Query.ipynb')
with Flow("attribution") as flow:
query_snowflake()
flow.storage=STORAGE
flow.run_config=RUN_CONFIG
flow.register(project_name="tutorial")
Prefect Utils File
import os
import logging
import sys
from typing import Tuple
from prefect.run_configs import ECSRun
from prefect.storage import S3
from prefect.storage import GitHub
from prefect.storage import Docker
from prefect.client import Secret
from prefect.schedules import CronSchedule
logging.basicConfig(level=<http://logging.INFO|logging.INFO>, stream=sys.stdout)
logger = logging.getLogger(__name__)
work_env = os.getenv("WORKENV")
GH_TOKEN = Secret("GH_TOKEN").get()
PREFECT_ENV_VARS = {
"GH_TOKEN": GH_TOKEN
}
DOCKER_REGISTRY = "<http://docker.pkg.github.com/<company>/<repo>/|docker.pkg.github.com/<company>/<repo>/>"
PREFECT_DATA_IMAGE = "<http://docker.pkg.github.com/<company>/<repo>/prefect-image:1.0.0|docker.pkg.github.com/<company>/<repo>/prefect-image:1.0.0>"
# if work_env == 'dev':
TASK_ARN = <ECS_TASK_ARN
RUN_CONFIG = ECSRun(labels=['s3-flow-storage'],
task_role_arn=TASK_ARN,
image='prefecthq/prefect:latest',
memory=512,
cpu=256
)
STORAGE = Docker(
registry_url=DOCKER_REGISTRY,
base_image=PREFECT_DATA_IMAGE,
env_vars=PREFECT_ENV_VARS
)
Alex Welch
03/07/2021, 6:41 AMDocker Storage
But that doesn’t help me if I need to reference these other files.Chris White
Alex Welch
03/08/2021, 2:10 AMdocker build -f Dockerfile --no-cache --build-arg GH_TOKEN=${GH_TOKEN} -t ${IMAGE} .
The Docker compose is just running it.Alex Welch
03/08/2021, 2:27 AMAlex Welch
03/08/2021, 2:27 AMdocker run \
--workdir /usr/local/pipeline/flows/ \
-e PREFECT__CONTEXT__SECRETS__GH_TOKEN=$GH_TOKEN \
-e PREFECT__CONTEXT__SECRETS__GH_USER=$GU_USER \
-e WORKENV=dev \
--mount type=bind,source=$(pwd),target=/usr/local/pipeline \
--mount type=bind,source=${HOME}/.prefect,target=/root/.prefect -it prefect-image
Alex Welch
03/08/2021, 2:30 AMprefecthq/prefect
and received the same errorAlex Welch
03/08/2021, 2:39 AMAlex Welch
03/08/2021, 2:57 AMDocker Storage
not work when using the prefect docker container to run flowsChris White
Alex Welch
03/08/2021, 2:59 AMAlex Welch
03/08/2021, 2:59 AMAlex Welch
03/08/2021, 2:59 AMAlex Welch
03/08/2021, 3:00 AMDocker Storage
when running it this way but am i right in my understanding that that’s the issue?Chris White
Alex Welch
03/08/2021, 3:01 AMAlex Welch
03/08/2021, 3:01 AMAlex Welch
03/08/2021, 3:02 AMAlex Welch
03/08/2021, 3:03 AMS3 Storage
in dev and then Docker Storage
in prod? or is what I’m trying to do not possibleAlex Welch
03/08/2021, 3:03 AMdata-image
which comes prepackaged with all of the packages we use. And I’d like to use that in dev>CI>prodChris White
data-image
and with Prefect included; the thing you’re getting tripped up on though is that registering a flow with Docker storage requires building an image that your flow is placed into, and you can’t build a docker image from within a docker container. This means you’ll need to call flow.register
from a non-docker process.
We are starting to recommend that folks use other storage types (e.g. S3) along with a fixed image that you build independently to avoid these sorts of complications — check out this newly published doc that covers some of these patterns: https://docs.prefect.io/orchestration/flow_config/docker.htmlAlex Welch
03/08/2021, 3:25 AMChris White
Alex Welch
03/08/2021, 4:07 AMAlex Welch
03/08/2021, 4:07 AMChris White
Alex Welch
03/08/2021, 4:57 AMChris White
image="address-to-registry-and-image"
to your Run Config (all types except Local and Universal should accept this kwarg)Alex Welch
03/08/2021, 6:37 PMChris White
Alex Welch
03/08/2021, 9:33 PMChris White
image
on your run config means you’re referencing an already built image so the step of building the image happens externally to prefect