Thread
#prefect-community
    a

    Alex Welch

    1 year ago
    Hi all, I am really struggling getting this to work. I am trying to use
    Docker Storage
    with a
    ECSRun Config
    . What I am looking to make happen is to have the github repo cloned to the docker container so that my flow has access to various files (jupyter notebooks primarily). I have been trying solutions for a number of days and I am currently stuck on
    Error while fetching server API version: {0}'.format(e)
    docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or direct
    This would indicate that my prefect image does not have access to the docker deamon. But I can’t figure out what I am doing wrong. I have
    prefect backend cloud
    set. And below are my files.
    Dockerfile
    FROM prefecthq/prefect
    FROM <http://docker.pkg.github.com/<company>/<repo>/data-image:latest|docker.pkg.github.com/<company>/<repo>/data-image:latest>
    
    # Prefect Config
    ARG GH_BRANCH=master
    ARG GH_TOKEN
    ARG PREFECT_DEPS="jupyter,aws"
    ARG PREFECT_HOME=/usr/local/
    ARG PREFECT_VERSION=0.14.10
    
    ENV GH_TOKEN=$GH_TOKEN
    
    # Copy in required files
    COPY requirements.txt ./
    
    # Install Python Requirements
    RUN pip install -U pip
    RUN pip install prefect[${PREFECT_DEPS}]==${PREFECT_VERSION}
    
    # Install VIM and Bash completion
    RUN apt-get update
    RUN apt-get install -y vim
    RUN apt-get install -y bash-completion
    
    # Cloning the master branch
    WORKDIR ${PREFECT_HOME}
    RUN git clone --branch ${GH_BRANCH} https://${GH_TOKEN}@github.com/<company>/<repo>
    
    # Renaming the directory for convenience 
    RUN cp -r "${PREFECT_HOME}data-team-pipeline" "${PREFECT_HOME}pipeline"
    
    WORKDIR "${PREFECT_HOME}pipeline"
    Docker Compose I’m using this to run and test the flows locally
    version: "3.3"
    services:
    prefect:
        image: prefect-image:latest
        restart: always
        command: bash -c "prefect auth login -t <AUTH_TOKEN> && /bin/bash"
        working_dir: /usr/local/pipeline/flows/
        environment:
          PREFECT__CONTEXT__SECRETS__GH_TOKEN: ${GH_TOKEN}
          PREFECT__CONTEXT__SECRETS__GH_USER: ${GH_USER} 
          WORKENV: dev
        volumes:
          - type: bind
            source: .
            target: /usr/local/pipeline
          
          - type: bind
            source: ${HOME}/.aws
            target: /root/.aws
    
          - type: bind
            source: ${HOME}/.prefect
            target: /root/.prefect
    Flow
    import sys
    import os
    sys.path.insert(0, os.path.abspath('..'))
    
    import prefect
    from prefect import task, Flow
    from prefect.tasks.jupyter.jupyter import ExecuteNotebook
    
    from flows.prefect_utils import (
      RUN_CONFIG,
      STORAGE
    )
    
    @task
    def query_snowflake():
      logger = prefect.context.get("logger")
      <http://logger.info|logger.info>("Running Notebook 01_Raw_Data_Snowflake_Query")
      ExecuteNotebook(path='/usr/local/pipeline/flows/01_Raw_Data_Snowflake_Query.ipynb')
    
    with Flow("attribution") as flow:
      query_snowflake()
    
    flow.storage=STORAGE
    flow.run_config=RUN_CONFIG
    flow.register(project_name="tutorial")
    Prefect Utils File
    import os
    import logging
    import sys
    from typing import Tuple
    from prefect.run_configs import ECSRun
    from prefect.storage import S3
    from prefect.storage import GitHub
    from prefect.storage import Docker
    
    from prefect.client import Secret
    from prefect.schedules import CronSchedule
    
    logging.basicConfig(level=<http://logging.INFO|logging.INFO>, stream=sys.stdout)
    logger = logging.getLogger(__name__)
    
    work_env = os.getenv("WORKENV")
    GH_TOKEN = Secret("GH_TOKEN").get()
    
    PREFECT_ENV_VARS = {
      "GH_TOKEN": GH_TOKEN
    }
    
    DOCKER_REGISTRY = "<http://docker.pkg.github.com/<company>/<repo>/|docker.pkg.github.com/<company>/<repo>/>"
    PREFECT_DATA_IMAGE = "<http://docker.pkg.github.com/<company>/<repo>/prefect-image:1.0.0|docker.pkg.github.com/<company>/<repo>/prefect-image:1.0.0>"
    
    # if work_env == 'dev':
    TASK_ARN = <ECS_TASK_ARN
    RUN_CONFIG = ECSRun(labels=['s3-flow-storage'],
                          task_role_arn=TASK_ARN,
                          image='prefecthq/prefect:latest',
                          memory=512,
                          cpu=256
                        )
    STORAGE = Docker(
        registry_url=DOCKER_REGISTRY,
        base_image=PREFECT_DATA_IMAGE,
        env_vars=PREFECT_ENV_VARS
    )
    I can get other flows working that don’t involve
    Docker Storage
    But that doesn’t help me if I need to reference these other files.
    Chris White

    Chris White

    1 year ago
    This almost always is caused by a poorly configured Docker daemon, independently of prefect - I recommend trying to build that Docker file using the Docker cli directly and debug from there
    a

    Alex Welch

    1 year ago
    I am building it from the CLI with
    docker build -f Dockerfile --no-cache --build-arg GH_TOKEN=${GH_TOKEN} -t ${IMAGE} .
    The Docker compose is just running it.
    @Chris White I went and ran the below as well and am receiving the same error
    docker run \                                                                                          
    --workdir /usr/local/pipeline/flows/ \
    -e PREFECT__CONTEXT__SECRETS__GH_TOKEN=$GH_TOKEN \
    -e PREFECT__CONTEXT__SECRETS__GH_USER=$GU_USER \
    -e WORKENV=dev \
    --mount type=bind,source=$(pwd),target=/usr/local/pipeline \
    --mount type=bind,source=${HOME}/.prefect,target=/root/.prefect -it prefect-image
    I also tried it with the official prefect image
    prefecthq/prefect
    and received the same error
    do I need to expose a certain port?
    It seems like it the issue has something with not being able to find the local docker agent. But it only happens when I use the Docker Storage. Does
    Docker Storage
    not work when using the prefect docker container to run flows
    Chris White

    Chris White

    1 year ago
    Oh you’re running this from within a Docker container? Yea that’s the issue - you can’t run Docker within docker, or at least Docker recommends against it
    a

    Alex Welch

    1 year ago
    to run my fl,ows
    to run my flows
    I didnt see anything in the docs that said you cant use the
    Docker Storage
    when running it this way but am i right in my understanding that that’s the issue?
    Chris White

    Chris White

    1 year ago
    Yea, this is unrelated to prefect - you can’t access a Docker daemon from within a Docker image; those prefect images are intended to be used as base image for your flows
    a

    Alex Welch

    1 year ago
    what do you mean by that? Maybe I am not understanidng how this workflow works
    my thought process was that I build a docker image using Prefect as the base and then include an additional image I have already created
    then run the flows from there
    would i want to be running something like
    S3 Storage
    in dev and then
    Docker Storage
    in prod? or is what I’m trying to do not possible
    I’m trying to do this because we have an internal image
    data-image
    which comes prepackaged with all of the packages we use. And I’d like to use that in dev>CI>prod
    Chris White

    Chris White

    1 year ago
    Yea, your end goal is valid — you can use Docker storage built on top of your
    data-image
    and with Prefect included; the thing you’re getting tripped up on though is that registering a flow with Docker storage requires building an image that your flow is placed into, and you can’t build a docker image from within a docker container. This means you’ll need to call
    flow.register
    from a non-docker process. We are starting to recommend that folks use other storage types (e.g. S3) along with a fixed image that you build independently to avoid these sorts of complications — check out this newly published doc that covers some of these patterns: https://docs.prefect.io/orchestration/flow_config/docker.html
    a

    Alex Welch

    1 year ago
    ok yeah, i think what I’ll end up doing is setting up a conda environment that the team opens into that has prefect and the like installed so that they can test the flows. The idea was to keep eveyrthing standardized. But let me ask, in the docs it says that you recommend relying on a different storage mechanism. Is there anyway to achieve what I am trying to do other than Docker storage? ie. We have a Github repo with all of our code and a team Docker image with all teh dependencies built in. Then, within that same github repo are files we want to run (ie. jupyter notebooks)
    Chris White

    Chris White

    1 year ago
    Yea you can still build Docker images that your flows run in that contain all the relevant files and dependencies, and store your flow separate from that
    a

    Alex Welch

    1 year ago
    Is that the Docker Runner then?
    instead of ECS?
    Chris White

    Chris White

    1 year ago
    All agents other than the local agent support both Docker storage and running flows within Docker images, so you aren’t really constrained there either way
    a

    Alex Welch

    1 year ago
    so lets say i have ECS up and running ads the agent. How do I run the flow in the image?
    Chris White

    Chris White

    1 year ago
    Sorry just seeing this - you can provide an
    image="address-to-registry-and-image"
    to your Run Config (all types except Local and Universal should accept this kwarg)
    a

    Alex Welch

    1 year ago
    intersting. and that will know to pull that image down and run the flows inside it?
    Chris White

    Chris White

    1 year ago
    yup yup
    a

    Alex Welch

    1 year ago
    one last question.. this has been so helpful. does it build the image on every run
    Chris White

    Chris White

    1 year ago
    Glad I could help - specifying an
    image
    on your run config means you’re referencing an already built image so the step of building the image happens externally to prefect