Darragh

    Darragh

    2 years ago
    Apologies in advance for this, it’s probably going to sound extremely basic, but it’s something I’m having trouble wrapping my mental model around. I have Prefect installed on an EC2 instance that I want to use to run all my flows. I also have a stupidly basic HelloWorld flow that does nothing, but it’s still a Flow. All good so far! Here’s where my trouble starts - I want to take my flows and deploy/register them to my Prefect instance on EC2 as part of my CI/CD process, but I’m having trouble understanding the procedure for this, even after reading the Storage and Execution Environment sections of the docs. So what I’d like to find out is • How do I get my python flow files in GitLab registered to AWS Prefect? • Do they have to be in Docker storage? What does the Docker container actually contain? It looked like it was pulling a lot of dependencies when I built it.. • Is it possible to just bundle the flow files and have them imported by my instance? • The execution environment, is the LocalEnvironment suitable for what I’m looking for, just to get it started? And probably most importantly, is there a guideline doc on how to deploy/register flows using a non cloud Prefect instance? 🙂 Thanks!
    Zachary Hughes

    Zachary Hughes

    2 years ago
    Hi @Darragh, good questions! Let me try to take them one at a time: • to register a flow to the Prefect Server you have hosted on AWS, you'll need to do two things: update your
    ~/.prefect/config.toml
    file to include the API address of your Prefect Server and ensure you've run
    prefect backend server
    . Updating your
    config.toml
    to look something like the below should do the trick:
    [server]
        [server.ui]
        graphql_url = "YOUR_MACHINES_PUBLIC_IP:4200/graphql"
    • You don't need to use Docker storage, but I'd strongly it for ease of use. Using docker storage builds an image with the dependencies Prefect needs to run, in addition to any dependencies you might have specified yourself. • If you'd rather bundle your flow files and use local storage, you can do that. It's worth noting that if you do register a flow locally, it'll be tagged with tags specific to the machine on which it was registered. You'll need to ensure that the agent you spin up to retrieve that flow has matching tags, otherwise the agent will not retrieve that flow. And to your last question, the nice thing about using Server is that it's quite similar to Cloud-- this guide should do the trick, but let me know if you have any additional questions. https://docs.prefect.io/orchestration/tutorial/first.html#write-flow
    Darragh

    Darragh

    2 years ago
    Thanks @Zachary Hughes I think I see now - so for example, if I’m building my Dockerized flow, either locally or in my Gitlab job, I need to make sure there’s a ./prefect/config.toml file present, is that right? And if it that is right, what are the implications for my build jobs? Does prefect need to be available to the GitLab container in order to be able to automated CI/CD work for this? I think there’s still a couple of items I’m not 100% on, but I’m slowly getting there!
    Zachary Hughes

    Zachary Hughes

    2 years ago
    Awesome, let's get you to 100%! 💪 The reason you'll want to specify that config is that it tells Prefect where to register your flow. By default it goes to
    localhost
    , so since you're trying to register to a remote server, you'll need to give it directions. If I'm understanding what you're trying to do properly, then you'll want Prefect available in your CI/CD container to streamline the registration process. So once you've gotten Prefect into your container so you can register and pointed Prefect at the correct API, you should be good to go.
    Darragh

    Darragh

    2 years ago
    Ah, ok. That may cause some problems then 🙂 GitLab jobs all run as standalone containers, so, making sure I have it right, I need to take either: • a pre-built container with Prefect in it [is that what the
    prefecthq/prefect
    image is? ] and then
    python $FLOW_FILE
    to get it to build • a D-in-D container, install prefect using pip install prefect, and then build using the same
    python $FLOW_FILE
    as above Both of those would need the
    ~/.prefect/config.toml
    to be updated to point at my AWS Prefect server. Am I right so far?
    Zachary Hughes

    Zachary Hughes

    2 years ago
    Full disclosure: I've never used GitLab. But both of these sound like viable options, or if the container you're using has access to pip, you should be able to install Prefect there. Also, if it's easier for you configuration-wise, you can also export whatever config options you need to set as environment variables (like the API address), as described in this link: https://docs.prefect.io/core/concepts/configuration.html#environment-variables
    Darragh

    Darragh

    2 years ago
    I’m not too worried about the config, I can get around that, it’s having to do the dreaded Docker in Docker that’s going to kill me 🙂 Thanks for the help though, if I get it working I’ll let you know!
    Zachary Hughes

    Zachary Hughes

    2 years ago
    Sounds good-- let me know if there's anything else I can do to help!
    Pedro Machado

    Pedro Machado

    2 years ago
    Hi Darragh. I'd be interested in learning the outcome of your experiments!
    Darragh

    Darragh

    2 years ago
    So far it’s not going that well 🙂 My problem is that I need to be able to install prefect inside a docker container, I don’t really care what the base image is at this point, but basically all I really want is : • A container with prefect installed [is the full installation necessary for building a docker flow?] • Ability to build prefect flow into Docker storage
    Tyler Wanner

    Tyler Wanner

    2 years ago
    for a CI setup you should only need to run
    python FLOW_FILE
    in a prefect image. This should work locally fine with a docker storage object if you mount the docker daemon to the builder container (prefect image) as suggested by scott. However, it may not work in GitLab at the moment. I’ve no experience with GitLab but a lot of CI systems use remote docker daemons, which requires the use of $DOCKER_HOST under the hood… This value is not respected by the Docker storage atm, but a fix is in and ready to be released with 0.11.0
    for reference, scott (zelenka)’s response below was “d-in-d is always a headache, but this pattern works for me (on debian based images):
    docker run -v $PWD:$PWD -w $PWD -v /var/run/docker.sock:/var/run/docker.sock IMAGE_NAME command
    By mapping the $PWD volume, and docker.sock, it allows the Docker instance inside the container to behave as if it was running natively on the Docker host.”
    Darragh

    Darragh

    2 years ago
    Question actually - you mentioned “run
    python FLOW_FILE
     in a prefect image” - is there such a thing as a pre-packaged prefect image for this kind of purpose?