https://prefect.io logo
Title
t

Tom Manterfield

04/21/2022, 2:42 PM
Hello everyone! Is there a way to create a full Prefect 2.0 setup entirely in code/config without CLI interactions? Ideally when creating work queues, agents, deployments etc I can just whack it all in a yaml file or similar and then have the same setup created every single time I apply that config. All I can see in the docs is a series of CLI commands for all of this, which is great for local dev and interacting on an ad-hoc basis, but I couldn’t run like that across multiple envs. I have searched (and searched and searched) but coming up with nothing. Hopefully I’m just missing the obvious! I’m deploying to Kubernetes, if that info makes a difference.
k

Kevin Kho

04/21/2022, 2:46 PM
I am pretty sure the answer here is not really, but let me confirm with some engineers. Will elaborate more when I get a response in a bit.
🙏 1
t

Tom Manterfield

04/21/2022, 2:57 PM
Any reliable in-code way to configure would be fine. Simple declarative formats would be ideal but the goal is I can run one or two commands as part of deployment and have a system with all the work queues, agents, etc needed.
k

Kevin Kho

04/21/2022, 3:56 PM
So this type of thing, we are interested, but it will come after schemas are more final. In the meantime though, the CLI commands call the Python API under the hood and you could interact with those classes directly instead if you want to avoid the CLI. Largely undocumented for now though.
t

Tom Manterfield

04/21/2022, 4:01 PM
No problem and thanks very much for getting back to me. I’m going to have a bit more of a play around to get some more definite questions. Right now I just get an overall impression of not being able to reliably deploy Prefect 2.0 without lots of scripting/manual steps. Even if I add those to CI/CD flows there’s still all the logic around checking for failure and retrying… ironically one of Prefect’s main strengths.
k

Kevin Kho

04/21/2022, 4:04 PM
That’s definitely understandable! If you need something more immediate, Prefect 1.0 of course is production ready. But I understand if you want to wait a bit for Prefect 2.0 to solidify
t

Tom Manterfield

04/21/2022, 4:15 PM
I’m going to dig in a bit and see if any of it is solvable on my side. I’d imagine some of the things I see as issues up front won’t be in practice. If you have an API for all of this internally I might be able to work backwards from that and create a Kubernetes operator.
a

Alexander Butler

04/22/2022, 2:36 PM
I managed to do it @Tom Manterfield using cli commands but embedding them all in a docker image build instructions. It makes the entire prefect setup automated, images built triggered by ci cd. Would that solve for your use case?
:upvote: 1
t

Tom Manterfield

04/22/2022, 4:44 PM
It would technically work, but I’m not sure it gives me much over doing it directly in CI/CD and scripting it. If I were going to script I think I’d do it as a job on Kubernetes and run it post deployment.
@Alexander Butler I’d be interested to see how you scripted the storage set up though. That seems to be the worst part for scripting as I can’t see any way to give the args in one go/non-interactively
a

Alexander Butler

04/22/2022, 6:35 PM
FROM python:3.9-slim as base

ENV PYTHONFAULTHANDLER=1 \
    PYTHONHASHSEED=random \
    PYTHONUNBUFFERED=1

RUN apt-get update && apt-get install -y gcc libffi-dev g++
WORKDIR /app

FROM base as staging

ENV PIP_DEFAULT_TIMEOUT=100 \
    PIP_DISABLE_PIP_VERSION_CHECK=1 \
    PIP_NO_CACHE_DIR=1 \
    POETRY_VERSION=1.1.13

RUN pip install "poetry==$POETRY_VERSION"
RUN python -m venv /venv

COPY pyproject.toml poetry.lock ./
RUN . /venv/bin/activate && poetry install --no-dev --no-root

COPY . .
RUN . /venv/bin/activate && poetry build

FROM base as output

COPY --from=staging /venv /venv
COPY --from=staging /app/dist .
COPY docker-entrypoint.sh ./
COPY src/production_*.py ./
COPY src/deploy-production.sh ./

# USE GCLOUD FOR DOCKER OPS
RUN echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] <http://packages.cloud.google.com/apt> cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list && curl <https://packages.cloud.google.com/apt/doc/apt-key.gpg> | tee /usr/share/keyrings/cloud.google.gpg && apt-get update -y && apt-get install google-cloud-sdk -y
RUN yes | gcloud auth configure-docker us-west1-docker.pkg.dev

# ENSURE WORK-QUEUES HAVE WORKERS IN ENTRYPOINT
RUN . /venv/bin/activate && \
        pip install *.whl && \
        prefect work-queue create -t pipeline pipeline-jobs && \
        prefect work-queue create -t reverse-etl reverse-etl-jobs && \
        prefect work-queue create -t ml machine-learning-jobs && \
        prefect work-queue create -t audit audit-jobs && \
        chmod +x ./deploy-production.sh && \
        chmod +x ./docker-entrypoint.sh && \
        ./deploy-production.sh

EXPOSE 4200

# START AGENTS + SERVER (run with --network="host")
CMD ["./docker-entrypoint.sh"]
./deploy-production.sh
for deployments in ./production_*.py; do
    prefect deployment create $deployments
    echo "Deployments loaded from $deployments"
done
./docker-entrypoint.sh (ran on every container startup)
#!/bin/sh
. /venv/bin/activate

# Initialize Prefect Agents 
prefect agent start pipeline-jobs \
    2> errfile-pipelines &
prefect agent start reverse-etl-jobs \
    2> errfile-reverse-etl &
prefect agent start machine-learning-jobs \
    2> errfile-machine-learning &
prefect agent start audit-jobs \
    2> errfile-audit &

# Initialize Prefect Server
prefect orion start --host 0.0.0.0
I use the default transient storage. Any important state stored outside prefect is handled within flows. We dont care about a long history of flow runs beyond the containers lifetime between upgrades. 🤷
t

Tom Manterfield

04/22/2022, 7:34 PM
Ah, I think you must also be using the local SQLite DB, right?
Thanks very much for sharing btw. I don’t think I can follow this exact pattern but it’s great to start from a working example. 🙏