https://prefect.io logo
e

Emile Piccoli

07/05/2023, 9:07 AM
Hi, we came to prefect in search of an orchestration framework. Our problem is pretty simple still, but it could grow in complexity, hence why we are exploring prefect. In essence, what we would need is: • to schedule execution of multiple flow runs • to have tasks defined (with dependencies between them) on each flow run Since we would like to deploy this on Azure container group, I could think that minimally we would need this setup: • one container instance for the prefect server • one container instance for the prefect agent (defaulting to local run, without external "workers") The question then is about the choice of database. Our setup is simple, so sqlite would suffice in theory, but the question is then where is this database file stored, because it might be beneficial to store it outside the container filesystem (and mount a volume for instance), so that it would be persistent to the rest of the infrastructure. Does anyone have recommendation / tried to achieve something simple with prefect (but in production)? At the moment the prefect server would really just service one application, so I am wondering if this is a stretch of how to use the prefect framework. We were looking for something simpler than airflow, but it looks like this might not necessarily be the case. Thanks in advance for any feedback!
c

Christopher Boyd

07/05/2023, 4:42 PM
Hi Emile - are you looking at doing self-hosted, or prefect cloud with a free workspace for this? That would dictate the necessity for sqlite, but yes, sqlite would be local to the installation
so it would co-exist on the agent / worker
I also have some setup steps (unfortunately they are not merged just yet, but maybe still good to follow) - https://github.com/PrefectHQ/prefect-recipes/tree/azure-deployments/devops/infrastructure-as-code/azure/prefect-worker-on-aci and https://github.com/PrefectHQ/prefect-recipes/tree/azure-deployments/azure/prefect-2/prefect-worker-on-aci The first is the example you mentioned (running local agent in the container group), and there is an example of doing it as worker as well. These examples show using blocks, and/or remote/private repositories / registries, but this is just to demonstrate the possibility, not necessarily that they need to be followed exactly
To answer your question directly - yes, the sqlite database is stored by default in ~/home/prefecft/.prefect
it could be beneficial to make that a separate volume mount so it could be persisted, but in going that route, would AKS be more viable (even as a smaller instance) than ACI?
e

Emile Piccoli

07/09/2023, 12:37 PM
Hi Christopher! Thank you for your replies. Yes, we would be looking into our own deployment on Azure. Why would that "dictate the location for sqlite"? So finally last week I tested locally with docker compose creating the following 2-container setup: - a docker container running the server, based only on the image
prefecthq/prefect:2-python3.10
with entrypoint:
["prefect", "orion", "start"]
- a docker container running the agent, based on the same image
prefecthq/prefect:2-python3.10
, but with some extra care that feels in part "cheating": - we add the flow and task files to the base image - we use a shell script as entrypoint, so that we can run multiple commands, namely 1) run a deployment python script (which calls
Deployment.build_from_flow()
) 2) runs the agent This setup (with only these two containers) was defaulting to a SQLite database, and I could mount the file outside of the containers as you mentioned its location. I see no issue in theory to deploy this pattern on Azure Container Group, with 2 containers within the Container Group. However, it does feel a bit "hacky". I'd be glad to have your feedback. PS: if anyone is interested, I could also post the exact configuration for reference.
@Christopher Boyd one additional question about the links you shared. If I understand the introduction right (at https://github.com/PrefectHQ/prefect-recipes/tree/azure-deployments/azure/prefect-2/prefect-worker-on-aci), the goal is to have workers deployed as Azure Container Groups. I see the relevance of this for a prefect deployment where flows are deployed dynamically. However, at least for the present case, I was looking into prefect just as a simple way to use some framework to handle dependencies between tasks, retries and such simple features, but the "work" that needs to be done will be mostly static, except after updates of the flows. This is the reason why I was asking if it was not a stretch to use Prefect at all in this simple scenario.