https://prefect.io logo
Title
a

Adam

04/13/2023, 8:56 PM
Hey all, I am trying to build a new project with a docker build and a git pull in the prefect.yaml. I am pretty sure I have this right, I am getting stuck on what the dockerfile itself should look like. I want to run on 3.10, make a virtual env to then run all the code on. I am testing locally first with the plan to ecs task later. If anyone could help me figure out the dockerfile build I would be very grateful! Feel free to dm, thanks!
c

Chris White

04/13/2023, 9:37 PM
Hey Adam! Is there a reason you want to use a virtual environment within docker? One of the benefits of docker is that it is a complete encapsulation of your runtime environment so adding in a virtual environment adds some complexity that should be avoided if possible. As a concrete suggestion, if you set
dockerfile: auto
on the docker build step, it will create a dockerfile and build the image for you -- Prefect persists this dockerfile within the container so you can inspect it and copy / paste it to build on top of it and customize it further
a

Adam

04/14/2023, 12:51 PM
Hi Chris, thanks for jumping in to help! I guess i am getting confused on what is happening under the hood. One automation I am working on is a standard ETL running stitch then dbt then hightouch. It needs to run on a VM and my team recommended ECS for the agent so it will be available upon run time. The second project I am working on is a detailed python script to run data and put it somewhere. These have specifications and was planning on containerizing all the requirements, creating a virtual env and running that virtually. Before projects came along I was going to use the git stoage and ecs infra block meaning I needed a custom dockerfile/image. With projects now, it made what I was doing initially less clear
does that make sense?
another question I cant seem to grasp is the scheduler only works when an agent is live, If I use ECS, how does the agent get activated on ECS so the deploy will work at the time set in the prefect UI
c

Chris White

04/17/2023, 4:49 AM
Hey Adam - I think the overall goal makes sense, but this line still confuses me:
These have specifications and was planning on containerizing all the requirements, creating a virtual env and running that virtually
I still don't see why you'd need a virtual environment if the container itself is purpose-built to have your runtime requirements already present; ultimately it's OK if you want to do both, there's just increased risk for weird issues
On the scheduler, the scheduler is always running regardless of agents or workers; the Prefect scheduler is unique in that it's job is to create runs in a
Scheduled
state but not to execute those runs - that's where the agent / worker comes in. It pulls runs that are ready for execution off a queue created by the scheduler. At this exact moment, you'll need to run your ECS agent in a VM somewhere 24/7 but we totally recognize that this defeats part of the value of ECS and will be offering a "managed worker" service in the next few weeks that submits jobs to your ECS instance
a

Adam

04/17/2023, 12:56 PM
I learned that virtual env are best practice even on VMs. I guess its a way to ensure you wont run into any package discrepancies by accident even in a container. I thought about it the way you did tbh. As for the ECS, it did always confuse me like a chicken and the egg that agent needed to run to have the job in the scheduler actually execute. if it was not a 24/7 VM you would have to time the agent to go up right when prefect launched... this almost defeats the purpose as it adds a new point of failure that would be hard to monitor. on a 24/7 VM you can just make an automation to detect the agent going down.