https://prefect.io logo
Title
p

Preston Marshall

06/09/2020, 1:56 PM
I think Prefect could be poised to be the next big data engineering tool, one gap I see though is that it can really only operate "serverless" on AWS with Fargate which I'm not sure even applies. On GCP it seems like Cloud Run could be a good deploy target for tasks that take less than 15 minutes. You'd need something to supervise the requests or you could utilize cloud pubsub. Any interest in this from the community? I'd really like to not worry about running a full blown k8s cluster to run Prefect. Airflow on the other hand is provided as a hosted service by Google.
j

Joe Schmid

06/09/2020, 1:59 PM
Hi @Preston Marshall, we've been running Prefect in Fargate for the last several months and have had a good experience. We even use Prefect's
DaskCloudProviderEnvironment
to dynamically create a distributed Dask cluster on Fargate that scales uniquely for specific Flows and parameter sets, e.g. for a large data engineering Flow that benefits from parallelism, that environment spins up a Dask cluster with 10 workers, etc.
p

Preston Marshall

06/09/2020, 2:00 PM
Yeah, the fargate experience seems quite nice. I am wondering about things like Azure and GCP. I am looking at the Executor class now and it seems like it could be simple enough to implement a custom one
Fargate basically just gives you the ability to run compute indefinitely with a docker image, right?
j

Jim Crist-Harif

06/09/2020, 2:02 PM
I wouldn't do this as an
Executor
, this would be more an
Environment
and/or
Agent
IMO: • Executor - where to run tasks in a flow • Environment - spec for deploying a flow (e.g. in a fargate task) • Agent - process that watches the prefect api and kicks off flow runs using the Environment.
:upvote: 2
But yeah, we're happy to support other platforms for these.
j

Joe Schmid

06/09/2020, 2:03 PM
Fargate basically just gives you the ability to run compute indefinitely with a docker image, right?
Yes. Maybe another way to say it would be "run containers without allocating compute resources, i.e. ec2 instances."
p

Preston Marshall

06/09/2020, 2:10 PM
does EC2 have the ability to launch a docker container with their API? I am wondering if spinning up an instance group (AWS autoscaler group analog) with a docker container instance would be a better fit. GCP offers this via their API, I think it may be the closest thing to Fargate. It wouldn't be subject to timeouts either, and the IG could handle scaling up, restarting dead instances (though I've never seen that in years of GCP), etc. I will look into those contracts
a

An Hoang

06/09/2020, 2:11 PM
Thank you @Jim Crist-Harif for clarifying the difference in those terminologies! Maybe I missed it in the docs but I was very confused.
j

Joe Schmid

06/09/2020, 2:19 PM
does EC2 have the ability to launch a docker container with their API?
Yes, you can use ECS (Elastic Container Service) in two modes: 1. Fargate -- serverless, i.e. AWS manages resources for you 2. EC2 -- the traditional ECS approach where containers run on EC2 instances that you create, potentially using auto-scaling groups.
p

Pedro Machado

06/09/2020, 3:38 PM
I am still in evaluation mode and also interested in exploring DevOps-light ways to run Prefect. More serverless options are definitely welcome. I'd like to avoid a running a full kubernetes cluster for a small deployment. I'm glad to hear that the Fargate option works well. I plan to try it. Does it take long to start a new cluster with
DaskCloudProviderEnvironment
? Can the cluster be created in advance and just scale up when needed? Can the cluster be scaled down to zero? I've been wondering if AWS lambda or similar could be used to run an agent for infrequent flows. I was envisioning a function that would start every so often, run the agent, and go back to sleep if no work needs to be done. Would this make sense at all or are the time limits a deal killer? What other cloud services could be a good fit for a serverless execution environment that doesn't require a lot of administration?
j

Joe Schmid

06/09/2020, 3:58 PM
Does it take long to start a new cluster with  
DaskCloudProviderEnvironment
?
There is startup latency with Fargate in general (nothing to do with Prefect) and when using
DaskCloudProviderEnvironment
with Fargate. This makes sense, i.e. the whole point of serverless is to avoid pre-allocating compute resources and let the platform allocate them so you're trading off some startup latency for dynamic allocation.) However, for many scheduled Flows start-up latency would be a non-issue, e.g. when running a nightly scheduled Flow.
Can the cluster be created in advance and just scale up when needed? Can the cluster be scaled down to zero?
Yes and yes. In addition to running Prefect on Fargate, we also run on a Kubernetes cluster in this mode, i.e. cluster is created in advance then scaled up and down as needed. Like anything stateful, this comes with lifecycle management issues, e.g. if we have a new version of a docker image with updated Flows we need to update the existing cluster -- but if Flow(s) are currently running we need to wait for them to finish & gracefully update the cluster with new docker image versions, etc. The good news is there are plenty of options, e.g. Fargate, k8s, etc. so that you can make choices to optimize for Flow startup latency vs. lifecycle management, etc.
I've been wondering if AWS lambda or similar could be used to run an agent for infrequent flows.
We find the Prefect Fargate Agent to be incredibly light weight. We run it as an ECS Service (long-running task) with Fargate using the lowest resources possible (1/4 vCPU & 512MB RAM) which costs about $9 per month running continuously. (All the Agent does is poll for Flows that are ready to run and launch an ECS task for each Flow run.) At that rate, I'd say it isn't even worth trying to optimize cost by running it periodically in Lambda, etc.
p

Pedro Machado

06/09/2020, 4:28 PM
Great, Joe. Thanks for the answers. Question regarding updating flows. Are the images not created when the flows are registered and then pulled when a new task is run?
j

Jim Crist-Harif

06/09/2020, 4:33 PM
Currently flows running with an image have unique images per flow, so yes that's accurate. In the upcoming release we've refactored our
Storage
classes to support storing the flow information on e.g.
S3
, and having a static image shared by all flows.
👍 1
p

Pedro Machado

06/09/2020, 4:46 PM
Hi Jim. This sounds like a great improvement!