I appreciate this might be a fairly significant engineering Prefect Community #prefect-aws

I appreciate this might be a fairly significant en...

Christopher

11/30/2022, 2:38 PM

I appreciate this might be a fairly significant engineering effort, so not something I'm holding my breath for, but is there any appetite for Lambda-based task execution? We're probably heading for ECS tasks as a serverless execution environment, but they're kind of slow to spin up, and for short-lived tasks it feels like Lambda would be at the right sort of level

✅ 1

Anna Geller

11/30/2022, 3:21 PM

Lambda is very limited in what you can run on this, max 15 min runtime. It's free and great for event-driven execution of small actions but it's not great as a runtime environment. If you would like to take a stab at contributing an infra block to prefect-aws repo, we would most likely accept it and help you get it released using prefect with serverless framework for lambda works pretty seamless without infra blocks https://medium.com/the-prefect-blog/event-driven-data-pipelines-with-aws-lambda-prefect-and-github-actions-b3d9f84b1309

Anna Geller

11/30/2022, 3:21 PM

(almost free)

Christopher

11/30/2022, 3:24 PM

Yes, I appreciate it wouldn't be a good general-purpose runner, but many of our tasks are actually quite short-lived (especially with the prefect 2 pricing model encouraging smaller tasks), and it would be nice to be able to use prefect for orchestrating and monitoring these as well as our bigger batch tasks

Christopher

11/30/2022, 3:25 PM

but perhaps there are other tools better suited for orchestration of small jobs

Zanie

11/30/2022, 3:59 PM

You may be able to use lambda with subprocess infrastructure and an agent with the

--run-once

flag

Zanie

11/30/2022, 4:00 PM

You’d have troubles with the 15 min limit still though if there are a lot of scheduled runs. We may be able to add another flag to the agent to stop running new flows after some time.

Christopher

11/30/2022, 4:01 PM

Sorry it's possible I'm not using the right terminology -- I imagine that the agent would probably be on long-lived infrastructure. And then each flow run would be on a new lambda invocation

Christopher

11/30/2022, 4:02 PM

oh but you mean one immediate option might be to run the agent itself on lambda and let it pick up tasks to run locally

Zanie

11/30/2022, 4:02 PM

👍 yep

Zanie

11/30/2022, 4:03 PM

If you’re interested in contributing a block that submits runs to lambda, we’re definitely interested.

🙌 1

Christopher

11/30/2022, 4:03 PM

Cool, well I'm not going to make any commitments but if it starts to look like that would make sense for us then I'll reach back out

Zanie

11/30/2022, 4:04 PM

Sweet! We’ll probably do it eventually too.

Mike Grabbe

11/30/2022, 4:06 PM

@Christopher, have you thought about setting up ECS on EC2 for lower latency task runs?

❤️ 1

Christopher

11/30/2022, 4:06 PM

No, in fact my ECS experience is limited to Fargate. Is that something I should look into?

Mike Grabbe

11/30/2022, 4:07 PM

well, its an option to consider. I havent dug too deep into it either

Mike Grabbe

11/30/2022, 4:09 PM

we're just getting started with ECS fargate task runs, and the startup time is frustrating for development runs. I may try to set up ECS on EC2 and see how that works for tasks that dont need much in terms of resourcing

Christopher

11/30/2022, 4:10 PM

Please report back if you find it transformative!

Christopher

11/30/2022, 4:10 PM

Arguably at that point I suppose it might be more efficient to just run the agent on EC2 and use local execution rather than spinning anything up at all

💯 1

Mike Grabbe

11/30/2022, 4:11 PM

yes, that would be a simpler way to start out

gratitude thank you 1

Anna Geller

11/30/2022, 4:49 PM

who knows, maybe AWS will surprise us and will announce some cool caching on Fargate this week during re:Invent 😄 would be sweet

Mike Grabbe

11/30/2022, 4:50 PM

That would be amazing. We would definitely pay for that feature

Christopher

11/30/2022, 4:50 PM

true, if they can do it for lambda, they might do it for fargat too!

💯 1

Anna Geller

11/30/2022, 4:50 PM

there is one trick to slightly reduce the latency of ECSTask block and it is to use a custom task definition ARN - this way Prefect doesn't have to register a new task definition and deregister it, so there is slightly less latency. From my QA it was max 30s which seems acceptable for batch processes and even near-real time

👀 2

Anna Geller

11/30/2022, 4:51 PM

ditto. If AWS doesn't do it, Modal labs will show them the way of how to use caching to make serverless almost latency free

Jessica Smith

01/21/2023, 2:47 PM

@Christopher have you looked into this further? I found this thread because my team is in the same place and I'm thinking about trying to contribute a lambda infra block. If you have any ideas or hit any roadblocks I'd be happy to hear them

Christopher

01/21/2023, 2:49 PM

We didn't I'm afraid, got distracted by other fires to put out! I still think it's interesting though

👍 1

5 Views

Open in Slack

Previous Next