Just saw the announcement of the Prefect scheduler...
# prefect-community
a
Just saw the announcement of the Prefect scheduler! That is awesome! Congrats on getting that out! Just so I’m understanding it, is the main use of it as a “trial” type of service of the Cloud offering? Or would it be used to help roll your own infra and work with Prefect? (Tried to read through as much as I could, site isn’t very mobile friendly)
👊 1
marvin 1
j
Sorry for the mobile difficulties Alex, still working out the kinks!
We’ll have full details rolling out soon, but Scheduler is really the fulfillment of one of our major goals: providing a free workflow management system that’s nonetheless backed by the same production infrastructure as Cloud. As a product, it’s aimed at individuals who don’t need (or want) the full Cloud platform, but who could definitely use some of Cloud’s features like a UI, scheduler, API, etc. to deploy their Core flows.
💯 1
😎 1
a
Love it!! I think this is an amazing step to get people’s proof of concepts done! I’m sure it’ll be included in the details that’ll be rolling out soon, but any idea on what kind of specs we could expect from the Scheduler offering? Is it gonna be on a resource/time scale? So if someone has a flow that runs hourly and takes 512mb of ram and 1cpu vs someone who has one flow that takes 16gb of ram and 8 cpu but only runs monthly?
j
So we expect that for some users, it will indeed replace the need for a formal Cloud trial in the sense of seeing how easy it can be to take a Core flow and deploy to Cloud, but it is intended to sit next to Cloud as an individual-focused product, not a replacement.
You’re teasing out all the cool details on a quiet Saturday afternoon 😉 Scheduler is tied to another thing we’ve been quiet about, honestly because we were getting some very interesting patents on it — we call it our hybrid execution engine. It allows you to keep code, data, and execution private while Prefect manages the orchestration. You can run your flows wherever you want — your laptop, kubernetes cluster, any cloud provider’s free tier, somewhere we host for you — we’re completely agnostic. It’s like “on-prem lite” for everyone, except you don’t have to do anything but kick off a Prefect agent from your CLI.
And we’ve been sneaking it all into Core over the last month or two, so we hope once we announce it, users will take up the baton and help Prefect flows run on any environment imaginable
But to your question — Scheduler will allow up to 5,000 successful task runs per month (of unlimited time, resources, and concurrency, thanks to the hybrid model). Failures don’t count against the total.
a
I love all of the words you just said. Great work as always!!
👊 2
j
Thanks @Alex Cano! If you want to join the waitlist we’ll make sure you’re one of the first
a
Just submitted the email address! I need to make a more serious pet project now that I know this functionality is on its way!
💯 1
marvin 1
🎉 1
Actually another question on this! I’m assuming this follows the at least once processing model? Or is that more dependent on each individual executor more than the scheduler portion? The thing I’d want to make sure is that all execution options have the same processing guarantees so any kind of defensive code level checks don’t need to change. Edit: I know I at least remember reading up on how the Dask executor works and how if the scheduler dies, basically everything falls apart and can’t recover intelligently. I was just wondering if you solved that problem somehow or are accepting that’s a problem of using Dask as an execution engine.
j
Excellent question. Prefect Cloud is an “at most once” model, because we can’t assume that tasks can be safely run more than once (and we don’t require users to write idempotent code, because we think that’s unrealistic and just moves burden from workflow manager to workflow author)
This means that failures in execution CAN have adverse effects
because Cloud will not resubmit a run automatically that started and did not finish
(By the way, this doesn’t include things like manual or automatic retries — if you want a task to retry of course it will!)
We’re exploring allowing users to indicate that tasks can safely be rerun, but we’ve started with the safest/most conservative approach for most production workloads
🙌 1
a
Ah gotcha. I’d be curious if there can be an extension to the task than could take a function that returns True/False that is essentially an indicator of whether to resubmit
j
^^ that’s an interesting API
cc @Chris White
This is a whole topic of Cloud we call “Zombies” — tasks that failed not because they entered a failed state, but because we detected that they aren’t actually alive
Ah, I was going to link you to an issue tracking this feature but it’s in our Cloud repo. Suffice to say it’s being evaluated!
a
My thought for the True/False option is if you can externally validate your task, like say submitting an API request. If it’s an API you control, it’s very easy to double check whether it was “committed”, you know?
j
Sometimes, when your task becomes a zombie it means we no longer have access to the execution environment (for any reason). So it may be simpler to
True/False
indicate whether a task is safe to resubmit or not at all, and then incorporate the logic you’ve described within the task itself.
but these are all excellent points for us to incorporate
a
I’ve always thought zombies were interesting... since that part really comes into how bad it would be if at least once actually comes into play. Would really be interested to see if you guys do something significantly different than say, how Airflow handles them
👍 1
Ah yeah I definitely like that more. Executing code on every failure would be a bit much. Anyway, I love what you guys are doing with Prefect in general!!
j
Thank you Alex! Conversations like this are awesome for driving everything forward