Hello I am a software developer from Israel Been reading and Prefect Community #introductions

Hello, I am a software developer from Israel. Bee...

Tomer Cagan

02/24/2022, 12:25 PM

Hello, I am a software developer from Israel. Been reading and playing some with prefect and it looks like a really great product! I am working with an algorithmic research team and we are looking for a platform to run our experiments. We are currently using an in-house system but we experience a lot of friction using it and looking for an alternative. (more inside this thread)

👋 12

Anna Geller

02/24/2022, 12:29 PM

Welcome to the community @Tomer Cagan! 👋

👋 1

Tomer Cagan

02/24/2022, 12:31 PM

I have seen and understand how prefect makes sense for data-pipeline (creating new data) and also was very impressed with the data science stream. Our use case is slightly different - our focus is changing the code of the algorithm (the data is more or less fixed). We are looking for a platform that can help us with experiments, which involve continually running new versions of the code. The few high level requirements we have: • Parallel execution at different "depth" ◦ In many instances we need to start many tasks in parallel, collect the data and continue with processing ◦ This fan-out / fan-in can occur multiple time - i.e. one task can then spawn many sub-tasks which in turn can spawn more sub-tasks ◦ We would be happy to be able to cancel tasks after some tasks have completed (sort of find a winner task and cancel the rest • Still related to parallel but I put it separately for emphasis - one of the more important requirement we have is to be able to do recursive task chains. This is kind of a divide and conquer approach where a task tries to solve part of the algorithm, and if it fails, will split the problem into smaller parts are try to process each separately. The subprocesses can in turn further divide until the problem is small enough to solve. Then the results are percolated up and assembled. • Nice to have: conditional on the fan-out/fan-in results and than continue to process further. I am not sure whether prefect fits the bill here or not. It does cover a lot and I guess some work arounds can be used (e.g. sub-flows? can a flow call itself?). I was also wondering about the combination of prefect and dask - but I am talking about running dask code within a prefect task / flow. I will mention that our team has varying levels of coding experience (that's one of the appeals of the flows/tasks paradigm in prefect.

Tomer Cagan

02/24/2022, 12:32 PM

I am not sure how to proceed here - first of, am I even in the right place ? 😅 Should I ask specific separate questions in the community thread?

Anna Geller

02/24/2022, 12:46 PM

Thanks for sharing - sounds like a good use case for Orion. Would be great if you could repost it on Discourse in the welcome topic. Subflows are first-class objects in Orion, so that’s definitely supported. To run flows on Dask, you can configure a

DaskTaskRunner

on your flow. You’re definitely in the right place. Feel free to continue the discussion here or open a separate thread for specific questions (perhaps it’s easier to discuss each specific question/topic separately).

Chris Reuter

02/24/2022, 1:00 PM

Welcome to the community @Tomer Cagan!

Tomer Cagan

02/24/2022, 1:12 PM

I assumed it would be a better fit for Orion - with dynamic flows and such - my concern there is the maturity level of that version - I read in the FAQs that you do not recommend to use it in production and while our team is not large and deal with research I still want to make sure the solution we use is stable enough

Anna Geller

02/24/2022, 1:31 PM

I’d say that Orion is ready for local development and research. We don’t recommend it using it in production yet in the sense that you shouldn’t schedule your mission-critical workloads there (for now). But what you described seems like you mostly need a framework to locally execute flows for your research and for that you can definitely use Orion already.

Tomer Cagan

02/24/2022, 1:55 PM

Actually, I haven't got around for it but: • we have automated scheduled run of the "mainline" of our code. This is schedule daily/weekly runs that we are using to track progress • general development run - in which a research can test his/her changes to the code, usually on small samples, and ensure everything works as expected. This is where low level debugging would happen. For this, prefect local execution seems perfect (unlike other tools that makes it harder to run and debug locally) • ad-hoc scalled-out (or up) runs - in which code that was just developed (and not necessarily merge to mainline, sometimes not even committed) can be run on a larger scale on larger set of samples. We want this to most likely happen in k8s - I believe that with the available storage options, mainly docker, that would also be something prefect supports

Tomer Cagan

02/24/2022, 1:56 PM

So there is an aspect of schedule important (as it should not fail most of the time) workload(s)

Anna Geller

02/24/2022, 2:04 PM

It’s up to you whether you want to start directly with Orion or start with Prefect 1.0 and Prefect Cloud first. Either way, the migration process shouldn’t be too difficult. Keep us posted if you face any questions along the way.

Tomer Cagan

02/24/2022, 2:38 PM

Is there an estimate when Orion would come out of technical preview?

Anna Geller

02/24/2022, 2:42 PM

Early Q2 was a rough ETA, but it would be best if you keep watching the #CKNSX5WG3 channel in the next weeks and months to stay up to date with the latest releases. You can also click the 🔔 icon on this Discourse tag https://discourse.prefect.io/tag/release-notes

Kevin Kho

02/24/2022, 2:45 PM

Hi @Tomer Cagan! Welcome!

4 Views

Open in Slack

Previous Next