Hello, can anyone point me at any kind of comparis...
# ask-community
t
Hello, can anyone point me at any kind of comparison between
Prefect-core
and
prefect-orion
? And maybe also some clarity about which of them is run when we go in the cloud path?
a
I think this FAQ item in the docs provides the best answer to your question: https://orion-docs.prefect.io/faq/#relationship-with-other-prefect-products
t
@Anna Geller thanks! so - i gather from this that once Orion is released it will also replace the flavor in the cloud itself?
a
It’s hard to give any details at this point, but Orion will eventually be compatible with our commercial platform
💯 1
t
Thanks, i’m just building a big comparison sheet for my team between different tools so I’m trying to understand better what’s going to happen 🙂 And while on the topic, despite being a general data-oriented orchestrator, would you say there’s any features in Prefect that can be particularly useful in Data-science/machine-learning, etc? (i’m trying to figure it out myself from the docs but maybe there’s some stuff i’m missing)
a
@Tom Klein Yes, there are a lot of features that make Prefect the best orchestrator for ML-use cases! Features I find particularly relevant here: • Dynamic mapping and parametrization • The hybrid execution model that allows you to orchestrate data science workflows that process sensitive data because Prefect never receives your code or data, • In contrast to many other tools on the market, Prefect is able to pass data between tasks in a first-class way. Other tools “cheat” by, e.g., pushing this data in one task and pulling in the next one, causing weird dependency problems when something goes wrong. • Other workflow orchestrators require you to build n:n number of data connectors simply because they are not able to pass data between tasks. The impact of this is that you need a connector for everything: S3toRedshift, PostgresToRedshift, GCStoRedshift, and so on. In Prefect, you don’t need that. You can have one task that accepts data as input (passed from a task that extracts data from an arbitrary source - here, it can be S3, GCS, or Postgres) and loads it, e.g., to Redshift. This design significantly reduces boilerplate code and makes your code much more adaptable to unknown future changes (say, in the future, you may need yet another data source that needs to be loaded to Redshift). • Scheduling is decoupled from the execution so that you don’t need to have a scheduler running to execute your flow. This allows you to run, e.g., ML training jobs or data science experimentation at any time for any reason without having to schedule such flow runs. • Many tools have a limited number of states, such as Scheduled, Running, Success, Failed. Therefore, individual tasks are only allowed to report a very minimal set of information about their state. In contrast, Prefect has a rich state ecosystem that allows you to react to various conditions, e.g.: ◦ take action such as sending a notification when a task is Retrying, ◦ or store an exception of a failed task as its Result and take some action based on a specific exception. I think this talk from our CEO can be useful to watch because Prefect was built to make the orchestration compatible with Data Science use cases:

https://www.youtube.com/watch?v=TlawR_gi8-Y

🙏 1
this page can also be helpful for your comparison: https://docs.prefect.io/core/about_prefect/why-prefect.html
t
Thanks!