Hello Prefects, so Metaflow (by Netflix) was recen...
# prefect-community
Hello Prefects, so Metaflow (by Netflix) was recently released. I skimmed the docs a bit and there are some similarities with Prefect (like having a DAG), while other completely different approaches (like how flows are defined with the next and join member functions of FlowSpec)… I was wondering what you guys think at Prefect? Competition is always good (IMO) to make things improve on both sides. Are there any functionalities that may be interesting to port to Prefect?
Hi @David Ojeda! I'm going to cheat a bit and steal @Chris White's answer to a similar question: • "While Metaflow does have some similarities with Core, it has none with Cloud (no API / no management layer / low visibility / etc.) • Metaflow appears to be heavily focused on versioning / checkpointing machine learning model builds • Related to the above point, steps in metaflow appear to be more of an organizational tool, whereas Prefect Tasks are first class objects. So for example, you can’t have a task which only runs when an upstream dependency fails in Metaflow, as a task failure just stops the flow in Metaflow • No scheduling (so for example you could schedule your metaflow instances via Prefect) • Metaflow has more restrictions on the types of data that can be exchanged between steps, and any data exchange is not tracked as a dependency • appears to only support AWS deployments? (also no dask support in Metaflow as far as I can tell) I’m sure there are other similarities / differences but those were my initial takeaways from playing around a bit. I would love it if others chimed in with any other observations they find!"
upvote 3
My quick analysis is that Metaflow is in my opinion more a packager then a dag automation tool, although it has a few dag features. Also its approach with conda and version budling per step is in my opinion not great as it spreads imports all over the python file, (not compliant with PEP). I think it's ok for some very simple dag (mostly linear sequences of steps) but it doesn't have most of the features that make prefect-core stand out. I have been following the conda discussion quite a bit, and conda manipulation at this level sound great but in reality cloning, building and switching environments in a costly operation. It makes more sense in most cases just to rationalize the conda packages at the beginning of your flow especially if the granularity of the tasks is very small ... Probably there are more difference, but after a more detail analysis the two tools look very different to me. I mean, it's very simple to add a `@task`or a
decorator to your library but that does not mean that you have the same sort of sophistication of full-fledge dag management libraries. Just check the code in github in the two packages to verify this statement. That said you may use metaflow if you like the paradigm and if you this it's a good fit for your pipeline, but I would recommend prefect for anything else...
👍 1