guys need to evaluate airflow vs prefect vs dagste...
# prefect-community
c
guys need to evaluate airflow vs prefect vs dagster which one should we go for?
m
Really depends on your use-case, needs and team skills I would say… I don’t know Dagster too well, Airflow is a well established tool but also build in the Hadoop era of long running jobs that needed to be stitched together and scheduled. So it does not always fit modern standards (again, depends on your use-case). Prefect can be seen as a modernised version of Airflow solving much of its issues.
upvote 3
Anyway, that's just my two cents…
c
major issue we might face is communicate data between task as we are heavily using dbt for transformation and basically complete lineage part.
a
@chris evans If you want to see how to use Prefect together with Monte Carlo data lineage (and dbt), I'll be doing a demo about that together with @alex tomorrow at 2p Eastern. You can sign up for that live workshop here.
❤️ 1
m
@chris evans in that case, Airflow is of the table as it doesn’t support data communication between tasks. Both Prefect and Dagster have support for this. The immediate advantage of Prefect over Dagster is that it has a Cloud offering (its on the roadmap for Dagster, but not sure of its already there).
upvote 3
c
Thanks for response guys, I'd love to see more information on how we can use dbt databricks and prefect as orc.
a
Check out this blog post for Databricks: https://towardsdatascience.com/tutorial-integrating-prefect-databricks-af426d8edf5c and for dbt we have several blog posts about it listed here
c
Thanks @Anna Geller
👍 1
@Anna Geller I just missed asking about this. do airbyte supports data ingesion part. let's say we have mysql prod db and i want to run bunch of airbyte piplines (using spark for larger data processing) and dump data in partition parquet format in one of the s3 bucket. can i do this using airbyte..?
cause usually in past experience what i did basically enable bin logs and reading cdc sequence we were reading all new rows or updates rows using nifi and spark on emr applications were reading this data. to dump in s3. all new data/ updated data can airbyte read this and dump in s3 using running some spark? Pls let me know @Anna Geller
a
not sure you are in the right Slack 😄 you are asking this in Prefect Community. Do you know that Airbyte has a Slack Community as well?
c
shoot!
SOrrY : P my bad
a
no worries, all good 🙂 we do have an Airbyte integration, so if you have any questions about how to integrate your specific Airbyte sync with Prefect, we can definitely help. But for a deep dive on Airbyte, their Slack can help more