https://prefect.io logo
a

Ajith Kumara Beragala Acharige Lal

12/22/2020, 4:04 PM
Hi Prefect experts, I want to try out Prefect in our existing ETL pipeline for scheduling/Spark job management - I know Prefect is a best match for Python based scripts ( eg: Pyspark) , would it support Spark Scala/Java jobs as well ? because our ETL is mainly built with Scala Spark jobs? any examples or document related to this matter? Thank you in advance! 🙏 ( sorry, if this is a duplicate question)
j

Jim Crist-Harif

12/22/2020, 4:08 PM
Hi Ajith, Prefect can run jobs of any type. For launching things other than Python tasks, you'd need some way to kick off a spark job from Python or using a shell script (and our
ShellTask
). This might call
spark-submit
or something else.
a

Ajith Kumara Beragala Acharige Lal

12/22/2020, 4:13 PM
Thank you @Jim Crist-Harif for the quick response! If I got you correct - I need some kind of wrapper which can trigger Scala spark jobs based on Prefect calls? Is my understanding correct?
j

Jim Crist-Harif

12/22/2020, 4:14 PM
Yes, that's correct.
a

Ajith Kumara Beragala Acharige Lal

12/22/2020, 4:14 PM
alright! thank you very much!