https://prefect.io logo
a

An Hoang

08/29/2019, 12:48 PM
Quick question: If I build a flow that takes all the csvs in a directory and generates a report, is there anyway that Prefect could detect a new csv being added and rerun the pipeline? Or is scheduling it to do every couple of hours the way to go(not reactive). Also, how would you structure and link parameter file with each csv (maybe some csvs share the same parameter file) to parameterize the Flow to be processed a certain way?
j

Jeremiah

08/29/2019, 12:56 PM
That is a good question. Today, Flows are kicked off on an ad-hoc basis or via schedule. We have published a draft spec for a new kind of event-driven flow that we’re calling a “Listener Flow” which would be perfect for your situation, and we’ll be taking that up probably next quarter. https://docs.prefect.io/guide/PINs/PIN-08-Listener-Flows.html
One way to work around this at the moment would be to create a flow that had a single task that contained a while loop and launched a second flow any time it found a new file, but that may be overkill for what you’re trying to achieve
a

An Hoang

08/29/2019, 1:00 PM
Thanks @Jeremiah! I heard a bit about the Event Driven Flow from the data engineering podcast, and then also the "environment for each task" feature sounds really cool. Excited to see!
j

Jeremiah

08/29/2019, 1:03 PM
Lots and lots and lots coming down the pipe 🙂