https://prefect.io logo
Title
a

Andrey Alekseev

06/18/2020, 8:40 AM
Hey, guys! Not sure if understand it. I'm sure that prefect can work with loops. But I don't understand how prefect can help me organize my current pipeline. I work with time series forecasting and for some problems I need to iteratively predict future timestamps. Like on the pic. Can I make prefect be able to have that loop and have my feature generator functions and model as separate tasks?
e

emre

06/18/2020, 12:57 PM
AFAIK there is no way, and I doubt there will be. Prefect, like most other workflow orchestrators, is built around the concept of a Directed Acyclic Graph (DAG). For your case, seperating generation and prediction, then feeding back into generation breaks the acyclic property. For both of these tasks, their upstream task is also their downstream task. Therefore hierarchy between tasks become ambiguous and not supported by Prefect. 2 ideas: You may either define a set number of
feature
and
predict
tasks and chain them together, effectively unrolling the loop. Or, you can combine the generation and prediction tasks, and loop the task using the
LOOP
signal. Personally, I prefer task looping. Given how co-dependent
feature
and
predict
are, separating them isn’t worth the trouble. https://docs.prefect.io/core/examples/task_looping.html
đź‘€ 1
j

Jeremiah

06/18/2020, 1:08 PM
Never say never @emre ;) but it’s true at this time Prefect only supports DAGs. We supply the LOOP construct to loop inside a specific task, just as you described.
@Andrey Alekseev I spent much of my career as a ML researcher focused on time series (usually in finance), so your question is actually very near to my heart. Prefect’s predecessor (“Scarecrow”) was an ML orchestrator that DID support cyclical graphs. As @emre said, most of the current crop of workflow managers only support DAGs because, to be honest, it’s a lot easier. However, the cyclical use case is one I’ve had in mind because it is evident that the ML world is rapidly moving that direction (whether through time series analytics or iterative approaches like those at the heart of GANs), and I intend for Prefect to be a full partner in that workflow. Stay tuned!
đź‘€ 2
a

Andrey Alekseev

06/18/2020, 1:18 PM
Thank you!
👍 1
e

emre

06/18/2020, 1:18 PM
@Jeremiah Hell yeah, why stop at DAGs? Lets write a dynamic-heap orchestrator. Jokes aside, cyclical graph support sounds amazing, even if for a subset of a bigger DAG.
:marvin: 2
j

Jeremiah

06/18/2020, 1:20 PM
In the meantime, if LOOP signals don’t work for you inside a single task, you can simulate Prefect-governed looping by using Prefect’s orchestration backend. Your feature generation and training tasks could live inside a separate
Flow
and you could have a task kick off those external flows via API as many times as you want. This would essentially amount to a small Python function looping continuously inside
Orchestrator-Flow
that kicks off the
Feature-and-Train
flow with a parameter containing the data via API, retrieving its result (also via API) and looping appropriately. You would get all the benefits of Prefect introspection and retries, while also creating a cyclical dependency. I wouldn’t say this is something we see especially often (@emre’s approach is more common) but whatever meets your unit of work is the best option.
a

Andrey Alekseev

06/18/2020, 2:20 PM
Cool! Thanks! I'll definitely try both)