Question regarding task scheduling and execution. ...
# ask-community
i
Question regarding task scheduling and execution. I’m using
RayTaskScheduler
, but it could be applied to any, I suppose. Assume there is only one worker (node, process, or whatever fits the
Scheduler
class), and the computation graph consists of multiple identical branches (
A -> B -> C
x3). Is there a way to tell Prefect to prefer to go DFS and complete one such branch, before attempting to continue to the other branches? I’m aware of the
wait_for
argument, is there anything else one can do?
n
something like this?
Copy code
In [4]: from prefect import flow, task

In [5]: @task
   ...: def a():
   ...:     pass
   ...:

In [6]: @task
   ...: def b():
   ...:     pass
   ...:

In [7]: @task
   ...: def c():
   ...:     pass
   ...:

In [8]: @flow
   ...: def branch():
   ...:     a_result = a()
   ...:     b_result = b(wait_for=[a_result])
   ...:     c(wait_for=[b_result])
   ...:

In [9]: @flow
   ...: def foo():
   ...:     for _ in range(3):
   ...:         branch()
   ...:

In [10]: foo()
where you'd want to use
.submit()
to leverage your task runner, but otherwise should be the same, and if you pass data between a, b and c then you dont need to explicitly use
wait_for
i
Thanks @Nate! Sub-flows seem like the way to go for now, at least while creating this new flow. If I understand correctly, a call to a subflow is a blocking one, so this strictly forces a one-at-a-time behaviour. The end goal would be to distribute each branch onto a different worker node, let them run in parallel, and as a worker becomes free (finished a branch), it would pick up a new one.
n
if you're interested in a multi-node setup then I'd check this out
🙌 1
branch
could instead be its own deployment that you could create flow runs for with
run_deployment
and then your prefect worker would pick up those flow runs and execute them
👀 1
i
Dang, I was hoping to avoid the asyncio world, but that seems to be the way to go! On a completely tangent note — and I’ve asked this from the Ray community as well, figured why not here — my future “subflow” runs very nicely in Prefect using the RayTaskScheduler. If I try to compose the subflow as a Ray actor, I get OOM. Do you have any idea how those might be so fundamentally different that I’ll get an OOM in one and not the other?
And thanks, the pokemon example is great (and nostalgic)!
👍 1