Arsenii04/16/2020, 4:32 AM
, and then maps 5 tasks over that list. Pretty straightforward, until the part where objects
include information about what other objects
(A dash) they depend on... And those
objects have to be processed and mapped over the same tasks as
. This can go several layers deep, with
dependencies that need to be taken care of dynamically. The most naive solution is just to insert the dependencies at the beginning of the original list, hence making some kind of priority queue, and mapping over the tasks. However, this would not work with a DaskExecutor -- since everything is in parallel. What I guess I need here is "sub-flows" that can be mapped over a list of lists. It seems there's some discussion on it going on https://github.com/PrefectHQ/prefect/issues/1745 , but since it's far from release yet, do y'all think a similar thing can be hacked together, now? Thanks!!
Jeremiah04/16/2020, 11:52 AM
2. a second task maps over that list, and for each item in
retrieves and returns all other items
that are required to resolve it. This could be done recursively or with Prefect’s
A, A', A'', ...
operator. The result of this map operation is a list of lists, where each sub-item contains all dependencies for the original objects 3. a third task maps over the second task. Its input is a list of
dependencies and it returns a final result. 4. The output of the third task is therefore a list of processed items, including data from dynamically discovered dependencies. In general, Prefect will work best when you know the logical graph structure in advance. Therefore, a broad strategy for effective flows with dynamic dependencies is to include tasks in your graph that load the dynamic data, but which allow you to know the graph structure ahead of time.
[A, A', A'', ...]
Arsenii04/17/2020, 2:01 AM
must be processed before
) Basically I need to create a tree of dependencies and traverse it in a depth-first postorder fashion. The "one list with all objects" approach does not play well with parallelism, since I need to complete some tasks (always the same) on one level before going to the next... And that's where a sub-flow with those tasks, or a task-looping mechanism would have helped
Jeremiah04/17/2020, 3:35 AM
and the second element is
[A, A', A'']
and you could process those in any order, dynamically, by mapping over the parent list. However, your primary motivation of depth-first runtime dependency discovery is one that we don’t have first-class support for; I suspect in order to use Prefect you may have to move some dependency resolution logic into your tasks themselves.
[B, B', B'']