Hello everyone, I am a machine learning engineer, ...
# prefect-community
i
Hello everyone, I am a machine learning engineer, congrats on your amazing engine. I am in the process of migrating some workflows to prefect and I am facing the following problem: Supposing that I have 2 flows already registered and I want to create a flow that combines them according to https://docs.prefect.io/core/idioms/flow-to-flow.html but I want to provide the output of a specific task of one flow as a Parameter of the next flow (see snippet). Is it possible to do so (and how?).
e
It seems possible, check out
parameters
over at https://docs.prefect.io/api/latest/tasks/prefect.html#startflowrun I think, rather than keeping both flows under parent flow, your flow_a should have a final task, which is a
StartFlowRun
task that schedules flow b with any parameters that come from upstream tasks at flow a.
Copy code
with Flow('first') as flow:
    a = task_a()
    b = task_b(a)
    c = task_c(b)
    StartFlowRun(flow_name='second', project_name=...)(parameters={"input"=c})
i
Hi @emre thanks for your response. This solution however requires changes that affect flow_a, I am searching a way to build something on-top of these flows (so that they can remain unaffected).
I guess the actual question is whether it is possible to inspect/access the output of a flow's task when running it via the StartFlowRun operator. If that is possible, I can pass that as a parameter to the next flow.
e
Oh, I see. I can’t guarantee if my idea will work, since I don’t have much experience with prefect server and graphql api. Here goes: Your
StartFlowRun(a)
should wait for the flow to finish. Then, the
StartFlowRun
task will give you a flow run id, which you can inspect the graphql api with. Prefect server does not hold your task results directly, it stores references to your task results, such as an s3 file path if you are storing task results in s3. If you can find out where prefect stores the result of the task you need, you can have some custom tasks to get that result. It’s clunky, I don’t know if everything you need is there, but thats my idea 😅
I found that a column named
state_result
exists inside
task_run
table of my local prefect servers postgres instance. Since I never used results, all my values are null, but that should contain the result you need. Now you need to somehow access that column via graphql
🙌 1
i
Yes, I think that's a possible workaround. Thanks @emre
Fyi,
serialized state
in task run contains all the necessary information (the location of the result, etc)
👍 2