Hi everyone, I am currently in the process of splitting a giant DAG into logically separate flows but I am running into this issue. I have 2 flows, but one task in Flow2 needs output from a task in Flow1. Does anyone know what is the easiest way to do this?
k
Kevin Kho
06/30/2021, 9:07 PM
Hey @Dan Zhao, if it’s small, I suggest using the KV Store to persist that and then retrieve it later. The KV Store and hold key value pairs for a maximum of 10KB. If not through the KV Store, you would need to use the GraphQL API to query those results. We are having a release soon in 0.15..0 (maybe by tom) that will make this process easier without having to use either of these. It will do those GraphQL API calls for you.
Many thanks Kevin!
The output has some considerable size (1Mb to 10Mb). If I go the GraphQL route, will this still observe the dependencies and cache validators?
k
Kevin Kho
06/30/2021, 9:19 PM
Between flows there is no way to pass things in memory, so you have to persist that somewhere to be loaded later. The result interface does that where it persists something at a location for you. The GraphQL API would be used to retrieve the location of this result and then load it in.
Kevin Kho
06/30/2021, 9:19 PM
Even using 0.15.0, you would need to write that somewhere to be available for a later subflow.
d
Dan Zhao
06/30/2021, 9:20 PM
Got it - many thanks Kevin! I'll give it a try
Dan Zhao
06/30/2021, 9:20 PM
Is there a planned release date for 0.15.0?
Dan Zhao
06/30/2021, 9:21 PM
I am thinking whether I should learn the GraphQL way or wait for the release.