Hi I would like to ask about a case with <https docs prefect Prefect Community #ask-community

Hi! I would like to ask about a case with <multipl...

Andreas

05/30/2022, 3:16 PM

Hi! I would like to ask about a case with multiple return values from a Task. When I specify the flow edge manually by using

flow.add_edge(_upstream_task_, _downstream_task_, _*key*_)

is there any way to specify which of the multiple return values from the upstream task should pass to the downstream task?

Anna Geller

05/30/2022, 3:51 PM

Why would you want to do it this way? This method is (TTBOMK) relevant only for the imperative API which no longer exists in Prefect 2.0 - I would be curious to hear more about your use case and why do you try to set dependencies that way

Anna Geller

05/30/2022, 3:55 PM

At which stage are you in your Prefect adoption - are you already a long-term user or just getting started with Prefect?

Kevin Kho

05/30/2022, 4:10 PM

I would not recommend the imperative API in general. It’s a lot harder, and I’ve yet to see a use case that really needs it

👍 1

Andreas

05/31/2022, 9:47 AM

In short, I want to specify the DAG, not the individual tasks, how do I do that in Prefect 2.0? I don't want to type new code each time there is a tweak in the DAG. So I use an automated procedure to set the dependencies by using the API to create the flows. This provides a more general approach where the code automatically creates the flows based on external info provided (a dag with the tasks and their edges). I have already done some work in Prefect 1.0 mostly using the imperative API

Anna Geller

05/31/2022, 9:55 AM

by using the API to create the flows

you can think of Python as an API for Prefect 2.0 - instead of having a hard-coded set of dependencies defined at build time, you can have a dynamically defined computational graph by doing nothing more than updating your code

Andreas

06/01/2022, 11:53 AM

I understand that this way is useful for many users but not in this case. Due to the nature of the project I already have a DAG representation in json format of many many workflows. As a result I have created functions that take the DAG representation as input and build the flows using the

flow.add_edge

. In this way I avoid manually writing the code to define each flow. I want to transition to Prefect 2.0 but if there isn't a secondary way to set those dependencies between tasks like in Prefect 1.0 this seems really hard

Anna Geller

06/01/2022, 1:22 PM

avoid manually writing the code to define each flow

it may be an oversimplification from my side, but I think: you either write Python code or you write a JSON config + Python file consuming that JSON file Using Python without JSON DAG definition makes your architecture a bit simpler since there is only one place to adjust the changes and it seems a bit friendlier? it's a matter of personal preference you can certainly read JSON in your flow code and start there I think the underlying problem you were trying to solve with JSON config is reducing boilerplate - would you agree?

Andreas

06/01/2022, 1:42 PM

It is a little bit more complicated than that but yes in general Python code consumes a JSON DAG file to create a flow by manually specifying the dependencies between tasks that the json describes. Different json input can be provided to describe various different flows that can can leverage a pool of tasks. The underlying problem I was trying to solve was to avoid boilerplate related to defining multiple flows and avoid writing a ton of control blocks like if/else/for statements. Also this way it is easy to create DAGs in 3rd party programs that describe the relations between tasks, then extract the dag and just pass it to the code to create a flow. This gives the ability to non technical people who do not know to write code (and only know the functionality in general of the tasks) to create flows graphically.

Anna Geller

06/01/2022, 1:47 PM

I hope not to disappoint you here, but Prefect is mainly for engineers who are willing to write some code - you won't get away without writing some minimal Python code, there is no support for drag and drop workflow definition - there are other tools that do it and if this aspect is important to you, Prefect may be the wrong tool for the job (to be fully transparent)

Andreas

06/01/2022, 2:10 PM

Well I know that there is no such functionality in Prefect and I am not asking Prefect to implement such functionality. The drag and drop workflow definition is something external to Prefect that is my responsibility to integrate. I told you about it because you asked about a use case :) The questions I am looking answers for are: 1. In Prefect 1.0, if I have multiple return values from the upstream task can I specify them imperatively to the downstream task? The function

flow.add_edge(_upstream_task_, _downstream_task_, _*key*_)

works fine but only for single return parameters under the

key

attribute 2. Is there an alternative way to create dependencies in Prefect 2.0 (or planned) apart from (the task parameter `wait_for` ) like it is easily done in Prefect 1.0 with add_edge?

Anna Geller

06/01/2022, 2:26 PM

I told you about it because you asked about a use case :)

gotcha, thanks for that #1 not sure how to do it imperatively, sorry #2 again here, I won't have any encouraging answer because this entire imperative API was based on the concept of built-time DAG which Prefect 2.0 elevates to a runtime-DAG - that's why dependencies are not defined by DAG nodes and edges, but via passing data between tasks + state dependencies via

wait_for

- there is certainly a solution for your problem in Prefect 2.0, but using plain Python rather than class-based DAG API, e.g. you can add tasks and their dependencies dynamically based on what you read from those JSON files - this could be the first step before you migrate it to be Python code

Andreas

06/01/2022, 3:13 PM

Thank you Anna for your time and for answering my questions. I wish you well 😊

🙌 1

3 Views

Open in Slack

Previous Next