https://prefect.io logo
s

Scott Moreland

11/24/2020, 2:42 PM
Suppose I have a long sequence of tasks within a given flow, where each task reads some persisted data created by the previous task, operates on that data, and persists the resulting output to a database via checkpointing. How can I... 1. Run a single task within the flow and pull its input data from the persisted checkpoint of the previous task, i.e. debug a single task. 2. Have my flow rebuild the persisted output of any given task, i.e. ignore the fact that the checkpoint exists, recompute and overwrite it. Thanks in advance! Love the package and selling my team hard on it!
j

Jenny

11/24/2020, 3:57 PM
Hi @Scott Moreland - If you're using Cloud or the Server UI, checkout the restart button. It let's you restart a flow run from any failed tasks (on the flow run page) or re-run a specific task (on the task run page).
s

Scott Moreland

11/24/2020, 6:57 PM
Great, thanks! Any tips for doing this locally as well?
j

Jenny

11/24/2020, 7:02 PM
When you say locally do you mean using flow.run or are you still registering the flow and using cloud/server?
s

Scott Moreland

11/24/2020, 7:40 PM
Yep, I mean using flow.run. I'm not registering the flow anywhere, just running it from python main as follows:
Copy code
with Flow('my flow') as flow:
    first_output = first_task()
    second_output = second_task(first_output)
    third_output = third_task(second_output)

if __name__ == '__main__':
    flow.run()
Wondering if there is a simple way to just run
second_task
, such that it reads
first_output
from the checkpoint (on disk), performs the computation defined by
second_task
, returns
second_output
, and then exits with a success code. I guess more generally, it would be nice to run a contiguous subset of the flow graph by defining start and stop tasks if such a thing is possible with a DAG.
j

Jenny

11/24/2020, 8:30 PM
Hi @Scott Moreland - I double checked this with the core team. Technically it should be possible but you would need to do some extra work to do it. You can pass a 
state
 to the 
flow.run
 which could contain the state of a previous flow run that failed (and has all of the task states attached). Then the runner would only rerun the failed task and forward BUT that would also require taking the previous failed state and putting it into a Scheduled state / setting the tasks to rescheduled manually. The reason we added that button in the UI is so it can do all that logic for you!
s

Scott Moreland

11/24/2020, 8:39 PM
Great! Thank you for the detailed response. I definitely see the value in the cloud service and hope to be a paying customer soon. Thanks for making such a well crafted product; I definitely see the value in this over airflow.
j

Jenny

11/24/2020, 8:41 PM
Thank you! And just in case it helps, Server (including the Sever UI) is open source and the Cloud UI has a developer tier that's also free so you can always try them out without being a paying customer.