An Hoang
01/05/2022, 7:25 PMperform_permutation
and generate_summary_df
are both upstream dependencies of process_report_df
. However, the flow fails at process_report_df
without ever running the two upstream tasks. The error is due to not having data from upstream task generate_summary_df
. The post-hoc state visualization doesn't show the two upstream tasks at all. Why does this happen? I will post the log and task results in the comments.An Hoang
01/05/2022, 7:31 PMAn Hoang
01/05/2022, 7:36 PM.....
'get_permute_odds_ratio_object': Starting task run...
[2022-01-05 14:11:47-0500] DEBUG - prefect.TaskRunner | Task 'get_permute_odds_ratio_object': Handling state change from Pending to Mapped
[2022-01-05 14:11:47-0500] INFO - prefect.TaskRunner | Task 'get_permute_odds_ratio_object': Finished task run for task with final state: 'Mapped'
[2022-01-05 14:11:47-0500] INFO - prefect.TaskRunner | Task 'perform_permutation': Starting task run...
[2022-01-05 14:11:47-0500] DEBUG - prefect.TaskRunner | Task 'perform_permutation': Handling state change from Pending to Mapped
[2022-01-05 14:11:47-0500] INFO - prefect.TaskRunner | Task 'perform_permutation': Finished task run for task with final state: 'Mapped'
[2022-01-05 14:11:47-0500] INFO - prefect.TaskRunner | Task 'generate_summary_df': Starting task run...
[2022-01-05 14:11:47-0500] DEBUG - prefect.TaskRunner | Task 'generate_summary_df': Handling state change from Pending to Mapped
[2022-01-05 14:11:47-0500] INFO - prefect.TaskRunner | Task 'generate_summary_df': Finished task run for task with final state: 'Mapped'
[2022-01-05 14:11:47-0500] INFO - prefect.TaskRunner | Task 'process_report_df': Starting task run...
[2022-01-05 14:11:47-0500] DEBUG - prefect.TaskRunner | Task 'process_report_df': Handling state change from Pending to Running
[2022-01-05 14:11:47-0500] DEBUG - prefect.TaskRunner | Task 'process_report_df': Calling task.run() method...
[2022-01-05 14:11:47-0500] ERROR - prefect.TaskRunner | Task 'process_report_df': Exception encountered during task execution!
Traceback (most recent call last):
File "/lab/corradin_biobank/FOR_AN/OVP/corradin_ovp_utils/.venv/lib/python3.8/site-packages/prefect/engine/task_runner.py", line 876, in get_task_run_state
value = prefect.utilities.executors.run_task_with_timeout(
File "/lab/corradin_biobank/FOR_AN/OVP/corradin_ovp_utils/.venv/lib/python3.8/site-packages/prefect/utilities/executors.py", line 467, in run_task_with_timeout
return task.run(*args, **kwargs) # type: ignore
File "/lab/corradin_biobank/FOR_AN/OVP/corradin_ovp_utils/corradin_ovp_utils/prefect_flows/step2.py", line 175, in process_report_df
all_report_df_melted = pd.concat( previous_steps_result_files + report_df_melted_list)
File "/lab/corradin_biobank/FOR_AN/OVP/corradin_ovp_utils/.venv/lib/python3.8/site-packages/pandas/util/_decorators.py", line 311, in wrapper
return func(*args, **kwargs)
File "/lab/corradin_biobank/FOR_AN/OVP/corradin_ovp_utils/.venv/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 294, in concat
op = _Concatenator(
File "/lab/corradin_biobank/FOR_AN/OVP/corradin_ovp_utils/.venv/lib/python3.8/site-packages/pandas/core/reshape/concat.py", line 351, in __init__
raise ValueError("No objects to concatenate")
ValueError: No objects to concatenate
[2022-01-05 14:11:47-0500] DEBUG - prefect.TaskRunner | Task 'process_report_df': Handling state change from Running to Failed
[2022-01-05 14:11:47-0500] INFO - prefect.TaskRunner | Task 'process_report_df': Finished task run for task with final state: 'Failed'
....
[2022-01-05 14:11:47-0500] DEBUG - prefect.FlowRunner | Flow 'OVP_step2': Handling state change from Running to Failed
An Hoang
01/05/2022, 7:37 PMKevin Kho
Kevin Kho
flow.run()
?An Hoang
01/05/2022, 7:39 PMflow.run()
right nowKevin Kho
An Hoang
01/05/2022, 7:42 PMKevin Kho
Kevin Kho
.map()
with a string input like .map("test")
which can treat this as a list of 4 letters. You need an explicit unmapped
for thatAn Hoang
01/05/2022, 9:39 PMassert list_to_be_mapped
in each of the task? What if your flow has tons of mapped tasks?Kevin Kho
map
normally suggests dynamicism so you don’t know the length ahead of time. State handlers are also done per mapped item so I’m not seeing a good place to put an assert other than the downstream task, which errors as a result