Hey. I have an issue. After click on manual step, ...
# prefect-server
s
Hey. I have an issue. After click on manual step, the flow has been stuck for 40 minutes. Is there anything we can do to wake the process?
a
yes, there’s definitely something we can do about it 🙂 it’s stuck in which state?
can you perhaps show the screenshot of the UI, logs, or give more information otherwise?
s
It is in state
resume
a
did someone configure manual approval for this flow? Can you share a bit larger screenshot or tell a bit more about your use case? otherwise it’s hard to tell why it would get stuck in a Resume state
s
Ok. So, the flow executed a set of commands in EC2 Instance, then it came paused waiting for the manual approval, which has been waiting for an hour. Then It received the confirmation, and until now, it is stuck
a
oh, I see - so the approval did work, but the flow is now stuck in a Resume state? It could be that some of your tasks don’t have the input data required to resume the task. Did you configure any Result class in your flow? Can you perhaps show your flow definition?
s
I don’t think so. I executed the same flow yesterday, and it didn’t get stuck.
I configured the result storage in tasks which output is going to be used after the resume
a
Can you share a minimal flow example that we could reproduce? hard to tell otherwise at which step something may be wrong
s
Copy code
batch_matching_full_execution_commands = create_batch_matching_ec2_commands(export_version, next_ekata_dv, False)
    batch_matching_full_execution_commands.name = 'Create batch matching ec2 commands for full execution'
    batch_matching_full_execution_commands.checkpoint = True
    batch_matching_full_execution_commands.result = S3Result(bucket=S3_BUCKET, location=S3_LOCATION_PATTERN)

    run_batch_matching_export_notebook = notebook_run(databricks_conn_secret=conn,
                                                      json=notebook_submit_config)
    run_batch_matching_export_notebook.set_upstream(notebook_created)

    instance_id = create_ec2_instance(volume_size=2000)
    instance_id.set_upstream(run_batch_matching_export_notebook)
    instance_id.skip_on_upstream_skip = False
    instance_id.checkpoint = True
    instance_id.result = S3Result(bucket=S3_BUCKET, location=S3_LOCATION_PATTERN)

    # Execution for 5 minutes, so Ekata can scale up theirs services.
    batch_matching_warm_up_execution = execute_job_in_ec2_instance(instance_id=instance_id,
                                                                   commands=batch_matching_warm_up_commands,
                                                                   s3_dir_prefix='batch-matching-warm-up')
    batch_matching_warm_up_execution.name = 'Warm up ekata services in ec2 instance for 5 minutes'
    batch_matching_warm_up_execution.set_upstream(run_batch_matching_export_notebook)
    batch_matching_warm_up_execution.skip_on_upstream_skip = False

    ekata_verified_their_services = has_ekata_verified_their_services() # Manual Step
    ekata_verified_their_services.set_upstream(batch_matching_warm_up_execution)

    # Full Execution
    batch_matching_full_execution = execute_job_in_ec2_instance(instance_id=instance_id,
                                                                commands=batch_matching_full_execution_commands,
                                                                s3_dir_prefix='batch-matching',
                                                                execution_timeout=60 * 60 * 24 * 7)
    batch_matching_full_execution.name = 'Batch Matching Full execution in ec2 instance'
    batch_matching_full_execution.set_dependencies(upstream_tasks=[batch_matching_warm_up_execution,
                                                                   ekata_verified_their_services])
As you can see, the result storage is configured in some task
z
Do you have an agent running? What kind of run config are you using?
👀 1
s
I am ending the day now. I will answer you tomorrow. Thanks for the reply
Yes, we do. We have a docker agent, running in a EC2 Instance
a
@Santiago Gonzalez can you share your run configuration or your entire flow definition? alternatively, could you build a minimal example that we could use to reproduce the issue?
z
Is the flow run container still running?
upvote 1
s
No, we cancelled it yesterday
a
thx for update, let us know if you need any help with that in the future