Thread
#prefect-community
    r

    Raymond Yu

    4 months ago
    Hey Prefect, we’re encountering a somewhat stochastic error when running a
    wait_for_flow_run
    for a long running
    DatabricksSubmitRun in another flow even when the Databricks job runs to completion without an issue. We noticed this can occasionally result in the error enclosed below that causes no heartbeat to be detected. Has anyone encountered this? Any ideas on what may be causing this and how to address the issue?
    Error during execution of task: ClientError([{'path': ['flow_run'], 'message': 'request to <http://hasura:3000/v1alpha1/graphql> failed, reason: read ECONNRESET', 'extensions': {'code': 'INTERNAL_SERVER_ERROR', 'exception': {'message': 'request to <http://hasura:3000/v1alpha1/graphql> failed, reason: read ECONNRESET', 'type': 'system', 'errno': 'ECONNRESET', 'code': 'ECONNRESET'}}}])
    In a possibly related note, we’re also encountering a similar issue polling on EMR termination status via boto3 i.e
    emr_client.get_waiter("cluster_terminated").wait(ClusterId=cluster_id)
    Where the task also loses heartbeat despite the EMR cluster running successfully and terminating as expected.
    Anna Geller

    Anna Geller

    4 months ago