i just had a task that ran twice. i only wanted it...
# prefect-community
a
i just had a task that ran twice. i only wanted it to run once and i got duplicate data as a result. i dont have retries enabled. is this something other people see sometimes? is there a way to avoid this?
k
Are you using Dask? and are you on Prefect Cloud? This is something like you had two jobs fire (Flows), or Dask recovering after dying can also re-submit tasks
a
i am on Prefect Cloud and im using a LocalDaskExecutor
the flow only ran once but it ran one task twice within (but only one)
k
There should be version locking to block the second execution. Does it actually run twice?
Is Kubernetes spinning up two jobs?
a
i believe so. i got duplicate data from it. how do i see if kubernetes spun up two jobs?
k
Check for two pods I think?
a
do i have to turn version locking on?
k
It should be on by default on Prefect Cloud. It is configurable per Flow and you can find it in the Flow settings
a
and the job completed so the pods arent up anymore
its disabled for me
is there a way to turn it on programmatically? i havent used the graphql api at all
k
Oh, turning that on may help because it puts a lock on the task execution. Are you mapping? Yeah just do
Copy code
query = """
mutation {
  enable_flow_version_lock(input: { flow_id: "your-flow-id-here" }) {
    success
  }
}
"""

from prefect.client import Client
client = Client()
client.graphql(query)
I am wondering if you issue is related to this . I’ll read through it to dig more
a
i am mapping (edit: im not mapping at the task level)
k
Let me read into that issue. I haven’t gone through it yet, but I think you are running into the issue.
a
i am running GKE autopilot so i also think i am. im also mapping tasks on other flows so even if im not seeing it here it is probably relevant
this does seem pretty similar to what im seeing, but I am using LocalDaskExecutor, not the full DaskExecutor as the others in that thread were using
d
Any update to this thread? I have a similar problem (though I am using a DaskExecutor on ECS) and I have been unable to resolve it so far. The linked Git thread shed some light on the problem, but the only proposed solution seems to be dumping autoscaling, which would mean having to retool our CI which sets the cluster config when the flow is being registered. Not ideal.
k
Hey @Daniel Ross , no update to this thread and I think the Github thread is the best place to follow and that issue is being worked on. You can chime in there if you have additional details (and to follow it)
👍 1
d
Thanks Kevin! I'll keep an eye on it.
a
i ended up moving away from autopilot and doing a standard cluster and i dont think im seeing the same issue anymore
👍 1
d
Thanks for the update @Andrew Lawlor!