https://prefect.io logo
j

Joish

07/18/2023, 3:24 PM
Hey everyone, I'm encountering an issue with retrying tasks in the Prefect UI. When I try to retry a task, I'm getting the following error:
raise MissingResult("The result was not persisted and is no longer available.")
Here are some details: - I'm using the Dask runner. - My task results are stored in S3. - The error occurs when I try to retry a failed task from the UI. Has anyone else experienced a similar issue with the Dask runner and S3 result storage? I'd appreciate any insights or suggestions on how to resolve this problem. Thanks in advance for your help!
c

Christopher Boyd

07/18/2023, 3:28 PM
How are you persisting results, and using them in code?
j

Joish

07/18/2023, 3:30 PM
for persisting results I am using S3
i am not sure how to use the result back in code..
Here's an overview of my workflow: 1. I retrieve a list of files to process. 2. I submit this list to Dask for distributed processing. Each file is treated as a separate task. 3. Dask processes each file and stores the output in S3. 4. After processing all the files, I move the output from S3 to my database. Everything seems to be working well, but I'm considering a scenario where one of the files fails during processing. In such cases, I want to be able to retry the entire flow by using the retry button in the Prefect UI. My concern is that when I attempt to retry a failed task in the UI, I encounter the following error:
raise MissingResult("The result was not persisted and is no longer available.")
c

Christopher Boyd

07/18/2023, 4:25 PM
Hrmm, are the inputs or outputs different between these? I’d have to review results to ensure the configuration is right here, but each result should be unique - if the inputs change (either in total, or in between tasks) then the output results would be different. That said, I’m not sure what might be the case here
d

Deceivious

07/18/2023, 5:00 PM
Ive had this issue a lot myself.
j

Joish

07/18/2023, 5:42 PM
@Deceivious do you have any suggestion on this...
d

Deceivious

07/18/2023, 5:42 PM
None. I just rerun when this happens and hope it passes.
j

Joish

07/18/2023, 5:43 PM
rerun as in a new flow rather than retry the failed flow
d

Deceivious

07/18/2023, 5:43 PM
Retry....
We usually run 3x the frequency of data we need. ie if we need hourly we run every 20 mins. SO if one fails and the next passes we just ignore 😄
j

Joish

07/18/2023, 5:45 PM
ohh ok in my case i have a batch of 27 Million rows
d

Deceivious

07/18/2023, 5:46 PM
There have been multiple listing of such cases in community before. Its hard to replicate so not much to be done.
j

Joish

07/18/2023, 5:47 PM
also I am not sure if I am doing right can you quickly eyeball the code and tell me if I am missing something... is that ok..
d

Deceivious

07/18/2023, 5:47 PM
I dont use Dask so not sure if m the right person
j

Joish

07/18/2023, 5:48 PM
ohh ok
i am stuck with this issue like 5hr now
d

Deceivious

07/18/2023, 5:48 PM
I personally think this should be treated as a cache miss and should be recomputed? Or atleast allow the dev to select
Try run sequentially and see if that passes
j

Joish

07/18/2023, 5:49 PM
i want the dask runner reason being I will have to run this all 27M as soon as possible
so I am using multi-processing along with multi-threading
d

Deceivious

07/18/2023, 5:50 PM
yes just as test is what i mean
j

Joish

07/18/2023, 5:50 PM
ohh ok sure let me do that
also noticed this I am passing result_storage_key, but in the bucket the output is a file with random name @Christopher Boyd @Deceivious
d

Deceivious

07/18/2023, 6:02 PM
which versionof preffect?
j

Joish

07/18/2023, 6:03 PM
v2.10.17
c

Christopher Boyd

07/18/2023, 7:05 PM
I’m not 100% sure here unfortunately, and might need to test this out myself. The problem as @Deceivious mentioned, is I can’t quite confirm if this is configured correctly and it’s not working as intended, or if it’s not configured correctly - but in that case, I don’t have enough knowledge here to make the recommendation of what to fix just yet
j

Joish

07/18/2023, 8:35 PM
@Deceivious @Christopher Boyd thanks for your time