Sven Teresniak
07/28/2020, 9:09 AMLocalResult
as a flow's result=mylocalresultinstance
to persist task results. The data is written to a NFS share which is available to DaskWorker and the agent (exact same mointpoint). With default settings (no result
keyword in the flow definition) all worked well. Now I use some templating to better organize (and later cleanup) results.
Question:
1. The UI states that I cannot restart a task because Warning: If this flow run does not have a result handler, restarting is unlikely to succeed
. Is the text now aware of the deprecated result_handler
keyword and maybe checks the wrong setting? Does the UI need the NFS share (the result location) as well? Or any Prefect related service except agent and dask worker?
2. Once my LocalResult
is working. Is it possible to access results from the UI? I know I can load results using the `Result`'s subclasses. But it would be easier for testing and debugging
3. Is there an elegant way to get rid of old results? Deleting old flow versions (and their runs) does not remove results.
4. How do I find which keyword parameters I can use in the format string of a Result
's location
string? Everything from prefect.context
and all keywords a Task.run()
got?
5. Is Prefect using (persisted) results in any way? Let's assume 1. is a bug and fixed. Does a retry for a failed task run reading its input from (maybe persisted) result objects? Something else?
6. The documentation does not say a word about uniqueness. If Prefect is using results (see 5.), then each persisted result must be unique, right? That is, overwriting a result by accident could lead to complete fuckup?josh
07/28/2020, 12:02 PMSven Teresniak
07/28/2020, 12:18 PMLocalResult()
and writing files like prefect-result-2020-07-23t14-11-27-600908-00-00
.
Now, when I want to restart a failed flow, I cannot do this.josh
07/28/2020, 12:25 PMSven Teresniak
07/28/2020, 12:25 PMjosh
07/28/2020, 12:26 PMSven Teresniak
07/28/2020, 12:26 PMtarget
-RTFM! This will come in handy!
It seems that Result
persistence (Local or S3 in our case) without using target
is not worth much. Because basically nothing uses the content of pickled results now, right?Marwan Sarieddine
07/28/2020, 1:05 PM`Result` will only be used by Prefect when running against a backend and a future retry is needed, necessitating recreating a task's inputs.
Sven Teresniak
07/28/2020, 1:51 PMmax_retries>0
" reads the Result
of its successful upstream task?
Or does "future retry" mean "re-run the failed flow-run" (this is not working at the moment)
Or both?Marwan Sarieddine
07/28/2020, 1:59 PM"Future retry" means "a failed downstream task with max_retries>0 " reads the Result of its successful upstream task?
Yes in this case I think the task is being run in a Retry state …
Or does "future retry" mean "re-run the failed flow-run" (this is not working at the moment)
I don’t think re-running a flow counts given tasks are not being run in a retry state
but it would be great if @josh or someone from the prefect team could confirmSven Teresniak
07/28/2020, 2:11 PMtarget
and result
is mutually exclusive.
I understand that Result
is a persisted return value of a task. Okay. This is useful wenn re-run downstream tasks. Useful thing.
From https://docs.prefect.io/core/PINs/PIN-16-Results-and-Targets.html#pin-16-results-and-targets I learned, that target
can (or was supposed to) cover non-returned results (from side effects like writes to a file or S3).
But with result
and target
mutually exclusive it is not possible to cache an expensive calculation without side effects, right? I have to persist the outcome of the calculation as part of the task and then point target
to that exact location?target
the doc mentions "…at write time all formatting kwargs will be passed…"
Does that mean that target will also write (persist) returned results automatically if not present?
The docs only mention reading. Not writing.Task
the situation is much clearer now. When target
is provided then the result
-location is set to target. Now it makes sense.
When result data is available under target
, the cached data will be used. If result data is not available, the Result
implementation will write the data returned from the task to that location for future use.
This makes sense.