Guillaume Latour
05/24/2022, 6:57 AM/home/<user>/.prefect/results
folder with <user>
beeing the user launching the prefect server & agent on the other server (and who are not present in the docker)
I've added this line into the configuration: flows.checkpointing = false
, relaunched the server and agent but nothing has changed: the results folder is still beeing filled with intermediate tasks results.
Am I doing something wrong? Is it possible to prevent this intermediate backups?
Ty in advanceAnna Geller
05/24/2022, 11:14 AMIs it possible to prevent this intermediate backups?Yes, there is! It would involve some work though as you would need to set it on each task, but setting checkpoint=False on your task decorator will prevent that
Guillaume Latour
05/24/2022, 12:54 PMAnna Geller
05/24/2022, 12:59 PMGuillaume Latour
05/24/2022, 1:19 PMAnna Geller
05/24/2022, 1:51 PMKevin Kho
05/24/2022, 2:24 PMcheckpoint=False
Guillaume Latour
05/26/2022, 2:03 PMfrom typing import Callable, Dict, Any, Optional
from prefect import task
def checkpoint_task(func: Callable, **task_init_kwargs: Optional[Dict[str, Any]]):
if task_init_kwargs is None:
task_init_kwargs = {}
task_init_kwargs['checkpoint'] = False
return task(func, **task_init_kwargs)
And I've updated all the tasks import so they look like this import checkpoint_task as task
I've created a docker image including this new code for the dask cluster
And finally I've registered the new defined flows into the prefect server.
I am still having the issue on the dask workers: they create a folder in /home/<user>/.prefect/results
(with <user>
being the one that is running the prefect agent)
Do you have an idea of how I could debug this? Where could be overriden this hardcoded checkpoint = False
?
Is there some caching mechanism preventing the registration of the updated code?Kevin Kho
05/26/2022, 2:28 PMcheckpoint
is part of what is stored in the task. Did you re-register this Flow?Guillaume Latour
05/26/2022, 2:34 PMKevin Kho
05/26/2022, 2:37 PMGuillaume Latour
05/26/2022, 2:54 PMdask.address = "tcp://<ip>:<port>"
in my .prefect/config.toml
file
I am using the daskdev/dask:latest
docker image on which I install our custom packages, and I launch a small cluster (1 scheduler and some workers). We do not use k8 yet.Kevin Kho
05/26/2022, 2:56 PM