Guillaume Latour
05/24/2022, 6:57 AM/home/<user>/.prefect/results
folder with <user>
beeing the user launching the prefect server & agent on the other server (and who are not present in the docker)
I've added this line into the configuration: flows.checkpointing = false
, relaunched the server and agent but nothing has changed: the results folder is still beeing filled with intermediate tasks results.
Am I doing something wrong? Is it possible to prevent this intermediate backups?
Ty in advanceAnna Geller
Is it possible to prevent this intermediate backups?Yes, there is! It would involve some work though as you would need to set it on each task, but setting checkpoint=False on your task decorator will prevent that
Guillaume Latour
05/24/2022, 12:54 PMAnna Geller
Guillaume Latour
05/24/2022, 1:19 PMAnna Geller
Kevin Kho
checkpoint=False
Guillaume Latour
05/26/2022, 2:03 PMfrom typing import Callable, Dict, Any, Optional
from prefect import task
def checkpoint_task(func: Callable, **task_init_kwargs: Optional[Dict[str, Any]]):
if task_init_kwargs is None:
task_init_kwargs = {}
task_init_kwargs['checkpoint'] = False
return task(func, **task_init_kwargs)
And I've updated all the tasks import so they look like this import checkpoint_task as task
I've created a docker image including this new code for the dask cluster
And finally I've registered the new defined flows into the prefect server.
I am still having the issue on the dask workers: they create a folder in /home/<user>/.prefect/results
(with <user>
being the one that is running the prefect agent)
Do you have an idea of how I could debug this? Where could be overriden this hardcoded checkpoint = False
?
Is there some caching mechanism preventing the registration of the updated code?Kevin Kho
checkpoint
is part of what is stored in the task. Did you re-register this Flow?Guillaume Latour
05/26/2022, 2:34 PMKevin Kho
Guillaume Latour
05/26/2022, 2:54 PMdask.address = "tcp://<ip>:<port>"
in my .prefect/config.toml
file
I am using the daskdev/dask:latest
docker image on which I install our custom packages, and I launch a small cluster (1 scheduler and some workers). We do not use k8 yet.Kevin Kho