< Marvin> Why am I getting the error OSError Errno 28 No spa Prefect Community #ask-marvin

<@ULVA73B9P> Why am I getting the error "OSError: ...

Chad LeJeune

10/11/2025, 5:04 AM

@Marvin Why am I getting the error "OSError: [Errno 28] No space left on device" on my worker. How do I clean this up? I'm getting crashes for one of my jobs.

Marvin

10/11/2025, 5:04 AM

✅ thought for 333.5 seconds

Marvin

10/11/2025, 5:09 AM

“OSError: [Errno 28] No space left on device” means the machine (or container/node) your worker runs on is out of disk or inode space. With Prefect workers this typically comes from one or more of: - OS temp directories (e.g., /tmp) filling up from ephemeral run artifacts or user code - Local result or artifact files written by your flow/tasks - Container images/volumes/logs (Docker/Kubernetes workers) - The Prefect home dir (~/.prefect) if it’s redirected to a small disk Quick triage - Check overall space and biggest offenders

Copy code

df -h
  df -ih
  sudo du -xhd1 /tmp 2>/dev/null | sort -h
  sudo du -xhd1 ~ 2>/dev/null | sort -h

- If using Docker worker: also

Copy code

docker system df
  docker ps -a | wc -l
  docker volume ls | wc -l

- If using Kubernetes worker: check node and pod ephemeral storage

Copy code

kubectl top nodes
  kubectl top pods -A
  kubectl describe node <node-name> | grep -i "ephemeral"

Where Prefect 3.x puts things by default - Prefect home:

~/.prefect

(configurable via

PREFECT_HOME

). This stores profiles/config, not large run data by default. Docs: Settings and profiles - Process worker working directory: by default the worker’s current directory unless you set

working_dir

in the work pool base job template. Temporary checkouts or run working dirs are typically created under the OS temp dir (e.g., /tmp) and cleaned up when runs complete; abrupt kills can leave leftovers. - There is no

PREFECT_LOCAL_STORAGE_PATH

in Prefect 3.x. If your flow writes results locally, that’s your code’s filesystem path (or a storage block/filesystem you configured). - Logs: Prefect 3 logs go to the API/console by default; Prefect does not create local log files unless you explicitly configure a file handler. Cleanup, by worker type Process worker (VM/bare metal) - Clear stale temp dirs older than a day (safe if no runs currently executing):

Copy code

sudo find /tmp -maxdepth 1 -name 'prefect*' -type d -mtime +1 -print -exec rm -rf {} +
  sudo find /tmp -maxdepth 1 -type f -mtime +1 -size +10M -print -delete

- Inspect and remove large artifacts your flows may have written locally (e.g., data extracts, intermediate parquet/csv)

Copy code

sudo du -xh /path/to/your/project | sort -h | tail -50

- If

~/.prefect

is on a small disk, move it to a larger volume and point Prefect at it:

Copy code

export PREFECT_HOME=/mnt/bigdisk/prefect
  mkdir -p "$PREFECT_HOME"

Then restart your worker so it picks up the new env. Docs: Settings and profiles - Prevent reoccurrence: - Set the Process work pool’s base job template

working_dir

to a larger/writable volume (in the Prefect UI: Work Pools -> your process pool -> Edit base job template -> working_dir). - Optionally set

TMPDIR

in the pool’s env to redirect temp files:

Copy code

TMPDIR=/mnt/bigdisk/tmp

Docker worker - Prune unused images/containers/volumes (use filters so you don’t break active runs):

Copy code

docker system prune -a --volumes --filter "until=168h"
  docker volume prune --filter "label!=keep"

Check what will be removed first with

--dry-run

on Docker 27+, or run without

-a

to only prune dangling images. - Limit container log growth on the host by configuring Docker log rotation (daemon.json):

Copy code

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "50m",
    "max-file": "3"
  }
}

Then restart Docker. - In the Docker work pool base job template, set: -

working_dir

to a mounted volume with more space -

volumes

to mount a large host path for temp/data, e.g.

/mnt/bigdisk/tmp:/tmp

env

to set

TMPDIR=/tmp

(which you’ve mounted) - If your flows write results/artifacts locally, move them to object storage (S3, GCS, etc.) instead of the container filesystem. Kubernetes worker - Node-level cleanup:

Marvin

10/11/2025, 5:09 AM

- Ensure image garbage collection and container log rotation are configured on nodes (containerd/docker). Large container logs commonly fill disks on K8s nodes. - Remove evicted/terminated pods and orphaned PVCs/volumes not in use. - Set resource policies so pods are killed before the node fills up: - Add ephemeral-storage requests/limits to your flow pods (via the work pool’s base job template pod spec). - Use

emptyDir

with

sizeLimit

for `/tmp`/working directories, or mount a larger PVC for

working_dir

. - In the Kubernetes work pool base job template: - Set

working_dir

and mount an

emptyDir

or PVC there - Set

env

TMPDIR=/work/tmp

and mount that path Prefect-specific tips - There is no built-in

prefect cache clear

or similar CLI in Prefect 3.x, and we verified there’s no

prefect cache

command. - If you’re persisting results locally, prefer a remote filesystem (S3/GCS/Azure) or a mounted large-volume path. For ad-hoc local files, ensure your flow/task code deletes temporary files (or use

tempfile.TemporaryDirectory()

). - If you’re using

flow.from_source(...)

, the worker clones/exports your code per run; unexpected termination can leave temp dirs. Direct TMPDIR/working_dir to a mount you can prune safely. Optional recurring maintenance - Add a cron/systemd timer to purge ancient temp:

Copy code

# m h dom mon dow user  command
  15 3 * * * root find /tmp -maxdepth 1 -name 'prefect*' -type d -mtime +2 -exec rm -rf {} +

- For Docker hosts, schedule periodic prune with conservative filters. References - Prefect 3 settings and profiles (PREFECT_HOME): Settings and profiles - Prefect 3.2 release notes (worker/deploy improvements): Release notes - Background on infra cleanup jobs (general guidance): The role of infrastructure cleanup jobs Can you share: - Which worker type you’re using (process, docker, kubernetes)? - Where it’s running (VM, container, k8s node), and any custom `working_dir`/`TMPDIR` you’ve set? - A quick

df -h

and the top few directories from

du -xhd1 /tmp

? I can then give you the exact commands/config to fix and prevent it going forward.

Open in Slack

Previous Next