<@ULVA73B9P> i am getting `OSError: [Errno 28] No ...
# ask-marvin
p
@Marvin i am getting
OSError: [Errno 28] No space left on device
from following code. len(args) is 10000, and there are 16 processes. I have enough disk space, what can cause this error?
Copy code
with ProcessPoolTaskRunner(max_workers=args.processes) as runner:
    futures = [runner.submit(process_arg, args) for args in arguments]
    for future in futures:
        future.wait()
m
thought for 274.4 seconds
Short answer: this is usually not your main disk. With many processes/futures, ENOSPC often comes from one of: - /dev/shm (shared memory) filling up — very common inside Docker/Kubernetes - Inode exhaustion (too many small files) even if df -h shows plenty of space - A small/isolated /tmp or container writable layer that’s full - Local result persistence creating lots of files A few checks to pinpoint it: - Check shared memory
Copy code
df -h /dev/shm
  ls -l /dev/shm | wc -l
If you’re in Docker/K8s, /dev/shm is typically tiny (64MB in Docker by default) and Python multiprocessing uses POSIX semaphores in /dev/shm; creating many can trigger “No space left on device”. - Check inodes
Copy code
df -i
If IUse% is 100% for the filesystem hosting your temp/results, you’re out of inodes. - Check temp and container layer
Copy code
df -h /
  df -h /tmp
In containers, the writable layer may be full even if the host isn’t. Mitigations: 1) If running in Docker/Kubernetes - Increase shared memory: - Docker: add
--shm-size=1g
- Kubernetes: mount an EmptyDir at
/dev/shm
with
medium: Memory
and a
sizeLimit
- Ensure your container’s writable layer or ephemeral volumes have enough space. 2) Don’t queue 10k futures at once - Submit in batches so only a small number are “in flight”; this reduces OS resources (/dev/shm objects, temp files).
Copy code
batch_size = 1000  # or 4 * args.processes
   with ProcessPoolTaskRunner(max_workers=args.processes) as runner:
       for i in range(0, len(arguments), batch_size):
           futures = [runner.submit(process_arg, a) for a in arguments[i:i+batch_size]]
           for f in futures:
               f.wait()
3) Results/temp storage - If you’ve enabled local result persistence, 10k tasks can create lots of small files; point results at a larger filesystem or disable persistence for this run. - Prefect 3.x results docs: Results in Prefect 3 4) If you see many leftover
sem.*
files in
/dev/shm
, you’re likely hitting the shared memory limit or leaking semaphores after crashes. Cleaning them up and increasing
/dev/shm
usually resolves it. If you can share: - Whether this is running in Docker/Kubernetes (and any resource limits) - The full traceback (to see if it fails creating a semaphore or writing a file) - Whether you’ve enabled result persistence …I can give more targeted guidance.