Alex Ilyin
08/12/2025, 4:39 PMMarvin
08/12/2025, 4:39 PMMarvin
08/12/2025, 4:40 PMcache_key_fn
isn’t robust, you can serve stale results and scratch your head for hours.
- Side-effecting tasks shouldn’t rely on cached/persisted outputs for correctness.
- Observability and UX surprises
- Persisted results live in your storage; the UI shows references, not the full payload. Debugging large/complex results can be a scavenger hunt.
When to think twice:
- High-volume, small/fast tasks (I/O cost dominates).
- Sensitive data pipelines without strong storage policies.
- Workloads with unstable schemas or frequent dependency changes.
- Anything where correctness hinges on re-execution rather than reuse.
If you do persist:
- Prefer JSON/structured serializers when possible; reserve Pickle for objects you truly can’t serialize otherwise.
- Set storage lifecycle policies (TTL/archival).
- Make cache keys explicit and conservative.
- Keep read/write environments aligned (Python and deps).
- Encrypt and lock down storage.
If you’re on 2.x today, it’s fine—but consider planning for Prefect 3.x, which improves result handling and deployment ergonomics.Alex Ilyin
08/12/2025, 4:58 PMMarvin
08/12/2025, 4:58 PMMarvin
08/12/2025, 5:02 PM@task(retries=3, retry_delay_seconds=60)
def fragile():
...
- Make tasks idempotent with external checkpoints
- Have each task check external state (S3/DB/file) and no-op if work is already done. Then rerunning the flow only “does” the failed/missing parts.
- Example:
@task
def process_record(id):
if already_processed(id): # e.g., S3 object/DB row exists
return marker_for(id)
# do work, then write the marker
write_marker(id)
return marker_for(id)
- Split into subflows/stages with checkpoints
- Break your DAG into stages that write durable markers; rerunning only reruns stages missing their markers.
What doesn’t exist in 2.x:
- A CLI/UI “rerun failed tasks only” for an already-failed flow run.
- Cross-run caching without persisted results to reconstruct prior task outputs.
If you’re open to minimal persistence, enabling cache keys + persisted results lets a fresh run skip previously successful tasks automatically. If not, external checkpoints are the usual pattern.
Can you share whether this is for rerunning after a completed failed run, and whether your tasks’ outputs are consumed downstream? That will determine how strict your checkpointing needs to be.Alex Ilyin
08/12/2025, 5:30 PMMarvin
08/12/2025, 5:30 PMMarvin
08/12/2025, 5:32 PM@task
decorator exposes persist_result
, result_storage
, and result_serializer
so you can enable it surgically where it pays off.