Santiago Gutierrez
08/20/2025, 8:25 AMDaskTaskRunner()?
Marvin
08/20/2025, 8:26 AMMarvin
08/20/2025, 8:29 AMif __name__ == "__main__":
to avoid accidental execution when workers import your module, especially on Windows or when using processes.
- Keep the class’s init and attributes serializable. Don’t stash open file handles, DB connections, or clients on the class at definition time; create those inside task calls or methods executed on the worker.
- Local vs remote workers:
- Local Dask cluster: same-file class usually works fine.
- Remote Dask cluster or deployments: ensure workers can import your code. Best practice is to put reusable classes in a proper module/package and deploy from source so workers fetch the same codebase.
- In Prefect 3.x, prefer flow.from_source(...).deploy(...)
to ship code to workers.
Working example (Prefect 3.x):
from prefect import flow, task
from prefect_dask import DaskTaskRunner
class Greeter:
def __init__(self, name: str):
self.name = name
def greet(self) -> str:
return f"hello {self.name}"
@task
def do_work(who: str) -> str:
g = Greeter(who)
return g.greet()
@flow(task_runner=DaskTaskRunner())
def my_flow(names):
futures = [do_work.submit(n) for n in names]
return [f.result() for f in futures]
if __name__ == "__main__":
print(my_flow(["Arthur", "Ford"]))
What tends to break:
- Class defined inside a task/flow or inside if __name__ == "__main__":
(harder to serialize/import).
- Class that closes over local variables or contains unpicklable members.
- Top-level side effects executing on worker import because there’s no __main__
guard.
Docs, if you want a light bedtime read:
- Prefect Dask integration
- Task runners in Prefect 3.x
If you’re running on a remote Dask cluster or using deployments, do you already package your code or deploy from source? That determines whether “same file” will be reliable or merely optimistic.