<@ULVA73B9P> Can I instantiate a class coming from...
# ask-marvin
s
@Marvin Can I instantiate a class coming from the same flow script when running a task within a
DaskTaskRunner()?
m
thought for 168.4 seconds
Short answer: yes. Tasks running under DaskTaskRunner can instantiate a class defined in the same flow file—as long as that class is defined at the module top level and your code is import-safe. Key points and caveats: - Define the class at the top level of the module (not inside a task/flow/function). Cloudpickle will serialize it for Dask workers. - Guard your entry point with
if __name__ == "__main__":
to avoid accidental execution when workers import your module, especially on Windows or when using processes. - Keep the class’s init and attributes serializable. Don’t stash open file handles, DB connections, or clients on the class at definition time; create those inside task calls or methods executed on the worker. - Local vs remote workers: - Local Dask cluster: same-file class usually works fine. - Remote Dask cluster or deployments: ensure workers can import your code. Best practice is to put reusable classes in a proper module/package and deploy from source so workers fetch the same codebase. - In Prefect 3.x, prefer
flow.from_source(...).deploy(...)
to ship code to workers. Working example (Prefect 3.x):
Copy code
from prefect import flow, task
from prefect_dask import DaskTaskRunner

class Greeter:
    def __init__(self, name: str):
        self.name = name
    def greet(self) -> str:
        return f"hello {self.name}"

@task
def do_work(who: str) -> str:
    g = Greeter(who)
    return g.greet()

@flow(task_runner=DaskTaskRunner())
def my_flow(names):
    futures = [do_work.submit(n) for n in names]
    return [f.result() for f in futures]

if __name__ == "__main__":
    print(my_flow(["Arthur", "Ford"]))
What tends to break: - Class defined inside a task/flow or inside
if __name__ == "__main__":
(harder to serialize/import). - Class that closes over local variables or contains unpicklable members. - Top-level side effects executing on worker import because there’s no
__main__
guard. Docs, if you want a light bedtime read: - Prefect Dask integration - Task runners in Prefect 3.x If you’re running on a remote Dask cluster or using deployments, do you already package your code or deploy from source? That determines whether “same file” will be reliable or merely optimistic.