Question: when I run `flow.run(executor=DaskExecut...
# prefect-community
j
Question: when I run
flow.run(executor=DaskExecutor(address=dask_scheduler_address))
does it serialize the flow / computation graph and send it as a task to hold onto? Basically, if I have a dask scheduler spun up and I am working on my local machine, can I call
flow.run(...)
and then shut down my machine?
z
Hi @Jackson Maxfield Brown! I believe you can call
flow.run()
and shut down your computer-- looks like we're using a fire and forget approach to how we submit work to Dask.
j
Unfortunately wasn't able to get it working immediately using the following:
Copy code
from prefect import task, Flow
import random
import json
from time import sleep
from prefect.engine.executors import DaskExecutor

@task
def inc(x):
    sleep(random.random() / 10)
    return x + 1


@task
def dec(x):
    sleep(random.random() / 10)
    return x - 1


@task
def add(x, y):
    sleep(random.random() / 10)
    return x + y


@task(name="sum")
def list_sum(arr):
    return sum(arr)

@task(name="save")
def save_total(total)
    with open("~/dask_test.json", "w") as f:
        json.dump({"total": total}, f)

range_list = list(range(0,int(1e5)))

with Flow("dask-example") as flow:
    incs = inc.map(x=range_list)
    decs = dec.map(x=range_list)
    adds = add.map(x=incs, y=decs)
    total = list_sum(adds)
    save_total(total)

executor = DaskExecutor(address="<tcp://localhost>:{PORT}")

# This will run until the computation is complete
flow.run(executor=executor)
Will keep trying a few different configurations.
z
I'm sorry to hear that! What issues are you running into-- could very well be that this is an opportunity for us to improve our documentation.
j
The setup we are currently running is to develop a flow on our local machine (in this case it is a macbook pro but could be a local desktop or similar). From our local machine we SSH into our cluster and launch a dask scheduler, we keep this SSH session running in the background and grab the dask executor address. On our local machine with the Flow code above we pass in the dask executor port that we have open through SSH and we run the pipeline. That all works. What isn't working is doing all of that then closing our laptop lid or shutting the local machine down because the SSH connection gets dropped. The original question was whether or not Prefect (or I think Dask would be the culprit) basically serializes the task graph and holds onto it for us but it doesn't look like it does because dropping the SSH connection interrupts the flow from running.