Hi - When I use the DaskExecutor without a host a...
# prefect-community
i
Hi - When I use the DaskExecutor without a host address the program compiles. When I spin up a scheduler and workers ahead of time and pass it to the DaskExecutor, I get two errors.
Killworker
and
dask-worker: error: unrecognized arguments: <tcp://127.0.0.5:8786>
with the second error always being any arguments I give the scheduler. Any thoughts on how to profile the DaskExecutor?
c
Hey itay - could you share the code you use to initialize the dask executor?
i
@Chris White
executor = DaskExecutor(address="<tcp://127.0.0.5:8786>", debug=True)
or debug=False
Copy code
if make_nodes_executor=='dask':
    from prefect.engine.executors import DaskExecutor
    #client = Client()
    executor = DaskExecutor(address="<tcp://127.0.0.5:8786>", debug=True) # Does not work
else:# make_nodes_executor=="local":
    from prefect.engine.executors import LocalExecutor
    executor = LocalExecutor()
c
Hm that code looks fine - how are you creating your dask scheduler / workers?
i
The context to this is: There is a
Flow
within a
LOOP
task
. That
task
is triggered by another
flow
run locally.
c
Hmmm I’ve never seen this before but it could be caused by running flows within tasks especially if both flows use a dask executor
i
Actual code looks like this
c
Yea I think your problem is running a flow within a task here
i
That is what I was thinking.... If that is the case it's good enough for now, meaning the program is being parallelized where it's largest choke point is. Is there a way to get the IP of the dask executor without specifying it from prefect?
I am going to try it as a local agent in the clod and try to get insight
c
Yea you should be able to a access a dask worker client since this task will be running on a worker but this is intended for submitting single pieces of work, not entire flows
i
Is dask the right tool for this or should I use
multiprocessing
or some other library/method?
inside the LOOP
@Chris White I can safely say prefect does not like how
generate_knowledge_graph
is put together. It works but.. running it in the cloud or using storage does not. Basically anything that has a
cloudpickle
fails. I need to learn how to chain and update flows.
c
Both would be appropriate I think you just will need to access their apis directly and not through prefect
i
@Chris White Thank you
👍 1