https://prefect.io logo
Title
i

itay livni

02/08/2020, 4:35 PM
Hi - When I use the DaskExecutor without a host address the program compiles. When I spin up a scheduler and workers ahead of time and pass it to the DaskExecutor, I get two errors.
Killworker
and
dask-worker: error: unrecognized arguments: <tcp://127.0.0.5:8786>
with the second error always being any arguments I give the scheduler. Any thoughts on how to profile the DaskExecutor?
c

Chris White

02/08/2020, 4:36 PM
Hey itay - could you share the code you use to initialize the dask executor?
i

itay livni

02/08/2020, 4:38 PM
@Chris White
executor = DaskExecutor(address="<tcp://127.0.0.5:8786>", debug=True)
or debug=False
if make_nodes_executor=='dask':
    from prefect.engine.executors import DaskExecutor
    #client = Client()
    executor = DaskExecutor(address="<tcp://127.0.0.5:8786>", debug=True) # Does not work
else:# make_nodes_executor=="local":
    from prefect.engine.executors import LocalExecutor
    executor = LocalExecutor()
c

Chris White

02/08/2020, 4:47 PM
Hm that code looks fine - how are you creating your dask scheduler / workers?
i

itay livni

02/08/2020, 5:21 PM
The context to this is: There is a
Flow
within a
LOOP
task
. That
task
is triggered by another
flow
run locally.
c

Chris White

02/08/2020, 5:25 PM
Hmmm I’ve never seen this before but it could be caused by running flows within tasks especially if both flows use a dask executor
i

itay livni

02/08/2020, 5:27 PM
Actual code looks like this
c

Chris White

02/08/2020, 5:29 PM
Yea I think your problem is running a flow within a task here
i

itay livni

02/08/2020, 5:33 PM
That is what I was thinking.... If that is the case it's good enough for now, meaning the program is being parallelized where it's largest choke point is. Is there a way to get the IP of the dask executor without specifying it from prefect?
I am going to try it as a local agent in the clod and try to get insight
c

Chris White

02/08/2020, 5:35 PM
Yea you should be able to a access a dask worker client since this task will be running on a worker but this is intended for submitting single pieces of work, not entire flows
i

itay livni

02/08/2020, 5:39 PM
Is dask the right tool for this or should I use
multiprocessing
or some other library/method?
inside the LOOP
@Chris White I can safely say prefect does not like how
generate_knowledge_graph
is put together. It works but.. running it in the cloud or using storage does not. Basically anything that has a
cloudpickle
fails. I need to learn how to chain and update flows.
c

Chris White

02/08/2020, 7:46 PM
Both would be appropriate I think you just will need to access their apis directly and not through prefect
i

itay livni

02/08/2020, 9:30 PM
@Chris White Thank you
👍 1