ciaran
06/17/2021, 1:54 PMpod_template
and scheduler_pod_template
in my dask_kubernetes.KubeCluster
DaskExecutor
.
Both the templates (for now) are exactly the same:
"pod_template": make_pod_spec(
image=os.environ["BAKERY_IMAGE"],
labels={"flow": flow_name},
env={
"AZURE_STORAGE_CONNECTION_STRING": os.environ[
"FLOW_STORAGE_CONNECTION_STRING"
]
},
),
"scheduler_pod_template": make_pod_spec(
image=os.environ["BAKERY_IMAGE"],
labels={"flow": flow_name},
env={
"AZURE_STORAGE_CONNECTION_STRING": os.environ[
"FLOW_STORAGE_CONNECTION_STRING"
]
},
),
If I try to run a flow with both declared, my Dask Scheduler pod fails with:
Traceback (most recent call last): File "/srv/conda/envs/notebook/bin/dask-worker", line 8, in <module>
sys.exit(go())
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/cli/dask_worker.py", line 462, in go
main()
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/click/core.py", line 1137, in __call__
return self.main(*args, **kwargs)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/click/core.py", line 1062, in main
rv = self.invoke(ctx)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/click/core.py", line 763, in invoke
return __callback(*args, **kwargs)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/cli/dask_worker.py", line 406, in main
nannies = [
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/cli/dask_worker.py", line 407, in <listcomp>
t(
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/nanny.py", line 220, in __init__
host = get_ip(get_address_host(self.scheduler.address))
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/comm/addressing.py", line 142, in get_address_host
return backend.get_address_host(loc)
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/comm/tcp.py", line 572, in get_address_host
return parse_host_port(loc)[0]
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/comm/addressing.py", line 90, in parse_host_port
port = _default()
File "/srv/conda/envs/notebook/lib/python3.8/site-packages/distributed/comm/addressing.py", line 69, in _default
raise ValueError("missing port number in address %r" % (address,))
ValueError: missing port number in address '$(DASK_SCHEDULER_ADDRESS)'
But if I only declare the pod_template
, everything works out great. I'm assuming the fact I declare the scheduler_pod_template
means I'm losing some default setup somewhere down the line?