Hello everybody! I am actually trying to test a P...
# ask-community
a
Hello everybody! I am actually trying to test a PoC flow using Docker storage and DaskExecutor. I deployed a local prefect server and a separate dask cluster . Runing the flow directly works well but when registering and running using a
prefect agent docker
i am having a permission error : Thanks for your help
a
Hi @Amine Dirhoussi If you get the chance, could you move the traceback to the thread so we can keep the main channel a bit cleaner? Is your local Dask cluster running on the same machine as your server? You may need to configure networking so that your container can reach Dask scheduler. And the “Permission denied” is likely related to the user within your container. I do NOT recommend this for production, but for local development, if you add this like to your Dockerfile, I think this specific error may be fixed:
Copy code
USER root
Do you have a specific reason to use your own local Dask cluster? Otherwise, if you’re not running it in a distributed setting, you could try using
LocalDaskExecutor
instead, which will parallelize work across your local processes and threads, and is much easier to set up and use.
a
Hello @Anna Geller Thanks for the respnse, i'm new to slack sorry for the huge traceback block 😅 The server and Dask cluster are on the same server for the local test and the scheduler does receive the call. Do i need to update the permission of the docker file of the flow ? I'm sorry but i'm a bit confused on how the cluster executors actually runs the flow that is packaged as docker image ? I need to test it this way because our data science team has an existing dask cluster that we need use to run the flow on. Thanks a lot again.
a
@Amine Dirhoussi when you register your flow, Prefect pickles your flow and stores it in the Docker image. When Prefect runs the flow, it retrieves the flow object and submits the tasks to the Dask executor. I’m not entirely clear how permissions need to be set in the Docker storage, I only suggested changing the user to root for development because you were getting this Permission denied on auth.toml file. I see that the traceback disappeared completely 🙂 You can add it (it’s helpful), but just as part of this thread rather than the main channel.
🙌 1
a
Copy code
Unexpected error: PermissionError(13, 'Permission denied')
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/prefect/engine/runner.py", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/prefect/engine/flow_runner.py", line 543, in get_flow_run_state
    {e: state for e, state in upstream_states.items()}
  File "/usr/local/lib/python3.7/site-packages/prefect/executors/dask.py", line 424, in wait
    return self.client.gather(futures)
  File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 1949, in gather
    asynchronous=asynchronous,
  File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 841, in sync
    self.loop, func, *args, callback_timeout=callback_timeout, **kwargs
  File "/usr/local/lib/python3.7/site-packages/distributed/utils.py", line 326, in sync
    raise exc.with_traceback(tb)
  File "/usr/local/lib/python3.7/site-packages/distributed/utils.py", line 309, in f
    result[0] = yield future
  File "/usr/local/lib/python3.7/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/usr/local/lib/python3.7/site-packages/distributed/client.py", line 1808, in _gather
    raise exception.with_traceback(traceback)
  File "/home/amine/miniconda3/envs/prefect-test/lib/python3.7/site-packages/prefect/executors/dask.py", line 60, in _maybe_run
  File "/home/amine/miniconda3/envs/prefect-test/lib/python3.7/site-packages/prefect/engine/flow_runner.py", line 775, in run_task
  File "/home/amine/miniconda3/envs/prefect-test/lib/python3.7/site-packages/prefect/engine/cloud/task_runner.py", line 44, in __init__
  File "/home/amine/miniconda3/envs/prefect-test/lib/python3.7/site-packages/prefect/client/client.py", line 135, in __init__
  File "/home/amine/miniconda3/envs/prefect-test/lib/python3.7/site-packages/prefect/client/client.py", line 266, in load_auth_from_disk
  File "/home/amine/miniconda3/envs/prefect-test/lib/python3.7/pathlib.py", line 1361, in exists
  File "/home/amine/miniconda3/envs/prefect-test/lib/python3.7/pathlib.py", line 1183, in stat
PermissionError: [Errno 13] Permission denied: '/root/.prefect/auth.toml'
👍 1
Thanks for the reply i'm still getting the permission denied, what i don't quite understand is the path of prefect config
/root/.prefect/
which i think is in the docker image ? The
base_image
i created is very simple:
Copy code
FROM prefecthq/prefect:latest
USER root
what i don't get is how the dask scheduler can be called from within the docker container (created by prefect agent when running the flow) ?
on my local
Dask worker
i'm having this error :
Copy code
distributed.worker - WARNING - Compute Failed
Function:  _maybe_run
args:      ('prefect-6004fbcd4dac46eebdfe60061e41be04', <function run_task at 0x7f4518b88170>)
kwargs:    {'task': <Task: Constant[range]>, 'state': <Pending: "Task run created">, 'upstream_states': {}, 'context': {'image': 'dask-example:2021-10-21t12-04-03-051893-00-00', 'flow_run_id': '8472a134-d284-463d-86fb-9fef225f0d6c', 'flow_id': 'eef9aecf-0bb5-4852-a148-748883568a44', 'config': <Box: {'debug': False, 'home_dir': '/root/.prefect', 'backend': 'server', 'server': {'host': '<http://localhost>', 'port': 4200, 'host_port': 4200, 'host_ip': '127.0.0.1', 'endpoint': '<http://localhost:4200>', 'database': {'host': 'localhost', 'port': 5432, 'host_port': 5432, 'name': 'prefect_server', 'username': 'prefect', 'password': 'test-password', 'connection_url': '<postgresql://prefect>:test-password@localhost:5432/prefect_server', 'volume_path': '/root/.prefect/pg_data'}, 'graphql': {'host': '0.0.0.0', 'port': 4201, 'host_port': 4201, 'debug': False, 'path': '/graphql/'}, 'hasura': {'host': 'localhost', 'port': 3000, 'host_port': 3000, 'admin_secret': '', 'claims_namespace': 'hasura-claims', 'graphql_url'
Exception: PermissionError(13, 'Permission denied')
a
Ok I see, So you use Docker agent, but your flow has no dependencies. Instead of
Copy code
base_image="test:latest"
you could probably use “prefecthq/prefect:latest” directly. I think there is some issue in how Server, your Docker agent and Dask communicate. From the Dask logs you shared, it looks like Dask can’t reach the GraphQL API. I have only two things worth trying that come to my mind atm: 1. Try the same using local agent and local storage instead of Docker agent and Docker storage to see if the error comes from permissions between those different components 2. Try the same with Prefect Cloud - since this way, the API is accessible from the internet (provided that you authenticated with an API key), this would be easier to set up and debug. I only recommend #2 because we were discussing this together previously that you picked Server only because you weren’t aware of how secure Cloud is with the hybrid execution model.
👀 1
a
Thanks a lot for the response I'll try the first solution. Unfortunately using the Prefect Cloud solution is out of scope for the team even though I explained the flow storage due to very restrictive rules.
a
btw just sharing in case this might be useful for you: https://docs.prefect.io/core/advanced_tutorials/dask-cluster.html
👍 1
👀 1
@Amine Dirhoussi it’s not just flow storage, both your code and data live entirely on your infrastructure, Prefect never receives your data, only the metadata needed to orchestrate your flows. If you ever want to give it a try, you can contact sales@prefect.io - they can explain it better than me, incl. responding to specific security concerns.
a
I'll pushing toward this direction! They actually tried the first solution with local storage and everything worked accordingly. The issue rises when using the docker storage 😕
a
that’s what I thought. The reason is that several containers need to communicate between each other, the API and with Dask
👍 1
a
ok, so for each worker, a container needs to be spawned using the image and executes in the worker?
a
yes. And in the distributed setting, the challenge is to install all dependencies that are needed by your flow on all workers. That’s why having all package dependencies baked into the image is beneficial