Hello, someone is encountering problems with cloud...
# ask-community
r
Hello, someone is encountering problems with cloud run v2 work pool? I think they made some updates and something is now broken. Some weeks ago, suddenly my scheduled deployment failed due "architecture error in docker image" and few days ahead everything worked back again with no action. Right now I'm facing this exception only sometimes. Prefect team are you aware of this?
Copy code
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/distributed/client.py", line 511, in __del__
  File "/usr/local/lib/python3.10/site-packages/distributed/utils.py", line 1836, in is_python_shutting_down
ImportError: sys.meta_path is None, Python is likely shutting down
c
Hi @Raffaele Scarano! Given the traceback, this appears to be an issue with your environment, specifically dask-distributed; for example, I found this GitHub issue referencing the same error message. The architecture error is a separate thing, Google Cloud Run will fail to run images built with
linux/arm64
architecture; you should build your docker images using
linux/amd64
architecture (if you're using Prefect to build your images, you can specify this via
platform="linux/amd64"
)
r
@Chris White thank you for your attention to this. What drives me crazy is that the exactly same deployment works and does not work at the same time, I don’t get why. As you can see in the attached screenshot the same flow sometimes completes successfully and sometimes crashes before starting.
c
Hm I do see that - we haven't made any updates to our Cloud Run work pool configuration in that time; my guess is that it's still a dask distributed bug, but it's a race condition so it's non-deterministic. It could be worth raising on their repo as an issue. How are you creating your dask client and submitting to it?
r
@Chris White I actually use the DaskTaskRunner and let it do all the magic. I noticed that always the same cities cause the cloud run job to crush at startup, but only if they are launched in parallel, (one job per city); if i launch one job for the 4 cities together, it runs successfully. Even more strange :)