Hello everyone, our team has just run into an issu...
# prefect-cloud
c
Hello everyone, our team has just run into an issue where all of our clusters (dev and production) are no longer found by Prefect cloud. It seems that the workers cannot start. This happened to all the environments with out any changes in deployment at about 9am PST today. Does anyone have any thoughts about why this may have happened considering we are pinned on an older version of Prefect
2.16.4
?
a
Is prefect_kubernetes pinned too?
c
no we have not pinned the version of that package
a
So my guess is that since it’s not pinned, it’s pulling the most up to date version of prefect-kubernetes which may be referencing a utility in a version of prefect you don’t have. Not entirely sure (on vacation at the moment but had a second to spare). @Nate is this your spidey sense too?
n
yes i think you're right @Alexander Azzam - 2.16.4 doesn't have
prefect.utilities.timeout
but the newest kubernetes worker uses that module
c
We are actually using the provided image
prefecthq/prefect:2.16.4-python3.10
so maybe the version is not pinned in that image? Either way, I just upgraded to the newest version and it seems to work alright now
n
We are actually using the provided image
prefecthq/prefect:2.16.4-python3.10
so maybe the version is not pinned in that image?
prefect 2.16.4 just doesnt have that module, so when installing
prefect-kubernetes>=0.4.0
on top (which uses that module) then we get that import error but yeah, upgrading prefect or downgrading prefect-kubernetes would resolve - glad you got it figured out!
c
Thanks everyone!
n
catjam
c
Actually, spoke to soon. The worker was reporting to Prefect Cloud but on initialization of a Prefect run I get different errors:
Copy code
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/workers/base.py", line 908, in _submit_run_and_capture_errors
    result = await self.run(
  File "/usr/local/lib/python3.10/site-packages/prefect_kubernetes/worker.py", line 612, in run
    async with self._get_configured_kubernetes_client(configuration) as client:
  File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
    return await anext(self.gen)
  File "/usr/local/lib/python3.10/site-packages/prefect_kubernetes/worker.py", line 743, in _get_configured_kubernetes_client
    await config.load_incluster_config()
TypeError: object NoneType can't be used in 'await' expression
n
hi @Curtis White - sorry about that. that is a bug that went out that we've fixed in the 2.x lineage but need to release (will be doing this asap) in the meantime 0.3.11 should not have this bug
c
so do we need to create our own image then for our worker?
We have something like this right now
Copy code
spec:
      serviceAccountName: scheduler
      containers:
        - name: worker
          image: prefecthq/prefect:2.19.9-python3.11
          command:
            [
              "prefect",
              "worker",
              "start",
              "--pool",
              "${ENVIRONMENT_NAME_H}",
              "--type",
              "kubernetes",
              "--install-policy",
              "always",
            ]
          imagePullPolicy: "Always"
          securityContext:
            allowPrivilegeEscalation: false
n
there's a kubernetes flavored image you should be able to use, one sec, let me grab that
Copy code
» docker run -it --rm prefecthq/prefect:2.19.9-python3.11-kubernetes bash

root@53647e34beb2:/opt/prefect# pip list | grep prefect
prefect                   2.19.9
prefect-kubernetes        0.3.11
c
For this image it seems to also download the newer version
This is with this config
n
hmm it shouldnt, as I tried to show in the above, 0.3.11 is already installed on that image if you have
EXTRA_PIP_PACKAGES
on the deployment / work pool or are pip installing something in the
pull
section, then yeah it would install prefect-kubernetes on top at runtime
yeah actually in the top of that screengrab you sent, you can see it says 0.3.11 is already in the site packages. so id guess you have EXTRA_PIP_PACKAGES or its the install-policy "always"
i might choose
if-not-present
instead of
always
c
We don’t have anything setup for
EXTRA_PIP_PACKAGES
but I will try with this different policy
👍 1
same issue when changing this to if-not-present
n
as in, you see 0.4.0 being installed on top?
c
Yes specifically it installs 0.4.0 and then uninstalls the 0.3.11 version
n
well if you don't mind trying one more thing, you shouldnt need an install policy at all if we use the kubernetes flavored image. can we try just removing that install policy flag entirely?
c
Ok its working now. It was because I changed the
imagePullPolicy: "Always"
not the install policy in the worker start. I have a flow run executing at least
n
ah nice catch, didn't think of that