https://prefect.io logo
Title
j

José Duarte

09/20/2022, 12:10 PM
Hey all, am currently running Prefect 2.3.2 on a minikube cluster and while the agent connects to Orion just fine, it does not seem to be reading jobs from the queue. I have a pending job and the agent just doesn’t pick it up. Any ideas?
1
More information: I am using an anonymous deployment which I inspected using the CLI and got the following:
Traceback (most recent call last):
  File "/Users/duarte/Documents/data/pdbt/.venv/lib/python3.8/site-packages/prefect/cli/_utilities.py", line 41, in wrapper
    return fn(*args, **kwargs)
  File "/Users/duarte/Documents/data/pdbt/.venv/lib/python3.8/site-packages/prefect/utilities/asyncutils.py", line 212, in wrapper
    return run_async_in_new_loop(async_fn, *args, **kwargs)
  File "/Users/duarte/Documents/data/pdbt/.venv/lib/python3.8/site-packages/prefect/utilities/asyncutils.py", line 141, in run_async_in_new_loop
    return anyio.run(partial(__fn, *args, **kwargs))
  File "/Users/duarte/Documents/data/pdbt/.venv/lib/python3.8/site-packages/anyio/_core/_eventloop.py", line 70, in run
    return asynclib.run(func, *args, **backend_options)
  File "/Users/duarte/Documents/data/pdbt/.venv/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 292, in run
    return native_run(wrapper(), debug=debug)
  File "/Users/duarte/.pyenv/versions/3.8.9/lib/python3.8/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/Users/duarte/.pyenv/versions/3.8.9/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
    return future.result()
  File "/Users/duarte/Documents/data/pdbt/.venv/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 287, in wrapper
    return await func(*args)
  File "/Users/duarte/Documents/data/pdbt/.venv/lib/python3.8/site-packages/prefect/cli/deployment.py", line 152, in inspect
    deployment_json["infrastructure"] = Block._from_block_document(
  File "/Users/duarte/Documents/data/pdbt/.venv/lib/python3.8/site-packages/prefect/blocks/core.py", line 514, in _from_block_document
    block = block_cls.parse_obj(block_document.data)
  File "pydantic/main.py", line 526, in pydantic.main.BaseModel.parse_obj
  File "/Users/duarte/Documents/data/pdbt/.venv/lib/python3.8/site-packages/prefect/blocks/core.py", line 171, in __init__
    super().__init__(*args, **kwargs)
  File "pydantic/main.py", line 342, in pydantic.main.BaseModel.__init__
pydantic.error_wrappers.ValidationError: 1 validation error for KubernetesJob
job
  value is not a valid dict (type=type_error.dict)
An exception occurred.
I am not sure that this is the root cause but I wish Prefect was noisy instead of failing silently
c

Christopher Boyd

09/20/2022, 1:31 PM
Hi Jose, So you are running minikube locally, with an agent pointing to minikube - is the agent in the cluster, or local? Can you interact with the orion instance in any other way (e.g. creating work queues and seeing they exist), or blocks?
j

José Duarte

09/20/2022, 1:33 PM
Hey @Christopher Boyd, everything is running in the minikube cluster. I am able to interact with Orion because my Agent’s init container registers all flows in the python module as deployments and they’re all there
c

Christopher Boyd

09/20/2022, 1:41 PM
I think the two main requirements for the agent to pick up the flow then, are ensuring that your kubeconfig to talk to the cluster is accurate (between the agent + the cluster), and ensuring tags / work-queue are appropriately set. Beyond that, would need to see how the deployment is registered, how the flowrun is created / executed, and logs from the agent to see what it’s thinking
j

José Duarte

09/20/2022, 1:44 PM
• The work-queue I am using is the default one. • The agent has no logs though, is there a verbosity config? • The deployment may be wrong but Prefect only showed me the error above when I inspected it. • The flow run is created by clicking on Run in Orion’s Deployment
c

Christopher Boyd

09/20/2022, 2:28 PM
The agent should definitely have logs, by default of what work-queue its listening to, what workspace its connected to, and updates as it is looking for flow runs
j

José Duarte

09/20/2022, 2:30 PM
It doesn’t do any kind of updates. It just says it’s “listening” to the default queue
c

Christopher Boyd

09/20/2022, 2:35 PM
did you follow any sort of guide to set this up?
j

José Duarte

09/20/2022, 2:37 PM
More or less. I started by doing everything locally, then docker compose to get the containers right, and now minikube before moving this to Azure
I’ve gathered the information from several resources as there is no proper guide on “self-hosting” prefect
c

Christopher Boyd

09/20/2022, 2:41 PM
There are some resources available for this - options 1 and 2 are for self hosted, option 3 is to the cloud https://discourse.prefect.io/t/deploying-prefect-agents-on-kubernetes/1298 Additionally, this is setup for azure, but steps 8 and on should apply across all other environments: https://discourse.prefect.io/t/how-to-deploy-a-prefect-2-0-agent-to-an-azure-kubernetes-cluster-and-connect-to-azure-blob-storage/1128
j

José Duarte

09/20/2022, 2:50 PM
I’ve done all those steps, though my deployment run just doesn’t seem to hit the queue
@Christopher Boyd after more investigation, I suspect that this had to do with the fact that my
job
field in the `KubernetesJob`was not a dict. Prefect is only validating the blocks after the fact, not during construction or registering… Either way I’ve hit another bug and I am currently reporting it