I am running a prefect agent with Orion right now with a dep Prefect Community #ask-community

I am running a prefect agent with Orion right now ...

Nelson Griffiths

03/23/2022, 1:20 PM

I am running a prefect agent with Orion right now with a deployed flow. The agent runs flows just fine if I start it and go hit quick run in the UI. But if I leave the agent sitting for too long I start getting this 403 error:

Copy code

Traceback (most recent call last):
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/site-packages/prefect/cli/base.py", line 59, in wrapper
    return fn(*args, **kwargs)
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/site-packages/prefect/utilities/asyncio.py", line 120, in wrapper
    return run_async_in_new_loop(async_fn, *args, **kwargs)
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/site-packages/prefect/utilities/asyncio.py", line 67, in run_async_in_new_loop
    return anyio.run(partial(__fn, *args, **kwargs))
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/site-packages/anyio/_core/_eventloop.py", line 56, in run
    return asynclib.run(func, *args, **backend_options)
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 233, in run
    return native_run(wrapper(), debug=debug)
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
    return future.result()
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 228, in wrapper
    return await func(*args)
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/site-packages/prefect/cli/agent.py", line 71, in start
    await agent.get_and_submit_flow_runs()
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/site-packages/prefect/agent.py", line 88, in get_and_submit_flow_runs
    submittable_runs = await self.client.get_runs_in_work_queue(
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/site-packages/prefect/client.py", line 747, in get_runs_in_work_queue
    response = await <http://self._client.post|self._client.post>(
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/site-packages/prefect/utilities/httpx.py", line 137, in post
    return await self.request(
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/site-packages/prefect/utilities/httpx.py", line 80, in request
    response.raise_for_status()
  File "/home/nelson/miniconda3/envs/my_project/lib/python3.9/site-packages/httpx/_models.py", line 1510, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '403 Forbidden' for url '<https://api-beta.prefect.io/api/accounts/df4b7089-cc2a-48ae-b4ce-baea44b163d6/workspaces/b22af91f-f810-4bc3-ac90-a1fa0e042c55/work_queues/c91e1439-be7e-4a98-8df0-da39515197b2/get_runs>'
For more information check: <https://httpstatuses.com/403>
An exception occurred.

Any ideas what might be causing this?

Kevin Kho

03/23/2022, 1:22 PM

Hey @Nelson Griffiths, will check with the team about this

Anna Geller

03/23/2022, 1:37 PM

can you share your deployment spec, @Nelson Griffiths?

Anna Geller

03/23/2022, 1:39 PM

if you are running your flow in a container using Docker or Kubernetes flow runner, you may need to attach the API_KEY and API_ULR env variables, e.g.: Docker:

Copy code

DeploymentSpec(
    name="example",
    flow=docker_flow,
    tags=["local"],
    flow_runner=DockerFlowRunner(
        image="prefecthq/prefect:2.0ba2-python3.9",
        env={
            "EXTRA_PIP_PACKAGES": "pandas",
            "PREFECT_API_KEY": "xxx",
        },
        volumes=["/Users/anna/.aws:/root/.aws"],
    ),
)

Kubernetes:

Copy code

DeploymentSpec(
    name="prod",
    flow=kubernetes_flow,
    tags=["local"],
    flow_runner=KubernetesFlowRunner(
        env=dict(
            PREFECT_API_URL="<https://api-beta.prefect.io/api/accounts/yyy/workspaces/xxx>",
            PREFECT_API_KEY="YOUR_API_KEY",
        ),
    ),
)

Anna Geller

03/23/2022, 1:41 PM

you can also attach the same on UniversalFlowRunner:

Copy code

DeploymentSpec(
    name="cloud",
    flow=universal_flow,
    tags=["local"],
    flow_runner=UniversalFlowRunner(
        env=dict(
            PREFECT_API_URL="<https://api-beta.prefect.io/api/accounts/yyy/workspaces/xxx>",
            PREFECT_API_KEY="YOUR_API_KEY",
        ),
    ),
)

Kevin Kho

03/23/2022, 2:04 PM

How long does it take for this to happen and is it consistent?

Nelson Griffiths

03/23/2022, 2:21 PM

It has happened 3/3 times now. This last one took about 35 min before it died. It ran a few scheduled flows in that time

Nelson Griffiths

03/23/2022, 2:22 PM

@Anna Geller here is my DeploymentSpec. Just running locally:

Copy code

DeploymentSpec(flow=ingest_tweets,
               name="udot-data-collection",
               parameters={"username": "UDOTTRAFFIC", "lookback_days": 1},
               tags=["db", "local"],
               schedule=IntervalSchedule(interval=timedelta(minutes=5)))

Nelson Griffiths

03/23/2022, 2:23 PM

The strangest part is that it goes and gets and runs flows for 30 minutes successfully before throwing the error

Anna Geller

03/23/2022, 2:35 PM

so you're running everything locally - both Orion and your agent? and since you don't assign any FlowRunner, you use the default

SubprocessFlowRunner

Nelson Griffiths

03/23/2022, 2:36 PM

Sorry I am using prefect cloud and running my agent locally.

👍 1

Anna Geller

03/23/2022, 2:41 PM

Do you mind trying to attach the universal flow runner with the API key to your DeploymentSpec and let us know if this helps?

Copy code

flow_runner=UniversalFlowRunner(
        env=dict(
            PREFECT_API_URL="<https://api-beta.prefect.io/api/accounts/yyy/workspaces/xxx>",
            PREFECT_API_KEY="YOUR_API_KEY",
        ),
    ),

I saw a similar error with the Docker flow runner and I assumed that this container (sub)process didn't get the API key... I'll ask the team about this but this is worth trying

Anna Geller

03/23/2022, 2:43 PM

the error 403 Forbidden indicates API key issue

Nelson Griffiths

03/23/2022, 2:54 PM

I will give this a shot in a little bit and let you know if it fixes the issue

👍 1

Nelson Griffiths

03/23/2022, 4:03 PM

It has now been alive for about an hour. So this seems to have fixed the problem.

Nelson Griffiths

03/23/2022, 4:04 PM

I am guessing that the default behavior with a SubprocessFlowRunner will be looked into and fixed at some point though?

Kevin Kho

03/23/2022, 4:05 PM

If we can replicate yes for sure that is not intended. Will be trying to

Nelson Griffiths

03/23/2022, 4:06 PM

Let me know if there is anything you need from me!

👍 1

Nelson Griffiths

03/24/2022, 1:24 AM

@Kevin Kho @Anna Geller This ran for a while but then turned into a 502 bad gateway error. Any further ideas?

Kevin Kho

03/24/2022, 1:26 AM

Not at the moment, will check in with the team tom and discuss this

Nelson Griffiths

03/24/2022, 4:41 PM

As a further update my agent is bouncing between 403 and 502 errors now when I start it

Kevin Kho

03/24/2022, 4:44 PM

Oh man will look into this today

Nelson Griffiths

03/24/2022, 4:46 PM

I appreciate it! Hopefully I just did something dumb in my setup. 🤷🏼‍♂️

Kevin Kho

03/25/2022, 3:26 AM

So I am have agent running against Cloud, and am not finding any weirdness. You said it works for 30 mins. I have a set-up going on right now and I’ll try to leave it overnight. I have a 10 minute schedule

Kevin Kho

03/25/2022, 1:38 PM

My agent seems to be fine. You have any advice for me to replicate? Do you have more than one work queue or agent?

Nelson Griffiths

04/02/2022, 1:09 PM

Sorry I was traveling for a bit. Is there anyway I can get more detailed logs as to what is happening? I'm not sure how to tell you to replicate it. I'm happy to share my whole repository if that is helpful. It's just a small side project I'm working on

Anna Geller

04/02/2022, 8:28 PM

Sharing your repo will be helpful, for sure! Also, can you perhaps recreate your workspace, work queue and agent from scratch? within 10 days some things could have changed 🙂 https://orion-docs.prefect.io/ui/cloud/

Nelson Griffiths

04/09/2022, 7:58 PM

I recreated everything and am still running into the same things. I will share my repo shortly, but I also have 2 other questions: 1. Is there some way to get better logs from the agent to understand why this is happening? 2. Is there a way I can ping the agent from another process to see if it is running and turn it back on programmatically as a work around for now?

Anna Geller

04/09/2022, 11:23 PM

#1 Yup, you can set the log level to debug this way:

Copy code

prefect config set PREFECT_LOGGING_LEVEL='DEBUG'

#2 To check if the agent process is running, you can inspect your running processes on the instance:

Copy code

ps -ef | grep "prefect agent start"

But the easiest way is to inspect the work queue the agent polls for:

Copy code

prefect work-queue inspect 'acffbcc8-ae65-4c83-a38a-96e2e5e5b441'

4 Views

Open in Slack

Previous Next