Is there a healthcheck route we can use for an `agent` `exec Prefect Community #ask-community

Is there a healthcheck route we can use for an `ag...

Leo Meyerovich (Graphistry)

04/11/2020, 12:03 AM

Is there a healthcheck route we can use for an

agent

executor

docker service? I saw a feb 5 PR around some storage healthchecks, but not seeing docs for instrumented monitoring here. Ideally something curl-able, like,

Copy code

healthcheck:
      test: ["CMD-SHELL", "curl -sSf <http://prefect/health> | jq .code | grep 200 || exit 1"]

👍 1

Chris White

04/11/2020, 12:24 AM

Hey Leo! Which agent are you running and where? The k8s agent has a health check route but none of the other ones do at the moment. Is the goal to trigger a restart in the event the health check fails?

Leo Meyerovich (Graphistry)

04/11/2020, 12:34 AM

we're trying to figure out secure (federated) & observable & auto-restarting executors. basic soln we came to is docker-compose executors put onto every server donated to us. the toy setup is used a prefect container base, but we'll be switching to a nvidia rapids base image and pip/conda installing prefect on top (we're trying to reuse gpu/pydata/etc. deps across all envs, so won't use prefect base). we're using standard docker monitoring etc. stacks, so hoping for some in-inprocess rest endpoint we can just poll on.

Leo Meyerovich (Graphistry)

04/11/2020, 12:36 AM

looking at the

Dockerfile

, I see:

Copy code

RUN prefect backend server

Leo Meyerovich (Graphistry)

04/11/2020, 12:37 AM

(I also see

prefect agent start

, but the intent here is operating the executor, not the agent)

Leo Meyerovich (Graphistry)

04/11/2020, 12:38 AM

(my understanding is

executor

= task runner =

backend server

, while

agent

is a client interface for submitting a job... still learning the lingo!)

Chris White

04/11/2020, 12:56 AM

when you’re talking about “executor health” what are you referring to? The only executor type that is long-lasting is a Dask executor but you don’t appear to be doing anything with Dask

Leo Meyerovich (Graphistry)

04/11/2020, 1:03 AM

We have a growing number of compute servers. Each one runs a docker container with

prefect

installed, and afaict,

prefect

will poll the central server for new tasks. We can make sure the docker container itself stays running, but not sure how to tell if

prefect

gets wedged .

Chris White

04/11/2020, 1:04 AM

It sounds like you’re running multiple Prefect agents on various servers; what type of agents are you running? Local Agents?

Leo Meyerovich (Graphistry)

04/11/2020, 1:06 AM

Not sure. The prototype has

RUN prefect backend server

. Ultimately these will be

<http://rapids.ai|rapids.ai>

tasks + neural network stuff. We're not using prefect for its dask capabilities, just task dispatch & reporting.

Leo Meyerovich (Graphistry)

04/11/2020, 1:07 AM

Happy to follow recommendations!

Leo Meyerovich (Graphistry)

04/11/2020, 1:07 AM

https://github.com/TheDataRideAlongs/ProjectDomino/pull/60

Chris White

04/11/2020, 1:07 AM

prefect backend server

doesn’t really do anything, it just updates your local user configuration to point to

localhost:4200

for the API instead of Cloud

Leo Meyerovich (Graphistry)

04/11/2020, 1:08 AM

Ah I see our entrypoint.sh also has:

Copy code

# Keep the container running
prefect agent start

Chris White

04/11/2020, 1:08 AM

got it, yea that’s a Local Agent then 👍

Leo Meyerovich (Graphistry)

04/11/2020, 1:08 AM

But we can switch to whatever other agent, we're trying to figure out the right way to do it

Leo Meyerovich (Graphistry)

04/11/2020, 1:09 AM

(we already have the central UI server running elsewhere behind a bastion, and using VPC rules to allow direct central UI <> executor flows)

Chris White

04/11/2020, 1:10 AM

yea, local agent is perfectly fine — the local agent will submit flows to run in subprocesses. Generally we recommend using

supervisord

to manage the parent process running the agent

Leo Meyerovich (Graphistry)

04/11/2020, 1:11 AM

https://docs.prefect.io/orchestration/tutorial/multiple.html#install-a-supervised-agent

Leo Meyerovich (Graphistry)

04/11/2020, 1:12 AM

So the dockerfile should install supervisor, run the parent agent via that. Except does the parent agent have a healthcheck in case it gets wedged?

Chris White

04/11/2020, 1:12 AM

yup exactly

Chris White

04/11/2020, 1:12 AM

unfortunately not natively; supervisord might though

Leo Meyerovich (Graphistry)

04/11/2020, 1:13 AM

Yeah supervisord has a healthcheck framework, but it still comes down to the prefect agent process having some way of doing a "oh hi, yep just checked, I am indeed OK for what I consider OK to mean"

Chris White

04/11/2020, 1:13 AM

we could definitely look into that as an enhancement though!

Chris White

04/11/2020, 1:14 AM

yea we could definitely add that, care to open an issue on GitHub?

Leo Meyerovich (Graphistry)

04/11/2020, 1:14 AM

Yep!

Chris White

04/11/2020, 1:14 AM

awesome thank you Leo!

Leo Meyerovich (Graphistry)

04/11/2020, 1:15 AM

local agent

, right?

Chris White

04/11/2020, 1:16 AM

yup yup

Leo Meyerovich (Graphistry)

04/11/2020, 1:27 AM

https://github.com/PrefectHQ/prefect/issues/2313 There are fancier forms of this, so just wrote the bog simple & standard one :)

💯 1

Open in Slack

Previous Next