https://prefect.io logo
Setting up an EC2 agent service
j

Jarvis Stubblefield

12/07/2022, 6:01 PM
So I noticed this last night as I was getting everything setup, but it would seem that my Work Queue is
unhealthy
… I went back and ensured that my starting of the agent in production was using the correct queue. It appears to be spelled correctly and everything. However, the
unhealthy
bit has never gone away and it doesn’t have a
last polled date
.
1
I now have a flow that is 3 hours late in running because nothing picked it up. I’m working to resolve. Currently, in production the prefect API values are done through environment variables and not configured in the CLI.
k

Kalise Richmond

12/07/2022, 6:03 PM
Do you have logs from the agent you can share? Also have you set
"PREFECT_LOGGING_LEVEL": "DEBUG"
on the agent? That can help us identify more information.
j

Jarvis Stubblefield

12/07/2022, 6:03 PM
I did create a profile on the CLI (which just has a name and no values). Could it be looking at the profile and then not properly connecting since it doesn’t have the API Key / Workspace URL?
I’ll pop into the server and restart the agent…
k

Kalise Richmond

12/07/2022, 6:04 PM
It definitely could be that. The unhealthy means that there is no connection between the agent and the workqueue which explains the Late flows
j

Jarvis Stubblefield

12/07/2022, 6:04 PM
So I can set the logs…
I made the profile so it would stop complaining that it didn’t exist…
lol
:melting_face: 1
I can populate it with the appropriate values, but was just trying to let the ENV handle that…
k

Kalise Richmond

12/07/2022, 6:07 PM
Oh I just found a new doc I've never read that might be helpful. Looks like there can be conflicts with the profile and environment variables. https://docs.prefect.io/concepts/settings/#conflicts-with-environment-variables
j

Jarvis Stubblefield

12/07/2022, 6:09 PM
Seems so… I have proven this by going ahead and running the agent directly within the environment, and it worked. Running as a Systemd service is where the issue is.
It just picked up the late run and ran it when I did the following…
$ PREFECT_LOGGING_LEVEL=DEBUG prefect agent start --work-queue "tenzinga_django_prod"
Starting v2.7.0 agent connected to <https://api.prefect.cloud/api/accounts/5626ffe9-0140-4e88-babc-4a4fc614bb99/workspaces/ee8a533d-2754-420e-87f2-2d6b084984af>...
18:07:52.975 | DEBUG   | prefect.client - Connecting to API at <https://api.prefect.cloud/api/accounts/5626ffe9-0140-4e88-babc-4a4fc614bb99/workspaces/ee8a533d-2754-420e-87f2-2d6b084984af/>

  ___ ___ ___ ___ ___ ___ _____     _   ___ ___ _  _ _____
 | _ \ _ \ __| __| __/ __|_   _|   /_\ / __| __| \| |_   _|
 |  _/   / _|| _|| _| (__  | |    / _ \ (_ | _|| .` | | |
 |_| |_|_\___|_| |___\___| |_|   /_/ \_\___|___|_|\_| |_|


Agent started! Looking for work from queue(s): tenzinga_django_prod...
18:07:52.979 | DEBUG   | prefect.agent - Checking for scheduled flow runs...
18:07:52.980 | DEBUG   | prefect.agent - Checking for cancelled flow runs...
18:07:53.375 | INFO    | prefect.agent - Submitting flow run '5132e016-5fc1-4422-8b4f-31cbb6f3d808'
18:07:53.941 | INFO    | prefect.infrastructure.process - Opening process 'hypnotic-buffalo'...
18:07:53.942 | DEBUG   | prefect.infrastructure.process - Process 'hypnotic-buffalo' running command: /home/ec2-user/.local/share/virtualenvs/ppower-spzYstDT/bin/python -m prefect.engine in /tmp/tmpl3iv1780prefect
18:07:54.072 | INFO    | prefect.agent - Completed submission of flow run '5132e016-5fc1-4422-8b4f-31cbb6f3d808'
/home/ec2-user/.pyenv/versions/3.10.8/lib/python3.10/runpy.py:126: RuntimeWarning: 'prefect.engine' found in sys.modules after import of package 'prefect', but prior to execution of 'prefect.engine'; this may result in unpredictable behaviour
  warn(RuntimeWarning(msg))
18:07:56.137 | DEBUG   | prefect.client - Connecting to API at <https://api.prefect.cloud/api/accounts/5626ffe9-0140-4e88-babc-4a4fc614bb99/workspaces/ee8a533d-2754-420e-87f2-2d6b084984af/>
18:08:00.974 | DEBUG   | prefect.agent - Checking for scheduled flow runs...
18:08:04.174 | DEBUG   | prefect.agent - Checking for cancelled flow runs...
18:08:11.022 | DEBUG   | prefect.agent - Checking for scheduled flow runs...
18:08:12.059 | DEBUG   | prefect.agent - Checking for cancelled flow runs...
18:08:20.186 | DEBUG   | prefect.agent - Checking for cancelled flow runs...
18:08:22.396 | DEBUG   | prefect.agent - Checking for scheduled flow runs...
The flow run happens a bit below that… but it is working when executed manually in the environment.
@Kalise Richmond I do not explicitly set the profile in the
systemd
service…
So that “collision” shouldn’t be the issue.
It would seem to me that maybe be it’s not picking up the correct environment setup.
The runtime warning in the above output was odd and I’m not sure why it is complaining…
k

Kalise Richmond

12/07/2022, 6:16 PM
How do you define the systemd service? I found this discourse that Ryan wrote with an example. https://discourse.prefect.io/t/how-to-run-a-prefect-2-agent-as-a-systemd-service-on-linux/1450
j

Jarvis Stubblefield

12/07/2022, 6:19 PM
Here is my example…
[Unit]
Description=prefect agent daemon
After=network.target

[Service]
PIDFile=/run/prefect_agent/pid
User=ec2-user
Group=ec2-user
RuntimeDirectory=prefect_agent
WorkingDirectory=/home/ec2-user/tenzinga/ppower
ExecStart=/home/ec2-user/.local/share/virtualenvs/ppower-spzYstDT/bin/prefect agent start --work-queue "tenzinga_django_prod"
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID
Restart=always
PrivateTmp=true

[Install]
WantedBy=multi-user.target
Though I’m thinking I might need to change the
ExecStart
to something more like this …
pipenv run prefect agent start --work-queue "tenzinga_django_prod"
… that’ll ensure the environment variables are set I think…
k

Kalise Richmond

12/07/2022, 6:23 PM
Ya I'm just looking at an example I wrote a long time ago and this is what I had:
sudo touch /etc/systemd/system/prefect-agent.service
echo "[Unit]
Description=Prefect Agent
After=network.target
StartLimitIntervalSec=0
[Service]
Type=simple
Restart=always
RestartSec=1
User=ec2-user
ExecStart=~/.local/bin/prefect agent start 'kr-test'
[Install]
WantedBy=multi-user.target " >> /etc/systemd/system/prefect-agent.service
j

Jarvis Stubblefield

12/07/2022, 6:24 PM
So I need a path to pipenv … let me get that in the right spot … 😉
👍 1
k

Kalise Richmond

12/07/2022, 6:24 PM
I think I had prefect installed at
/home/ec2-user/.local/bin/prefect
on the ec2 instance
are you able to see anything about the status or logs from systemd when you start the daemon?
systemctl status prefect-agent.service
j

Jarvis Stubblefield

12/07/2022, 6:27 PM
Yea, right now it’s complaining about my pathing not being absolute…
Okay … when I run the command
/home/ec2-user/.pyenv/shims/pipenv run prefect agent start --work-queue tenzinga_django_prod
it works and starts the agent… however, now when I try to start the
systemd
service and then check the status, this is what I get…
[ec2-user@ip-172-31-31-75 ppower]$ systemctl status prefect_agent.service
● prefect_agent.service - prefect agent daemon
   Loaded: loaded (/etc/systemd/system/prefect_agent.service; enabled; vendor preset: disabled)
   Active: failed (Result: start-limit) since Wed 2022-12-07 18:28:20 UTC; 2s ago
  Process: 30777 ExecStop=/bin/kill -s TERM $MAINPID (code=exited, status=0/SUCCESS)
  Process: 31406 ExecStart=/home/ec2-user/.pyenv/shims/pipenv run prefect agent start --work-queue tenzinga_django_prod (code=exited, status=127)
 Main PID: 31406 (code=exited, status=127)

Dec 07 18:28:19 ip-172-31-31-75.us-west-2.compute.internal systemd[1]: Unit prefect_agent.service entered failed state.
Dec 07 18:28:19 ip-172-31-31-75.us-west-2.compute.internal systemd[1]: prefect_agent.service failed.
Dec 07 18:28:20 ip-172-31-31-75.us-west-2.compute.internal systemd[1]: prefect_agent.service holdoff time over, scheduling restart.
Dec 07 18:28:20 ip-172-31-31-75.us-west-2.compute.internal systemd[1]: Stopped prefect agent daemon.
Dec 07 18:28:20 ip-172-31-31-75.us-west-2.compute.internal systemd[1]: start request repeated too quickly for prefect_agent.service
Dec 07 18:28:20 ip-172-31-31-75.us-west-2.compute.internal systemd[1]: Failed to start prefect agent daemon.
Dec 07 18:28:20 ip-172-31-31-75.us-west-2.compute.internal systemd[1]: Unit prefect_agent.service entered failed state.
Dec 07 18:28:20 ip-172-31-31-75.us-west-2.compute.internal systemd[1]: prefect_agent.service failed.
Oddly enough, I use a command similar to my first posting to start
gunicorn
to serve the web application. It picks up the appropriate environment variables…
k

Kalise Richmond

12/07/2022, 6:40 PM
so weird. Ok so at least we know that the path is correct for starting the ExecStart command. Now to figure out why the systemd is failing
oh where did you store the service? It seems like its still exiting because it cannot find the command
j

Jarvis Stubblefield

12/07/2022, 6:43 PM
Yes the
systemd
service still cannot find the command.
If it is executing as
ec2-user
and within the
working directory
it should find the command no problem…
As I am logged in as
ec2-user
and in that working directory … I know that’s not exactly how these services work, but still…
The service is saved in
/etc/systemd/system/prefect_agent.service
where I place my other service files.
I’m not certain that it would pick up the appropriate values from the config, but maybe I just need to set the config values and call it… I would prefer to get it working from the environment…
k

Kalise Richmond

12/07/2022, 6:51 PM
I think you should be able to use the environment variables no problem. From the place you saved the service, can you run the ExecStart command manually and does that start the agent?
j

Jarvis Stubblefield

12/07/2022, 7:00 PM
Both the original command above and the command I changed it to run the agent and connect to Prefect Cloud
However, the systemd service does not. 😞
If I change it back to the original pathing it doesn’t error… it runs, just not connecting to the cloud as those environment variables are within the virtualenv setup with
pipenv
which is why I thought maybe I should use the feature of
pipenv run
to get that working.
Okay… this might be part of it…
I have python set to 3.10.8 within my project directory…
However, I don’t believe I have pipenv installed in that version of python.
Well it says I do…
When I run
pyenv whence pipenv
it doesn’t list Python 3.10.8 as having
pipenv
however, if I attempt to install it, it says it already exists in this version.
Okay … so I THINK the solution was…
I installed
pipenv
into the environment it was managing… then I could use the
bin
directory of the virtual environment to do the following…
Working
systemd
file…
[Unit]
Description=prefect agent daemon
After=network.target

[Service]
PIDFile=/run/prefect_agent/pid
User=ec2-user
Group=ec2-user
RuntimeDirectory=prefect_agent
WorkingDirectory=/home/ec2-user/tenzinga/ppower
ExecStart=/home/ec2-user/.local/share/virtualenvs/ppower-spzYstDT/bin/pipenv run prefect agent start --work-queue "tenzinga_django_prod"
ExecReload=/bin/kill -s HUP $MAINPID
ExecStop=/bin/kill -s TERM $MAINPID
PrivateTmp=true

[Install]
WantedBy=multi-user.target
Adding
Restart=always
back on there…
Well @Kalise Richmond thanks for being my rubber ducky and providing input and suggestions!! 😄
🙌 1
Seems it is up and running… the queue is showing healthy … excited to see runs happen automatically!
Tomorrow morning will likely be the first…
I think I need to adjust the timing for the schedule as it is running at 8:45 in the morning instead of 14:45 in the afternoon… oddly enough that is my timezone difference… which I didn’t take into account.
k

Kalise Richmond

12/07/2022, 7:28 PM
Yay! I'm so glad you got this working :marvin: