Hi, I'm trying to setup my prefect backend and one...
# prefect-server
h
Hi, I'm trying to setup my prefect backend and one agent to be ever-running on my EC2 instance, docker is already automatically starting and since restart is set to true in docker-server's compose.yml, prefect server also automatically starts ever since I launched it once with --detach. But how do I make the agent automatically start on boot too ? according to the docs I could use supervisor with the local agent, is there any alternative using the docker agent ? Should I make my own compose.yml for the agent ?
z
@Hugo Polloli You can definitely add an agent container to the docker-compose file. You can run
prefect server config
to get a configured compose file then modify it to include an agent service or you can just have a separate compose file.
k
Hi @Hugo Polloli, I think you can also look into startup scripts.
h
Hi thanks to you both ! I'll look into both of these solutions 🙂
i
@Hugo Polloli Hi Hugo. How did you get Prefect server to run on your EC2? I've tried Prefect backend server and prefect server start, but I cannot access the UI on the publicIP:8080
k
If you SSH into the EC2, can you see it on localhost:8080?
h
If you try the above and can see on localhost:8080, then did you add a security group to your EC2 instance that opens both 8080 and 4200 ports to the public ?
👍 1
upvote 1
i
hi @Hugo Polloli I have tried with opening ports 8080 and 4200 to the public. Funny thing is, on ubuntu it didnt work, but on linux it works. Maybe there's something I'm missing... Is there a post / thread / article you have followed to document the whole process from scratch ?
h
Just in case, here're my inbound rules for the prefect security group I created,
k
Hey @Ile Lee, I think what probably happened is the server containers did not start properly. Could you check the logs and see if everything is working? Or check the UI on localhost on the ubuntu server?
i
@Hugo Polloli @Kevin Kho I've tried again from scratch. There aren't many people who have deployed prefect server on AWS, and since you have, could you correct me if any of my steps are wrong? Sorry I'm still quite new to this field. This is what I've done in the N.Virginia region (us-east-1): Step 1: Created VPC • with CIDR block 10.100.0.0/16 Step 2: Created Subnet • in us-east-1b • with CIDR block 10.100.1.0/24 • enabled auto assign public IPv4 address Step 3: Created Internet Gateway • then attached it to the VPC Step 4: Created Route Table • Added destination 0.0.0.0 to target the internet gateway • In subnet association, added the subnet to explicit subnet association Step 5: Created EC2 • Also created Security Group - added inbound rules TCP 8080 (for prefect UI), TCP 4200 (for graphql), TCP 22 (for SSH) to anyone (0.0.0.0/0) • Downloaded a new key pair Step 6: SSH into the EC2 instance • updated packages • installed docker, docker-compose, prefect (ran sudo service docker start, so docker ps works) Step 7: Created a directory in /home/ec2-user/.prefect Step 8 : Created a config.toml file in /home/ec2-user/.prefect with the following inside it
Copy code
[server]
       [server.ui]
       apollo_url = "<http://public-ip:4200/graphql>"
       graphql_url = "<http://public-ip:4200/graphql>"
From there: prefect backend server prefect server start The server starts and says you can visit in localhost:8080 but when I go to my-public-ip:8080, the prefect UI doesn't show I'm sorry for such a long explanation, I just can't figure out why I can't get it running :(
I am using Linux AMI 2
h
Did you apply the security group created at step 5 to the EC2 instance ? See screenshot, then add it. If you already have then sorry but I don't know what else to check 😕
i
I have 😞 Which AMI are you using?
k
Can you confirm localhost 4200 and 8080 are up when you ssh just so we know if it’s a networking issue or issue with spinning those up? Can you check both because the UI image had an issue and was fixed 2 days ago so someone reported 4200 working but now 8080
i
@Kevin Kho Hi. Sorry I wasn't sure exactly how to do it so I googled it and typed
Copy code
nmap localhost
Output was as follows: PORT STATE SERVICE 22/tcp open ssh 3000/tcp open ppp 5432/tcp open postgresql 8080/tcp open http-proxy
It doesn't seem to have port 4200
k
Maybe you can try hitting the API with
prefect get flows
?
Do you also have a browser you can use to open localhost:8080?
i
by localhost:8080 do you mean the Linux localhost?
Because on my physical computer I go to the browser, type ec2-instance-public-ip:8080
k
Yep 8080 is the UI so I’m wondering if you can open the UI with a web browser on the EC2 instance so that we know if it’s a networking issue or the container didn’t spin up right. If you SSH into the instance,
localhost:8080
should show the browser. Btw, I should just mention because people don’t know this but Prefect Cloud has 10000 free task runs per month also, which is more than enough to get started.
i
So let's say I've SSH'ed into the EC2
ran Prefect backend server, and prefect server start. lets say my public ec2 instance ip is 3.144.2.10 On firefox I go to 3.144.2.10:8080 This is what it says: Unable to connect Firefox can’t establish a connection to the server at 44.195.83.50:8080. The site could be temporarily unavailable or too busy. Try again in a few moments. If you are unable to load any pages, check your computer’s network connection. If your computer or network is protected by a firewall or proxy, make sure that Firefox is permitted to access the Web.
On the SSH terminal I do get the "Welcome to PREFECT SERVER" "Visit http://localhost:8080 to get started, or check the docs at https://docs.prefect.io
I can't reach the UI for some reason
k
Can you try clearing the Docker images you have and then doing that again? There was a bug in the UI image fixed 2 days ago. If your UI image tagged as 0.15.3?
Any logs in the UI container?
i
I see 0.15.3 !
docker images shows core-0.15.3
k
Delete those images to force the re-download, and then do the
prefect server start
again since we fixed the image two days ago so you’ll be sure to pull the right image this time.
i
So I tried deleted the images using
Copy code
docker image prune -a
then ran prefect server start but the images still have the 0.15.3
Am I doing something wrong?
Sorry I'm a bit of a noob
k
If you did the
prefect server stop
, I think prune should work to remove those images. The updates are still tagged as 0.15.3, they were updated. The important thing is you should see new images being downloaded when you do
prefect server start
i
Awesome, it is pulling and downloading the images
when i type docker images it still has the tag 0.15.3
k
That’s good. I would suggest watching the logs and monitoring if everything is starting successfully, if there’s no errors, you can try the UI at
localhost:8080
again, and hopefully it works. The UI image logs should also tell you if it started successfully. You can find that container with docker, and then get the logs for that specific container.
i
towel_1 is throwing errors. Prefect server Lazarus, prefect server Scheduler, prefect server zombie killer
field not found in type query_root
k
Could you post your error message here? I think that the UI will at least be up 😅
i
Copy code
towel_1     | {"severity": "ERROR", "name": "prefect-server.Lazarus", "message": "Unexpected error: ValueError([{'extensions': {'path': '$.selectionSet.flow_run', 'code': 'validation-failed'}, 'message': 'field \"flow_run\" not found in type: \\'query_root\\''}])", "exc_info": "Traceback (most recent call last):\n  File \"/prefect-server/src/prefect_server/services/loop_service.py\", line 60, in run\n    await self.run_once()\n  File \"/prefect-server/src/prefect_server/services/towel/lazarus.py\", line 37, in run_once\n    return await self.reschedule_flow_runs()\n  File \"/prefect-server/src/prefect_server/services/towel/lazarus.py\", line 89, in reschedule_flow_runs\n    limit=5000,\n  File \"/prefect-server/src/prefect_server/database/orm.py\", line 501, in get\n    as_box=not apply_schema,\n  File \"/prefect-server/src/prefect_server/database/hasura.py\", line 85, in execute\n    as_box=as_box,\n  File \"/prefect-server/src/prefect_server/utilities/graphql.py\", line 84, in execute\n    raise ValueError(result[\"errors\"])\nValueError: [{'extensions': {'path': '$.selectionSet.flow_run', 'code': 'validation-failed'}, 'message': 'field \"flow_run\" not found in type: \\'query_root\\''}]"}
towel_1     | {"severity": "ERROR", "name": "prefect-server.Scheduler", "message": "Unexpected error: ValueError([{'extensions': {'path': '$.selectionSet.flow', 'code': 'validation-failed'}, 'message': 'field \"flow\" not found in type: \\'query_root\\''}])", "exc_info": "Traceback (most recent call last):\n  File \"/prefect-server/src/prefect_server/services/loop_service.py\", line 60, in run\n    await self.run_once()\n  File \"/prefect-server/src/prefect_server/services/towel/scheduler.py\", line 46, in run_once\n    offset=500 * iterations,\n  File \"/prefect-server/src/prefect_server/database/orm.py\", line 501, in get\n    as_box=not apply_schema,\n  File \"/prefect-server/src/prefect_server/database/hasura.py\", line 85, in execute\n    as_box=as_box,\n  File \"/prefect-server/src/prefect_server/utilities/graphql.py\", line 84, in execute\n    raise ValueError(result[\"errors\"])\nValueError: [{'extensions': {'path': '$.selectionSet.flow', 'code': 'validation-failed'}, 'message': 'field \"flow\" not found in type: \\'query_root\\''}]"}
towel_1     | {"severity": "ERROR", "name": "prefect-server.ZombieKiller", "message": "Unexpected error: ValueError([{'extensions': {'path': '$.selectionSet.task_run', 'code': 'validation-failed'}, 'message': 'field \"task_run\" not found in type: \\'query_root\\''}])", "exc_info": "Traceback (most recent call last):\n  File \"/prefect-server/src/prefect_server/services/loop_service.py\", line 60, in run\n    await self.run_once()\n  File \"/prefect-server/src/prefect_server/services/towel/zombie_killer.py\", line 216, in run_once\n    await self.reap_zombie_task_runs()\n  File \"/prefect-server/src/prefect_server/services/towel/zombie_killer.py\", line 153, in reap_zombie_task_runs\n    apply_schema=False,\n  File \"/prefect-server/src/prefect_server/database/orm.py\", line 501, in get\n    as_box=not apply_schema,\n  File \"/prefect-server/src/prefect_server/database/hasura.py\", line 85, in execute\n    as_box=as_box,\n  File \"/prefect-server/src/prefect_server/utilities/graphql.py\", line 84, in execute\n    raise ValueError(result[\"errors\"])\nValueError: [{'extensions': {'path': '$.selectionSet.task_run', 'code': 'validation-failed'}, 'message': 'field \"task_run\" not found in type: \\'query_root\\''}]"}
The UI doesn't connect either
😞
aswell, when running
Copy code
nmap localhost
shouldn't there be one for port 4200 ?
k
I guess? Looks like the API didn’t spin up properly .what AMI are you using? i’ll give this a shot.
i
I've tried with Linux Ami 2
And then with Ubuntu
The current setup is with ubuntu
k
So I tried this and stuff is starting. I can access the API from my local machine, but not the UI, so it seems like there might still be something wrong there. Gonna have to ask the team and get back to you on Monday, but
towel
and the
api
should at least start. I did this on t5.large Linux AMI 2. Just had to get docker and docker compose working and then
prefect server start
.
i
I appreciate this so much, thank you.
k
Hey @Ile Lee, I think the easiest thing to try is to use the 0.15.2 image like
prefect server start --version core-0.15.2
i
@Kevin Kho Hi. Okay I shall try that. I've also tried locally on my pc, and somehow I can reach the UI, It just seems like a problem when trying on EC2
k
you do reach it on
localhost
or with an IP address?
i
On the computer I reached on localhost:8080
On the EC2 I used ip-address:8080
I've tried.
Copy code
prefect server start --version core-0.15.2
Doesn't work 😕 I've tried with other versions aswell. The closest I got was to get the UI's Navigation bar. I messed around with it and then it stopped working again... Will try again
k
What do you mean with the UI Navigation bar? Does the API work (port 4200)?
i
I tried different versions using prefect server start --version <version-number> There was one version where I could see the blue bar on top with the Prefect Logo and some navigation links. But nothing else on the main page
k
Hard to say what is going on. Honestly I would recommend a fresh start and using
prefect server start --version core-0.15.2
. This should work. For you setup, it seems like some stuff is not working right. Do you have logs you can share? I used the Linux AMI 2 and t5.large when I got it working.
i
Ohhhh myyyyy goshhh!!! It works!!!
I started it from scratch and used a t3.large instance with Linux AMI 2 this time. Rather than a t2.micro. I wonder why though... Could it be that the instance was too small and low compute?
Thank you so much Kevin by the way for all your help and time !
k
I think so. You probably didnt have enough resources to spin everything up. Thanks for the patience, and 0.15.3 was re tagged yesterday I think so it might work already.
i
Thank you for all the help!