Hello everyone raised hands I m having a problem with schedu Prefect Community #ask-community

Hello everyone :raised_hands: I'm having a problem...

Gonzalo

09/01/2021, 6:01 PM

Hello everyone 🙌 I'm having a problem with scheduling flows, all of my flows did not schedule automatically overnight and I can't activate them through the interactive API (When I try to

set_schedule_active

I always get

"success": false

). When I try to manually toggle the flow through the GUI the entire server hangs until I manually kill the GraphQL process on the server. Can anyone help or provide some insight? I'm acting as an interpreter for the tech team so I have to do some back and forth with my questions.

🆘 1

👀 1

Kevin Kho

09/01/2021, 6:14 PM

Hey @Gonzalo, that’s weird. It seems to be performance related on the GraphQL pod? This is Prefect Server right? Are all of your pods healthy? Is the db running out of memory?

upvote 1

Gonzalo

09/01/2021, 6:24 PM

Hey Kevin, yes, it is indeed Prefect Server. We're using Fargate Agent, not Kubernetes, so we're not sure what to answer regarding pods. The tech team says they're not sure what you mean when you refer to a db running out of memory.

Gonzalo

09/01/2021, 6:24 PM

I just recieved this error message from them

Zach Angell

09/01/2021, 6:28 PM

@Gonzalo it looks like they're trying to start Prefect Server but port 5432 is already allocated, so Postgres is unable to start

upvote 1

Zach Angell

09/01/2021, 6:29 PM

If Prefect Server isn't running properly, you will not be able to edit schedules

Gonzalo

09/01/2021, 6:30 PM

Understood, I'll relay the information and I hope that solves it, thanks for the heads up!

👍 1

Gonzalo

09/01/2021, 9:18 PM

They're still having trouble. They restarted Prefect entirely, the following error occurred upon running and none of the agents are running right now

Kevin Kho

09/01/2021, 9:25 PM

The agent not spinning up makes sense if server doesn’t spin up. Did all of the containers spin up healthy when you did

prefect server start

? Still seems like something may have not spin up right

Gonzalo

09/01/2021, 9:31 PM

They're checking the logs right now

Gonzalo

09/01/2021, 9:34 PM

All the containers seem to be ok, this is the latest error

Kevin Kho

09/01/2021, 9:35 PM

This is when starting an agent right? Are you able to access the UI/dashboard?

Gonzalo

09/01/2021, 9:36 PM

Yes, and yes I can access the dashboard

Kevin Kho

09/01/2021, 9:38 PM

Can you try

Copy code

query { hello }

in the interactive API?

Gonzalo

09/01/2021, 9:39 PM

It returns

Copy code

{
  "data": {
    "hello": "👋"
  }
}

Kevin Kho

09/01/2021, 9:40 PM

And when you query for flows, you see stuff?

Copy code

query {flow {
  id
}}

Gonzalo

09/01/2021, 9:40 PM

I see several IDs

Kevin Kho

09/01/2021, 9:41 PM

What kind of agent are you starting? Is that a docker agent?

Gonzalo

09/01/2021, 9:41 PM

It's a Fargate agent

Kevin Kho

09/01/2021, 9:43 PM

Can you show me your config toml? just omit sensitive info

Gonzalo

09/01/2021, 9:43 PM

One second

Gonzalo

09/01/2021, 9:45 PM

Copy code

[server]
  endpoint = "http://<name>:4200"
  request_timeout = 60
  [server.ui]
    apollo_url = "http://<name>:4200/graphql"

where "<name>" is the url

Kevin Kho

09/01/2021, 9:47 PM

Can you try adding

/graphql

after endpoint?

Kevin Kho

09/01/2021, 9:48 PM

Is this the config.toml of the server or agent? I meant for the agent

Gonzalo

09/01/2021, 9:50 PM

That's the server's file, they're telling me the agent has no .toml

Kevin Kho

09/01/2021, 9:52 PM

Is the server on the same machine as the agent?

Gonzalo

09/01/2021, 9:54 PM

The server is on an EC2 instance and the agent is on an ECS machine

Gonzalo

09/01/2021, 9:54 PM

(I'm deeply sorry for this cumbersome game of telephone, but I'm also very grateful for your time)

Kevin Kho

09/01/2021, 9:56 PM

So with this walkthrough, you want

Copy code

[server]
  [server.ui]
    apollo_url = "<http://YOUR_MACHINES_PUBLIC_IP:4200/graphql>"

on the server and

Copy code

[server]
endpoint = "YOUR_MACHINES_PUBLIC_IP:4200/graphql"

for the agent. and also

prefect backend server

of course. Really weird if your agent has no config.toml cuz it looks like it’s hitting a nearly correct endpoint there.

Kevin Kho

09/01/2021, 9:57 PM

And of course! no worries

Gonzalo

09/01/2021, 10:35 PM

I think we're not really understanding the config.toml as far as the agent goes, we only have one for the server (we tried adding "/graphql" to the end of the endpoint url to no avail).

Gonzalo

09/01/2021, 10:37 PM

We have a backend.toml which reads:

Copy code

backend = "server"

that's on the same directory

Kevin Kho

09/01/2021, 10:37 PM

You can make a

config.toml

for the agent also in that same

.prefect

directory

Gonzalo

09/01/2021, 10:47 PM

I think we found something, is GraphQL all on port 4200 by default? Some of our environment variables (such as GRAPHQL_SERVICE_PORT) point to 4201 instead

Gonzalo

09/01/2021, 10:49 PM

Also, making a

config.toml

for the agent under the same directory is confusing since we already have a file with that same name

Kevin Kho

09/01/2021, 10:51 PM

Not inside your working directory. Inside the

.prefect

directory where the

backend.toml

is? You already have a config? GraphQL on port 4201 looks good. But your endpoint will be on 4200

Kevin Kho

09/01/2021, 10:52 PM

They are separate here

Kevin Kho

09/01/2021, 10:54 PM

4200 is apollo, 4201 is the graphql

upvote 1

Gonzalo

09/01/2021, 11:03 PM

Yeah, there's already a

config.toml

there

Zach Angell

09/01/2021, 11:04 PM

@Gonzalo are you able to edit the existing

.prefect/config.toml

file on the agent ECS machine to include Kevin's addition?

Copy code

[server]
endpoint = "YOUR_MACHINES_PUBLIC_IP:4200/graphql"

Zach Angell

09/01/2021, 11:05 PM

If you do have access to an interactive session on the agent machine, can you also confirm the machine can ping the server? This would rule out AWS networking issues that might be blocking the connection

upvote 1

3 Views

Open in Slack

Previous Next