https://prefect.io logo
Title
d

Darragh

05/26/2020, 3:35 PM
Hey guys, 2 questions for today… • Do you know why I would get an output error with the following format? Please note the dodgy URL printed does not contain http, which is most likely the cause for failure but I’m not sure what it’s trying to find.. “Invalid URL ‘1.2.3.4/graphql/alpha’” • Am I right in assuming that if I want to to run a very simple flow built into Local Storage and with LocalEnvironment, I register it with my server running on amazon, the local executor agent should just pick it up? Or am I missing a magic step..?
z

Zachary Hughes

05/26/2020, 3:37 PM
Hi @Darragh! Could you give a bit more detail about where you're seeing the error about the invalid URL? As for running a simple local flow, you're correct. You should be able to register your flow with
flow.register()
, run a local agent, and have it pick it up. If you're seeing issues with that, the first thing I'd check is whether the labels on your flow match the labels for the agent you're running.
d

Darragh

05/26/2020, 3:39 PM
Labels could be it alright, I haven’t done/changed/added any. Are there recommended sets, or guidelines?
Regarding the error, I see it in the log output from
prefect agent start docker
Actually I missed a piece from the error, might be more helpful…
Invalid URL ‘1.2.3.4/graphql/alpha’: No schema supplied. Perhaps you meant http://1.2.3.4/graphql/alpha?
z

Zachary Hughes

05/26/2020, 3:41 PM
When you register your flow, it's auto-labeled with information about the computer from which it was registered. If you spun up your agent on the same computer, those labels should match, but 9 times out of 10 scheduled runs not being picked up by the local agent ends up being a label mismatch. 🙂
d

Darragh

05/26/2020, 3:42 PM
That’ll be the problem then - local machine calls registers it to remote machine
Any helpful hints on what they should be?? 🙂
z

Zachary Hughes

05/26/2020, 3:44 PM
They'll be specific to the machine they were registered on. You should be able to navigate to the flow in your UI and see them no problem. Here's an example:
It might also be worth calling out that if you're trying to run a Docker agent, you'll want to build your flow with Docker storage. If you do that, the flow won't be auto-labeled and the flow/agent label mismatch issue becomes something of a moot point.
d

Darragh

05/26/2020, 3:50 PM
Good point, thanks! The docker ones are building and associating fine, but at runtime I get the URL error above. There’s some env variable gone bass ackwards I think
w

Will Milner

05/26/2020, 3:51 PM
try setting the value of the graphql_url in your prefect config to the IP of your amazon server
:upvote: 1
[server]      [server.ui]     graphql_url = "http😕/<SERVER_IP>:4200/graphql"
i ran into this problem myself when trying to run a docker agent on aws
d

Darragh

05/26/2020, 3:57 PM
Yeah I have that set in an env variable : PREFECT__SERVER__UI__GRAPHQL_URL=“http://${PUBLIC_IP}:4200/graphql
But I ALSO have: PREFECT__CLOUD__API=$PUBLIC_IP which was missing the http 🙂
Is that 2nd var actually needed?
w

Will Milner

05/26/2020, 3:59 PM
I don't think so, this is what my config looks like
[server]
  
    [server.ui]
        graphql_url = "http://<SERVER_IP>:4200/graphql"

[environments]

    [environments.local]
        services_host = <SERVER_IP>
this is for prefect core btw
d

Darragh

05/26/2020, 4:01 PM
Yeah that’s what I’m using. I don’t have the environments.local piece set, will give that a go too!
No joy on the labels…
z

Zachary Hughes

05/26/2020, 4:16 PM
I could definitely be missing something, but it looks like you're missing the
DeepThought.local
tag on your agent.
d

Darragh

05/26/2020, 4:17 PM
ahhhhhh do I need ALL of the tags?
that one was auto generated, probably need to remove it…
z

Zachary Hughes

05/26/2020, 4:20 PM
Agent labels need to be a superset of the flow's labels. This doc may be useful, in case you haven't seen it. https://docs.prefect.io/orchestration/execution/overview.html#environments
d

Darragh

05/26/2020, 4:23 PM
Yeah I saw that, missed the superset part 🙂 It doesn’t mention anything about removing an auto generated label, is that a thing? the machine regsitering this flows will be a temp container in my CI/CD flow, so I don’t want it added as a label
z

Zachary Hughes

05/26/2020, 4:32 PM
It doesn't look like we have a way to opt out of the auto-generated labels at the moment, but I've opened an issue for that! https://github.com/PrefectHQ/prefect/issues/2653
In the meantime, we can focus on getting Docker storage up and running if that'll help unblock you. FWIW, using Docker storage will likely lead to a smoother CI/CD experience anyways.
d

Darragh

05/26/2020, 4:45 PM
Yeah the docker storage in general is working well, but it has me beating my head against a separate problem 😂 Think I mentioned it previously…
the regsitry the docker agent is pulling from is an AWS ECR, and the login times out every 12 hours. I’m trying to figure a way to get the registry login refreshed, but it’s surprisingly painful…
w

Will Milner

05/26/2020, 4:48 PM
you could just call docker login in your CI/CD scripts, i do that in mine
d

Darragh

05/26/2020, 4:50 PM
Yeah i do that in the CI/C D scripts and it works fine, It’s on the far side I hit the issue - the EC2 instance I register the flows to, and run the docker agent on, will get a timeout after 12 hours from ECR, so if I have a flow scheduled to run more than 12 hours after I register then I’m in trouble 🙂
Maybe you guys have some experience with it, but it’s driving me mad… I create an EC2 instance through cloudformation, and the cloud-init/userdata commands install prefect and do all the necessary config, and an initial login using this command:
$(aws ecr get-login --no-include-email --region eu-west-1)
w

Will Milner

05/26/2020, 4:52 PM
won't the image the flow needs already be pulled down locally though? so no need to hit ECR again
d

Darragh

05/26/2020, 4:55 PM
That one might (at register time), but any subsequent ones won’t, is what my assumption is. I could be thinking about this wrong…. Let me write this out, make sure I have it right
• CI/CD - containerized build, builds each flow into docker, registers with the Amazon server
• Amazon server, prefect core and docker agent running on same EC2 instance
• Core gets notified of new flow, sends to docker agent to pull down
• EC2 instance [where docker agent is running] ]needs to be logged in to ECR to pull the image
• ECR login expires after 12 hours
• After 14 hours, CI/CD regsiters new flow with core on EC2, but the ECR login on EC2 has expired
So it’s my understanding that I need to do some voodoo to maintain that login on the EC2 instance
w

Will Milner

05/26/2020, 4:59 PM
I guess it depends on how you configure your docker storage, but you could set the local_image to True on the docker storage - https://docs.prefect.io/api/latest/environments/storage.html#docker
you'd need your CI/CD pipeline to pull those images to the agent though
d

Darragh

05/26/2020, 5:02 PM
Pipeline is pushing the images out to AWS-ECR from GitLab, so I don’t know how that would work…
Docker agent then running on AWS tries to pull from ECR
I’d like it quite a lot if you were able to tell me I have it all bass ackwards 😂
w

Will Milner

05/26/2020, 5:03 PM
I don't know how much control you have over your CI/CD, but you could have 1 step that pushes to AWS-ECR, and then another step that runs on the agent machine that pulls that image down locally
d

Darragh

05/26/2020, 5:09 PM
Hmmmmmmmmmmm yeah I can see how that would work. I’d need to figure out the magic words for logging to the ec2 box - it’s created on-demand using aws-cdk, so the IP is liable to change 🙂
Would I be right in thinking that FargateTaskEnvironment might solve this?
w

Will Milner

05/26/2020, 5:22 PM
that could work since you could set up your fargate cluster to be able to authenticate to your ECR using an IAM_role
d

Darragh

05/26/2020, 5:25 PM
And ideally it would mean I’m running the tasks on Fargate env rather than on the same EC2 instance running Core
That seems like a more sensible idea than the shenanigans I’m currently trying to solve 😄
Question on Fargate - better practice to start an instance of fargate agent, or do them on demand from the flow and let it die after use?
w

Will Milner

05/26/2020, 5:28 PM
really depends on the entire use case, you'll end up saving money if you just do it on demand though
d

Darragh

05/26/2020, 5:33 PM
👍 Does that hold for scheduled flows as well? If I have a flow, or groups of flows, I need to run on say a weekly schedule, I’d guyess on demand is better. Flow.. (
agent = FargateAgent(...)
agent.start() flow.environment = FargateEnvironment flow.register() )
w

Will Milner

05/26/2020, 5:38 PM
on-demand is generally going to be cheaper than having a long running instance that just sits idle, the biggest downside would be that you need to wait for your instances to spin up when running your flows, so your overall execution time will be longer
d

Darragh

05/26/2020, 5:39 PM
That’s fine, the jobs I’ll be running will be measured in hours anyway, so it’s not an issue.
Is my assumption above correct? I can create an on-demand agent from inside my flow definition?
d

Darragh

05/26/2020, 5:47 PM
The docs suggest the posaibility, but nothing explicit. Ill try it and see 😂
z

Zachary Hughes

05/26/2020, 5:49 PM
Most folks who use the Fargate Agent have it running on an EC2 instance or something similar and submitting jobs to Fargate. As long as Fargate's properly authed with ECR, I think that'd do the trick for you.
d

Darragh

05/26/2020, 5:56 PM
Thanks guys, I’ll have a crack at it and let you know how it goes. @Zachary Hughes Would you have anymore insight into my question on the on-demand agent? Is it possible to start the agent on demand within the flow?
z

Zachary Hughes

05/26/2020, 5:59 PM
It's a bit of a chicken and egg situation there. In order for your flow to be picked up, you'd need an agent to be running. In order for your agent to run, you'd need to have your flow picked up. So what I think you'll want to do is have your agent running constantly, but have your flows execute on Fargate. Whether the agent is running on an EC2 instance or your local machine is up to you-- it just needs proper credentials to submit things to Fargate.
d

Darragh

05/26/2020, 6:05 PM
Fair enough, that clears it up. Thanks guys!
👍 1