My question is regarding Storage. How does anythin...
# ask-community
t
My question is regarding Storage. How does anything change if I add
flow.storage=Local()
to my Hello world flow ? Everything seems to be the same
a
@Tilak Maddy So the storage is relevant when you want to deploy your flow to Prefect cloud or Prefect Server. Due to the hybrid execution model, Prefect doesn’t store your code or data and therefore the agent backend needs to know where can it find the flow definition. With local agent and local storage by default all flows that you register get a host name label attached to it. So using this flow and then registering with CLI:
Copy code
prefect register --project p -p flows/local_storage_local_run_implicit.py
will automatically assign a Local storage with the provided local path
Copy code
flow.storage = Local(path="flows/local_storage_local_run_implicit.py")
It will also assume LocalRun run configuration with your host name label.
Copy code
flow.run_config = LocalRun(labels=["yourhostname"])
When you then start an agent on the same machine:
Copy code
prefect agent local start
it will assign your host name label to the agent and this is how it will match it with the flow because the LocalRun used the same label. The reason for the default hostname label is that only a Local agent on the same machine can run this flow because this specific Local storage path wouldn’t exist on another machine (say an EC2 instance on AWS). Does it answer your question?
t
Okay I understood the concept @Anna Geller. Your explanation is wonderful ! I have doubt tho - inspired from this file https://github.com/anna-geller/packaging-prefect-flows/blob/master/flows/github_local_run.py I have created my own flow .
import prefect
from prefect import task, Flow
from prefect.storage import GitHub
from prefect.run_configs import LocalRun
@task
def gitbye():
logger = prefect.context.get("logger")
<http://logger.info|logger.info>("Git bye world!")
with Flow("git-bye-flow") as flow:
gitbye()
flow.storage = GitHub(
repo="XXX/test-repo",
path="github_flow.py",
access_token_secret="XXX"
)
flow.run_config = LocalRun(labels=["dev"])
flow.register(project_name="tutorial")
This file exists in my local machine as well as the Github Repo . However when I proceed to trigger a run from prefect cloud, my agent picks it up but then the flow "failed" Here's what I found An import error , that I don't know how to fix. Could you assist me please ?
a
Sure! To use GitHub storage, you would need to install Prefect with Github extra. The reason why you get the error is that the Github package is not installed in your agent’s environment. Installing:
Copy code
pip install "prefect[github]"
And then restarting the agent so that the agent now has this package in its environment:
Copy code
prefect agent local start --label dev --no-hostname-label
should fix the issue
❤️ 1
t
Thanks a ton @Anna Geller Github storage and local run has worked for me by your help,...... now I am onto trying .......Github storage and vertex agent inspired from https://github.com/anna-geller/packaging-prefect-flows/blob/master/flows/github_vertexrun.py issue I ran into is IAM PERMISSION DENIED. (Attached screenshot) I think it has to do with the permissions set up on the service account in GCP. I don't know much about that. Please help me. ! How do i fix it ? We are registered as an org. on GCP however I am not the owner so I need to know what permissions to request. To be honest I dont know much about I A M policies. Too hard to understand. All I want is a vertex agent in production that can run the flows
prefect agent vertex start --label dev --service-account <mailto:XXX@XXX.iam.gserviceaccount.com|XXX@XXX.iam.gserviceaccount.com>
Above command runs without errors but when I trigger a run from the prefect cloud, i get errors
a
@Tilak Maddy I actually was able to start the Vertex agent without providing a service account to the Vertex agent, but instead providing a GCP project:
Copy code
prefect agent vertex start --project prefect-community
t
Hey @Anna Geller ! Actually i think you missed the message before that. It was a continuation . https://prefect-community.slack.com/archives/CL09KU1K7/p1638470843234100?thread_ts=1638446024.174500&amp;cid=CL09KU1K7
The problem was not with starting the vertex agent, it was with getting the flow to run when i trigger it in the prefect cloud ui
a
so the reason why your flow doesn’t run is probably because the agent is not configured correctly, right? what error do you get exactly? what is your run_config?
t
flow.run_config = VertexRun(labels=["dev"])
Since its a 403 permission denied error, like you said I think it has to do with the vertex agent config.
@Anna Geller here's the stack trace, case it helps ....
raise exceptions.from_grpc_error(exc) from exc
google.api_core.exceptions.PermissionDenied: 403 Permission 'aiplatform.customJobs.create' denied on resource '<//aiplatform.googleapis.com/projects/edilitics1/locations/us-central1>' (or it may not exist). [reason: "IAM_PERMISSION_DENIED"
domain: "<http://aiplatform.googleapis.com|aiplatform.googleapis.com>"
metadata {
 
key: "permission"
 
value: "aiplatform.customJobs.create"
}
metadata {
 
key: "resource"
 
value: "projects/edilitics1/locations/us-central1"
}
]
a
the Service account is wrong. Can you try this? 1. Create a Service Account and JSON key e.g. with Basic Owner permissions 2. Download the json key from GCP 3. Run commands:
Copy code
export GOOGLE_APPLICATION_CREDENTIALS="/Your/path/to/key.json"
pip install prefect --upgrade
pip install "prefect[gcp]" --upgrade
prefect agent vertex start --project prefect-community --label vertex
or for you the label dev instead of Vertex then.
t
prefect agent vertex start --project prefect-community --label vertex
Hey the project here refers to the project name in your gcp right ? Anyways I'll try it and get back. Thanks for all the help so far
a
correct, this is the GCP project name
t
@Anna Geller thanks a lot, I followed what you said above and was able to successfully run the project on a vertex agent that pulls flow storage from Github. This is super exciting !! But we have a slight problem here. I don't want to keep this -
prefect agent vertex start --project prefect-community --label vertex
command running on my local machine forever. I just want vertex to be running forever as an agent on GCP waiting for cloud to tell it when to trigger a run . Then it should fetch the file from Github and execute the flow. What do I do ?
@Anna Geller I am asking this question by looking at this

Diagram

a
@Tilak Maddy it’s a great question! many agents such as Kubernetes or ECS agents have an option to run the agent process itself within a cluster. Vertex, however, doesn’t have it. So the quickest way to make this agent run 24/7 would be to spin up a small VM on GCP, install prefect and supervisor there, and run the agent as a daemon process on a VM. The supervisor part is explained here for local agent, but will work the same way for Vertex
❤️ 1
t
Okay let me try that I'll get back to you. Thanks again
👍 1
but will work the same way for Vertex
Hey @Anna Geller unfortunately it did not work ! Here is the command I ran
prefect agent vertex install --label dev --service-account <mailto:XXX@XXX.iam.gserviceaccount.com|XXX@XXX.iam.gserviceaccount.com>
Copy code
Usage: prefect agent vertex [OPTIONS] COMMAND [ARGS]...
Try 'prefect agent vertex -h' for help.

Error: No such command 'install'.
a
@Tilak Maddy the install command is only to generate the supervisor file. You can use the one for local agent, but within the generated supervisor file, you would then change the "prefect agent local start" by "prefect agent vertex start --label dev"
t
~Hey @Anna Geller I did that, the command was running in the background as a daemon process . It all worked even upon closing the SSH terminal (which GCP offers). However, after 5 hours when I looked at the prefect cloud dashboard > agents , it says that the last query time was 4 hours 36 minutes ago. So the background process was clearly only running for 20 mins after leaving the ssh session. So I ran the command
supervisorctl
Then ironically I saw that the prefect-agent process was in
RUNNING
mode. I typed
fg prefect-agent
. to foreground the process and everything became normal. But look, I want it to run no matter what. I feel like this is a common issue and i can't seem to find the fix. Help me
🙏 please ?
Jesus ! I was wrong ! its funny, I'll tell you what - it is a prefect cloud UI bug. When I closed the lid of my laptop and reopened after 4~5 hours it turns out that the cloud doesn't update the latest queried time accurately. So I had to refresh and then it shows the correct last queried time. which lies in the range of 0 - 10 seconds
a
Nice work! Yeah the UI generally refreshes automatically, but sometimes a manual refresh can be helpful.
133 Views