Hi All, a little confused how Prefect agent knows ...
# prefect-community
m
Hi All, a little confused how Prefect agent knows about the flow code. I am using Prefect Cloud and running a local Prefect agent. I have a simple python workflow script which does flow.register(project_name="Prefect Demo") which registers the flow with Cloud and assuming Cloud only has the metadata of the Flow. How does Prefect agent gets to know about the Workflow code?
n
Hi @Milly gupta - when you call
flow.register
, Prefect creates a serialized version of the metadata of your flow , for example where the flow lives and what the task dependency graph looks like and sends it to the API. When your agent queries the API for flows that need to be run, the API returns that serialized flow, which tells the agent where to find your flow code and how to run it. If you haven't specified storage for your flow, it defaults to
Local
storage. Local storage stores a reference to your flow in the form of a directory path, which an agent running on that same machine should be able to retrieve and run. Take a look at our docs on flow storage and the local agent for more information.
m
Ah ok makes sense. So when running agent on a docker container, we need some storage like S3/Azure blob storage?
n
That's correct - local storage would only work if you explicitly put your code in the container
m
ok. Then does flow.register should be run from the same container where Prefect agent is running? What is the recommended way to set up?
n
That depends on what set up you're going for - if you want to run everything in the same container and use local storage then it's probably easiest to run the register step from the container. Otherwise you could use some storage mechanism and you won't need to worry about where you register your flow.
m
Hi @nicholas We talked about this couple of months ago. In my setup I am using module storage but when I register the flow, I can't see any tasks in the flow. My docker container which has prefect agent running can detect the flow but can't see any tasks. I am getting Failed to retrieve task state with error: ClientError([{'path': ['get_or_create_task_run_info'], 'message': 'Expected type UUID!, found ""; Could not parse UUID: ', 'extensions': {'code': 'INTERNAL_SERVER_ERROR', 'exception': {'message': 'Expected type UUID!, found ""; Could not parse UUID: ', 'locations': [{'line': 2, 'column': 101}], 'path': None}}}]) Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/prefect/engine/cloud/task_runner.py", line 154, in initialize_run task_run_info = self.client.get_task_run_info( File "/usr/local/lib/python3.8/dist-packages/prefect/client/client.py", line 1399, in get_task_run_info result = self.graphql(mutation) # type: Any File "/usr/local/lib/python3.8/dist-packages/prefect/client/client.py", line 319, in graphql raise ClientError(result["errors"]) prefect.utilities.exceptions.ClientError: [{'path': ['get_or_create_task_run_info'], 'message': 'Expected type UUID!, found ""; Could not parse UUID: ', 'extensions': {'code': 'INTERNAL_SERVER_ERROR', 'exception': {'message': 'Expected type UUID!, found ""; Could not parse UUID: ', 'locations': [{'line': 2, 'column': 101}], 'path': None}}}]
n
Hi @Milly gupta - how large is your flow?
m
Very small just a hello world, it has only one task
n
Very interesting - is this against Cloud or Server?
m
Cloud
n
Got it, ok so I'll need a little more info: could you share your code and let me know which version of Prefect your flow is registered with and which version of Prefect your agent is running?
m
Yeah let me check
Prefect agent version - CORE VERSION 0.14.15
How can I check cloud version?
This is test_module.py from prefect import task, Flow   @task def say_hello():     print("hello") with Flow("hello-flow") as flow:     say_hello()
n
You can find the version of Prefect your flow is registered with on the flow page of your Cloud account, here:
m
and this is where adding from prefect import Flow from prefect.storage import Module flow = Flow("hello-flow", storage=Module("test_module")) flow.register(project_name="CDP")
Prefect Core Version: 0.14.15
n
Perfect, thank you. Can you try re-registering your flow?
m
Yeah sure
Just did
n
ok can you try running again?
m
I can't see any tasks in it
n
Interesting. Let me look on my end. Can you send a flow id?
m
Can I get flow if rom UI?
Completed flow run submission (id: b6cf236d-6f38-4e84-b60f-90d82e5b4351)
n
Wait I think I know the issue. It looks like you've defined your flow twice, once here:
Copy code
@task
def say_hello():
    print("hello")
with Flow("hello-flow") as flow:
    say_hello()
and another time here:
Copy code
flow = Flow("hello-flow", storage=Module("test_module"))
flow.register(project_name="CDP")
Both have the same name but only 1 has tasks attached and it's not the one you're calling flow.register on.
m
Ah so how can I register the flow and use module storage
n
You'll want to attach the storage to the flow you want to register, so this:
Copy code
@task
def say_hello():
    print("hello")

storage=Module("test_module")
with Flow("hello-flow", storage=storage) as flow:
    say_hello()

flow.register(project_name="CDP")
m
So I want to register the flow on a separate machine.
n
i'm not sure I understand your question, can you clarify?
m
I want to create the flow in one file but register as a different step
I think i don't understand how module storage works
In your above example what is test_module referring to?
n
I'm honestly not sure, that's what you sent me haha
Basically, module storage allows you to reference flows that are in modules that are installed and executable in your environment
m
If I register the above flow outside container(where Prefect agent is deployed) , will the container be able to execute the flow?
Do you have any working example?
n
Only if the module is installed in the agent's execution environment as well
I don't have one, since this is related to python modules being installed in the local environment.
It sounds like you want to use some sort of external storage to load and execute your flows outside of the environment where you have them written. Have you looked at Docker storage? This is typically one of the most reliable and easiest to configure external storage types.
m
Thanks @nicholas. But I am trying to use Module storage option atm