Hey folks :wave: I’m completely new to Prefect but...
# prefect-community
c
Hey folks 👋 I’m completely new to Prefect but it sounds amazing — I’m wondering what are best practices for structuring flows to be run with Prefect Cloud? I have a hello-world example that I tried with our Snowflake warehouse; when ending the script with
flow.run()
(instead of
flow.register()
) and running with a local-agent, everything works great. Though, when I run the following script to register it in our Prefect Cloud account and execute it through the UI, it works (as in, the query gets executed) but I get the following error:
Unexpected error: TypeError("can't pickle _thread.lock objects")
. My question is; what are best practices for structuring these scripts? Once you’ve built the flow, should you always end it with
flow.run()
and
flow.register()
? Do you manually register flows when making changes but not include the
register()
method inside the actual script?
Here’s the toy example I ran:
Copy code
import prefect
from prefect import task, Flow
from prefect.tasks.snowflake import SnowflakeQuery


query = """
    SHOW DATABASES;
"""

snowflake = SnowflakeQuery(
    {account info},
    query=query
)

flow = Flow("hello-snowflake", tasks=[snowflake])

flow.register(project_name="analytics")
j
Hi Charles, That sounds like the return value from the task isn't pickleable (which is required when running with
Results
enabled). We can fix that.
@Marvin open "Result from snowflake task isn't pickleable"
j
My question is; what are best practices for structuring these scripts?
Once you’ve built the flow, should you always end it with
flow.run()
  and
flow.register()
? Do you manually register flows when making changes but not include the
register()
method inside the actual script?
You have full flexibility here. Generally I recommend when possible users write their whole flow in a single file (makes it easier to ship around via prefect's storage or dask). If running locally or debugging, I generally either import the flow and call
flow.run
or end the script with
flow.run()
. When registering, I generally call
flow.register
in the script itself, but you can also use
prefect register flow -f yourflow.py -p your-project
to register using the CLI (then no
flow.register
call is needed in the file itself).
c
Awesome, thanks for you insights! If I end the flow with;
Copy code
flow.register()
flow.run()
and run the flow on a daily schedule — would it then get registered (and increment the version number in Prefect Cloud) every time it is run?
j
Hmmm, I think there's some confusion here (or maybe not). Let me try to clarify: •
flow.run
does a local run only. Runs this way never communicate with a backend (prefect cloud/server), they're local only. This does not schedule a flow run with a backend. •
flow.register
registers a flow with a backend. Each call will bump the version number. If you're running a flow with a
Schedule
: •
flow.run
will start a long running process that will wait in between runs in accordance with the schedule •
flow.register
will register the flow with the backend (cloud/server). The backend will then schedule flow runs in accordance with that schedule, which are eventually picked up by the agent. During a backend-executed (using cloud/server) flow run, it doesn't matter what the originating "script" ended with, all calls to `flow.run`/`flow.register` in the script are ignored.
c
Beautiful — I think that clarifies things for me. It was the relationship between the flow and the backend that was unclear. That makes a ton of sense now — thank you! 🙏