Hi all, I have a quick question about how register...
# ask-community
c
Hi all, I have a quick question about how registering flows with s3 storage works.
I have my flows set up to have a
flow_<agency_name>.py
file with the main flow, I run and register the flow from this file. Additionally, I have a
tasks/<agency_name>.py
where I have written the tasks for this flow, these are imported in the flow file. My question is, when I register the flow are the tasks imported into the flow file included in the object stored in s3?
I'm also referencing sql files by path in some tasks, it is clear that those are not included which makes sense.
a
Hi @Claire Herdeman, first, you need to install
aws
subpackage:
Copy code
pip3 install "prefect[aws]"
Then you have several options to configure S3 storage. 1. You can either choose the default pickle-based storage - this will store your flow in serialized form 2. ...or you can choose script-based storage - this will assume your flow is stored on S3 as a normal
.py
file → you need to pass
stored_as_script=True
to configure that. 3. If you went for script-based storage, you can additionally let Prefect upload your local flow file to S3 → just pass
local_script_path
as argument. Otherwise, you can let your flows be pushed to S3 as part of a CI/CD pipeline. Here is an example with a local agent:
Copy code
from prefect.storage import S3
from prefect.run_configs import LocalRun
from prefect import task, Flow

FLOW_NAME = "s3_storage_demo"
STORAGE = S3(
    bucket="prefect-datasets",
    key=f"flows/{FLOW_NAME}.py",
    stored_as_script=True,
    # if you add local_script_path, Prefect will upload the local Flow script to S3 during registration
    local_script_path=f"{FLOW_NAME}.py",
)

@task(log_stdout=True)
def hello_world():
    print("hello world")

with Flow(
    FLOW_NAME, storage=STORAGE, run_config=LocalRun(labels=["s3"])
) as flow:
    hello_world()
When you start the agent, make sure that your AWS CLI is configured with permissions to use S3. Then you can start the agent using:
Copy code
prefect agent local start --label s3
Let me know if you have any issues with it.
c
Oh, so to be clear I've got it up and running totally fine, I'm just trying to understand the behavior
So I am using the default pickle-based storage, is what is stored in S3 just the literally flow, or does it include the imported tasks that are part of the flow
a
Sorry for misunderstanding. In that case, it includes the pickled flow object. But the SQL files that you reference by local path are not included within the flow. You could add a task that downloads those as part of a task - it would be probably the easiest option.
c
Makes sense, thank you!
👍 1