https://prefect.io logo
#prefect-community
Title
# prefect-community
j

Jason

05/09/2022, 7:49 PM
I have a flow that runs locally but is returning a None result when run in Prefect Cloud. I think it's because I'm not setting the S3Result as the default? But this is confusing since I'm using the S3Storage as the default for flow storage. Does this seem right?
Copy code
Updated
9 May 2022 2:31pm
Result Type
Result
Result Location
None
k

Kevin Kho

05/09/2022, 7:59 PM
Are you printing inside the flow block?
j

Jason

05/09/2022, 8:02 PM
Yes - the individual task is fairly plain, and seems to return a result with the local flow:
Copy code
@task(name="Extract Owners from Streamline")
def extract_owners() -> DataFrame:
    """
    Grab Owners from Streamline and serialize as a Pandas DataFrame.
    """

    logger = prefect.context.get("logger")
    streamline = Streamline(token_key=TOKEN_KEY, token_secret=TOKEN_SECRET)
    owners_df: DataFrame = streamline.get_owners()

    if owners_df.empty:
        raise FAIL(
            "Owners DataFrame is empty. Something is wrong with the API pull from Streamline."
        )

    <http://logger.info|logger.info>(f"Pulled {owners_df.shape[0]} owners for {FLOW_NAME}")

    return owners_df
So within the Flow() as flow context, it's fed like this (perhaps incorrectly?):
Copy code
owners_df = extract_owners()

    load_owners_s3(owners_df, database=glue_database)
When it reaches
load_owners_s3
, it's a Nonetype, which raises the appropriate error with aws-data-wrangler: wr.s3.to_parquet.
Oh, sorry Kevin, I misread your reply. I'm not printing, that info on the None result from Prefect Cloud for the specific Task.
I must have something misconfigured. Earlier info in #prefect-community suggests that:
Copy code
by default, when you use e.g. S3 storage, Prefect will use S3 also for results automatically, unless you explicitly turn off checkpointing
k

Kevin Kho

05/09/2022, 8:37 PM
Wait, that thing you mentioned is more Prefect 2.0 I think
Is it none for a new flow run, or when restarting from failure?
j

Jason

05/09/2022, 8:54 PM
New flow run: I'm on Prefect 1-series still
That is when trying to explicitly set S3Result on the result for the flow, too.
So the nonetype fails the subsequent task
k

Kevin Kho

05/09/2022, 8:56 PM
What is the log of the shape?
j

Jason

05/09/2022, 8:57 PM
Oh, interesting. S3Result forced it to produce .prefect_results after the task finished. I spoke too quickly.
k

Kevin Kho

05/09/2022, 8:57 PM
This looks right. Are you sure the issue is not with
glue_database
?
j

Jason

05/09/2022, 8:57 PM
No, glue-database is good. I think the current issue (separate from the original post) is now that the prefectTaskRole doesn't have access to the S3 bucket that I was testing locally to serialize the Parquet files too, but that's an easy fix to the IAM role.
So I think, at least with my setup, that it is necessary to explicitly pass in S3Result then
k

Kevin Kho

05/09/2022, 8:59 PM
The Result should not affect whether or not the return from a task is None
It gets persisted but it still gets fed in memory to the next task
1
6 Views