I have a flow that runs locally but is returning a None resu Prefect Community #ask-community

I have a flow that runs locally but is returning a...

Jason

05/09/2022, 7:49 PM

I have a flow that runs locally but is returning a None result when run in Prefect Cloud. I think it's because I'm not setting the S3Result as the default? But this is confusing since I'm using the S3Storage as the default for flow storage. Does this seem right?

Jason

05/09/2022, 7:50 PM

Copy code

Updated
9 May 2022 2:31pm
Result Type
Result
Result Location
None

Kevin Kho

05/09/2022, 7:59 PM

Are you printing inside the flow block?

Jason

05/09/2022, 8:02 PM

Yes - the individual task is fairly plain, and seems to return a result with the local flow:

Copy code

@task(name="Extract Owners from Streamline")
def extract_owners() -> DataFrame:
    """
    Grab Owners from Streamline and serialize as a Pandas DataFrame.
    """

    logger = prefect.context.get("logger")
    streamline = Streamline(token_key=TOKEN_KEY, token_secret=TOKEN_SECRET)
    owners_df: DataFrame = streamline.get_owners()

    if owners_df.empty:
        raise FAIL(
            "Owners DataFrame is empty. Something is wrong with the API pull from Streamline."
        )

    <http://logger.info|logger.info>(f"Pulled {owners_df.shape[0]} owners for {FLOW_NAME}")

    return owners_df

Jason

05/09/2022, 8:03 PM

So within the Flow() as flow context, it's fed like this (perhaps incorrectly?):

Copy code

owners_df = extract_owners()

    load_owners_s3(owners_df, database=glue_database)

Jason

05/09/2022, 8:03 PM

When it reaches

load_owners_s3

, it's a Nonetype, which raises the appropriate error with aws-data-wrangler: wr.s3.to_parquet.

Jason

05/09/2022, 8:07 PM

Oh, sorry Kevin, I misread your reply. I'm not printing, that info on the None result from Prefect Cloud for the specific Task.

Jason

05/09/2022, 8:29 PM

I must have something misconfigured. Earlier info in #CL09KU1K7 suggests that:

Copy code

by default, when you use e.g. S3 storage, Prefect will use S3 also for results automatically, unless you explicitly turn off checkpointing

Kevin Kho

05/09/2022, 8:37 PM

Wait, that thing you mentioned is more Prefect 2.0 I think

Kevin Kho

05/09/2022, 8:37 PM

Is it none for a new flow run, or when restarting from failure?

Jason

05/09/2022, 8:54 PM

New flow run: I'm on Prefect 1-series still

Jason

05/09/2022, 8:54 PM

That is when trying to explicitly set S3Result on the result for the flow, too.

Jason

05/09/2022, 8:55 PM

So the nonetype fails the subsequent task

Kevin Kho

05/09/2022, 8:56 PM

What is the log of the shape?

Jason

05/09/2022, 8:57 PM

Oh, interesting. S3Result forced it to produce .prefect_results after the task finished. I spoke too quickly.

Kevin Kho

05/09/2022, 8:57 PM

This looks right. Are you sure the issue is not with

glue_database

Jason

05/09/2022, 8:57 PM

No, glue-database is good. I think the current issue (separate from the original post) is now that the prefectTaskRole doesn't have access to the S3 bucket that I was testing locally to serialize the Parquet files too, but that's an easy fix to the IAM role.

Jason

05/09/2022, 8:58 PM

So I think, at least with my setup, that it is necessary to explicitly pass in S3Result then

Kevin Kho

05/09/2022, 8:59 PM

The Result should not affect whether or not the return from a task is None

Kevin Kho

05/09/2022, 8:59 PM

It gets persisted but it still gets fed in memory to the next task

✅ 1

6 Views

Open in Slack

Previous Next