j

    Jason

    4 months ago
    I have a flow that runs locally but is returning a None result when run in Prefect Cloud. I think it's because I'm not setting the S3Result as the default? But this is confusing since I'm using the S3Storage as the default for flow storage. Does this seem right?
    Updated
    9 May 2022 2:31pm
    Result Type
    Result
    Result Location
    None
    Kevin Kho

    Kevin Kho

    4 months ago
    Are you printing inside the flow block?
    j

    Jason

    4 months ago
    Yes - the individual task is fairly plain, and seems to return a result with the local flow:
    @task(name="Extract Owners from Streamline")
    def extract_owners() -> DataFrame:
        """
        Grab Owners from Streamline and serialize as a Pandas DataFrame.
        """
    
        logger = prefect.context.get("logger")
        streamline = Streamline(token_key=TOKEN_KEY, token_secret=TOKEN_SECRET)
        owners_df: DataFrame = streamline.get_owners()
    
        if owners_df.empty:
            raise FAIL(
                "Owners DataFrame is empty. Something is wrong with the API pull from Streamline."
            )
    
        <http://logger.info|logger.info>(f"Pulled {owners_df.shape[0]} owners for {FLOW_NAME}")
    
        return owners_df
    So within the Flow() as flow context, it's fed like this (perhaps incorrectly?):
    owners_df = extract_owners()
    
        load_owners_s3(owners_df, database=glue_database)
    When it reaches
    load_owners_s3
    , it's a Nonetype, which raises the appropriate error with aws-data-wrangler: wr.s3.to_parquet.
    Oh, sorry Kevin, I misread your reply. I'm not printing, that info on the None result from Prefect Cloud for the specific Task.
    I must have something misconfigured. Earlier info in #prefect-community suggests that:
    by default, when you use e.g. S3 storage, Prefect will use S3 also for results automatically, unless you explicitly turn off checkpointing
    Kevin Kho

    Kevin Kho

    4 months ago
    Wait, that thing you mentioned is more Prefect 2.0 I think
    Is it none for a new flow run, or when restarting from failure?
    j

    Jason

    4 months ago
    New flow run: I'm on Prefect 1-series still
    That is when trying to explicitly set S3Result on the result for the flow, too.
    So the nonetype fails the subsequent task
    Kevin Kho

    Kevin Kho

    4 months ago
    What is the log of the shape?
    j

    Jason

    4 months ago
    Oh, interesting. S3Result forced it to produce .prefect_results after the task finished. I spoke too quickly.
    Kevin Kho

    Kevin Kho

    4 months ago
    This looks right. Are you sure the issue is not with
    glue_database
    ?
    j

    Jason

    4 months ago
    No, glue-database is good. I think the current issue (separate from the original post) is now that the prefectTaskRole doesn't have access to the S3 bucket that I was testing locally to serialize the Parquet files too, but that's an easy fix to the IAM role.
    So I think, at least with my setup, that it is necessary to explicitly pass in S3Result then
    Kevin Kho

    Kevin Kho

    4 months ago
    The Result should not affect whether or not the return from a task is None
    It gets persisted but it still gets fed in memory to the next task