https://prefect.io logo
Title
r

Russell Brooks

03/16/2022, 10:56 PM
Also on a separate topic, the need to know the slug in advance and to know what quirky append it has, e.g. “-1” or “-copy” is a problem when using create_flow_run and get task results. I think you have an old ticket for this but I lost a couple of hours on this yesterday. Also when to use startflowrun or create_flow_run is a confusing to me. I think is is in the wrong forum as it is not UI. I am new to slack. Can someone help move it?
k

Kevin Kho

03/16/2022, 10:59 PM
The slugify algorithm is pretty aggressive because you can re-use a task multiple times and it makes copies in the Flow block. I have personally found the best way to get the slug is to print the content of
flow.serialize()
and look for it under “tasks”. Otherwise you need to use the GraphQL API. The StartFlowRun was the original task but then it would change the return type depending if you set
wait=True
or
wait=False
. If wait was False, it would return a Flow ID and if wait = True, it would return a state. This led to some pain when using it, so the
create_flow_run
task was created in 0.15.0 to give a uniform return type.
wait_for_flow_run
was decoupled from
create_flow_run
Both are usable but probably
create_flow_run
is preferred
r

Russell Brooks

03/16/2022, 11:52 PM
It is issue #4624. I have to always check what to append. And it is not obvious, e.g. I have to use “-copy” too
k

Kevin Kho

03/16/2022, 11:54 PM
Yeah it’s really not. The whole subflow experience is clunky, which is why it’s redefined in Prefect 2.0 and more first class and Prefect will handle the passing of data for you
r

Russell Brooks

03/17/2022, 12:02 AM
Also when I get_task_run_result from a LocalResult it works ok. But if from S3Result it errors: ClientError an error occurred 403 when calling HeadObject operation. So not sure how to get task results if it is in S3.
k

Kevin Kho

03/17/2022, 12:04 AM
That tends to be related to permissions. The execution environment of the Flow needs to have authentication to pull from that specific bucket.
r

Russell Brooks

03/17/2022, 12:18 AM
This sounds like the cause. But how to let get_task_run_result my auth? Doesn't look like it takes any keywords to tell it bucket, location and aws credentials.
k

Kevin Kho

03/17/2022, 12:24 AM
It uses the result information defined in the subflow. It’s like how S3 storage doesn’t take credentials either. It uses your local
.awsconfig
to create a
boto3
client.
But it tries to use secrets first like this to make the connection
r

Russell Brooks

03/17/2022, 12:29 AM
I have a dict for AWS_CREDENTIALS in my config.toml. Even if it find and uses that how should get_task_run_result know which bucket, endpoint, and location?
k

Kevin Kho

03/17/2022, 12:29 AM
Don’t you define it in the subflow Result object for that task?
r

Russell Brooks

03/17/2022, 12:34 AM
The Flow with the Task where I want to get the results from is defined with a S3Result that has bucket, location, and boto3_kwargs. That works and a file is created on minio. But when the downstream Flow, that is in a different module, creates the upstream flow I doubt it is aware of all the parameters used to setup the S3Result.
k

Kevin Kho

03/17/2022, 12:36 AM
It pulls the task info from the GraphQL API
The code path is long but it starts here
r

Russell Brooks

03/17/2022, 12:49 AM
Hmm not even sure this is the cause. Seems a bridge too far and too many complications to use S3Results for this purpose. I'll stick with LocalResults for now and see if anyone can help clarify this over the next period of time