Also on a separate topic the need to know the slug in advanc Prefect Community #prefect-ui

Also on a separate topic, the need to know the slu...

Russell Brooks

03/16/2022, 10:56 PM

Also on a separate topic, the need to know the slug in advance and to know what quirky append it has, e.g. “-1” or “-copy” is a problem when using create_flow_run and get task results. I think you have an old ticket for this but I lost a couple of hours on this yesterday. Also when to use startflowrun or create_flow_run is a confusing to me. I think is is in the wrong forum as it is not UI. I am new to slack. Can someone help move it?

Kevin Kho

03/16/2022, 10:59 PM

The slugify algorithm is pretty aggressive because you can re-use a task multiple times and it makes copies in the Flow block. I have personally found the best way to get the slug is to print the content of

flow.serialize()

and look for it under “tasks”. Otherwise you need to use the GraphQL API. The StartFlowRun was the original task but then it would change the return type depending if you set

wait=True

wait=False

. If wait was False, it would return a Flow ID and if wait = True, it would return a state. This led to some pain when using it, so the

create_flow_run

task was created in 0.15.0 to give a uniform return type.

wait_for_flow_run

was decoupled from

create_flow_run

Kevin Kho

03/16/2022, 11:00 PM

Both are usable but probably

create_flow_run

is preferred

Russell Brooks

03/16/2022, 11:52 PM

It is issue #4624. I have to always check what to append. And it is not obvious, e.g. I have to use “-copy” too

Kevin Kho

03/16/2022, 11:54 PM

Yeah it’s really not. The whole subflow experience is clunky, which is why it’s redefined in Prefect 2.0 and more first class and Prefect will handle the passing of data for you

Russell Brooks

03/17/2022, 12:02 AM

Also when I get_task_run_result from a LocalResult it works ok. But if from S3Result it errors: ClientError an error occurred 403 when calling HeadObject operation. So not sure how to get task results if it is in S3.

Kevin Kho

03/17/2022, 12:04 AM

That tends to be related to permissions. The execution environment of the Flow needs to have authentication to pull from that specific bucket.

Russell Brooks

03/17/2022, 12:18 AM

This sounds like the cause. But how to let get_task_run_result my auth? Doesn't look like it takes any keywords to tell it bucket, location and aws credentials.

Kevin Kho

03/17/2022, 12:24 AM

It uses the result information defined in the subflow. It’s like how S3 storage doesn’t take credentials either. It uses your local

.awsconfig

to create a

boto3

client.

Kevin Kho

03/17/2022, 12:25 AM

But it tries to use secrets first like this to make the connection

Russell Brooks

03/17/2022, 12:29 AM

I have a dict for AWS_CREDENTIALS in my config.toml. Even if it find and uses that how should get_task_run_result know which bucket, endpoint, and location?

Kevin Kho

03/17/2022, 12:29 AM

Don’t you define it in the subflow Result object for that task?

Russell Brooks

03/17/2022, 12:34 AM

The Flow with the Task where I want to get the results from is defined with a S3Result that has bucket, location, and boto3_kwargs. That works and a file is created on minio. But when the downstream Flow, that is in a different module, creates the upstream flow I doubt it is aware of all the parameters used to setup the S3Result.

Kevin Kho

03/17/2022, 12:36 AM

It pulls the task info from the GraphQL API

Kevin Kho

03/17/2022, 12:39 AM

The code path is long but it starts here

Russell Brooks

03/17/2022, 12:49 AM

Hmm not even sure this is the cause. Seems a bridge too far and too many complications to use S3Results for this purpose. I'll stick with LocalResults for now and see if anyone can help clarify this over the next period of time

12 Views

Open in Slack

Previous Next