Hi, I'm trying to list files from a S3 bucket but don't know how to get the result of the task. Can anyone help here? I would have thought to get an array of file names instead of a task object.
✅ 1
Dennis Hinnenkamp
07/07/2022, 9:11 AM
ok, i got it. i just passed the return to another task and now i can access the values
a
Anna Geller
07/07/2022, 11:52 AM
nice work, thanks for the update
d
Dennis Hinnenkamp
07/08/2022, 6:48 AM
Unfortunately, it didn't all work out as i had imagined. I get a list of filenames back from one task, which I pass as parameters into a second task. in the second task, I then call a function from the aws sdk, and get the message
TypeError: cannot pickle '_thread.lock' object
.
I don't know how to model the flow differently. The plan was to go through the list and delete or copy each file.
I think this function call causes the error:
sorry for a late reply - the error you were getting is related to the s3_client object - you shouldn't pass the client between tasks because it cannot be serialized with cloudpickle; if you really need that, you would need to turn off checkpointing at least:
Copy code
@task(checkpoint=False)
if you still struggle with this, can you share your code? I've used boto3 quite a lot so perhaps I can help here
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.