https://prefect.io logo
Title
p

Prasanth Kothuri

03/03/2022, 5:15 PM
Hi All, I would like to write a pandas dataframe as csv to s3 in prefect, shouldn't this work?
# upload to s3
write_to_s3 = S3Upload(
    bucket=s3_bucket,
    boto_kwargs=dict(
        endpoint_url=os.getenv("s3_endpoint"),
        aws_access_key_id=os.getenv("s3_access_key"),
        aws_secret_access_key=os.getenv("s3_secret_key")
    )
)

output = write_to_s3(results.to_csv(index=False), key=file_name)
looking at the code, I see that it is using bytesIO, so I had to encode as below and it works !
write_to_s3(results.to_csv(index=False).encode(), key=file_name)
k

Kevin Kho

03/03/2022, 6:02 PM
Yeah the task as it is only accepts bytes
a

Anna Geller

03/03/2022, 8:15 PM
@Prasanth Kothuri as much as I like our task library, for loading dataframes to S3, awswrangler has just way better functionality. You can upload your Pandas dataframe as a compressed parquet file in a single command:
from prefect import task
import awswrangler as wr

@task
def upload_df(df, path):
    wr.s3.to_csv(df1, path, index=False)