https://prefect.io logo
Title
h

Hedgar

10/01/2022, 8:17 AM
@Anna Geller I think I’m somewhat confuse. I’m using
awswrangler
package to wrangle data and move same to an S3 bucket. Before signing on prefect cloud I could use
wr
by simply indicating an s3 bucket path and my data would go to this path. Recently I created an S3 block via prefect cli. Now in an attempt to build deployment with the
-sb
flag I got an error that I indicated wrong s3 path. Clarifying questions: 1. Must I use the prefect cloud s3 storage block as the
path
under
wr
package? 2. Can I have a separate s3 bucket for storing output of my data wrangling different from the s3 bucket where my flow code should reside? I would appreciate if your answer include some basic scenarios. I have gone through all your repo and none appear to address this issue probably there is, maybe implicitly .
1
a

Anna Geller

10/01/2022, 2:14 PM
#1 No, awswrangler is independent of the S3 block. You could retrieve the boto3 session from AwsCredentials block and pass it to your awswrangler operations #2 sure, you can Here is an example:
import awswrangler as wr
from prefect_aws.s3 import AwsCredentials

creds = AwsCredentials.load("prod")
session = creds.get_boto3_session()

wr.s3.to_parquet(boto3_session=session, ...)
you need to install it first:
pip install awswrangler prefect-aws
register the block:
prefect block register -m prefect_aws
and create the block:
from prefect_aws.credentials import AwsCredentials


aws = AwsCredentials(
    aws_access_key_id="xxx",
    aws_secret_access_key="xxx",
)
aws.save("prod", overwrite=True)