Hey I have a Secret AWS CREDENTIALS in Prefect Cloud 1 0 fol Prefect Community #ask-community

Hey, I have a Secret AWS_CREDENTIALS in Prefect Cl...

Mateo Merlo

05/30/2022, 2:24 PM

Hey, I have a Secret AWS_CREDENTIALS in Prefect Cloud (1.0) following this format:

Copy code

{
  "ACCESS_KEY": "abcdef",
  "SECRET_ACCESS_KEY": "ghijklmn"
}

If I'm using pandas to read a file in S3:

Copy code

df = pd.read_csv(f"s3://{s3_bucket_name}/{filename}")

Should I need to pass the credentials as a param to the function read_csv? Or are they read automatically from the Cloud? Currently I'm getting this error: "botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden" Thanks!

Anna Geller

05/30/2022, 2:34 PM

Try creating a boto3 session in which you use those credentials

Mateo Merlo

05/30/2022, 2:54 PM

I will try it. Thanks!

👍 1

Volker L

06/06/2022, 12:03 PM

I highly recommend using pyarrow datasets, when working with parquet datasets/files in an AWS S3 bucket. https://arrow.apache.org/docs/python/dataset.html Optionally, If you wanna run queries on your parquet dataset/files, you can use duckdb. https://duckdb.org/2021/12/03/duck-arrow.html Here is a short working example:

Copy code

from pyarrow.fs import S3FileSystem
# or instead of pyarrow.fs
import s3fs
import pyarrow.dataset as ds
import duckdb

con = duckdb.connect()

fs = S3FileSystem(access_key="my_access_key", secret_key="my_secret_key", region="eu-1/frankfurt")
# or
fs = s3fs.S3FileSystem(anon=False, key="my_access_key", secret="my_secret_key")

history = ds.dataset("findata/forex_pros/history/D", partitioning=["exchange"], filesystem=fs)

aapl = con.execute("SELECT * FROM history WHERE symbol_id=6408").df()

Mateo Merlo

06/06/2022, 12:24 PM

Thanks @Volker L !!

🙂 1

Volker L

06/06/2022, 12:30 PM

You are welcome. Hope this short introduction is helpful. Contact me, if you need some more input.

✅ 1

5 Views

Open in Slack

Previous Next