Hey, I have a Secret AWS_CREDENTIALS in Prefect Cl...
# prefect-community
m
Hey, I have a Secret AWS_CREDENTIALS in Prefect Cloud (1.0) following this format:
Copy code
{
  "ACCESS_KEY": "abcdef",
  "SECRET_ACCESS_KEY": "ghijklmn"
}
If I'm using pandas to read a file in S3:
Copy code
df = pd.read_csv(f"s3://{s3_bucket_name}/{filename}")
Should I need to pass the credentials as a param to the function read_csv? Or are they read automatically from the Cloud? Currently I'm getting this error: "botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden" Thanks!
a
Try creating a boto3 session in which you use those credentials
m
I will try it. Thanks!
👍 1
v
I highly recommend using pyarrow datasets, when working with parquet datasets/files in an AWS S3 bucket. https://arrow.apache.org/docs/python/dataset.html Optionally, If you wanna run queries on your parquet dataset/files, you can use duckdb. https://duckdb.org/2021/12/03/duck-arrow.html Here is a short working example:
Copy code
from pyarrow.fs import S3FileSystem
# or instead of pyarrow.fs
import s3fs
import pyarrow.dataset as ds
import duckdb

con = duckdb.connect()

fs = S3FileSystem(access_key="my_access_key", secret_key="my_secret_key", region="eu-1/frankfurt")
# or
fs = s3fs.S3FileSystem(anon=False, key="my_access_key", secret="my_secret_key")

history = ds.dataset("findata/forex_pros/history/D", partitioning=["exchange"], filesystem=fs)

aapl = con.execute("SELECT * FROM history WHERE symbol_id=6408").df()
m
Thanks @Volker L !!
🙂 1
v
You are welcome. Hope this short introduction is helpful. Contact me, if you need some more input.
1