<@U02H1A95XDW> <@U01QEJ9PP53> (or anyone else) - J...
# ask-community
g
@Anna Geller @Kevin Kho (or anyone else) - Just curious, has there been anyone making a Pandas task library? I seem to be making my own as I go along, but just making sure I'm not doing it the hard way
a
I haven't heard of anyone, especially because it's so easy to call any Pandas operation in a function decorated with "task". Also, we have PandasSerializer for storing results of Pandas operation. What are you looking for in such task? We would be definitely open to that, especially for Orion.
1
k
There is the serializer for results but I think Pandas has too much functionality to encompass in a few tasks?
1
upvote 1
g
A lot of what we're doing is basic ETL-to-data warehouse stuff, and I'm wrapping a lot of the Pandas tasks with the decorator just to keep the tasks in the flow "small" and "discreet," (and hopefully, to make debugging simpler). I like the idea of standardizing the use of Pandas for ETL since it translates easily to our longer ML pipelines.
k
Have you seen hamilton ?
g
Hmmm, looking at hamilton now. Trying to understand it in this context. Definitely looks useful for some of the pipelines we have.
k
I think it builds the DAG for you with your Pandas functions allowing it to be more reusable (by enforcing naming conventions)
I dunno though I juist saw it and haven’t tried it myself, but it might be worth checking out
a
Also, to throw some recommendations from my side: if you are doing a lot of loading data to and from AWS data lake/Redshift, check out awswrangler which is basically Pandas on AWS, extremely useful for writing Pandas-based ETL workflows and makes it even kind of fun (rather than daunting) to work with AWS Glue data catalogue
g
Dang, that does look nice but we're using Google BigQuery.
m
I don't know the exact use-case, but why not using dbt to transform data in BigQuery?
upvote 1
a
agree with Matthias, for BigQuery I was also using just their native gcs and bigquery clients and then transforming the data with dbt - sample repo https://github.com/anna-geller/prefect-monte-carlo/tree/workshop
👍 1