I noticed that in a mapped insert task I ran, there were quite a few rejected records (on purpose), but all the remaining good ones seemed to be inserted correctly...
and yet, it seems that
commit=False
is the default
is anyone aware of deeper dive docs on dask and psycopg?
ah ok so the PostgresExecutor task does not support the connection.executemany() function inside the cursor.. this would be really helpful in mapped tasks that are doing data loading of fixed-size batches..
currently each run() call opens a separate connection/transaction per single query.. it would be possible to make a super giant querystring for the whole chunk, but executemany() makes it much friendlier
would this be something welcome as a PR? I can pretty easily extend the task class in my code, but might be useful to others... and by extension i wonder what the ambitions/roadmap are for postgres tasks
j
josh
04/03/2020, 8:26 PM
Absolutely welcome as a PR!
The task library is generally community driven so there are no formal roadmap items (yet) for the tasks in it. The goal is to have a library of easy to use tasks for performing basic functions that can also serve as templates/inspiration for others to use when making their own tasks 🙂