Royzac

    Royzac

    5 months ago
    Does Pandas have a bulk update(not insert) functionality with sqlalchemy?
    Kevin Kho

    Kevin Kho

    5 months ago
    I don’t know a lot on this, but update sounds hard (and slow). We’ll see what others have to say. I’ve only dealt with inserts
    Royzac

    Royzac

    5 months ago
    Why do you think that? Here's an example for sqlalchemy but it has ridiculous hang(that never seems to end) session = Session() session.bulk_update_mappings(ConversionRate, merged_table_dict) session.commit() session.close()
    Kevin Kho

    Kevin Kho

    5 months ago
    I was personally in a similar situation and we tried a bulk upsert (pure SQL) and it was taking forever. I forget exact numbers but it felt like the insert was 30 minutes and the upsert was 20 hours. So similar to your ridiculous hang.
    But I dont know much here so dont take my word for it
    Kyle McChesney

    Kyle McChesney

    5 months ago
    Not sure about pandas, but I’ve generally had good luck using CSV import for bulk performance. You can CSV load into a temp table and then do UPDATE FROM on the temp table to the new table. Something like this: https://www.virtual-dba.com/blog/postgresql-updating-data-using-csv-file-with-dbeaver/ (ignore all the GUI stuff but the queries are the same)
    Kevin Kho

    Kevin Kho

    5 months ago
    ^ This sounds like the solution we ended up with at my last job where the temp table insertion was a bulk insert and this was way more performant
    Royzac

    Royzac

    5 months ago
    @Kyle McChesney I've used that method before, and would be an option, but it's not very seamless. I use pandas rd_sql, then manipulation, create temp table(to_sql) with it, then update to regular table with the data from the temp table.
    Kyle McChesney

    Kyle McChesney

    5 months ago
    yeah it is quite a painful process to walk through
    Royzac

    Royzac

    5 months ago
    Curious, how computationally expensive is it to merge values and pandas.to_sql('replace') vs bulk update vs iterated update? @Kyle McChesney @Kevin Kho