Cormac
11/30/2023, 12:11 PMsnowflake.connector.errors.ProgrammingError: 001795 (42601): SQL compilation error: error line 1 at position 115
maximum number of expressions in a list exceeded, expected at most 16,384, got 251,666
This is a common snowflake issue:
• https://community.snowflake.com/s/question/0D50Z00008759PiSAI/error-maximum-number-of-ex[…]sions-in-a-list-exceeded-expected-at-most-16384-got-29933
• https://stackoverflow.com/questions/62541089/snowflake-connector-sql-compilation-error-maximum-number-of-expressions-in-a-li
◦ https://docs.snowflake.com/developer-guide/python-connector/python-connector-example#label-python-connector-binding-batch-inserts
I've used prefect_snowflake.database.SnowflakeConnector.execute_many
in my implementation:
query_str = f"INSERT INTO {db}.{schema}.{table} (col1, col2, ...) VALUES (%(COL1)s, $(COL2)s, ...);"
conn_obj.execute_many(query_str, seq_of_parameters=param_list)
This works perfectly fine for small tables, but not for tables with over 16k of records to fetch.
Reading
https://stackoverflow.com/questions/62541089/snowflake-connector-sql-compilation-error-maximum-number-of-expressions-in-a-li
I wonder if this is being executed as a client-side cursor, and if it were to execute as a server-side cursor, then the issue would no longer manifest.
To enforce server-side cursor, I'd need to set qmark or numeric binding - but that needs setting before the cursor is created....
Can someone assist?Sean Williams
11/30/2023, 3:58 PMJack Goslin
12/04/2023, 3:58 PM.write_pandas()
on the Snowflake session. In the background this pushes it to a stage in Snowflake, then uses COPY
commands to insert the data into the target table. But I'm just using the snowflake-snowpark-python package, not the prefect_snowflake one. Straight insert
statements in Snowflake are s l o w.Cormac
12/04/2023, 4:04 PMsnowflake
module is loaded
• instantiate a SnowflakeConnector
• force close all connections by calling the close()
method on the instance
• set the snowflake connector parameter style snowflake.connector.paramstyle = 'qmark
• create a sequence of parameters of the form ?,?, ... , ?
and use this instead of [(col_name, value), ...]
list of tuples as advised in Prefect SnowflakeConnector
documentation
I absolutely do not want to convert to a pandas data frame in the interim.... too messy and prone to horrible memory, conversion issues, particularly with lots of large DB ELT jobs....
Documentation really needs improving around ELT support - come on, Prefect team!Cormac
12/04/2023, 4:05 PMCormac
12/04/2023, 4:09 PMJack Goslin
12/04/2023, 4:10 PMSean Williams
12/04/2023, 4:27 PMLet’s start with the simplest method of data ingestion, the INSERT command. This is a suitable approach to bringing a small amount of data, it has some limitations for large data sets exceeding the single digit MB range, particularly around ease of use, scalability, and error handling. For larger data sets, data engineers typically have the option to use a variety of ETL/ELT tools to ingest data, or preferably use object storage as an intermediate step alongside COPY INTO or Snowpipe.If you want to stick with insert, Snowflake's advice is to stick to very small amounts of data. https://www.snowflake.com/blog/best-practices-for-data-ingestion/
Cormac
12/04/2023, 4:32 PM