I’m seeing an error I haven’t run into before, and...
# data-tricks-and-tips
j
I’m seeing an error I haven’t run into before, and I’ve done a good number of ML pipelines with prefect before. I’m passing a pandas dataframe as an argument to a flow, and it’s not too big, but I’m getting a new error message here:
Copy code
PrefectHTTPStatusError: Client error '422 Unprocessable Entity' for url '<https://api.prefect.cloud/api/accounts/31b5f36a-4a3e-4b90-a1aa-ec5a6c794192/workspaces/2e0cb677-1bec-4912-9270-e845f84475b0/flow_runs/>'
Response: {'exception_message': 'Invalid request received.', 'exception_detail': [{'loc': ['body', 'parameters'], 'msg': 'Flow run parameters must be less than 512KB when serialized.'
The error message shows that it recognizes the first of the 2 dataframe arguments as a dataframe when it prints out the arguments to the function in the traceback. But the second df prints out as a massive dictionary in the traceback. They’re both definitely dataframes, so I don’t see what the issue is
k
prefect stores flow parameters server-side so there's a limit on data size for them. would it be possible to save out your dataframe somewhere and then read it back in inside the flow?
j
Guess I could throw it into a temp table in snowflake, which is the source anyway
So, the DF just has to be of a certain size or smaller when used as an argument to a function, then. No new limits on how big a dataset can be inside of a flow, right?
k
correct
and that's strictly for flows, not tasks
j
Good, because I’ve got some monster models. Ok, I’ll park this in a temp table and just use the snowflake connector to pull down the little pieces of it inside the flow and then do my stuff to it there
Thanks for the info! Surprised I haven’t run into that before. Just luck, I guess
k
np!