I m using Prefect server to run a flow that queries a NoSQL Prefect Community #prefect-server

I'm using Prefect server to run a flow that querie...

Alex Furrier

07/13/2021, 4:00 PM

I'm using Prefect server to run a flow that queries a NoSQL database. If the query is large enough I get a 413 payload to large error for graphql. Any suggestions for how to tackle?

Copy code

Failed to set task state with error: HTTPError('413 Client Error: Payload Too Large for url: <http://prefect-apollo.prefect:4200/graphql/graphql>')
Traceback (most recent call last):
  File "/opt/conda/lib/python3.8/site-packages/prefect/engine/cloud/task_runner.py", line 91, in call_runner_target_handlers
    state = self.client.set_task_run_state(
  File "/opt/conda/lib/python3.8/site-packages/prefect/client/client.py", line 1518, in set_task_run_state
    result = self.graphql(
  File "/opt/conda/lib/python3.8/site-packages/prefect/client/client.py", line 298, in graphql
    result = <http://self.post|self.post>(
  File "/opt/conda/lib/python3.8/site-packages/prefect/client/client.py", line 213, in post
    response = self._request(
  File "/opt/conda/lib/python3.8/site-packages/prefect/client/client.py", line 459, in _request
    response = self._send_request(
  File "/opt/conda/lib/python3.8/site-packages/prefect/client/client.py", line 375, in _send_request
    response.raise_for_status()
  File "/opt/conda/lib/python3.8/site-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 413 Client Error: Payload Too Large for url: <http://prefect-apollo.prefect:4200/graphql/graphql>

nicholas

07/13/2021, 10:00 PM

Hi @Alex Furrier - it sounds like you're attempting to send a pretty large payload to the Server; the Apollo Server imposes a hard limit on payloads larger than 1mb. Generally you'll want to avoid sending payloads and instead putting them elsewhere between tasks, maybe something like a bucket, that you can then fetch in downstream tasks to perform operations on

Alex Furrier

07/13/2021, 10:03 PM

So the large payload is from a database query. Is there a way to avoid having the data sent as a payload? Or is my only option to make the query smaller and then combine or stream it and then write to a bucket like you mentioned.

nicholas

07/13/2021, 10:09 PM

That depends on how you're making the query. If you're using a task from the task library (like

PostgresFetch

or something) you can always wrap that in your own task, make the query as you'd like, write the results, and then return maybe the bucket reference to your downstream so they know where to look

upvote 1

nicholas

07/13/2021, 10:10 PM

So (in semi-pseudocode):

Copy code

@task
def postgres_wrapper(**args):
  results = PostgresFetch(**args).run()
  location = write_to_bucket(results)

  return location

Alex Furrier

07/13/2021, 10:41 PM

Makes sense I'll try that. Thanks!

😄 1

9 Views

Open in Slack

Previous Next