Hello friends happy Monday I m getting a strange error in Pr Prefect Community #ask-community

Hello friends, happy Monday. I’m getting a strange...

Adam

12/07/2020, 2:28 PM

Hello friends, happy Monday. I’m getting a strange error in Prefect regarding

Failed to set task state with error: HTTPError('413 Client Error: Request Entity Too Large for url: <https://api.prefect.io/graphql>')

— whats the best way to debug this?

Adam

12/07/2020, 2:29 PM

It occurs between two tasks, where the first task is passing a list of filenames to the second. I suspect that list of filenames results in a query that is too large? But surely that is not sent to the API?

Chris White

12/07/2020, 4:15 PM

Hi @Adam - what type of Result configuration are you using for this task / for your flow?

Adam

12/07/2020, 4:15 PM

Hi @Chris White, we’re not using any

Chris White

12/07/2020, 4:16 PM

What is your Flow’s storage configuration?

Adam

12/07/2020, 4:17 PM

Docker. I recently updated to agent to 0.13.18 but the flow might still be on 0.13.8. Could that be an issue?

Chris White

12/07/2020, 4:18 PM

hmmm no, I don’t think so; without a result configuration I’m really surprised your state object was large. If you can reproduce this, perhaps do the following:

Copy code

def size_handler(task, old, new):
    <http://prefect.context.logger.info|prefect.context.logger.info>(f"State {new} has serialized representation: {new.serialize()}")

and then add this as a state handler to your affected task:

Copy code

@task(state_handlers=[size_handler])
...

# or

my_task(state_handlers=[size_handler])

Chris White

12/07/2020, 4:19 PM

and you’ll probably want to look at the logs in STDOUT because there’s a chance this log will be rejected by the API as well because it might be large

Adam

12/07/2020, 5:08 PM

Thanks @Chris White, I’ll try the above

Adam

12/08/2020, 1:06 PM

Hi @Chris White, here is the output of the state handler:

Copy code

State <Running: "Starting task run."> has serialized representation: {'message': 'Starting task run.', 'context': {'tags': []}, 'cached_inputs': {}, '_result': {'__version__': '0.13.18', 'type': 'NoResultType'}, '__version__': '0.13.18', 'type': 'Running'}

Although from looking at the stdout logs, it appears that my code is somehow logging contents of a file to stdout. As I am using

log_stdout=True

for this task, it seems thats the culprit. Very strange though that its outputting the contents of a file. This is one of the methods that gets called from within the task

Copy code

def clean_file(filepath: str):
    print(f"Cleaning file {filepath}")
    filename = path.basename(filepath)
    with open(filepath, "r") as fin:
        data = fin.read().splitlines(True)
        for index, value in enumerate(data):
            data[index] = "|".join([filename, value])
    with open(filepath, "w") as fout:
        # skip line 1 (header) and skip last line (footer)
        fout.writelines(data[1 : len(data) - 1])
    return filepath

It appears

fout.writelines

is coming up on stdout rather than into the file. Or am I mistaken?

Adam

12/08/2020, 2:59 PM

I’ve removed all

log_stdout

and explicity use the prefect logger, but I’m still seeing the actual file written to stdout. It used to work though, so very confused 😕

Adam

12/08/2020, 3:00 PM

Locally when running

flow.run()

I don’t have this issue though

Chris White

12/08/2020, 4:16 PM

So the original traceback was for a state, logs are batched up in the background and wouldn’t cause a state update failure; you only posted the serialized form for the

Running

state — what did the final state look like?

Adam

12/09/2020, 9:31 AM

Hi @Chris White So I’m no longer seeing the “set task state” error, but I am seeing this error:

Copy code

Failed to write log with error: 413 Client Error: Request Entity Too Large for url: <https://api.prefect.io/graphql>

It never proceeds further than a few of those errors as eventually it says

No heartbeat detected from the remote task; marking the run as failed.

That being said, when I look at the actual logs on the container, I see a file being printed which shouldn’t be (it seems this file being printed is sent as a log). I have disabled all

log_stdout

properties on the task though. Any ideas?

Adam

12/10/2020, 4:59 PM

Hey @Chris White, good morning 🙂 Any ideas? Do you have any idea why writing to a file is logged to stdout when running in Kuebrnetes, but not locally?

Adam

12/10/2020, 4:59 PM

It’s causing one of our main ETL jobs to fail :(

Chris White

12/10/2020, 5:31 PM

Hey @Adam - I think this will be better as a GitHub issue w/ some code snippets - I don’t have enough information to debug here; the symptoms you’re describing appear contradictory to me; the failure to write a log has no bearing on the final state of your task runs, so there’s something missing that we’ll need to identify

Adam

12/10/2020, 5:39 PM

Sure, will post as an issue with the code snippets

Adam

12/10/2020, 5:39 PM

Thanks!

Chris White

12/10/2020, 5:39 PM

👍 👍

2 Views

Open in Slack

Previous Next