Adam
12/07/2020, 2:28 PMFailed to set task state with error: HTTPError('413 Client Error: Request Entity Too Large for url: <https://api.prefect.io/graphql>')
— whats the best way to debug this?Chris White
Adam
12/07/2020, 4:15 PMChris White
Adam
12/07/2020, 4:17 PMChris White
def size_handler(task, old, new):
<http://prefect.context.logger.info|prefect.context.logger.info>(f"State {new} has serialized representation: {new.serialize()}")
and then add this as a state handler to your affected task:
@task(state_handlers=[size_handler])
...
# or
my_task(state_handlers=[size_handler])
Adam
12/07/2020, 5:08 PMState <Running: "Starting task run."> has serialized representation: {'message': 'Starting task run.', 'context': {'tags': []}, 'cached_inputs': {}, '_result': {'__version__': '0.13.18', 'type': 'NoResultType'}, '__version__': '0.13.18', 'type': 'Running'}
Although from looking at the stdout logs, it appears that my code is somehow logging contents of a file to stdout. As I am using log_stdout=True
for this task, it seems thats the culprit.
Very strange though that its outputting the contents of a file. This is one of the methods that gets called from within the task
def clean_file(filepath: str):
print(f"Cleaning file {filepath}")
filename = path.basename(filepath)
with open(filepath, "r") as fin:
data = fin.read().splitlines(True)
for index, value in enumerate(data):
data[index] = "|".join([filename, value])
with open(filepath, "w") as fout:
# skip line 1 (header) and skip last line (footer)
fout.writelines(data[1 : len(data) - 1])
return filepath
It appears fout.writelines
is coming up on stdout rather than into the file. Or am I mistaken?log_stdout
and explicity use the prefect logger, but I’m still seeing the actual file written to stdout. It used to work though, so very confused 😕flow.run()
I don’t have this issue thoughChris White
Running
state — what did the final state look like?Adam
12/09/2020, 9:31 AMFailed to write log with error: 413 Client Error: Request Entity Too Large for url: <https://api.prefect.io/graphql>
It never proceeds further than a few of those errors as eventually it says No heartbeat detected from the remote task; marking the run as failed.
That being said, when I look at the actual logs on the container, I see a file being printed which shouldn’t be (it seems this file being printed is sent as a log). I have disabled all log_stdout
properties on the task though. Any ideas?Chris White
Adam
12/10/2020, 5:39 PMChris White