How can we check whether all flow run logs have arrived at p Prefect Community #prefect-server

How can we check whether all flow-run-logs have ar...

jack

12/10/2021, 7:21 PM

How can we check whether all flow-run-logs have arrived at prefect cloud? When a flow-run is marked as failed, we wait 10 seconds, then query the flow-run-logs to see what happened. Sometimes 10 seconds isn't long enough of a delay, and instead of getting all of the logs, we only get the first N. Is there some flag we can wait for instead of waiting an arbitrary duration, so that we can be sure to get all the logs?

Anna Geller

12/10/2021, 7:25 PM

@jack Can you elaborate on a problem that you try to solve this way? Do you want to get notified about the reason why the flow run failed? If so, probably a state handler or using an Automations notification would help here. This way, you could get the exception e.g. as part of a Slack or email message.

Kevin Kho

12/10/2021, 7:35 PM

So the cause of this is that in Prefect Cloud, there is a batching of logs before they are persisted with us, so there will always be a delay.

jack

12/10/2021, 7:51 PM

I don't mind a delay. But I would like to know how long I need to wait in order to get all of them. Is there a set cadence for shipping the logs as a batch?

Kevin Kho

12/10/2021, 7:55 PM

I suspect it’s hard to pin a number because even if there was a cadence, the processing time can be variable depending on the load. Will ask though.

Kevin Kho

12/10/2021, 8:01 PM

Anna brings up good suggestions if you need instant logs to do something, you might be able to do it form the Flow side

jack

12/10/2021, 8:07 PM

We've been slurping the logs so that we have all of the pertinent logs (prefect and otherwise) in a common location on disk.

Kevin Kho

12/10/2021, 8:08 PM

It seems roundabout to get the Prefect logs by getting them from the Database. Maybe a FileHandler would be better , especially cuz querying for logs would require pagination. Still trying to get an answer for you though

upvote 1

Anna Geller

12/10/2021, 8:10 PM

@jack to add to that, what agent do you use? It’s actually easier also to forward flow run logs from your agent or other execution layer (e.g. Kubernetes/ECS cluster) to some central location rather than going through the API. This could significantly reduce latency because this way you would retrieve those logs directly from your own infrastructure

jack

12/10/2021, 8:12 PM

We are running on ECS. Does that answer the question about agents, or is that a different question?

Anna Geller

12/10/2021, 8:14 PM

yes it does. Actually, all your logs are then by default available in CloudWatch, and you could e.g. configure it on CloudWatch to automatically forward your logs to something else like S3 bucket. Or you could configure a log driver on ECS tasks and ECS agent to send all logs e.g. to Splunk

jack

12/10/2021, 8:18 PM

We have also been ingesting flow run logs for successful runs. This way we have been able to capture the resident size for each flow run.

3 Views

Open in Slack

Previous Next