jack

    jack

    9 months ago
    How can we check whether all flow-run-logs have arrived at prefect cloud? When a flow-run is marked as failed, we wait 10 seconds, then query the flow-run-logs to see what happened. Sometimes 10 seconds isn't long enough of a delay, and instead of getting all of the logs, we only get the first N. Is there some flag we can wait for instead of waiting an arbitrary duration, so that we can be sure to get all the logs?
    Anna Geller

    Anna Geller

    9 months ago
    @jack Can you elaborate on a problem that you try to solve this way? Do you want to get notified about the reason why the flow run failed? If so, probably a state handler or using an Automations notification would help here. This way, you could get the exception e.g. as part of a Slack or email message.
    Kevin Kho

    Kevin Kho

    9 months ago
    So the cause of this is that in Prefect Cloud, there is a batching of logs before they are persisted with us, so there will always be a delay.
    jack

    jack

    9 months ago
    I don't mind a delay. But I would like to know how long I need to wait in order to get all of them. Is there a set cadence for shipping the logs as a batch?
    Kevin Kho

    Kevin Kho

    9 months ago
    I suspect it’s hard to pin a number because even if there was a cadence, the processing time can be variable depending on the load. Will ask though.
    Anna brings up good suggestions if you need instant logs to do something, you might be able to do it form the Flow side
    jack

    jack

    9 months ago
    We've been slurping the logs so that we have all of the pertinent logs (prefect and otherwise) in a common location on disk.
    Kevin Kho

    Kevin Kho

    9 months ago
    It seems roundabout to get the Prefect logs by getting them from the Database. Maybe a FileHandler would be better , especially cuz querying for logs would require pagination. Still trying to get an answer for you though
    Anna Geller

    Anna Geller

    9 months ago
    @jack to add to that, what agent do you use? It’s actually easier also to forward flow run logs from your agent or other execution layer (e.g. Kubernetes/ECS cluster) to some central location rather than going through the API. This could significantly reduce latency because this way you would retrieve those logs directly from your own infrastructure
    jack

    jack

    9 months ago
    We are running on ECS. Does that answer the question about agents, or is that a different question?
    Anna Geller

    Anna Geller

    9 months ago
    yes it does. Actually, all your logs are then by default available in CloudWatch, and you could e.g. configure it on CloudWatch to automatically forward your logs to something else like S3 bucket. Or you could configure a log driver on ECS tasks and ECS agent to send all logs e.g. to Splunk
    jack

    jack

    9 months ago
    We have also been ingesting flow run logs for successful runs. This way we have been able to capture the resident size for each flow run.