Pierre Monico08/25/2021, 9:04 AM
run config. However, since I deployed to the server the runs are not very reliable. I keep getting random errors in a totally non-deterministic way (cf. image). The flows all run fine on my machine and were also running fine when executed in a Docker container I was managing myself (on cloud infrastructure). Some of the errors include: •
HTTPConnectionPool(host='host.docker.internal', port=4200): Read timed out. (read timeout=15)
(I am writing to Postgres tables - the table does exist and sometimes the run even succeeds) •
TypeError: 'NoneType' object is not subscriptable
I believe most of the errors are related to some time out issue / the state of the task can’t be monitored but I am confused as to why this happens. My VM has sufficient resources (I checked the monitoring) but I am thinking it might be worth scaling it up? I know the question might be a bit broad but would be happy to know if anyone experienced something similar and / or knows the reasons / a fix.
No heartbeat detected from the remote task; retrying the run.This will be retry 1 of 3.
Kevin Kho08/25/2021, 2:48 PM
with a mapped task by chance? Also the
error normally happens when something didn’t succeed and the flow continues. Heartbeat issues are related to memory issues 90% of the time. Heartbeats tell Prefect that the task is still running and from most of the cases we’ve seen, heartbeats die when the flow/task is memory constrained. All things considered, I think the best advice for you is to scale up here.
Pierre Monico08/25/2021, 3:07 PM
Kevin Kho08/26/2021, 1:58 PM