I now have a working Prefect Server with flows run...
# prefect-server
p
I now have a working Prefect Server with flows running within a
DockerRun
run config. However, since I deployed to the server the runs are not very reliable. I keep getting random errors in a totally non-deterministic way (cf. image). The flows all run fine on my machine and were also running fine when executed in a Docker container I was managing myself (on cloud infrastructure). Some of the errors include: •
HTTPConnectionPool(host='host.docker.internal', port=4200): Read timed out. (read timeout=15)
sqlalchemy.exc.NoSuchTableError
(I am writing to Postgres tables - the table does exist and sometimes the run even succeeds) •
TypeError: 'NoneType' object is not subscriptable
No heartbeat detected from the remote task; retrying the run.This will be retry 1 of 3.
I believe most of the errors are related to some time out issue / the state of the task can’t be monitored but I am confused as to why this happens. My VM has sufficient resources (I checked the monitoring) but I am thinking it might be worth scaling it up? I know the question might be a bit broad but would be happy to know if anyone experienced something similar and / or knows the reasons / a fix.
k
I think some of these might be related to scaling up. Are you accessing
sqlalchemy
with a mapped task by chance? Also the
NoneType
error normally happens when something didn’t succeed and the flow continues. Heartbeat issues are related to memory issues 90% of the time. Heartbeats tell Prefect that the task is still running and from most of the cases we’ve seen, heartbeats die when the flow/task is memory constrained. All things considered, I think the best advice for you is to scale up here.
p
Thanks a lot for the detailed advice. I’ll look into scaling the machine and will see if it fixes it.
Update: checking monitoring again it was indeed a problem with the VM resources. For a start I did spread my flow schedules over several periods (instead of all at the same time) and it seems to already help.
👍 1
Indeed, just spreading the flow schedules did wonders 😍
👍 1
Thanks again for always helping out @Kevin Kho!
k
Of course!