heyo! i'm using `prefect agent docker start` ... ...
# prefect-server
c
heyo! i'm using
prefect agent docker start
... is there a way to prevent the container from being removed once its completed?
k
Hey, I’m not sure honestly. Looking at it, but would you happen to know how to do it without Prefect?
c
usually exited containers stick around, i have to assume prefect is doing the equivalent of
Copy code
docker create ...
docker run ... --rm
usually i don't
docker run ... --rm
k
Prefect specifically uses create_container of docker-py so I am looking to see if it can be passed in as a kwarg but not seeing anything. The API doc is here for reference
z
We set this here
Copy code
# By default, auto-remove containers
        host_config: Dict[str, Any] = {"auto_remove": True}
You can set the
auto_remove=False
on the run config host config which gets merged here
Copy code
if run_config is not None and run_config.host_config:
            # The host_config passed from the run_config will overwrite defaults
            host_config.update(run_config.host_config)
c
lemee try that, standby!
z
I’d accept an agent level toggle for the default behavior as in https://github.com/PrefectHQ/prefect/pull/4351
If you’d like to contribute 🙂
c
is host_config actually the right place?
z
Yeah it is
Well, I think it is. We definitely set it to
True
there.
c
my type linter is complaining, i'll see what i'm doing wrong... prob a environmental issue
We just type it as a
dict
internally
c
yea hmm, even register-flow is complaining about using
host_config={auto_remove=False}
🤔
Copy code
root@ca683cf00884:/app# prefect version
1.1.0
what i'm really really trying to solve is the container is starting, then stopping, but Prefect only show that it was submitted, not failed... not getting any signal as to why its failing
no logs for the flow run other than that the container was submitted for execution, and it stays in the Scheduled state...
i'm volume linking in our code and pickled prefect from the host OS, so i figure something is wrong there, just not what, specifically
k
I think it should be:
Copy code
host_config={"auto_remove":False}
? Or were you just typing and that’s not a copy paste?
c
omg rookie mistake sorry
so used to JS/TS in life where i don't have to quote map keys 🤦‍♂️
ah hah! its a networking issue...
Copy code
requests.exceptions.ConnectionError: HTTPConnectionPool(host='apollo', port=4200): Max retries exceeded with url: /graphql (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x400a087ad0>: Failed to establish a new connection: [Errno 111] Connection refused'))
still very strange that didn't end up in flow logs and the job stays in Submitted state... feels like a bug? perhaps a race condition when the container is launched, fails, and gets rm'd before prefect agent can gather logs?
k
Stuck in submitted means the flow process never started, which is the thing that sends the logs to Cloud. So it’s not a bug necessarily, but more like there was nothing to send the logs. This also happens when a Kubernetes job couldn’t get the hardware it needed for example
So these will eventually get marked as Failed is they are retried enough times and it couldn’t get past a submitted state
z
It can’t send logs / update its state if it can’t contact the API
Here it looks like it’s trying to connect to
apollo
as the API url, are you running your server on the same machine as your agent?
c
ahhhh yes that makes sense
i added
--network
to
prefect agent docker start
and got past the issue!
z
👍 yep that was the next recommendation
c
stellar, i'll look into an PR for the agent to get a little more customization for the remove behavior like Kube
thanks!