seeing something weird in prefect 2. our stack is ...
# ask-community
t
seeing something weird in prefect 2. our stack is AWS ECR + docker we have steps to auth into ECR and then tag the image with the commit hash. we have about 20-30 deployments in our prefect.yaml and around deployment #25 it stops using the cached docker build/push steps and tries to push the docker image again. this fails because there's already an image with that tag and the repo is immutable. anyone ever seen anything like this?
n
hrm, I have not. so its 20-30 of the same image? I can try to reproduce
t
yes, all using the same image. let me know if you'd like me to dm you the prefect.yaml nate
n
if you could the send build and push step (which I assume are common to them all) that would be helpful
t
and it's at deployment #40 that it loses the cache, not #25.
n
doh, ok just tried 20 and it worked. ill up it to 40
sure enough
Copy code
╭────────────────────────────────────────────────────────────────────────────────────────────────────────╮
│ Deploying healthcheck-docker-test-36                                                                   │
╰────────────────────────────────────────────────────────────────────────────────────────────────────────╯
Running deployment build steps...
 > Running run_shell_script step...
 > Running run_shell_script step...
Authenticating with existing credentials...
the 37th misses cache somehow 🤔
oh wait. no mine is failing bc I'm authing over and over again, hold on
hm so if I stop running
docker login
(using dockerhub instead of ECR) this is working for me
Copy code
» PREFECT_LOGGING_LEVEL=DEBUG prefect --no-prompt deploy --all --prefect-file many-of-the-same.yaml
using
PREFECT_LOGGING_LEVEL=DEBUG
bc it shows me some logs from
prefect_docker
that show whether i hit or miss the cache
t
interesting...
maybe it's the login to ecr that's timing out then... let me see if I can tweak it
on my side it looks like the docker steps are all cached until it gets to one of the steps copying files onto the container. still not sure what would be causing that... everything is using the same image.
Copy code
Step 9/15 : RUN ...
 ---> Using cache
 ---> bd8ca778e7c6
Step 10/15 : COPY ...
 ---> 7c8b7b94b3ae
Step 11/15 : COPY ...
n
oh ok. sorry. so when I said cache I meant this, not dockers internal layers cache
that makes sense to me that docker would bust cache if you copy new files at some point, it would only cache all the layers below that right?
t
that's what I'm trying to think through. it misses the internal layers cache just on the deployments I most recently added. but it should all be using the same docker container with the files pre loaded onto it. it's able to chug through other deployments that user 4-5 different flows without issue
n
do you wanna share your dockerfile if you can?
t
for sure, I'll dm it over
ok couple of updates on this. I moved the deployment that was causing issues to be second on the list and it's still generating the same issue. so I don't think it's a docker login timing issue or anything like that. something about the deployment is causing the internal docker cache to reset and I'm not sure what.
to close the loop on this, I lsd all the files and found a number of pycache/pyc files being leftover. our dockerignore pattern wasn't catching them properly so I updatd it and we're golden now!
catjam 1
n
nice!! glad you got it