Hi everyone! @Aaron Prescott and I are trying to set up our first flows on Prefect cloud using an aws EKS agent. Right now we are getting stuck when trying to specify the storage in aws ECR. When I run
[2021-07-13 14:32:00+0100] INFO - prefect.Docker | Building the flow's Docker storage...
invalid reference format
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/hilaryroberts/.pyenv/versions/3.8.11/lib/python3.8/site-packages/prefect/storage/docker.py", line 303, in build
self._build_image(push=push)
File "/Users/hilaryroberts/.pyenv/versions/3.8.11/lib/python3.8/site-packages/prefect/storage/docker.py", line 369, in _build_image
raise ValueError(
ValueError: Your docker image failed to build! Your flow might have failed one of its deployment health checks - please ensure that all necessary files and dependencies have been included.
It should be using the default image from the prefecthq dockerhub, so I wonder why it's failing to build. Anyone know what I might be doing wrong?
✅ 1
k
Kevin Kho
07/13/2021, 1:51 PM
Hey @Hilary Roberts, just clarifying, does this happen on flow registration or execution?
h
Hilary Roberts
07/13/2021, 1:51 PM
On registration
k
Kevin Kho
07/13/2021, 1:54 PM
You’re not providing any Dockerfile right?
Kevin Kho
07/13/2021, 2:04 PM
I replicated this. Looking into it
h
Hilary Roberts
07/13/2021, 2:07 PM
No I'm not. Great thanks!
k
Kevin Kho
07/13/2021, 3:01 PM
This is working for me now. Stuff I had to do: remove https from the registry url. Log-in ECR from the CLI. Create the repository that matches the image_name.
Copy code
from prefect.storage import docker
from prefect.storage.docker import Docker
from prefect import task, Flow
@task
def abc(x):
return x
with Flow("ecs_test") as flow:
abc(1)
storage = Docker(registry_url="<http://0298646XXXXX.dkr.ecr.us-east-2.amazonaws.com/|0298646XXXXX.dkr.ecr.us-east-2.amazonaws.com/>", image_name="ecs_test")
flow.storage = storage
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.