Hi we are trying to use `register()` to push a flo...
# prefect-community
l
Hi we are trying to use
register()
to push a flow to a ECR (amazon) private registry, but we can't figure out how prefect picks up the credentials to do it. We know that with the command line
docker build
and
docker push
, you need to execute a special
docker login
with
aws ecr get-login...
command, but don't know how to pass this on to prefect for register(). We feel like we miss a step.
👀 1
We find documentation in the DaskKubernetesEnvironment, about specifying a secret containing the docker credentials, but nothing at the registration step
this is a skeleton of our registration code:
Copy code
from prefect.environments.storage import Docker
from strdata.strdata_poc import strdata
strdata.storage = Docker(registry_url="our_empty_aws_ecr_repository",
                         dockerfile="our_custom_directory_path")
strdata.register(project_name="test")
d
Hi @Luis Muniz! Docker Storage uses the [Docker SDK for Python](https://docker-py.readthedocs.io/en/stable/index.html) to build the image and push to a registry. Make sure you have the Docker daemon running locally and you are configured to push to your desired container registry. Additionally make sure whichever platform Agent deploys the container also has permissions to pull from that same registry.
I’m not familiar as much with ECR (we use GCP), but I know for google that I need to have my local machine configured with my google credentials
If you can confirm you’re logged in before your python session, see if you can export an appropriate environment variable for python to pick up when calling
flow.register
I often have to export
GOOGLE_APPLICATION_CREDENTIALS
to my python session when registering flows with Docker storage + Google Container Registry
i
@Luis Muniz give me a couple minutes I will sanitize the script I use to push to ECR. But just to confirm you are not interested in this? https://docs.prefect.io/orchestration/execution/storage_options.html#non-docker-storage-for-containerized-environments
💯 1
d
Thanks for sharing @itay livni!
l
Hi thanks @itay livni, this is something I will be doing later, for now I want first to keep it simple and generate one image per flow
i
Copy code
storage = Docker(
    registry_url=  ecr_url   # "<http://your_account.dkr.ecr.mars-east-75.amazonaws.com/get-tsx-moc-ecr|your_account.dkr.ecr.mars-east-75.amazonaws.com/get-tsx-moc-ecr>",
    python_dependencies=["pandas", "sqlalchemy", "psycopg2", "boto3", "humps", "requests", "yfinance"],
    dockerfile=docker_flpth,
    image_name="etl-moc-img",
    image_tag="latest"
    )
I was looking for my old script could not find it... But what errors are you getting? The other key thing is
ecr_repo_name = f"{ecr_url.replace('https://', '')}"
.
👍 1
l
@Dylan I think that your initial answer already gives us the indication that we don't really need to do anything more than executing the get-login command should be enough for registration. We are running into an error and were mistakenly thinking that this was a credentials issue
let me show you the error message we get
👍 1
Copy code
requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: <http+docker://localhost/v1.40/build?t=http%3A%2F659990142216.dkr.ecr.eu-west-1.amazonaws.com%2Fstrdata-flow%2Fstrdata-poc%3A2020-07-02t14-43-38-253660-00-00&q=False&nocache=False&rm=False&forcerm=True&pull=False&dockerfile=.%2Ftmpj_kuw1de%2FDockerfile>
16:44
docker.errors.APIError: 500 Server Error: Internal Server Error ("invalid reference format")
d
Hmmm
l
for the record we tried a
docker pull
to make sure that our credentials work
d
Thanks!
l
@itay livni thanks this will help no doubt when we get the deeper error fixed
d
Hey @Luis Muniz this looks like a bug to me, that registry url looks potentially malformed
Would you mind opening an issue?
l
Sure
d
Thanks! We’ll address this as soon as we can
i
@Luis Muniz @Dylan Trying to replicate but yes that url looks bad
d
In the meantime, you can build and push the image yourself and use one of the other storage options to collect your flow code
b
hello @Dylan @itay livni I'm investigating this issue alongside @Luis Muniz. I will be creating the issue in a few minutes.
thanks for all the help so far! Really appreciated
d
Of course!
b
So meanwhile, I guess the proposition would be to have an initial built base image and use S3 storage instead?
l
@bruno.corucho Yes, you can definitely do that (long term if you want too, which it sounds like may be your end goal, but it seems for right now it is more or less to get around whatever bug is causing the malformed url when Prefect builds your docker image for you using the python sdk)
i
@bruno.corucho I tried to replicate. Could not. But I did find my script and it successfully registered to cloud. https://gist.github.com/gryBox/40e6812fd623d5942fd8985b8c63df3a
and pushed to ECR
@bruno.corucho Did you successfully push the image using the steps provided by AWS in the console,
View push commands
?
b
After some attempts, and checking out @itay livni’s illustration, the issue was the following: • we did not remove the http prefix from our registry_url (oops); • our custom Dockerfile was not properly configured (it was not correctly pointing to the root directory of the project) • We did not know that the name we give to the flow would impact our attempt to register it to our ECR repository. Also, the error message was not very handy _"_InterruptedError: name unknown: The repository with name 'strdata-poc' does not exist in the registry with id '659990142216'". It took us a while to realise that this "strdata-poc" label was actually referencing our flow which was named "STRDATA POC". On top of that, we also didn't know that prefect would also append this label to our registry_url (our_ecr_url vs our_ecr_url/strdata-poc)
Correct me if I took any wrong conclusion in here but I think that was it. Once again, you guys were truly helpful in here. Thank you SO much for assisting us this fast!
🚀 1