< Marvin> any idea on this error `pull image manifest has be Prefect Community #ask-marvin

<@ULVA73B9P> - any idea on this error: `pull image...

Kyle McChesney

12/11/2024, 11:09 PM

@Marvin - any idea on this error:

pull image manifest has been retried 5 time(s): failed to resolve ref <http://docker.io/prefecthq/prefect:3.1.6-python3.10|docker.io/prefecthq/prefect:3.1.6-python3.10>: failed to do request: Head "<https://registry-1.docker.io/v2/prefecthq/prefect/manifests/3.1.6-python3.10>": dial tcp 54.198.86.24:443: i/o timeout

Marvin

12/11/2024, 11:10 PM

It looks like there was an issue accessing the GitHub repository to check for similar issues related to your error. However, based on my prior experience, the error you encountered is typically related to network connectivity issues when trying to pull a Docker image. Here are a few steps you can consider to resolve the issue: 1. Check Network Connection: Make sure your internet connection is stable. The timeout error indicates that the request to Docker's registry isn't completing, likely due to network issues. 2. Retry the Pull: Sometimes, this issue resolves by simply trying the command again after some time. 3. Firewall and Security Groups: Ensure that your firewall or any network security groups allow outbound traffic to Docker's registry. 4. Docker Daemon Restart: Restart your Docker daemon to clear any potential stale network connections. 5. DNS Configuration: Consider checking your DNS configuration, as sometimes the default DNS servers may have difficulty resolving

<http://docker.io|docker.io>

. 6. Try a Different Network: If possible, switch to a different network or connect directly to the internet without going through a corporate proxy or VPN that might be blocking Docker registry traffic. If the problem persists after these steps, it might be an issue on the Docker Hub's side, and waiting or reaching out to their support could be helpful. Meanwhile, feel free to reach out if there's anything else specific you want help with!

Nate

12/11/2024, 11:12 PM

hmm, I wonder if it was a network timeout? fwiw

Copy code

» docker run -p 4200:4200 --rm prefecthq/prefect:3.1.6-python3.10 -- prefect server start --host 0.0.0.0
Unable to find image 'prefecthq/prefect:3.1.6-python3.10' locally
3.1.6-python3.10: Pulling from prefecthq/prefect
bb3f2b52e6af: Already exists
....
a92eb3f0465d: Pull complete
Digest: sha256:c563cde3c0c1b982f5623847136a32c6a3a5667fa5ec9f175965d189f194c797
Status: Downloaded newer image for prefecthq/prefect:3.1.6-python3.10

 ___ ___ ___ ___ ___ ___ _____
| _ \ _ \ __| __| __/ __|_   _|
|  _/   / _|| _|| _| (__  | |
|_| |_|_\___|_| |___\___| |_|

Configure Prefect to communicate with the server with:

    prefect config set PREFECT_API_URL=<http://0.0.0.0:4200/api>

View the API reference documentation at <http://0.0.0.0:4200/docs>

Check out the dashboard at <http://0.0.0.0:4200>

Kyle McChesney

12/11/2024, 11:13 PM

Its happening on flow submission to an ECS worker. The worker ECS task has "all outbound" access in its security group.

Kyle McChesney

12/11/2024, 11:16 PM

wait never mind, seems like it was transient

Kyle McChesney

12/11/2024, 11:17 PM

nevermind again, it failed again, same issue.

Nate

12/11/2024, 11:20 PM

hmm i do suspect that one way or another it has to do with the network of the place trying to pull

Kyle McChesney

12/11/2024, 11:25 PM

I think the worker task is not forwarding / setting the SG onto the flow task. Is there a doc/reference for job variables for how to map that?

"security_groups": []

"securityGroups": []

Nate

12/11/2024, 11:26 PM

https://docs.prefect.io/integrations/prefect-aws/ecs_guide#3-create-an-ecs-service-to-host-your-worker

Nate

12/11/2024, 11:26 PM

cant remember so well but I think the best information would be there somewhere

Kyle McChesney

12/11/2024, 11:28 PM

This is for the worker though, the pull is happening / trying to happen in ECS for the flow run task.

Kyle McChesney

12/11/2024, 11:28 PM

probably the same syntax though, i see

Nate

12/11/2024, 11:29 PM

but that’s the workers purview. it pulls the image and spins up the runtime infra

Nate

12/11/2024, 11:36 PM

i’m not an ecs expert so i could be wrong but generally that’s how workers work. i’ll tap in @Kevin Grismore here to fact check bc he knows more than me about this

Kyle McChesney

12/11/2024, 11:37 PM

I appreciate the help, I also emailed poc@prefect.io cause we are doing a proof of concept. Don't want to bother too many folks.

Kevin Grismore

12/11/2024, 11:38 PM

yeah, the worker acquires the config from the flow run/deployment/work pool, then sends a runTask request to ecs

Kevin Grismore

12/11/2024, 11:38 PM

so this is work pool config right?

Kyle McChesney

12/11/2024, 11:40 PM

yea, I think I just dont have everything configured right in the job variables (for the pool or deployment or flow run)

Kyle McChesney

12/11/2024, 11:41 PM

I have the following in job variables

Copy code

{
  "cluster": "$cluster",
  "subnets": [
    "subnet-#############",
    "subnet-#############",
    "subnet-#############"
  ],
  "securityGroups": [
    "sg-###########"
  ]
}

Kyle McChesney

12/11/2024, 11:42 PM

does not seem to be picking them up though. The flow run task ends up in a default SG and seemingly a random other subnet

Kevin Grismore

12/12/2024, 12:02 AM

yeah, do

security_groups

Kevin Grismore

12/12/2024, 12:02 AM

wait maybe I'm wrong

Kevin Grismore

12/12/2024, 12:03 AM

let me check one of mine

Kyle McChesney

12/12/2024, 12:03 AM

Does it also maybe need to be nested under

awsvpcConfiguration

Kevin Grismore

12/12/2024, 12:05 AM

here's a reference https://github.com/PrefectHQ/prefect/blob/main/src/integrations/prefect-aws/prefect_aws/workers/ecs_worker.py#L1449

Kevin Grismore

12/12/2024, 12:06 AM

but if you provide a custom one, we drop it in for you like this https://github.com/PrefectHQ/prefect/blob/main/src/integrations/prefect-aws/prefect_aws/workers/ecs_worker.py#L1499

Kevin Grismore

12/12/2024, 12:07 AM

ahaaa yeah you can't just have them in job_variables like that

Kevin Grismore

12/12/2024, 12:07 AM

you need to match each section to the key in the work pool template

Kevin Grismore

12/12/2024, 12:08 AM

Copy code

"network_configuration": {
  "subnets": [
    "subnet-#############",
    "subnet-#############",
    "subnet-#############"
  ],
  "securityGroups": [
    "sg-###########"
  ]
}

Kyle McChesney

12/12/2024, 12:11 AM

kk, got an error to include a vpc id if using a custom config, but thats a good sign

Kyle McChesney

12/12/2024, 12:14 AM

okay it started and ended up in the right subnet, etc. It crashed immediately, but thats progress. I have to hop off for now thanks a bunch @Kevin Grismore @Nate and @Marvin. For tomorrow, if possible could someone point me to the docs for

the work pool template

or whatever its called. Its been discussed a lot in the docs but I don't see a reference for it upon a quick review of things. I need to configured logs next. 👋

Kevin Grismore

12/12/2024, 12:15 AM

prefect UI -> work pools -> select a work pool -> edit -> advanced tab

upvote 1

Kevin Grismore

12/12/2024, 12:15 AM

the

variables

section has the names, descriptions, and data types of all the

job_variables

. The

job_configuration

section has where those variables slot in to form the actual thing that gets built by the worker

🙌 1

Kyle McChesney

12/12/2024, 5:58 PM

Morning @Kevin Grismore - thanks for pointing all this out, this is exactly what I need. One follow up question, if I want to customize this at the work pool level, and said work pool is managed via terraform, I assume I need to do the following: • Copy paste the whole JSON from the advanced tab into a file • Update the values or defaults in variables for the things I want set • Set it as the

base_job_template

Is that right? There is no way to do "use the default template for the work pool + my overrides" if using terraform/API?

15 Views

Open in Slack

Previous Next