<@ULVA73B9P> - any idea on this error: `pull image...
# ask-marvin
k
@Marvin - any idea on this error:
pull image manifest has been retried 5 time(s): failed to resolve ref <http://docker.io/prefecthq/prefect:3.1.6-python3.10|docker.io/prefecthq/prefect:3.1.6-python3.10>: failed to do request: Head "<https://registry-1.docker.io/v2/prefecthq/prefect/manifests/3.1.6-python3.10>": dial tcp 54.198.86.24:443: i/o timeout
?
m
It looks like there was an issue accessing the GitHub repository to check for similar issues related to your error. However, based on my prior experience, the error you encountered is typically related to network connectivity issues when trying to pull a Docker image. Here are a few steps you can consider to resolve the issue: 1. Check Network Connection: Make sure your internet connection is stable. The timeout error indicates that the request to Docker's registry isn't completing, likely due to network issues. 2. Retry the Pull: Sometimes, this issue resolves by simply trying the command again after some time. 3. Firewall and Security Groups: Ensure that your firewall or any network security groups allow outbound traffic to Docker's registry. 4. Docker Daemon Restart: Restart your Docker daemon to clear any potential stale network connections. 5. DNS Configuration: Consider checking your DNS configuration, as sometimes the default DNS servers may have difficulty resolving
<http://docker.io|docker.io>
. 6. Try a Different Network: If possible, switch to a different network or connect directly to the internet without going through a corporate proxy or VPN that might be blocking Docker registry traffic. If the problem persists after these steps, it might be an issue on the Docker Hub's side, and waiting or reaching out to their support could be helpful. Meanwhile, feel free to reach out if there's anything else specific you want help with!
n
hmm, I wonder if it was a network timeout? fwiw
Copy code
» docker run -p 4200:4200 --rm prefecthq/prefect:3.1.6-python3.10 -- prefect server start --host 0.0.0.0
Unable to find image 'prefecthq/prefect:3.1.6-python3.10' locally
3.1.6-python3.10: Pulling from prefecthq/prefect
bb3f2b52e6af: Already exists
....
a92eb3f0465d: Pull complete
Digest: sha256:c563cde3c0c1b982f5623847136a32c6a3a5667fa5ec9f175965d189f194c797
Status: Downloaded newer image for prefecthq/prefect:3.1.6-python3.10

 ___ ___ ___ ___ ___ ___ _____
| _ \ _ \ __| __| __/ __|_   _|
|  _/   / _|| _|| _| (__  | |
|_| |_|_\___|_| |___\___| |_|

Configure Prefect to communicate with the server with:

    prefect config set PREFECT_API_URL=<http://0.0.0.0:4200/api>

View the API reference documentation at <http://0.0.0.0:4200/docs>

Check out the dashboard at <http://0.0.0.0:4200>
k
Its happening on flow submission to an ECS worker. The worker ECS task has "all outbound" access in its security group.
wait never mind, seems like it was transient
nevermind again, it failed again, same issue.
n
hmm i do suspect that one way or another it has to do with the network of the place trying to pull
k
I think the worker task is not forwarding / setting the SG onto the flow task. Is there a doc/reference for job variables for how to map that?
"security_groups": []
"securityGroups": []
?
cant remember so well but I think the best information would be there somewhere
k
This is for the worker though, the pull is happening / trying to happen in ECS for the flow run task.
probably the same syntax though, i see
n
but that’s the workers purview. it pulls the image and spins up the runtime infra
i’m not an ecs expert so i could be wrong but generally that’s how workers work. i’ll tap in @Kevin Grismore here to fact check bc he knows more than me about this
k
I appreciate the help, I also emailed poc@prefect.io cause we are doing a proof of concept. Don't want to bother too many folks.
k
yeah, the worker acquires the config from the flow run/deployment/work pool, then sends a runTask request to ecs
so this is work pool config right?
k
yea, I think I just dont have everything configured right in the job variables (for the pool or deployment or flow run)
I have the following in job variables
Copy code
{
  "cluster": "$cluster",
  "subnets": [
    "subnet-#############",
    "subnet-#############",
    "subnet-#############"
  ],
  "securityGroups": [
    "sg-###########"
  ]
}
does not seem to be picking them up though. The flow run task ends up in a default SG and seemingly a random other subnet
k
yeah, do
security_groups
wait maybe I'm wrong
let me check one of mine
k
Does it also maybe need to be nested under
awsvpcConfiguration
?
ahaaa yeah you can't just have them in job_variables like that
you need to match each section to the key in the work pool template
Copy code
"network_configuration": {
  "subnets": [
    "subnet-#############",
    "subnet-#############",
    "subnet-#############"
  ],
  "securityGroups": [
    "sg-###########"
  ]
}
k
kk, got an error to include a vpc id if using a custom config, but thats a good sign
okay it started and ended up in the right subnet, etc. It crashed immediately, but thats progress. I have to hop off for now thanks a bunch @Kevin Grismore @Nate and @Marvin. For tomorrow, if possible could someone point me to the docs for
the work pool template
or whatever its called. Its been discussed a lot in the docs but I don't see a reference for it upon a quick review of things. I need to configured logs next. 👋
k
prefect UI -> work pools -> select a work pool -> edit -> advanced tab
upvote 1
the
variables
section has the names, descriptions, and data types of all the
job_variables
. The
job_configuration
section has where those variables slot in to form the actual thing that gets built by the worker
🙌 1
k
Morning @Kevin Grismore - thanks for pointing all this out, this is exactly what I need. One follow up question, if I want to customize this at the work pool level, and said work pool is managed via terraform, I assume I need to do the following: • Copy paste the whole JSON from the advanced tab into a file • Update the values or defaults in variables for the things I want set • Set it as the
base_job_template
Is that right? There is no way to do "use the default template for the work pool + my overrides" if using terraform/API?