<@ULVA73B9P> I am able to deploy my script to pref...
# ask-marvin
t
@Marvin I am able to deploy my script to prefect, but when it runs in coiled, I get errors. My coiled function is set to run using a container (example below) and my deployment is using another container (example below). The image is deployed with --platform linux/amd64 and works when testing locally using software created using the same image. The error I'm getting is Failed pulling software environment image, error code 1. # coiled function example @task @coiled.function( name="prefect_jaxa_hourly_sst_median", container="bathymaps/prefect-coiled:v2.0", #software='prefect-coiled-v2', region="ap-southeast-2", memory="64 GiB", n_workers=2, idle_timeout="10m", environ={"CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE": "YES"}, ) # deploy example if name == "__main__": copernicus_hourly_sst_so_ssh_jaxa_flow.deploy( name="copernicus_hourly_sst_so_ssh_jaxa_flow", work_pool_name="bathymaps-coiled-pool", image="bathymaps/prefect-coiled:v1.0", build=False, cron="0 * * * *", )
m
thought for 69.3 seconds
Thanks for the details — this sounds like a Coiled-side image pull failure rather than a Prefect deployment issue. What’s happening - Your Prefect deployment starts fine (Docker image v1.0), then your task calls
@coiled.function(...)
which tells Coiled to launch workers using a different image (v2.0). - The error “Failed pulling software environment image, error code 1” is Coiled failing to pull the
container="bathymaps/prefect-coiled:v2.0"
image in the target region (ap-southeast-2). Common causes and quick checks 1) Tag mismatch - You’re using
v1.0
for the Prefect deployment and
v2.0
for Coiled. If
v2.0
wasn’t pushed or is private, Coiled can’t pull it. - Recommendation: either align both to the same, known-good tag or verify
v2.0
exists and is accessible. 2) Image accessibility and registry auth - Is
bathymaps/prefect-coiled:v2.0
public? If private, Coiled needs registry credentials configured for your account. - Try pulling from a Linux/amd64 host outside your network to verify:
Copy code
docker pull --platform linux/amd64 bathymaps/prefect-coiled:v2.0
If that fails or prompts for auth, configure Coiled with the appropriate registry credentials (Docker Hub/ECR/GCR) in your Coiled account. 3) Architecture/manifest - Even if you built with
--platform linux/amd64
, confirm the image manifest includes amd64 and that there isn’t a stray arm64-only tag.
Copy code
docker buildx imagetools inspect bathymaps/prefect-coiled:v2.0
Ensure linux/amd64 is listed. 4) Region and registry rate limits - Coiled will pull the image in ap-southeast-2. If you’re hitting rate limits (Docker Hub) or experiencing slow pulls, consider hosting the image in a registry local to that region (e.g., AWS ECR in ap-southeast-2) and point
container=
to that fully-qualified image. 5) Using both software/container - You commented out
software=...
(good). Make sure you never pass both at once — only one should be active. Minimal repro to isolate Coiled from Prefect Run this locally (outside Prefect) to confirm Coiled can pull and run your container in ap-southeast-2:
Copy code
import coiled

@coiled.function(
    name="test-pull",
    container="bathymaps/prefect-coiled:v2.0",
    region="ap-southeast-2",
    n_workers=1,
)
def ping():
    import platform, sys
    return {"machine": platform.machine(), "python": sys.version}

print(ping())
If this fails with the same error, it confirms an image access/registry/arch issue on the Coiled side. Recommendations to proceed - For consistency, align the images (either use v1.0 everywhere or upgrade your Prefect deployment to v2.0):
Copy code
copernicus_hourly_sst_so_ssh_jaxa_flow.deploy(
      name="copernicus_hourly_sst_so_ssh_jaxa_flow",
      work_pool_name="bathymaps-coiled-pool",
      image="bathymaps/prefect-coiled:v2.0",
      build=False,
      cron="0 * * * *",
  )
- Ensure the Coiled image (the one in
container=
) is public or that Coiled has valid credentials for your registry. - If possible, host the image in ECR in ap-southeast-2 and reference that ECR URL in
container=...
to avoid Docker Hub rate limits and speed up pulls. Could you share: - Whether
bathymaps/prefect-coiled:v2.0
is public or private and which registry it’s on - The exact Coiled log snippet around the failure (including any “denied”/“not found” lines) - Confirmation that
docker pull --platform linux/amd64 bathymaps/prefect-coiled:v2.0
works from a Linux amd64 machine With that info, I can help pinpoint whether it’s an auth, tag, or architecture issue.
t
@Marvin I did the ping test and got this (prefect-coiled) thomascobban@Thomass-MBP prefect-coiled % /Users/thomascobban/dev/uv-projects/prefect-coiled/.venv/bi n/python /Users/thomascobban/dev/uv-projects/prefect-coiled/test.py [2025-09-08 143307,897][INFO ][coiled] Creating software environment [2025-09-08 143308,218][INFO ][coiled] Software environment created [2025-09-08 143310,017][INFO ][coiled] Creating Cluster (name: test-pull, https://cloud.coiled.io/clusters/1137999 ). This usually takes 1-2 minutes... {'machine': 'x86_64', 'python': '3.12.11 (main, Aug 12 2025, 231107) [GCC 12.2.0]'} (prefect-coiled) thomascobban@Thomass-MBP prefect-coiled %
m
thought for 228.1 seconds
Great — that ping test confirms Coiled can pull and run your container in ap-southeast-2. So the failure during the Prefect-run is about differences between your Prefect execution and the minimal test. Most likely causes that fit your symptoms - Tag drift: Prefect runs your flow in v1.0 while Coiled workers use v2.0. If v1.0 and v2.0 were built differently (or you have an old decorator in the deployed code), Coiled may be trying to pull a tag that doesn’t exist or is private. - Coiled client version mismatch: The Prefect deployment image (v1.0) may include an older coiled client that behaves differently than your local/test (where it worked). - Registry rate-limits/timeouts: Your failing runs launch with n_workers=2; the ping test used 1. Parallel pulls can hit Docker Hub rate limits or intermittent timeouts. - Env drift: The code that actually ran in Prefect may reference a different
container=
value than the code you tested locally. Quick fixes to try 1) Align your images to a single, known-good tag (e.g., v2.0) for both the Prefect deployment and the Coiled container:
Copy code
# Coiled task
@coiled.function(
    name="prefect_jaxa_hourly_sst_median",
    container="bathymaps/prefect-coiled:v2.0",
    region="ap-southeast-2",
    memory="64 GiB",
    n_workers=2,
    idle_timeout="10m",
    environ={"CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE": "YES"},
)

# Prefect deploy
copernicus_hourly_sst_so_ssh_jaxa_flow.deploy(
    name="copernicus_hourly_sst_so_ssh_jaxa_flow",
    work_pool_name="bathymaps-coiled-pool",
    image="bathymaps/prefect-coiled:v2.0",
    build=False,
    cron="0 * * * *",
)
2) Rule out rate limits - Temporarily set
n_workers=1
in the Prefect-run and try again. If that works but 2 fails, it’s likely registry rate-limiting. - If you hit limits, host the image in ECR in ap-southeast-2 (or another registry close to the region) and set
container=
to the fully-qualified ECR URL. This avoids Docker Hub limits and speeds up pulls. 3) Ensure the Prefect image has an up-to-date Coiled client - Make sure both images include the same
coiled
version:
Copy code
python -c "import coiled, sys; print('coiled:', coiled.__version__, 'python:', sys.version)"
- If your v1.0 image is older, update and re-push or switch your deployment to v2.0 as above. 4) Verify the exact image being pulled in the failing run - From the Coiled cluster page linked in the failing Prefect run, copy the lines around the pull failure. We want to see the precise image URL/tag and whether it says “not found”, “denied”, or rate-limited. If you can share: - The Coiled log snippet from the failing Prefect run (especially the line with the image name and the “denied/not found” text) - Whether the images are public or private - The work pool type for “bathymaps-coiled-pool” (Docker, Process, etc.) …we can pinpoint whether this is tag drift, auth, or rate limits. References - Prefect + Coiled integration guide - Coiled: Using Docker images
t
@Marvin If I run my flow like the below if name == "__main__": copernicus_hourly_sst_so_ssh_jaxa_flow() it works, but if I run it like this via a deployment, it doesnt work if name == "__main__": copernicus_hourly_sst_so_ssh_jaxa_flow.deploy( name="copernicus_hourly_sst_so_ssh_jaxa_flow", work_pool_name="bathymaps-coiled-pool", image="bathymaps/prefect-coiled:v2.0", build=False, cron="0 * * * *", )
m
thought for 92.6 seconds
Thanks — that narrows it down nicely. Why it works locally but not via deployment - When you run the flow directly, it uses your local Coiled config/credentials (token, defaults, registry auth) and your local
coiled
version — and we know your container pulls fine in ap-southeast-2. - When you run via a Prefect deployment, the code executes inside your work pool’s worker container. That environment likely differs: - Different or missing Coiled credentials/config (COILED_API_TOKEN, account defaults, registry auth) - A different
coiled
client version than your local - Potentially different code or decorator values than what you just tested (if the deployment wasn’t updated) Given your ping test succeeded, this is almost certainly about the worker environment, not the image tag itself. Concrete fixes to try 1) Ensure the worker has Coiled credentials and defaults - In the Prefect UI, open your work pool “bathymaps-coiled-pool” and add environment variables: - COILED_API_TOKEN=<your token> - COILED_DEFAULT_REGION=ap-southeast-2 - (If you use a non-default account/org) COILED_ACCOUNT=<your-account> - Alternatively, pass env at deploy time via job variables (works for Docker/Process pools):
Copy code
copernicus_hourly_sst_so_ssh_jaxa_flow.deploy(
      name="copernicus_hourly_sst_so_ssh_jaxa_flow",
      work_pool_name="bathymaps-coiled-pool",
      image="bathymaps/prefect-coiled:v2.0",
      build=False,
      cron="0 * * * *",
      job_variables={"env": {
          "COILED_API_TOKEN": "...",
          "COILED_DEFAULT_REGION": "ap-southeast-2"
      }},
  )
2) Align versions and config between local and deployment - Make sure the deployment image has the same
coiled
version as your local:
Copy code
python -c "import coiled, sys; print('coiled:', coiled.__version__, 'python:', sys.version)"
- Keep both the Prefect deployment image and the Coiled
container=
tag the same (you already switched to v2.0 — good). 3) Rule out registry rate limits and concurrency - Temporarily set
n_workers=1
in the @coiled.function. If that works but
2
fails, you’re likely hitting registry rate limits. Hosting the image in ECR in ap-southeast-2 and using its full URL in
container=
helps a lot. 4) Verify what the worker is actually doing - Add a tiny debug task at the start of your flow to inspect the runtime:
Copy code
from prefect import task, get_run_logger

  @task
  def debug_env():
      import coiled, os, platform
      logger = get_run_logger()
      <http://logger.info|logger.info>(f"coiled version: {getattr(coiled, '__version__', 'unknown')}")
      <http://logger.info|logger.info>(f"whoami: {getattr(coiled, 'whoami', lambda: 'n/a')()}")
      <http://logger.info|logger.info>(f"COILED_DEFAULT_REGION: {os.getenv('COILED_DEFAULT_REGION')}")
      <http://logger.info|logger.info>(f"machine: {platform.machine()}")
Then call
debug_env()
at the start of the flow and check the Prefect logs. 5) Double-check the exact image being pulled by Coiled during the failing run - In the Coiled cluster page linked from the failing run, look for lines around the pull error. If it says “denied” or “not found”, that points to auth/tag issues. If it looks like timeout/rate-limit, that points to registry limits. A couple of gotchas I’ve seen cause this - The worker is using a different Coiled token/account than your local. That account does not have registry credentials configured, so private images fail to pull. Configure registry credentials in that Coiled account or make the image public. - The code deployed is not the same as the code you just tested (e.g., a stale decorator using a different
container=
or
software=
). Re-deploy from the updated code and confirm the deployment shows your latest commit/hash. - Environment defaults in the worker image or profile override your
container=
(e.g., COILED_SOFTWARE or a default config in ~/.config/coiled). The debug task above will help reveal this. If you can share: - The worker type for “bathymaps-coiled-pool” (Docker vs Process)
- Whether your image is private and where it’s hosted - The Coiled log lines around the failure (especially the line with the image tag and “denied/not found”) …I can give more targeted next steps. References - Prefect + Coiled guide: How to run Prefect on Coiled - Coiled: Using Docker images and credentials: Using Docker images in Coiled
t
@Marvin how do I check what coiled credentials are used in the work-pool
m
thought for 66.6 seconds
Short answer: work pools don’t “store” Coiled credentials. The credentials that get used come from the environment where your worker/flow runs (env vars like COILED_API_TOKEN / COILED_ACCOUNT or ~/.config/coiled/config.yaml inside the container/host). Ways to check what’s actually being used - Prefect UI (recommended) - Work Pools -> your pool -> Base job template -> Environment variables - Deployments -> your deployment -> Infrastructure -> Environment variables - These two places show env vars that will be present in the flow-run container, which is what Coiled will use. - Prefect CLI - Inspect the pool to see its base job template (including env):
Copy code
prefect work-pool inspect "bathymaps-coiled-pool"
Look for an env section in the output. - If you pass env via Flow.deploy, it will be under job_variables["env"] for that deployment. - Check inside the running environment (most definitive) - Add a quick debug task to your flow to see which credentials/config are in effect at runtime:
Copy code
from prefect import task, get_run_logger
    import os, pathlib

    @task
    def report_coiled_auth():
        logger = get_run_logger()
        try:
            import coiled
            who = coiled.whoami()
        except Exception as e:
            who = f"whoami failed: {e}"

        <http://logger.info|logger.info>(f"COILED_API_TOKEN set: {'yes' if os.getenv('COILED_API_TOKEN') else 'no'}")
        <http://logger.info|logger.info>(f"COILED_ACCOUNT: {os.getenv('COILED_ACCOUNT')}")
        <http://logger.info|logger.info>(f"COILED_DEFAULT_REGION: {os.getenv('COILED_DEFAULT_REGION')}")
        <http://logger.info|logger.info>(f"coiled.whoami(): {who}")

        cfg = pathlib.Path.home() / ".config" / "coiled" / "config.yaml"
        <http://logger.info|logger.info>(f"~/.config/coiled/config.yaml exists: {cfg.exists()}")
Call
report_coiled_auth()
at the start of your flow and check the logs. - Inspect the worker container/host directly - If you’re using a Docker work pool and have access to the worker container:
Copy code
docker exec -it <worker-container-id> env | grep COILED
    docker exec -it <worker-container-id> sh -lc 'cat ~/.config/coiled/config.yaml || true'
- For a Process worker, check the shell/systemd service/env used to start
prefect worker start -p bathymaps-coiled-pool
. What to set if missing/misconfigured - Set these env vars in your pool’s base job template or as job_variables on the deployment: - COILED_API_TOKEN - COILED_ACCOUNT (if you use a non-default account/org) - COILED_DEFAULT_REGION=ap-southeast-2 (optional default) - Ensure your deployment image contains a recent coiled client, same as you tested locally. If you tell me: - Your work pool type (Docker, Process, Kubernetes) - How you’re starting the worker …I can give exact steps to view/set the env in that environment. Links: Run Prefect on Coiled
t
@Marvin here is my work pool output (prefect-coiled) thomascobban@Thomass-MBP prefect-coiled % prefect work-pool inspect "bathymaps-coiled-pool" WorkPool( id='cc753fd9-299c-4053-ab22-dd1741e688c2', created=DateTime(2025, 6, 23, 6, 38, 31, 143081, tzinfo=*Timezone('UTC'))*, updated=DateTime(2025, 8, 11, 7, 39, 38, 83193, tzinfo=*Timezone('UTC'))*, name='bathymaps-coiled-pool', type='coiled:push', base_job_template={ 'variables': { 'type': 'object', 'required': *['credentials']*, 'properties': { 'arm': *{'type': 'boolean', 'title': 'Arm', 'default': True}*, 'cpu': { 'type': 'string', 'title': 'Cpu', 'default': '8', 'description': 'Use a VM with this number of CPU (or vCPU) cores' }, 'env': { 'type': 'object', 'title': 'Environment Variables', 'description': 'Environment variables to set when starting a flow run.', 'additionalProperties': *{'type': 'string'}* }, 'gpu': *{'type': 'boolean', 'title': 'Gpu', 'description': 'Use a VM with NVIDIA GPU available'}*, 'name': *{'type': 'string', 'title': 'Name', 'description': 'Name given to created infrastructure.'}*, 'image': { 'type': 'string', 'title': 'Image', 'example': 'docker.io/prefecthq/prefect:3-latest', 'description': "Image to use. Defaults to the latest Prefect base image. Required if you aren't using a Coiled software environment" }, 'labels': { 'type': 'object', 'title': 'Labels', 'description': 'Labels applied to created infrastructure.', 'additionalProperties': *{'type': 'string'}* }, 'memory': { 'type': 'string', 'title': 'Memory', 'default': '16 GiB', 'description': "Use a VM with this amount of memory; specify as a string such as '16 GiB'" }, 'region': { 'type': 'string', 'title': 'Region', 'default': 'ap-southeast-2', 'description': 'The region in which to run the job on Coiled; by default uses default region from Coiled workspace' }, 'command': { 'type': 'string', 'title': 'Command', 'description': 'The command to use when starting a flow run. In most cases, this should be left blank and the command will be automatically generated.' }, 'software': { 'type': 'string', 'title': 'Software', 'description': 'Name of Coiled software environment to use' }, 'vm_types': { 'type': 'array', 'items': *{'type': 'string'}*, 'title': 'Vm Types', 'description': "List of specific VM type(s) to use; often it's best to specify cpu and/or memory and let Coiled determine appropriate VM types." }, 'workspace': { 'type': 'string', 'title': 'Workspace', 'description': 'The Coiled workspace to use. If not provided the default Coiled workspace for your user will be used.' }, 'credentials': { 'allOf': *[{'$ref': '#/definitions/CoiledCredentials'}]*, 'title': 'Coiled API token', 'default': *{*'$ref': *{'block_document_id': '7cf315a5-c9eb-4e02-b21b-3e69a58fa4e8'}}*, 'description': 'The Coiled API token used to connect to Coiled.' }, 'job_timeout': { 'type': 'string', 'title': 'Job Timeout', 'description': "Timeout for the job execution, e.g. '20 minutes', '1 hour'. If not provided, the job will run until completion." } }, 'definitions': { 'CoiledCredentials': { 'type': 'object', 'title': 'CoiledCredentials', 'required': *['api_token']*, 'properties': { 'api_token': { 'type': 'string', 'title': 'Api Token', 'format': 'password', 'writeOnly': True, 'description': 'Coiled API token' } }, 'description': 'Block used to manage Coiled Credentials.', 'secret_fields': *['api_token']*, 'block_type_slug': 'coiled-credentials', 'block_schema_references':
Copy code
{}
} } }, 'job_configuration': { 'arm': '{{ arm }}', 'cpu': '{{ cpu }}', 'env': '{{ env }}', 'gpu': '{{ gpu }}', 'name': '{{ name }}', 'image': '{{ image }}', 'labels': '{{ labels }}', 'memory': '{{ memory }}', 'region': '{{ region }}', 'command': '{{ command }}', 'software': '{{ software }}', 'vm_types': '{{ vm_types }}', 'workspace': '{{ workspace }}', 'credentials': '{{ credentials }}', 'job_timeout': '{{ job_timeout }}' } }, status=WorkPoolStatus.READY, storage_configuration=WorkPoolStorageConfiguration(), default_queue_id='56ae0390-cb16-41e5-99a8-2c69f11b6099' (prefect-coiled) thomascobban@Thomass-MBP prefect-coiled %
m
Your message was too long, here's your message at the allowed limit: here is my work pool output (prefect-coiled) thomascobban@Thomass-MBP prefect-coiled % prefect work-pool inspect bathymaps-coiled-pool WorkPool( id='cc753fd9-299c-4053-ab22-dd1741e688c2' created=DateTime(2025 6 23 6 38 31 143081 tzinfo=*Timezone('UTC'))* updated=DateTime(2025 8 11 7 39 38 83193 tzinfo=*Timezone('UTC'))* name='bathymaps-coiled-pool' type='coiled push' base_job_template={ 'variables' { 'type' 'object' 'required' *['credentials']* 'properties' { 'arm' *{'type' 'boolean' 'title' 'Arm' 'default' True}* 'cpu' { 'type' 'string' 'title' 'Cpu' 'default' '8' 'description' 'Use a VM with this number of CPU (or vCPU) cores' } 'env' { 'type' 'object' 'title' 'Environment Variables' 'description' 'Environment variables to set when starting a flow run ' 'additionalProperties' *{'type' 'string'}* } 'gpu' *{'type' 'boolean' 'title' 'Gpu' 'description' 'Use a VM with NVIDIA GPU available'}* 'name' *{'type' 'string' 'title' 'Name' 'description' 'Name given to created infrastructure '}* 'image' { 'type' 'string' 'title' 'Image' 'example' '<http //docker io/prefecthq/prefect 3-latest|docker io/prefecthq/prefect 3-latest>' 'description' Image to use Defaults to the latest Prefect base image Required if you aren't using a Coiled software environment } 'labels' { 'type' 'object' 'title' 'Labels' 'description' 'Labels applied to created infrastructure ' 'additionalProperties' *{'type' 'string'}* } 'memory' { 'type' 'string' 'title' 'Memory' 'default' '16 GiB' 'description' Use a VM with this amount of memory; specify as a string such as '16 GiB' } 'region' { 'type' 'string' 'title' 'Region' 'default' 'ap-southeast-2' 'description' 'The region in which to run the job on Coiled; by default uses default region from Coiled workspace' } 'command' { 'type' 'string' 'title' 'Command' 'description' 'The command to use when starting a flow run In most cases this should be left blank and the command will be automatically generated ' } 'software' { 'type' 'string' 'title' 'Software' 'description' 'Name of Coiled software environment to use' } 'vm_types' { 'type' 'array' 'items' *{'type' 'string'}* 'title' 'Vm Types' 'description' List of specific VM type(s) to use; often it's best to specify cpu and/or memory and let Coiled determine appropriate VM types } 'workspace' { 'type' 'string' 'title' 'Workspace' 'description' 'The Coiled workspace to use If not provided the default Coiled workspace for your user will be used ' } 'credentials' { 'allOf' *[{'$ref' '#/definitions/CoiledCredentials'}]* 'title' 'Coiled API token' 'default' *{*'$ref' *{'block_document_id' '7cf315a5-c9eb-4e02-b21b-3e69a58fa4e8'}}* 'description' 'The Coiled API token used to connect to Coiled ' } 'job_timeout' { 'type' 'string' 'title' 'Job Timeout' 'description' Timeout for the job execution e g '20 minutes' '1 hour' If not provided the job will run until completion } } 'definitions' { 'CoiledCredentials' { 'type' 'object' 'title' 'CoiledCredentials' 'required' *['api_token']* 'properties' { 'api_token' { 'type' 'string' 'title' 'Api Token' 'format' 'password' 'writeOnly' True 'description' 'Coiled API token' } } 'description' 'Block used to manage Coiled Credentials ' 'secret_fields' *['api_token']* 'block_type_slug' 'coiled-credentials' 'block_schema_references'
Copy code
{}
} } } 'job_configuration' { 'arm' '{{ arm }}' 'cpu' '{{ cpu }}' 'env' '{{ env }}' 'gpu' '{{ gpu }}' 'name' '{{ name }}' 'image' '{{ image }}' 'labels' '{{ labels }}' 'memory' '{{ memory }}' 'region' '{{ region }}' 'command' '{{ command }}' 'software' '{{ software }}' 'vm_types' '{{ vm_types }}' 'workspace' '{{ workspace }}' 'credentials' '{{ credentials }}' 'job_timeout' '{{ job_timeout }}' } } status=WorkPoolStatus
t
@Marvin what should I be looking for in the output of my workpool inspection using the cli
m
thought for 106.0 seconds
Great question. When you run:
Copy code
prefect work-pool inspect "bathymaps-coiled-pool"
look for these fields in the output: 1) Work pool type - Tells you how env is applied. - Docker/Kubernetes/ECSTask pools: env is defined in the pool’s base job template and passed into the flow-run container. - Process pool: env is inherited from the machine/process that starts the worker; the pool usually won’t show COILED_* here. 2) base_job_template (critical) - image: The container your flow run executes in. Make sure it’s the expected tag (e.g., bathymaps/prefect-coiled:v2.0). - env: The environment variables that will exist inside the flow-run container. You want to see: - COILED_API_TOKEN (non-empty) - COILED_ACCOUNT (if you use a non-default account) - COILED_DEFAULT_REGION=ap-southeast-2 (optional but helpful) - Any other settings relevant to your runtime (volumes, command, etc.) A “good” Docker work pool snippet looks like:
Copy code
"type": "docker",
"base_job_template": {
  "image": "bathymaps/prefect-coiled:v2.0",
  "env": {
    "COILED_API_TOKEN": "****",
    "COILED_ACCOUNT": "my-account",
    "COILED_DEFAULT_REGION": "ap-southeast-2"
  },
  ...
}
Also check deployment-level overrides - Your deployment can add/override env. Inspect it with:
Copy code
prefect deployment inspect "copernicus_hourly_sst_so_ssh_jaxa_flow/copernicus_hourly_sst_so_ssh_jaxa_flow"
- Look for fields like image and job_variables/env (or infrastructure/env). If env is defined here, it merges with or overrides the pool’s base job template. Notes by pool type - Docker/Kubernetes/ECS: You should see your COILED_* variables in base_job_template.env (or in the deployment’s env). If they’re missing, add them in the pool’s Base job template (UI) or via deployment job_variables. - Process: You won’t typically see COILED_* in the pool inspection. Those must be set in the environment that launches the worker (the shell/systemd service running prefect worker). In this case, rely on your debug task inside the flow to verify what’s actually present at runtime. If anything’s missing, set it - UI: Work Pools -> your pool -> Base job template -> Environment variables - Or update the deployment to include:
Copy code
job_variables={"env": {
  "COILED_API_TOKEN": "...",
  "COILED_ACCOUNT": "my-account",
  "COILED_DEFAULT_REGION": "ap-southeast-2"
}}
If you paste the relevant parts of your inspect output (type, image, env), I can confirm whether it’s configured correctly.
t
@Marvin the only job variables I have in my deployment are { "image": "bathymaps/prefect-coiled:v2.0" }
m
thought for 42.1 seconds
Got it — if your deployment’s job variables only set
image
, then your flow-run container probably isn’t getting any Coiled credentials. That’s why it works locally (you have creds/config) but fails when run via the work pool. You have two good options: A) Add COILED_* env vars to the deployment - Update your
.deploy(...)
call to pass env via `job_variables`:
Copy code
import os

copernicus_hourly_sst_so_ssh_jaxa_flow.deploy(
    name="copernicus_hourly_sst_so_ssh_jaxa_flow",
    work_pool_name="bathymaps-coiled-pool",
    image="bathymaps/prefect-coiled:v2.0",
    build=False,
    cron="0 * * * *",
    job_variables={"env": {
        "COILED_API_TOKEN": os.getenv("COILED_API_TOKEN"),   # set in your shell/CI
        "COILED_ACCOUNT": "your-account",                    # if applicable
        "COILED_DEFAULT_REGION": "ap-southeast-2"            # optional but helpful
    }},
)
- Make sure
COILED_API_TOKEN
is set in the environment where you call
.deploy()
(so you don’t hardcode secrets). B) Set env on the work pool base job template (UI) - Prefect UI -> Work Pools -> bathymaps-coiled-pool -> Edit base job template -> Environment variables: -
COILED_API_TOKEN
-
COILED_ACCOUNT
(if you use a non-default account) -
COILED_DEFAULT_REGION=ap-southeast-2
- Save, then restart your worker if needed. New flow runs will inherit these vars. What to verify after - Inspect your deployment to confirm env is present:
Copy code
prefect deployment inspect "copernicus_hourly_sst_so_ssh_jaxa_flow/copernicus_hourly_sst_so_ssh_jaxa_flow"
Look for
job_variables.env
with your COILED_* vars (masked). - Add a quick debug task at the start of your flow to confirm at runtime:
Copy code
from prefect import task, get_run_logger
import os, pathlib

@task
def report_coiled_auth():
    logger = get_run_logger()
    try:
        import coiled
        who = coiled.whoami()
    except Exception as e:
        who = f"whoami failed: {e}"

    <http://logger.info|logger.info>(f"COILED_API_TOKEN set: {'yes' if os.getenv('COILED_API_TOKEN') else 'no'}")
    <http://logger.info|logger.info>(f"COILED_ACCOUNT: {os.getenv('COILED_ACCOUNT')}")
    <http://logger.info|logger.info>(f"COILED_DEFAULT_REGION: {os.getenv('COILED_DEFAULT_REGION')}")
    <http://logger.info|logger.info>(f"coiled.whoami(): {who}")

    cfg = pathlib.Path.home() / ".config" / "coiled" / "config.yaml"
    <http://logger.info|logger.info>(f"~/.config/coiled/config.yaml exists: {cfg.exists()}")
Notes - If your Coiled container image is private on Docker Hub/ECR/GCR, ensure your Coiled account has registry credentials configured; otherwise pulls can fail even with a valid Coiled API token. See Coiled: Using Docker images. - If your work pool is a Process worker, you can also set these env vars in the environment that starts the worker process. For Docker/Kubernetes pools, prefer setting them in the pool’s base job template or the deployment’s
job_variables.env
. If you can share your work pool type (Docker vs Process) and whether your image is public/private, I can give exact “where to put the env” steps. Prefect + Coiled guide: Run Prefect on Coiled.
t
@Marvin I jumped into the work pool UI and under CoiledCredentials I have a Block set used to manage Coiled Credentials.
m
thought for 34.5 seconds
Nice — having the CoiledCredentials block on your Coiled work pool is the right setup. A couple key points about what that block does (and doesn’t) do: What the CoiledCredentials block covers - It authenticates your flow runs to Coiled (API token/account). Prefect will inject those creds so calls like
coiled.function(...)
can create clusters. - You can confirm at runtime with a quick check:
Copy code
@task
  def report_coiled_auth():
      from prefect import get_run_logger
      import os, pathlib
      import coiled
      logger = get_run_logger()
      try:
          who = coiled.whoami()
      except Exception as e:
          who = f"whoami failed: {e}"
      <http://logger.info|logger.info>(f"coiled.whoami(): {who}")
      <http://logger.info|logger.info>(f"COILED_API_TOKEN set: {'yes' if os.getenv('COILED_API_TOKEN') else 'no'}")
      <http://logger.info|logger.info>(f"COILED_ACCOUNT: {os.getenv('COILED_ACCOUNT')}")
      cfg = pathlib.Path.home() / ".config" / "coiled" / "config.yaml"
      <http://logger.info|logger.info>(f"~/.config/coiled/config.yaml exists: {cfg.exists()}")
What it does not cover - Docker/registry credentials for pulling your
container=
image. If
bathymaps/prefect-coiled:v2.0
is private (or if you hit Docker Hub rate limits), Coiled needs registry auth configured in your Coiled account. Without that, you’ll see “Failed pulling software environment image”. What to do next - If the image is private or you suspect rate-limits, add registry credentials in Coiled (Docker Hub/ECR/GCR) and re-run: - Coiled docs on using Docker images and auth: Using Docker images in Coiled - For best reliability/performance in ap-southeast-2, push the image to ECR in that region and set
container=
to the full ECR URL. - Keep passing
region="ap-southeast-2"
in your
@coiled.function(...)
(you already do). - Optionally add COILED_DEFAULT_REGION via your pool’s Base job template env or deployment
job_variables.env
, but it’s not required since you set region in code. Reminder about the Prefect deployment image - In a Coiled work pool, the “image” you pass to
.deploy(image=...)
does not control the Coiled worker container; the container that must be pulled by Coiled is the one in your
@coiled.function(container=...)
. Credentials for pulling that image come from your Coiled account’s registry settings, not from the Prefect work pool image. If you paste the exact Coiled log lines around the failure (e.g., “denied”, “not found”, or rate-limit text), I can confirm whether it’s an auth vs. availability issue. Prefect + Coiled guide: Run Prefect on Coiled