Hi team! I'm currently seeing the following error on one of my flow runs for a custom work pool on t...
a
Hi team! I'm currently seeing the following error on one of my flow runs for a custom work pool on the Teams Plan:
Copy code
Container 'prefect' exited with non-zero exit code 126.
This is the only error that I'm seeing from the flow run within my ECS task definition. Is there any other way to see more logs or more information on what specific command may be causing this? I don't see any other information within the logs on the UI or on cloudwatch other than the task being deprovisioned followed by this error
This was also with a custom
dockerfile
. I also tried using the default
dockerfile
with the
push=True
and got an exit code of 255
j
hey, the CloudWatch logs from your container would be the most granular thing to look at. https://stackoverflow.com/questions/9979251/error-with-bash-script-exit-code-126 It sounds like it might be a file permission issue. I'm guessing some combination of your custom dockerfile + entrypoint + task definition and what command is actually executing
👀 1
a
I see the following error in my cloudwatch prior to the task being provisioned but no other logs, could this be related?
Copy code
fatal: not a git repository (or any of the parent directories): .git
I also do still see this error with the default docker file when I run with
push=True
and
build=True
j
In your entrypoint/command are you calling
git checkout
or something?
easiest way to debug is probably to try and run your docker image locally until you start seeing prefect errors (which means it can run the prefect command successfully)
👍 1
a
Appreciate the responses! I'm able to run it successfully locally (I'm opting to just use the prefect build process now for a docker image and pushing it to ECR). However, I'm still running into a very general exit code here (255). Here's a snippet of my ECS task logs: I've seen this old Github thread around it being related to the platform build, could that be the case? Here's my current step for building / deploying the image:
Copy code
my_target_flow.deploy(
        name="test-flow",
        work_pool_name="dev-ecs-pool",
        image=<my_ecr_repo>,
        push=True,
        build=True,
    )
j
Maybe the github log is a red herring. 255 can occur with the wrong build architecture. In your build steps you can give a
platform
kwarg: https://docs.prefect.io/v3/how-to-guides/deployments/prefect-yaml#the-build-action https://docs.prefect.io/v3/how-to-guides/deployment_infra/serverless#deployment
a
Thanks for all the help! The platform kwarg was useful on my dockerfile. I believe I'm nearly there but I'm now running into the same issue outlined in this thread here: https://linen.prefect.io/t/26775274/does-anyone-know-if-there-is-something-wrong-with-the-abilit Do you know if there ever a common resolution here by chance? I'm building my docker images on my Mac for context
I am on the following docker version: Docker version 28.2.2
Figured it out! The issue was actually with prefect-docker as an FYI
j
Apologies for not responding before. Glad you were able to figure it out! What was the issue?
c
Hey, new here 👋 I get this issue too, but when using the kubernetes workpool. I get the following events in GKE on image download:
Copy code
Events:
  Type     Reason     Age   From               Message
  ----     ------     ----  ----               -------
  Normal   Scheduled  18s   default-scheduler  Successfully assigned prefect/beryl-gharial-b4f5z-l6nhz to gke-XXX-development-autoscale-main-v2-a0a3a95e-j7vl
  Normal   Pulling    17s   kubelet            Pulling image "europe-north1-docker.pkg.dev/XXXXXX/ds-lab/analytics-scheduler-test:2025-07-22t11-48-11-690613-00-00"
  Warning  Failed     10s   kubelet            Failed to pull image "europe-north1-docker.pkg.dev/XXXXXX/ds-lab/analytics-scheduler-test:2025-07-22t11-48-11-690613-00-00": failed to pull and unpack image "europe-north1-docker.pkg.dev/XXXXXX/ds-lab/analytics-scheduler-test:2025-07-22t11-48-11-690613-00-00": wrong diff id calculated on extraction "sha256:11a485f0ff53881081e2b5e8c0791997d67e8e773e9571b549f302bb24cc1b55"
  Warning  Failed     10s   kubelet            Error: ErrImagePull
  Normal   BackOff    10s   kubelet            Back-off pulling image "europe-north1-docker.pkg.dev/XXXXXX/ds-lab/analytics-scheduler-test:2025-07-22t11-48-11-690613-00-00"
  Warning  Failed     10s   kubelet            Error: ImagePullBackOff
And this is my config for the flows:
Copy code
from prefect import deploy, flow
from prefect.docker import DockerImage

workpool_name = "Default worker pool"


@flow(log_prints=True)
def buy():
    print("Buying securities.")


@flow(log_prints=True)
def sell():
    print("Selling securities.")


@flow(log_prints=True)
def main():
    print("Starting flow")
    buy()
    sell()
    print("Flow completed")


if __name__ == "__main__":
    deploy(
        buy.to_deployment(name="buy-deploy"),
        sell.to_deployment(name="sell-deploy"),
        main.to_deployment(name="main-deploy"),
        work_pool_name=workpool_name,
        image=DockerImage(
            name="europe-north1-docker.pkg.dev/XXXXXXX/ds-lab/analytics-scheduler-test",
            platform="linux/amd64",
        ),
        push=True,
    )
I have inspected the uploaded image and the arch seems ok, so I at a loss understanding why the image is not runnable on GKE. Any thoughts?
Forgot to say, but I am also on a MAC (M-series)