< Marvin> Can you explain the following flow error ```Finish Prefect Community #ask-marvin

<@ULVA73B9P> Can you explain the following flow er...

Charlie Nash

10/14/2025, 7:38 PM

@Marvin Can you explain the following flow error?

Copy code

Finished in state Failed("Flow run encountered an exception: ProgrammingError: 255002: Optional dependency: 'pandas' is not installed, please see the following link for install instructions: <https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#installation>")

Marvin

10/14/2025, 7:38 PM

✅ thought for 34.7 seconds

Marvin

10/14/2025, 7:39 PM

Short version: your flow called a Snowflake method that requires pandas, but pandas is not installed in the environment where the flow actually ran. The Snowflake Python connector raises ProgrammingError 255002 in that case, and Prefect reports the flow as Failed. Common calls that trigger this: -

cursor.fetch_pandas_all()

fetch_pandas_batches()

snowflake.connector.pandas_tools.write_pandas(...)

- Snowpark

DataFrame.to_pandas()

- SQLAlchemy Snowflake helpers that return DataFrames How to fix - Install the pandas extra for the Snowflake connector in the runtime environment:

Copy code

pip install "snowflake-connector-python[pandas]"

This pulls in pandas (and related deps like pyarrow) that those methods need. - If you’re running in a container/worker image, add it to the image your deployment uses (e.g., in your Dockerfile):

Copy code

RUN pip install --no-cache-dir "snowflake-connector-python[pandas]"

- If you manage dependencies with requirements/pyproject, add:

Copy code

snowflake-connector-python[pandas]

Reference: Snowflake pandas integration installation If you’re not explicitly using pandas in your code, it’s still likely one of the above Snowflake helpers under the hood. If you share: - where the flow runs (local, Docker/K8s work pool, etc.), and - the Snowflake call you’re making, I can suggest the exact dependency change and where to put it.

Charlie Nash

10/14/2025, 7:44 PM

@Marvin I am still getting the same flow error. This is what my docker file looks like:

Copy code

FROM prefecthq/prefect:3.2.14-python3.9
COPY . /opt/prefect/Scripts/
WORKDIR /opt/prefect/Scripts/
RUN pip install --no-cache-dir "snowflake-connector-python[pandas]" adlfs s3fs requests pandas pyodbc snowflake prefect_azure && apt-get update && apt-get install -y unixodbc

Marvin

10/14/2025, 7:44 PM

✅ thought for 327.3 seconds

Marvin

10/14/2025, 7:50 PM

Thanks for sharing the Dockerfile — that helps. Given you’re still seeing “Optional dependency: 'pandas' is not installed” at runtime, the most common causes are: 1) The flow is not actually running in your custom image - If your deployment/worker is using the default Prefect image (e.g., prefecthq/prefect:3.2.14-python3.9), it won’t have pandas or the Snowflake extras you installed. - Check the image used by the run in the Prefect UI: - Open the flow run → Infrastructure panel → confirm the exact image name used. - If it’s not your custom image, point your deployment to it and rerun: - In Prefect 3.x Python API:

Copy code

from prefect import flow
    from prefect.deployments import deploy

    deploy(
      "path/to/flow.py:my_flow",
      name="my-deployment",
      work_pool_name="my-docker-pool",
      image="your-registry/your-image:tag",
      build=False,  # image already built
      push=False,   # image already pushed
    )

- Or set the “image” in your Docker work pool’s job variables or in your prefect.yaml and redeploy. - Make sure your Docker worker is running and attached to that pool. Docs: Use Docker with Prefect work pools 2) You’re running on a non-Docker worker - If you’re using a Process/Kubernetes/other pool, the Dockerfile won’t affect the runtime environment unless the job actually pulls and runs that image. - For Process workers, install the deps where the worker runs:

Copy code

pip install "snowflake-connector-python[pandas]" pandas pyarrow

3) Small but important Dockerfile notes - You’re installing both

snowflake-connector-python[pandas]

and

snowflake

. The PyPI package named

snowflake

is not the Snowflake connector and can cause module conflicts. Remove

snowflake

from your pip line. - If you use DataFrame helpers (e.g., write_pandas, fetch_pandas_all), you often need pyarrow too. Add it explicitly:

Copy code

RUN pip install --no-cache-dir "snowflake-connector-python[pandas]" pandas pyarrow adlfs s3fs requests pyodbc prefect-azure && \
      apt-get update && apt-get install -y unixodbc

Note: In Prefect 3.x the Azure collection is published as

prefect-azure

(hyphen), not

prefect_azure

. - Quick sanity test your image locally:

Copy code

docker run --rm -it your-registry/your-image:tag \
    python -c "import pandas, snowflake.connector; print('OK', pandas.__version__, snowflake.connector.__version__)"

Checklist to get you unblocked - Confirm the image used by your failing run in the UI. - If incorrect, update your deployment or pool to use your custom image and ensure the worker is running for that pool. - Rebuild/tag/push your image, then redeploy:

Copy code

docker build -t your-registry/your-image:tag .
  docker push your-registry/your-image:tag

- Remove

snowflake

from your pip installs; keep

snowflake-connector-python[pandas]

(+ pyarrow). - If not using a Docker work pool, install the deps in the actual runtime environment. If you can share: - Your work pool type (Docker, Process, K8s, etc.) - How you’re creating the deployment (prefect.yaml or code/CLI) and which image it references - A link or screenshot of the flow run’s Infrastructure panel showing the image …I can pinpoint the exact change needed.

Charlie Nash

10/14/2025, 8:08 PM

@Marvin Our work pool is on a Kubernetes Cluster. The deployment is created using this configuration in Prefect.yaml

Copy code

- name: VenueNextColumns
  version:
  tags: []
  description:
  entrypoint: venueNextParseColumns_snowflake.py:main_flow
  parameters: {}
  work_pool:
    name: venuenext-kubernetes
    worker_name: KubernetesWorker ce78e435-8a1e-4950-8d8b-e4ee47546859
    job_variables:
      image: '{{ build-image.image }}'
  schedules: []
  push:
  - prefect_azure.deployments.steps.push_to_azure_blob_storage:
      container: prefectscripts
      folder: ''
      credentials: '{{ prefect.blocks.azure-blob-storage-credentials.prefectscriptskubernetes
        }}'
  pull:
  - prefect_azure.deployments.steps.pull_from_azure_blob_storage:
      container: prefectscripts
      folder: ''
      credentials: '{{ prefect.blocks.azure-blob-storage-credentials.prefectscriptskubernetes
        }}'
  concurrency_limit:

Marvin

10/14/2025, 8:09 PM

✅ thought for 221.1 seconds

Marvin

10/14/2025, 8:13 PM

Thanks — with a Kubernetes work pool and this prefect.yaml, the key issue is that your deployment’s

job_variables.image

points to

{{ build-image.image }}

, but there is no step in your prefect.yaml that actually builds/pushes an image and sets that variable. As a result, the Kubernetes job is likely using the work pool’s default image (a base Prefect image that does not include pandas), which causes the Snowflake “pandas not installed” error. Two ways to fix it: Option A: Hardcode your custom image tag in the deployment - Build and push your image to a registry your cluster can pull from. - Set

job_variables.image

to that exact tag. Example:

Copy code

work_pool:
  name: venuenext-kubernetes
  job_variables:
    image: "your-registry/your-image:tag"
    # if your registry is private:
    # image_pull_secrets:
    #   - name: your-registry-secret

Option B: Wire up docker build/push steps in prefect.yaml, then keep using {{ build-image.image }} - Add build/push steps using the prefect-docker collection so

build-image.image

is produced:

Copy code

build:
- prefect_docker.deployments.steps.build_docker_image:
    id: build-image
    requires: prefect-docker
    image_name: your-registry/your-image
    tag: v1
    dockerfile: ./Dockerfile
- prefect_docker.deployments.steps.push_docker_image:
    id: push-image
    requires: prefect-docker
    image: '{{ build-image.image }}'

- Keep your deployment’s

job_variables.image: '{{ build-image.image }}'

. Make sure the image actually has the Snowflake pandas extras Your Dockerfile is close, but I recommend these changes: - Remove

snowflake

(that’s not the connector and can cause conflicts) - Install the Azure collection with the correct PyPI name

prefect-azure

- Add

pyarrow

(commonly needed with Snowflake pandas helpers) Example:

Copy code

FROM prefecthq/prefect:3.2.14-python3.9
COPY . /opt/prefect/Scripts/
WORKDIR /opt/prefect/Scripts/
RUN pip install --no-cache-dir \
      "snowflake-connector-python[pandas]" \
      pandas \
      pyarrow \
      adlfs \
      s3fs \
      requests \
      pyodbc \
      prefect-azure \
  && apt-get update \
  && apt-get install -y --no-install-recommends unixodbc \
  && rm -rf /var/lib/apt/lists/*

Verify what image was used by the failing run - In the Prefect UI: Flow run → Infrastructure panel → check the “image” value. If it’s not your custom image, the deployment wasn’t pointing at it. Helpful docs - Kubernetes work pools and job variables: Run on Kubernetes - Customize job variables: Customize job variables If you share: - The image value shown on the failing run’s Infrastructure panel - Whether your cluster needs

imagePullSecrets

- Your intended image tag/registry …I can provide the exact prefect.yaml snippet you need.

Charlie Nash

10/14/2025, 8:33 PM

@Marvin I checked the Kubernetes cluster and it is pulling the correct image from Docker. Any other ideas?

Marvin

10/14/2025, 8:34 PM

✅ thought for 64.0 seconds

Marvin

10/14/2025, 8:35 PM

Got it — if the cluster is definitely pulling your custom image, there are a few other common causes for this exact “pandas not installed” error: 1) The pod is running an older cached image tag - Even if it “pulled,” Kubernetes may be reusing a local cached image with the same tag. - Fix: use unique tags per build (e.g., include a sha or timestamp) and/or set the image pull policy to Always:

Copy code

work_pool:
  name: venuenext-kubernetes
  job_variables:
    image: "your-registry/your-image:2024-10-14-001"  # new tag
    image_pull_policy: "Always"

Docs: Run on Kubernetes 2) Package conflict: the

snowflake

PyPI package is installed - Your original Dockerfile installs

snowflake

, which is not the Snowflake connector and can interfere with imports. - Rebuild your image without

snowflake

and make sure you’re using

prefect-azure

(hyphen), not

prefect_azure

(underscore) as the PyPI package name. - Also add

pyarrow

, which Snowflake pandas helpers typically require. Example Dockerfile:

Copy code

FROM prefecthq/prefect:3.2.14-python3.9
COPY . /opt/prefect/Scripts/
WORKDIR /opt/prefect/Scripts/

# Use the same interpreter the container will run with
RUN python -m pip install --no-cache-dir \
      "snowflake-connector-python[pandas]" \
      pandas \
      pyarrow \
      adlfs \
      s3fs \
      requests \
      pyodbc \
      prefect-azure \
  && apt-get update \
  && apt-get install -y --no-install-recommends unixodbc \
  && rm -rf /var/lib/apt/lists/*

3) Packages installed into a different Python than the one used at runtime - Using

pip

python -m pip

can sometimes install into the wrong site-packages if multiple interpreters/venvs exist. - You can sanity-check during build:

Copy code

RUN python - <<'PY'
import sys
print("Python exe:", sys.executable)
import pandas, snowflake.connector
print("pandas:", pandas.__version__)
print("snowflake:", snowflake.connector.__version__)
PY

4) The Snowflake call needs pandas/pyarrow, but import fails for a different reason - The Snowflake connector raises the exact “Optional dependency: 'pandas' is not installed” message whenever importing pandas fails — that includes missing wheels or incompatible binaries. - Verifying inside the running pod will tell us quickly:

Copy code

# Grab the failing flow run's pod name, then:
kubectl exec -it <pod-name> -- python -c "import pandas, snowflake.connector; print('OK', pandas.__version__, snowflake.connector.__version__)"

If that import fails, it’s definitely an environment issue (wrong tag, wrong Python path, or the conflict from

snowflake

package). 5) Add a quick runtime check in your flow to confirm the environment At the start of the flow:

Copy code

from prefect import flow, get_run_logger

@flow
def main_flow(...):
    logger = get_run_logger()
    try:
      import sys, pandas, snowflake.connector
      <http://logger.info|logger.info>(f"Python: {sys.executable}")
      <http://logger.info|logger.info>(f"pandas: {pandas.__version__}")
      <http://logger.info|logger.info>(f"snowflake: {snowflake.connector.__version__}")
    except Exception as e:
      logger.exception(f"Import check failed: {e}")
    # ...rest of flow...

Quick checklist - Rebuild with the corrected Dockerfile (no

snowflake

, use

prefect-azure

, add

pyarrow

, use

python -m pip

). - Push with a new, unique tag; set

image_pull_policy: Always

. - Redeploy and run again. - If it still fails, exec into the pod and run the one-liner import test above — that will pinpoint whether it’s the image or the application code. If you can share: - The exact image tag and whether

image_pull_policy

is set - Output of the in-pod

python -c "import pandas, snowflake.connector; ..."

test - The specific Snowflake API you’re calling (e.g.,

fetch_pandas_all

write_pandas

, Snowpark

to_pandas()

) …I can zero in on the remaining cause.

Charlie Nash

10/14/2025, 9:20 PM

@Marvin I analyzed the packages in my docker file and I have these packages installed:

Copy code

snowflake-connector-python=4.0.0
pandas=2.3.3
pyarrow=21.0.0

However, I am still getting the error:

Copy code

snowflake.connector.errors.ProgrammingError: 255002: Optional dependency: 'pandas' is not installed, please see the following link for install instructions: <https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#installation>

Any ideas?

Marvin

10/14/2025, 9:20 PM

✅ thought for 73.5 seconds

Marvin

10/14/2025, 9:22 PM

Thanks for the details — given those versions are in your image, Snowflake only throws 255002 when

import pandas

fails at runtime. So we need to find why pandas import is failing inside the pod. Fast way to pinpoint the cause - Exec into a failing flow pod and run this:

Copy code

kubectl exec -it <pod-name> -- sh -lc 'python - <<PY
import sys, importlib
print("Python:", sys.version, sys.executable)
for name in ("pandas", "numpy", "pyarrow", "snowflake.connector"):
    try:
        m = importlib.import_module(name)
        path = getattr(m, "__file__", None)
        ver = getattr(m, "__version__", "unknown")
        print(f"{name} OK {ver} at {path}")
    except Exception as e:
        print(f"{name} FAILED: {type(e).__name__}: {e}")
PY'

- This will tell us exactly which import fails and why (e.g., version mismatch, missing binary, shadowed module, etc.). Please paste the output here. Common root causes I’ve seen for this exact symptom 1) Local module shadowing pandas - A file/folder in your project named

pandas.py

pandas/

can hijack the import. - Check what’s being imported: the command above prints the module path. If it points into your project instead of site-packages, rename/remove that file. - You can also log it in your flow:

Copy code

import pandas, sys
print("pandas file:", getattr(pandas, "__file__", "n/a"))

2) Numpy/pandas ABI mismatch - Pandas 2.3.x requires NumPy 2.x. If an older NumPy is present,

import pandas

will fail with an ImportError that Snowflake reports as “pandas not installed”. - Ensure NumPy 2.x is installed in the runtime:

Copy code

kubectl exec -it <pod> -- python -c "import numpy; import pandas; print(numpy.__version__, pandas.__version__)"

- If this fails, explicitly pin compatible versions in your image:

Copy code

RUN python -m pip install --no-cache-dir \
    "numpy>=2.0,<3" \
    "pandas==2.3.3" \
    "pyarrow==21.0.0" \
    "snowflake-connector-python[pandas]==4.0.0"

3) Wrong interpreter/site-packages at runtime - Very rarely, packages are installed with one interpreter but the job runs with another. - Your base image is fine, but to be safe, use

python -m pip

in the Dockerfile (you did list versions; confirming with the exec test above will rule this out). - Also ensure no venv is being activated in your container entrypoint that hides system site-packages. 4) Cached or stale image tag - Even if the cluster “pulled,” it may still run a cached layer without the latest wheels. - Use a unique image tag per build and set:

Copy code

work_pool:
  job_variables:
    image_pull_policy: "Always"

5) Missing system libs for pandas/pyarrow wheels - Uncommon on the Prefect base, but if import errors mention GLIBC/libstdc++ etc., we’ll see it in the exec output. We can add the needed apt packages if so. Two extra sanity checks - At the very top of your flow, log imports to the Prefect UI so you don’t need pod access:

Copy code

from prefect import flow, get_run_logger
@flow
def main_flow():
    logger = get_run_logger()
    try:
        import sys, pandas, numpy, pyarrow, snowflake.connector
        <http://logger.info|logger.info>(f"Python: {sys.executable}")
        <http://logger.info|logger.info>(f"pandas: {pandas.__version__} ({getattr(pandas, '__file__', 'n/a')})")
        <http://logger.info|logger.info>(f"numpy: {numpy.__version__}")
        <http://logger.info|logger.info>(f"pyarrow: {pyarrow.__version__}")
        <http://logger.info|logger.info>(f"snowflake: {snowflake.connector.__version__}")
    except Exception as e:
        logger.exception(f"Import check failed: {e}")
    # rest of your flow...

- Confirm there’s no

snowflake

PyPI package installed (only

snowflake-connector-python

). The wrong

snowflake

package can cause weird import behavior:

Copy code

kubectl exec -it <pod> -- python -c "import pkgutil; print(any(m.name=='snowflake' for m in pkgutil.iter_modules()))"

Marvin

10/14/2025, 9:22 PM

If you can share the output of the in-pod import test (especially the failure message and module paths), I can give you the precise fix. My hunch is either a numpy/pandas mismatch or a local

pandas.py

shadowing the import.

Charlie Nash

10/14/2025, 9:49 PM

@Marvin I ran the script attached in a failed pod:

Copy code

import sys, importlib
print("Python:", sys.version, sys.executable)
for name in ("pandas", "numpy", "pyarrow", "snowflake.connector"):
    try:
        m = importlib.import_module(name)
        path = getattr(m, "__file__", None)
        ver = getattr(m, "__version__", "unknown")
        print(f"{name} OK {ver} at {path}")
    except Exception as e:
        print(f"{name} FAILED: {type(e).__name__}: {e}")
PY'

The printed statements I got were:

Copy code

pandas OK 2.2.3 at /usr/local/lib/python3.9/site-packages/pandas/__init__.py
numpy OK 2.0.2 at /usr/local/lib/python3.9/site-packages/numpy/__init__.py
pyarrow FAILED: ModuleNotFoundError: No module named 'pyarrow'
snowflake.connector OK 3.14.0 at /usr/local/lib/python3.9/site-packages/snowflake/connector/__init__.py

Could you help me interpret this print screen?

Marvin

10/14/2025, 9:49 PM

✅ thought for 73.9 seconds

Marvin

10/14/2025, 9:50 PM

Thanks for running that check — it tells us a lot. What your pod shows - pandas OK 2.2.3 - numpy OK 2.0.2 - pyarrow FAILED (not installed) - snowflake-connector-python is 3.14.0 (not 4.0.0) Interpretation - Your flow’s container can import pandas just fine. However, pyarrow is missing in the runtime environment. - Many Snowflake “pandas” helpers (e.g., fetch_pandas_all, write_pandas, Snowpark to_pandas) require both pandas and pyarrow. In connector 3.x, a missing optional dep sometimes bubbles up as the generic “Optional dependency: 'pandas' is not installed” error, even when pyarrow is the missing piece. So this output explains the 255002 error. - The versions in the pod do not match what you expected (pandas 2.3.3 / connector 4.0.0 / pyarrow 21.0.0). This strongly suggests the image actually running does not include those pins, or a later layer/step in your build or runtime is installing different versions. How to fix quickly - Ensure pyarrow is in the image used by the Kubernetes job. - Pin the versions you want and build/push a new, uniquely tagged image. Set image_pull_policy: Always to avoid a cached pull. Dockerfile example (pin and install with the same interpreter)

Copy code

FROM prefecthq/prefect:3.2.14-python3.9

WORKDIR /opt/prefect/Scripts/
COPY . /opt/prefect/Scripts/

RUN python -m pip install --no-cache-dir \
      "snowflake-connector-python[pandas]==4.0.0" \
      "pandas==2.3.3" \
      "numpy>=2.0,<3" \
      "pyarrow==21.0.0" \
      adlfs s3fs requests pyodbc prefect-azure \
  && apt-get update \
  && apt-get install -y --no-install-recommends unixodbc \
  && rm -rf /var/lib/apt/lists/*

Then: - Build/push with a unique tag and set image pull policy:

Copy code

# build/push
docker build -t your-registry/your-image:2024-10-14-1 .
docker push your-registry/your-image:2024-10-14-1

- In your prefect.yaml deployment:

Copy code

work_pool:
  name: venuenext-kubernetes
  job_variables:
    image: "your-registry/your-image:2024-10-14-1"
    image_pull_policy: "Always"

- Redeploy and run again. Double-check in the pod after redeploy

Copy code

kubectl exec -it <new-pod> -- sh -lc 'python - <<PY
import importlib
for name in ("pandas","numpy","pyarrow","snowflake.connector"):
    try:
        m = importlib.import_module(name)
        print(name, "OK", getattr(m, "__version__", "unknown"))
    except Exception as e:
        print(name, "FAILED:", e)
PY'

If you need to confirm the current pod’s state right now (without rebuild), you can also test the hypothesis by installing pyarrow in-place:

Copy code

kubectl exec -it <pod> -- sh -lc 'python -m pip install --no-cache-dir pyarrow && python -c "import pyarrow; print(pyarrow.__version__)"'

If your flow then proceeds past the error, that confirms pyarrow was the missing dependency. You’ll still want to bake it into the image and redeploy with a unique tag. Given your pod currently shows connector 3.14.0, if you don’t want to bump to 4.0.0 yet, at least add pyarrow to the existing image and re-tag:

Copy code

python -m pip install "snowflake-connector-python[pandas]==3.14.0" pyarrow

Most likely causes for the mismatch you saw - The image used by the job was not the one you expected (stale tag/cache). Use a unique tag and image_pull_policy: Always. - Another install step (requirements.txt, a runtime init step, or a layer in your Dockerfile) installed/overrode the connector to 3.14.0 and did not include pyarrow. - Installing with plain pip vs python -m pip into a different interpreter/site-packages. Once pyarrow is present and versions are aligned, this specific 255002 should go away. If it persists after you confirm pyarrow imports inside the pod, share the new import test output and the exact Snowflake call you’re making (e.g., fetch_pandas_all, write_pandas, etc.) and we’ll dig further.

Charlie Nash

10/15/2025, 3:45 PM

@Marvin Is there a way to print the name of the Docker image my code is using in the code?

Marvin

10/15/2025, 3:45 PM

✅ thought for 54.5 seconds

Marvin

10/15/2025, 3:46 PM

Yes. In Prefect 3.x you can read the resolved job variables for the current flow run — for Kubernetes that typically includes the image. Add this near the start of your flow:

Copy code

from prefect import flow, get_run_logger
from prefect.runtime import flow_run

@flow
def main_flow():
    logger = get_run_logger()
    jv = getattr(flow_run, "job_variables", None) or {}
    image = jv.get("image")
    <http://logger.info|logger.info>(f"Kubernetes job_variables: {jv}")
    <http://logger.info|logger.info>(f"Container image: {image}")
    # ...rest of your flow...

Notes: -

flow_run.job_variables

is populated when the flow runs from a deployment. If you run locally (e.g., .serve()), it may be empty or missing. If you want to query Kubernetes directly (e.g., to double-check what actually ran), you can read the current Pod spec using the in-cluster Kubernetes client. This requires your service account to have permission to read its own Pod:

Copy code

import os
from kubernetes import client, config
from prefect import get_run_logger

def log_pod_images():
    logger = get_run_logger()
    config.load_incluster_config()
    pod_name = os.environ.get("HOSTNAME")  # default to current pod hostname
    with open("/var/run/secrets/kubernetes.io/serviceaccount/namespace") as f:
        namespace = f.read().strip()
    v1 = client.CoreV1Api()
    pod = v1.read_namespaced_pod(name=pod_name, namespace=namespace)
    images = [c.image for c in pod.spec.containers]
    <http://logger.info|logger.info>(f"K8s pod containers: {images}")

Docs: - Kubernetes work pools and job variables: Run on Kubernetes - Customize job variables: Customize job variables

Charlie Nash

10/15/2025, 3:52 PM

@Marvin Running that code returned the following print lines

Copy code

Kubernetes job_variables: {}
Container image: None

Does this mean my flow is not grabbing the correct docker image? How can I update my code to ensure it grabs the correct docker image?

Marvin

10/15/2025, 3:52 PM

✅ thought for 64.5 seconds

Marvin

10/15/2025, 3:53 PM

Great question. The empty job_variables means your deployment did not set any per-deployment overrides. That does not mean Kubernetes used the wrong image — it likely means the image is coming from your Kubernetes work pool’s default job template instead of the deployment’s job_variables. Key points: - flow_run.job_variables only shows what the deployment overrides. If your image is defined on the work pool’s base job template, job_variables will be empty in code. - To confirm the actual image at runtime, query the current Pod via the Kubernetes API from inside the pod (this reads the real spec). Add this to your flow to log the real container image(s):

Copy code

# pip install kubernetes in your image if you don't have it
# RUN python -m pip install kubernetes

from prefect import flow, get_run_logger
import os

def log_pod_images():
    from kubernetes import client, config
    logger = get_run_logger()
    try:
        config.load_incluster_config()
        with open("/var/run/secrets/kubernetes.io/serviceaccount/namespace") as f:
            namespace = f.read().strip()
        pod_name = os.environ.get("HOSTNAME")
        v1 = client.CoreV1Api()
        pod = v1.read_namespaced_pod(name=pod_name, namespace=namespace)
        images = [c.image for c in pod.spec.containers]
        <http://logger.info|logger.info>(f"Pod: {namespace}/{pod_name} images: {images}")
    except Exception as e:
        logger.exception(f"Failed to query K8s pod image: {e}")

@flow
def main_flow():
    logger = get_run_logger()
    # This prints deployment overrides (empty if image comes from pool defaults)
    from prefect.runtime import flow_run
    <http://logger.info|logger.info>(f"Kubernetes job_variables: {getattr(flow_run, 'job_variables', {})}")
    # This prints the actual pod container image(s)
    log_pod_images()
    # ...rest of your flow...

How to “ensure it grabs the correct docker image” You have two reliable options: - Set the image on the deployment: - In your prefect.yaml under this deployment, set:

Copy code

work_pool:
      name: venuenext-kubernetes
      job_variables:
        image: "your-registry/your-image:unique-tag"
        image_pull_policy: "Always"

- Use a unique tag per build to avoid stale caches. - Or set the default image on the work pool: - Prefect UI → Work Pools → venuenext-kubernetes → Edit default job template/variables → set image and imagePullPolicy. - Or inspect/update via CLI: - Inspect current defaults:

Copy code

prefect work-pool inspect "venuenext-kubernetes"

- Update by supplying a base job template JSON if needed. Docs: - Customize job variables - Run on Kubernetes Once you set the image in either place and redeploy, the Kubernetes API snippet above will confirm the exact image used by the pod.

Charlie Nash

10/15/2025, 5:45 PM

@Marvin How do I deploy a flow without pushing my entire directory into an Azure blob storage?

Marvin

10/15/2025, 5:45 PM

✅ thought for 295.5 seconds

Marvin

10/15/2025, 5:50 PM

You have a few good options in Prefect 3.x to avoid uploading your whole project to Azure Blob: Option 1: Bake your code into the Docker image (no storage at all) - Copy only the needed files into your image and run from there. Remove your Azure push/pull steps. - Example prefect.yaml snippet:

Copy code

deployments:
- name: VenueNextColumns
  entrypoint: venueNextParseColumns_snowflake.py:main_flow  # path inside the container
  work_pool:
    name: venuenext-kubernetes
    job_variables:
      image: your-registry/your-image:unique-tag
      image_pull_policy: Always
  push: []
  pull: []

- Ensure your Dockerfile copies just the flow code you need, and the path matches the entrypoint. - Docs: Run on Kubernetes Option 2: Pull code from Git at runtime (no Azure storage) - Replace your Azure push/pull steps with a Git clone pull step so workers fetch the code directly.

Copy code

deployments:
- name: VenueNextColumns
  entrypoint: venueNextParseColumns_snowflake.py:main_flow
  work_pool:
    name: venuenext-kubernetes
    job_variables:
      image: your-registry/your-image:unique-tag
      image_pull_policy: Always
  push: []
  pull:
  - prefect.deployments.steps.git_clone:
      repository: <https://github.com/your-org/your-repo.git>
      branch: main
      # credentials: '{{ prefect.blocks.secret.github-token }}'  # if private

- Docs: Store flow code, prefect.yaml Option 3: Keep Azure, but ignore almost everything - Your current step supports an ignore file. Create a

.prefectignore

that excludes the whole directory except the files you need.

Copy code

# .prefectignore
*
!venueNextParseColumns_snowflake.py
!your_package/**
!.prefectignore

- Or point the step at a custom ignore file:

Copy code

push:
- prefect_azure.deployments.steps.push_to_azure_blob_storage:
    container: prefectscripts
    folder: ''
    credentials: '{{ prefect.blocks.azure-blob-storage-credentials.prefectscriptskubernetes }}'
    ignore_file: .prefectignore

- Step signature reference: it accepts

ignore_file

and defaults to

.prefectignore

. Which should you pick? - If you already run on Kubernetes with a custom image, Option 1 is simplest and fastest at runtime. - If you keep your code in Git and want light images, Option 2 works great. - If you must use Azure storage but want smaller uploads, Option 3. If you tell me which route you prefer, I can draft the exact prefect.yaml and Dockerfile changes for your project. Links for reference: - Store flow code - prefect.yaml - Run on Kubernetes

Charlie Nash

10/15/2025, 5:57 PM

@Marvin I deployed my updated code and received the following error:

Copy code

AttributeError: module 'botocore' has no attribute 'config'

Marvin

10/15/2025, 5:57 PM

✅ thought for 66.9 seconds

Marvin

10/15/2025, 5:58 PM

That error isn’t Prefect-specific — it means Python is not finding the expected submodule at botocore.config. Most often, it’s one of these: Likely causes 1) Wrong import pattern - This will fail:

Copy code

import botocore
cfg = botocore.config.Config(...)

- Use either of these instead:

Copy code

from botocore.config import Config
# or
import botocore.config as bc_config
cfg = bc_config.Config(...)

2) Local module shadowing the real botocore - If your project has a file/folder named

botocore.py

botocore/

, it will shadow the installed AWS library. - Quick check (run inside the failing pod or log in your flow):

Copy code

import botocore, sys
print("botocore imported from:", getattr(botocore, "__file__", "builtin"))
print("sys.path:", sys.path)

If the path points into your project instead of site-packages, rename/remove that file/folder. 3) Version mismatch between boto3, botocore, aiobotocore/s3fs - s3fs pulls aiobotocore which pins specific botocore versions. If versions drift, odd import errors can show up. - Check versions:

Copy code

import boto3, botocore, pkgutil
print("boto3:", boto3.__version__)
print("botocore:", botocore.__version__)
print("aiobotocore installed:", any(m.name=='aiobotocore' for m in pkgutil.iter_modules()))

- If mismatched, pin compatible versions and rebuild with a new image tag. For example:

Copy code

python -m pip install --no-cache-dir \
  "boto3>=1.34,<1.35" \
  "botocore>=1.34,<1.35" \
  "s3fs>=2024.5.0"

4) Submodule not imported before attribute access - Accessing

botocore.config

only works after the submodule is imported. The from-import forms above ensure that. Quick next steps - Share the failing line(s) of code that reference botocore so I can suggest the exact fix. - Run this small diagnostic in the pod and paste output:

Copy code

python - <<'PY'
import sys, importlib, pkgutil
def info(name):
    try:
        m = importlib.import_module(name)
        print(f"{name} OK version={getattr(m,'__version__','?')} path={getattr(m,'__file__','builtin')}")
    except Exception as e:
        print(f"{name} FAIL {type(e).__name__}: {e}")

info("boto3")
info("botocore")
try:
    from botocore.config import Config
    print("from botocore.config import Config OK")
except Exception as e:
    print("botocore.config import FAIL:", e)

print("aiobotocore present:", any(m.name=='aiobotocore' for m in pkgutil.iter_modules()))
PY

If this is coming from s3fs usage, we can align versions or switch to the recommended import pattern and it should clear up quickly.

5 Views

Open in Slack

Previous Next