Charlie Nash
10/14/2025, 7:38 PMFinished in state Failed("Flow run encountered an exception: ProgrammingError: 255002: Optional dependency: 'pandas' is not installed, please see the following link for install instructions: <https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#installation>")Marvin
10/14/2025, 7:38 PMMarvin
10/14/2025, 7:39 PMcursor.fetch_pandas_all() or fetch_pandas_batches()
- snowflake.connector.pandas_tools.write_pandas(...)
- Snowpark DataFrame.to_pandas()
- SQLAlchemy Snowflake helpers that return DataFrames
How to fix
- Install the pandas extra for the Snowflake connector in the runtime environment:
pip install "snowflake-connector-python[pandas]"
This pulls in pandas (and related deps like pyarrow) that those methods need.
- If you’re running in a container/worker image, add it to the image your deployment uses (e.g., in your Dockerfile):
RUN pip install --no-cache-dir "snowflake-connector-python[pandas]"
- If you manage dependencies with requirements/pyproject, add:
snowflake-connector-python[pandas]
Reference: Snowflake pandas integration installation
If you’re not explicitly using pandas in your code, it’s still likely one of the above Snowflake helpers under the hood. If you share:
- where the flow runs (local, Docker/K8s work pool, etc.), and
- the Snowflake call you’re making,
I can suggest the exact dependency change and where to put it.Charlie Nash
10/14/2025, 7:44 PMFROM prefecthq/prefect:3.2.14-python3.9
COPY . /opt/prefect/Scripts/
WORKDIR /opt/prefect/Scripts/
RUN pip install --no-cache-dir "snowflake-connector-python[pandas]" adlfs s3fs requests pandas pyodbc snowflake prefect_azure && apt-get update && apt-get install -y unixodbcMarvin
10/14/2025, 7:44 PMMarvin
10/14/2025, 7:50 PMfrom prefect import flow
from prefect.deployments import deploy
deploy(
"path/to/flow.py:my_flow",
name="my-deployment",
work_pool_name="my-docker-pool",
image="your-registry/your-image:tag",
build=False, # image already built
push=False, # image already pushed
)
- Or set the “image” in your Docker work pool’s job variables or in your prefect.yaml and redeploy.
- Make sure your Docker worker is running and attached to that pool.
Docs: Use Docker with Prefect work pools
2) You’re running on a non-Docker worker
- If you’re using a Process/Kubernetes/other pool, the Dockerfile won’t affect the runtime environment unless the job actually pulls and runs that image.
- For Process workers, install the deps where the worker runs:
pip install "snowflake-connector-python[pandas]" pandas pyarrow
3) Small but important Dockerfile notes
- You’re installing both snowflake-connector-python[pandas] and snowflake. The PyPI package named snowflake is not the Snowflake connector and can cause module conflicts. Remove snowflake from your pip line.
- If you use DataFrame helpers (e.g., write_pandas, fetch_pandas_all), you often need pyarrow too. Add it explicitly:
RUN pip install --no-cache-dir "snowflake-connector-python[pandas]" pandas pyarrow adlfs s3fs requests pyodbc prefect-azure && \
apt-get update && apt-get install -y unixodbc
Note: In Prefect 3.x the Azure collection is published as prefect-azure (hyphen), not prefect_azure.
- Quick sanity test your image locally:
docker run --rm -it your-registry/your-image:tag \
python -c "import pandas, snowflake.connector; print('OK', pandas.__version__, snowflake.connector.__version__)"
Checklist to get you unblocked
- Confirm the image used by your failing run in the UI.
- If incorrect, update your deployment or pool to use your custom image and ensure the worker is running for that pool.
- Rebuild/tag/push your image, then redeploy:
docker build -t your-registry/your-image:tag .
docker push your-registry/your-image:tag
- Remove snowflake from your pip installs; keep snowflake-connector-python[pandas] (+ pyarrow).
- If not using a Docker work pool, install the deps in the actual runtime environment.
If you can share:
- Your work pool type (Docker, Process, K8s, etc.)
- How you’re creating the deployment (prefect.yaml or code/CLI) and which image it references
- A link or screenshot of the flow run’s Infrastructure panel showing the image
…I can pinpoint the exact change needed.Charlie Nash
10/14/2025, 8:08 PM- name: VenueNextColumns
version:
tags: []
description:
entrypoint: venueNextParseColumns_snowflake.py:main_flow
parameters: {}
work_pool:
name: venuenext-kubernetes
worker_name: KubernetesWorker ce78e435-8a1e-4950-8d8b-e4ee47546859
job_variables:
image: '{{ build-image.image }}'
schedules: []
push:
- prefect_azure.deployments.steps.push_to_azure_blob_storage:
container: prefectscripts
folder: ''
credentials: '{{ prefect.blocks.azure-blob-storage-credentials.prefectscriptskubernetes
}}'
pull:
- prefect_azure.deployments.steps.pull_from_azure_blob_storage:
container: prefectscripts
folder: ''
credentials: '{{ prefect.blocks.azure-blob-storage-credentials.prefectscriptskubernetes
}}'
concurrency_limit:Marvin
10/14/2025, 8:09 PMMarvin
10/14/2025, 8:13 PMjob_variables.image points to {{ build-image.image }}, but there is no step in your prefect.yaml that actually builds/pushes an image and sets that variable. As a result, the Kubernetes job is likely using the work pool’s default image (a base Prefect image that does not include pandas), which causes the Snowflake “pandas not installed” error.
Two ways to fix it:
Option A: Hardcode your custom image tag in the deployment
- Build and push your image to a registry your cluster can pull from.
- Set job_variables.image to that exact tag.
Example:
work_pool:
name: venuenext-kubernetes
job_variables:
image: "your-registry/your-image:tag"
# if your registry is private:
# image_pull_secrets:
# - name: your-registry-secret
Option B: Wire up docker build/push steps in prefect.yaml, then keep using {{ build-image.image }}
- Add build/push steps using the prefect-docker collection so build-image.image is produced:
build:
- prefect_docker.deployments.steps.build_docker_image:
id: build-image
requires: prefect-docker
image_name: your-registry/your-image
tag: v1
dockerfile: ./Dockerfile
- prefect_docker.deployments.steps.push_docker_image:
id: push-image
requires: prefect-docker
image: '{{ build-image.image }}'
- Keep your deployment’s job_variables.image: '{{ build-image.image }}'.
Make sure the image actually has the Snowflake pandas extras
Your Dockerfile is close, but I recommend these changes:
- Remove snowflake (that’s not the connector and can cause conflicts)
- Install the Azure collection with the correct PyPI name prefect-azure
- Add pyarrow (commonly needed with Snowflake pandas helpers)
Example:
FROM prefecthq/prefect:3.2.14-python3.9
COPY . /opt/prefect/Scripts/
WORKDIR /opt/prefect/Scripts/
RUN pip install --no-cache-dir \
"snowflake-connector-python[pandas]" \
pandas \
pyarrow \
adlfs \
s3fs \
requests \
pyodbc \
prefect-azure \
&& apt-get update \
&& apt-get install -y --no-install-recommends unixodbc \
&& rm -rf /var/lib/apt/lists/*
Verify what image was used by the failing run
- In the Prefect UI: Flow run → Infrastructure panel → check the “image” value. If it’s not your custom image, the deployment wasn’t pointing at it.
Helpful docs
- Kubernetes work pools and job variables: Run on Kubernetes
- Customize job variables: Customize job variables
If you share:
- The image value shown on the failing run’s Infrastructure panel
- Whether your cluster needs imagePullSecrets
- Your intended image tag/registry
…I can provide the exact prefect.yaml snippet you need.Charlie Nash
10/14/2025, 8:33 PMMarvin
10/14/2025, 8:34 PMMarvin
10/14/2025, 8:35 PMwork_pool:
name: venuenext-kubernetes
job_variables:
image: "your-registry/your-image:2024-10-14-001" # new tag
image_pull_policy: "Always"
Docs: Run on Kubernetes
2) Package conflict: the snowflake PyPI package is installed
- Your original Dockerfile installs snowflake, which is not the Snowflake connector and can interfere with imports.
- Rebuild your image without snowflake and make sure you’re using prefect-azure (hyphen), not prefect_azure (underscore) as the PyPI package name.
- Also add pyarrow, which Snowflake pandas helpers typically require.
Example Dockerfile:
FROM prefecthq/prefect:3.2.14-python3.9
COPY . /opt/prefect/Scripts/
WORKDIR /opt/prefect/Scripts/
# Use the same interpreter the container will run with
RUN python -m pip install --no-cache-dir \
"snowflake-connector-python[pandas]" \
pandas \
pyarrow \
adlfs \
s3fs \
requests \
pyodbc \
prefect-azure \
&& apt-get update \
&& apt-get install -y --no-install-recommends unixodbc \
&& rm -rf /var/lib/apt/lists/*
3) Packages installed into a different Python than the one used at runtime
- Using pip vs python -m pip can sometimes install into the wrong site-packages if multiple interpreters/venvs exist.
- You can sanity-check during build:
RUN python - <<'PY'
import sys
print("Python exe:", sys.executable)
import pandas, snowflake.connector
print("pandas:", pandas.__version__)
print("snowflake:", snowflake.connector.__version__)
PY
4) The Snowflake call needs pandas/pyarrow, but import fails for a different reason
- The Snowflake connector raises the exact “Optional dependency: 'pandas' is not installed” message whenever importing pandas fails — that includes missing wheels or incompatible binaries.
- Verifying inside the running pod will tell us quickly:
# Grab the failing flow run's pod name, then:
kubectl exec -it <pod-name> -- python -c "import pandas, snowflake.connector; print('OK', pandas.__version__, snowflake.connector.__version__)"
If that import fails, it’s definitely an environment issue (wrong tag, wrong Python path, or the conflict from snowflake package).
5) Add a quick runtime check in your flow to confirm the environment
At the start of the flow:
from prefect import flow, get_run_logger
@flow
def main_flow(...):
logger = get_run_logger()
try:
import sys, pandas, snowflake.connector
<http://logger.info|logger.info>(f"Python: {sys.executable}")
<http://logger.info|logger.info>(f"pandas: {pandas.__version__}")
<http://logger.info|logger.info>(f"snowflake: {snowflake.connector.__version__}")
except Exception as e:
logger.exception(f"Import check failed: {e}")
# ...rest of flow...
Quick checklist
- Rebuild with the corrected Dockerfile (no snowflake, use prefect-azure, add pyarrow, use python -m pip).
- Push with a new, unique tag; set image_pull_policy: Always.
- Redeploy and run again.
- If it still fails, exec into the pod and run the one-liner import test above — that will pinpoint whether it’s the image or the application code.
If you can share:
- The exact image tag and whether image_pull_policy is set
- Output of the in-pod python -c "import pandas, snowflake.connector; ..." test
- The specific Snowflake API you’re calling (e.g., fetch_pandas_all, write_pandas, Snowpark to_pandas())
…I can zero in on the remaining cause.Charlie Nash
10/14/2025, 9:20 PMsnowflake-connector-python=4.0.0
pandas=2.3.3
pyarrow=21.0.0
However, I am still getting the error:
snowflake.connector.errors.ProgrammingError: 255002: Optional dependency: 'pandas' is not installed, please see the following link for install instructions: <https://docs.snowflake.com/en/user-guide/python-connector-pandas.html#installation>
Any ideas?Marvin
10/14/2025, 9:20 PMMarvin
10/14/2025, 9:22 PMimport pandas fails at runtime. So we need to find why pandas import is failing inside the pod.
Fast way to pinpoint the cause
- Exec into a failing flow pod and run this:
kubectl exec -it <pod-name> -- sh -lc 'python - <<PY
import sys, importlib
print("Python:", sys.version, sys.executable)
for name in ("pandas", "numpy", "pyarrow", "snowflake.connector"):
try:
m = importlib.import_module(name)
path = getattr(m, "__file__", None)
ver = getattr(m, "__version__", "unknown")
print(f"{name} OK {ver} at {path}")
except Exception as e:
print(f"{name} FAILED: {type(e).__name__}: {e}")
PY'
- This will tell us exactly which import fails and why (e.g., version mismatch, missing binary, shadowed module, etc.). Please paste the output here.
Common root causes I’ve seen for this exact symptom
1) Local module shadowing pandas
- A file/folder in your project named pandas.py or pandas/ can hijack the import.
- Check what’s being imported: the command above prints the module path. If it points into your project instead of site-packages, rename/remove that file.
- You can also log it in your flow:
import pandas, sys
print("pandas file:", getattr(pandas, "__file__", "n/a"))
2) Numpy/pandas ABI mismatch
- Pandas 2.3.x requires NumPy 2.x. If an older NumPy is present, import pandas will fail with an ImportError that Snowflake reports as “pandas not installed”.
- Ensure NumPy 2.x is installed in the runtime:
kubectl exec -it <pod> -- python -c "import numpy; import pandas; print(numpy.__version__, pandas.__version__)"
- If this fails, explicitly pin compatible versions in your image:
RUN python -m pip install --no-cache-dir \
"numpy>=2.0,<3" \
"pandas==2.3.3" \
"pyarrow==21.0.0" \
"snowflake-connector-python[pandas]==4.0.0"
3) Wrong interpreter/site-packages at runtime
- Very rarely, packages are installed with one interpreter but the job runs with another.
- Your base image is fine, but to be safe, use python -m pip in the Dockerfile (you did list versions; confirming with the exec test above will rule this out).
- Also ensure no venv is being activated in your container entrypoint that hides system site-packages.
4) Cached or stale image tag
- Even if the cluster “pulled,” it may still run a cached layer without the latest wheels.
- Use a unique image tag per build and set:
work_pool:
job_variables:
image_pull_policy: "Always"
5) Missing system libs for pandas/pyarrow wheels
- Uncommon on the Prefect base, but if import errors mention GLIBC/libstdc++ etc., we’ll see it in the exec output. We can add the needed apt packages if so.
Two extra sanity checks
- At the very top of your flow, log imports to the Prefect UI so you don’t need pod access:
from prefect import flow, get_run_logger
@flow
def main_flow():
logger = get_run_logger()
try:
import sys, pandas, numpy, pyarrow, snowflake.connector
<http://logger.info|logger.info>(f"Python: {sys.executable}")
<http://logger.info|logger.info>(f"pandas: {pandas.__version__} ({getattr(pandas, '__file__', 'n/a')})")
<http://logger.info|logger.info>(f"numpy: {numpy.__version__}")
<http://logger.info|logger.info>(f"pyarrow: {pyarrow.__version__}")
<http://logger.info|logger.info>(f"snowflake: {snowflake.connector.__version__}")
except Exception as e:
logger.exception(f"Import check failed: {e}")
# rest of your flow...
- Confirm there’s no snowflake PyPI package installed (only snowflake-connector-python). The wrong snowflake package can cause weird import behavior:
kubectl exec -it <pod> -- python -c "import pkgutil; print(any(m.name=='snowflake' for m in pkgutil.iter_modules()))"Marvin
10/14/2025, 9:22 PMpandas.py shadowing the import.Charlie Nash
10/14/2025, 9:49 PMimport sys, importlib
print("Python:", sys.version, sys.executable)
for name in ("pandas", "numpy", "pyarrow", "snowflake.connector"):
try:
m = importlib.import_module(name)
path = getattr(m, "__file__", None)
ver = getattr(m, "__version__", "unknown")
print(f"{name} OK {ver} at {path}")
except Exception as e:
print(f"{name} FAILED: {type(e).__name__}: {e}")
PY'
The printed statements I got were:
pandas OK 2.2.3 at /usr/local/lib/python3.9/site-packages/pandas/__init__.py
numpy OK 2.0.2 at /usr/local/lib/python3.9/site-packages/numpy/__init__.py
pyarrow FAILED: ModuleNotFoundError: No module named 'pyarrow'
snowflake.connector OK 3.14.0 at /usr/local/lib/python3.9/site-packages/snowflake/connector/__init__.py
Could you help me interpret this print screen?Marvin
10/14/2025, 9:49 PMMarvin
10/14/2025, 9:50 PMFROM prefecthq/prefect:3.2.14-python3.9
WORKDIR /opt/prefect/Scripts/
COPY . /opt/prefect/Scripts/
RUN python -m pip install --no-cache-dir \
"snowflake-connector-python[pandas]==4.0.0" \
"pandas==2.3.3" \
"numpy>=2.0,<3" \
"pyarrow==21.0.0" \
adlfs s3fs requests pyodbc prefect-azure \
&& apt-get update \
&& apt-get install -y --no-install-recommends unixodbc \
&& rm -rf /var/lib/apt/lists/*
Then:
- Build/push with a unique tag and set image pull policy:
# build/push
docker build -t your-registry/your-image:2024-10-14-1 .
docker push your-registry/your-image:2024-10-14-1
- In your prefect.yaml deployment:
work_pool:
name: venuenext-kubernetes
job_variables:
image: "your-registry/your-image:2024-10-14-1"
image_pull_policy: "Always"
- Redeploy and run again.
Double-check in the pod after redeploy
kubectl exec -it <new-pod> -- sh -lc 'python - <<PY
import importlib
for name in ("pandas","numpy","pyarrow","snowflake.connector"):
try:
m = importlib.import_module(name)
print(name, "OK", getattr(m, "__version__", "unknown"))
except Exception as e:
print(name, "FAILED:", e)
PY'
If you need to confirm the current pod’s state right now (without rebuild), you can also test the hypothesis by installing pyarrow in-place:
kubectl exec -it <pod> -- sh -lc 'python -m pip install --no-cache-dir pyarrow && python -c "import pyarrow; print(pyarrow.__version__)"'
If your flow then proceeds past the error, that confirms pyarrow was the missing dependency. You’ll still want to bake it into the image and redeploy with a unique tag.
Given your pod currently shows connector 3.14.0, if you don’t want to bump to 4.0.0 yet, at least add pyarrow to the existing image and re-tag:
python -m pip install "snowflake-connector-python[pandas]==3.14.0" pyarrow
Most likely causes for the mismatch you saw
- The image used by the job was not the one you expected (stale tag/cache). Use a unique tag and image_pull_policy: Always.
- Another install step (requirements.txt, a runtime init step, or a layer in your Dockerfile) installed/overrode the connector to 3.14.0 and did not include pyarrow.
- Installing with plain pip vs python -m pip into a different interpreter/site-packages.
Once pyarrow is present and versions are aligned, this specific 255002 should go away. If it persists after you confirm pyarrow imports inside the pod, share the new import test output and the exact Snowflake call you’re making (e.g., fetch_pandas_all, write_pandas, etc.) and we’ll dig further.Charlie Nash
10/15/2025, 3:45 PMMarvin
10/15/2025, 3:45 PMMarvin
10/15/2025, 3:46 PMfrom prefect import flow, get_run_logger
from prefect.runtime import flow_run
@flow
def main_flow():
logger = get_run_logger()
jv = getattr(flow_run, "job_variables", None) or {}
image = jv.get("image")
<http://logger.info|logger.info>(f"Kubernetes job_variables: {jv}")
<http://logger.info|logger.info>(f"Container image: {image}")
# ...rest of your flow...
Notes:
- flow_run.job_variables is populated when the flow runs from a deployment. If you run locally (e.g., .serve()), it may be empty or missing.
If you want to query Kubernetes directly (e.g., to double-check what actually ran), you can read the current Pod spec using the in-cluster Kubernetes client. This requires your service account to have permission to read its own Pod:
import os
from kubernetes import client, config
from prefect import get_run_logger
def log_pod_images():
logger = get_run_logger()
config.load_incluster_config()
pod_name = os.environ.get("HOSTNAME") # default to current pod hostname
with open("/var/run/secrets/kubernetes.io/serviceaccount/namespace") as f:
namespace = f.read().strip()
v1 = client.CoreV1Api()
pod = v1.read_namespaced_pod(name=pod_name, namespace=namespace)
images = [c.image for c in pod.spec.containers]
<http://logger.info|logger.info>(f"K8s pod containers: {images}")
Docs:
- Kubernetes work pools and job variables: Run on Kubernetes
- Customize job variables: Customize job variablesCharlie Nash
10/15/2025, 3:52 PMKubernetes job_variables: {}
Container image: None
Does this mean my flow is not grabbing the correct docker image? How can I update my code to ensure it grabs the correct docker image?Marvin
10/15/2025, 3:52 PMMarvin
10/15/2025, 3:53 PM# pip install kubernetes in your image if you don't have it
# RUN python -m pip install kubernetes
from prefect import flow, get_run_logger
import os
def log_pod_images():
from kubernetes import client, config
logger = get_run_logger()
try:
config.load_incluster_config()
with open("/var/run/secrets/kubernetes.io/serviceaccount/namespace") as f:
namespace = f.read().strip()
pod_name = os.environ.get("HOSTNAME")
v1 = client.CoreV1Api()
pod = v1.read_namespaced_pod(name=pod_name, namespace=namespace)
images = [c.image for c in pod.spec.containers]
<http://logger.info|logger.info>(f"Pod: {namespace}/{pod_name} images: {images}")
except Exception as e:
logger.exception(f"Failed to query K8s pod image: {e}")
@flow
def main_flow():
logger = get_run_logger()
# This prints deployment overrides (empty if image comes from pool defaults)
from prefect.runtime import flow_run
<http://logger.info|logger.info>(f"Kubernetes job_variables: {getattr(flow_run, 'job_variables', {})}")
# This prints the actual pod container image(s)
log_pod_images()
# ...rest of your flow...
How to “ensure it grabs the correct docker image”
You have two reliable options:
- Set the image on the deployment:
- In your prefect.yaml under this deployment, set:
work_pool:
name: venuenext-kubernetes
job_variables:
image: "your-registry/your-image:unique-tag"
image_pull_policy: "Always"
- Use a unique tag per build to avoid stale caches.
- Or set the default image on the work pool:
- Prefect UI → Work Pools → venuenext-kubernetes → Edit default job template/variables → set image and imagePullPolicy.
- Or inspect/update via CLI:
- Inspect current defaults:
prefect work-pool inspect "venuenext-kubernetes"
- Update by supplying a base job template JSON if needed.
Docs:
- Customize job variables
- Run on Kubernetes
Once you set the image in either place and redeploy, the Kubernetes API snippet above will confirm the exact image used by the pod.Charlie Nash
10/15/2025, 5:45 PMMarvin
10/15/2025, 5:45 PMMarvin
10/15/2025, 5:50 PMdeployments:
- name: VenueNextColumns
entrypoint: venueNextParseColumns_snowflake.py:main_flow # path inside the container
work_pool:
name: venuenext-kubernetes
job_variables:
image: your-registry/your-image:unique-tag
image_pull_policy: Always
push: []
pull: []
- Ensure your Dockerfile copies just the flow code you need, and the path matches the entrypoint.
- Docs: Run on Kubernetes
Option 2: Pull code from Git at runtime (no Azure storage)
- Replace your Azure push/pull steps with a Git clone pull step so workers fetch the code directly.
deployments:
- name: VenueNextColumns
entrypoint: venueNextParseColumns_snowflake.py:main_flow
work_pool:
name: venuenext-kubernetes
job_variables:
image: your-registry/your-image:unique-tag
image_pull_policy: Always
push: []
pull:
- prefect.deployments.steps.git_clone:
repository: <https://github.com/your-org/your-repo.git>
branch: main
# credentials: '{{ prefect.blocks.secret.github-token }}' # if private
- Docs: Store flow code, prefect.yaml
Option 3: Keep Azure, but ignore almost everything
- Your current step supports an ignore file. Create a .prefectignore that excludes the whole directory except the files you need.
# .prefectignore
*
!venueNextParseColumns_snowflake.py
!your_package/**
!.prefectignore
- Or point the step at a custom ignore file:
push:
- prefect_azure.deployments.steps.push_to_azure_blob_storage:
container: prefectscripts
folder: ''
credentials: '{{ prefect.blocks.azure-blob-storage-credentials.prefectscriptskubernetes }}'
ignore_file: .prefectignore
- Step signature reference: it accepts ignore_file and defaults to .prefectignore.
Which should you pick?
- If you already run on Kubernetes with a custom image, Option 1 is simplest and fastest at runtime.
- If you keep your code in Git and want light images, Option 2 works great.
- If you must use Azure storage but want smaller uploads, Option 3.
If you tell me which route you prefer, I can draft the exact prefect.yaml and Dockerfile changes for your project. Links for reference:
- Store flow code
- prefect.yaml
- Run on KubernetesCharlie Nash
10/15/2025, 5:57 PMAttributeError: module 'botocore' has no attribute 'config'Marvin
10/15/2025, 5:57 PMMarvin
10/15/2025, 5:58 PMimport botocore
cfg = botocore.config.Config(...)
- Use either of these instead:
from botocore.config import Config
# or
import botocore.config as bc_config
cfg = bc_config.Config(...)
2) Local module shadowing the real botocore
- If your project has a file/folder named botocore.py or botocore/, it will shadow the installed AWS library.
- Quick check (run inside the failing pod or log in your flow):
import botocore, sys
print("botocore imported from:", getattr(botocore, "__file__", "builtin"))
print("sys.path:", sys.path)
If the path points into your project instead of site-packages, rename/remove that file/folder.
3) Version mismatch between boto3, botocore, aiobotocore/s3fs
- s3fs pulls aiobotocore which pins specific botocore versions. If versions drift, odd import errors can show up.
- Check versions:
import boto3, botocore, pkgutil
print("boto3:", boto3.__version__)
print("botocore:", botocore.__version__)
print("aiobotocore installed:", any(m.name=='aiobotocore' for m in pkgutil.iter_modules()))
- If mismatched, pin compatible versions and rebuild with a new image tag. For example:
python -m pip install --no-cache-dir \
"boto3>=1.34,<1.35" \
"botocore>=1.34,<1.35" \
"s3fs>=2024.5.0"
4) Submodule not imported before attribute access
- Accessing botocore.config only works after the submodule is imported. The from-import forms above ensure that.
Quick next steps
- Share the failing line(s) of code that reference botocore so I can suggest the exact fix.
- Run this small diagnostic in the pod and paste output:
python - <<'PY'
import sys, importlib, pkgutil
def info(name):
try:
m = importlib.import_module(name)
print(f"{name} OK version={getattr(m,'__version__','?')} path={getattr(m,'__file__','builtin')}")
except Exception as e:
print(f"{name} FAIL {type(e).__name__}: {e}")
info("boto3")
info("botocore")
try:
from botocore.config import Config
print("from botocore.config import Config OK")
except Exception as e:
print("botocore.config import FAIL:", e)
print("aiobotocore present:", any(m.name=='aiobotocore' for m in pkgutil.iter_modules()))
PY
If this is coming from s3fs usage, we can align versions or switch to the recommended import pattern and it should clear up quickly.