Miguel Moncada
05/22/2024, 4:02 PMeurope-docker.pkg.dev/project/folder-staging/image_name:latest
), previously this image had an issue as it was missing a dependency (pandas
lib) - I fixed this and the image was re-deployed via our CI, I've verified that the image now does have the dependency installed correctly.
The CI was also triggered to re-deploy the flow by calling the following function:
def deploy_projects_observability_export_flow():
projects_observability_export_flow.deploy(
name="projects_observability_export_flow_k8s_deployment",
work_pool_name="default-worker-pool",
image="europe-docker.pkg.dev/***/prefect-runners-staging/default:latest",
tags=["staging"],
schedule=CronSchedule(cron="0 0 * * *", timezone="UTC"),
job_variables={
"cpu_request": "110m",
"mem_request": "128Mi",
"cpu_limit": "1200m",
"mem_limit": "8Gi",
},
)
However, the flow runs that are set to leverage this image keep showing the ModuleNotFound
error. Do you happen to know what I'm missing?Kevin Grismore
05/22/2024, 4:31 PMMiguel Moncada
05/22/2024, 4:31 PMMiguel Moncada
05/22/2024, 4:32 PMk describe job -n prefect phenomenal-sidewinder-csnmg
Name: phenomenal-sidewinder-csnmg
Namespace: prefect
Selector: <http://batch.kubernetes.io/controller-uid=32b709ed-0221-40e2-94af-0c99a6864fc0|batch.kubernetes.io/controller-uid=32b709ed-0221-40e2-94af-0c99a6864fc0>
Labels: <http://prefect.io/deployment-id=99693312-dfdf-458b-ba2a-fe9a36423863|prefect.io/deployment-id=99693312-dfdf-458b-ba2a-fe9a36423863>
<http://prefect.io/deployment-name=projects_observability_export_flow_k8s_deployment|prefect.io/deployment-name=projects_observability_export_flow_k8s_deployment>
<http://prefect.io/deployment-updated=2024-05-22t16-28-20.988838z|prefect.io/deployment-updated=2024-05-22t16-28-20.988838z>
<http://prefect.io/flow-id=05196d14-2c87-408d-811b-0f7f543717cf|prefect.io/flow-id=05196d14-2c87-408d-811b-0f7f543717cf>
<http://prefect.io/flow-name=projects_observability_export_flow|prefect.io/flow-name=projects_observability_export_flow>
<http://prefect.io/flow-run-id=a6344f2c-32ba-4300-9e6e-49dadfb62054|prefect.io/flow-run-id=a6344f2c-32ba-4300-9e6e-49dadfb62054>
<http://prefect.io/flow-run-name=phenomenal-sidewinder|prefect.io/flow-run-name=phenomenal-sidewinder>
<http://prefect.io/version=2.19.1|prefect.io/version=2.19.1>
Annotations: <http://batch.kubernetes.io/job-tracking|batch.kubernetes.io/job-tracking>:
Parallelism: 1
Completions: 1
Completion Mode: NonIndexed
Start Time: Wed, 22 May 2024 18:29:19 +0200
Completed At: Wed, 22 May 2024 18:29:38 +0200
Duration: 19s
Pods Statuses: 0 Active (0 Ready) / 1 Succeeded / 0 Failed
Pod Template:
Labels: <http://batch.kubernetes.io/controller-uid=32b709ed-0221-40e2-94af-0c99a6864fc0|batch.kubernetes.io/controller-uid=32b709ed-0221-40e2-94af-0c99a6864fc0>
<http://batch.kubernetes.io/job-name=phenomenal-sidewinder-csnmg|batch.kubernetes.io/job-name=phenomenal-sidewinder-csnmg>
controller-uid=32b709ed-0221-40e2-94af-0c99a6864fc0
job-name=phenomenal-sidewinder-csnmg
Containers:
prefect-job:
Image: europe-docker.pkg.dev/cambium-earth/prefect-runners-staging/projects_observability:latest
Port: <none>
Host Port: <none>
Args:
prefect
flow-run
execute
Limits:
cpu: 1200m
memory: 8Gi
Requests:
cpu: 110m
memory: 128Mi
Environment:
PREFECT_DEBUG_MODE: False
PREFECT_API_URL: <https://api.prefect.cloud/api/accounts/XXXXXX/workspaces/XXXXX>
PREFECT_API_KEY: pnu_XXXXX
PREFECT_API_ENABLE_HTTP2: True
PREFECT_WORKER_QUERY_SECONDS: 5.0
PREFECT_WORKER_PREFETCH_SECONDS: 10.0
PREFECT__FLOW_RUN_ID: a6344f2c-32ba-4300-9e6e-49dadfb62054
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 62s job-controller Created pod: phenomenal-sidewinder-csnmg-bx94s
Normal Completed 42s job-controller Job completed
Miguel Moncada
05/22/2024, 4:33 PMWorker 'KubernetesWorker b1900e5c-ad9d-4be0-9998-e1a160579573' submitting flow run 'a6344f2c-32ba-4300-9e6e-49dadfb62054'
06:29:19 PM
prefect.flow_runs.worker
Creating Kubernetes job...
06:29:19 PM
prefect.flow_runs.worker
Job 'phenomenal-sidewinder-csnmg': Pod has status 'Pending'.
06:29:19 PM
prefect.flow_runs.worker
Completed submission of flow run 'a6344f2c-32ba-4300-9e6e-49dadfb62054'
06:29:19 PM
prefect.flow_runs.worker
Job 'phenomenal-sidewinder-csnmg': Pod has status 'Running'.
06:29:22 PM
prefect.flow_runs.worker
Opening process...
06:29:25 PM
prefect.flow_runs.runner
Downloading flow code from storage at '.'
06:29:28 PM
prefect.flow_runs
Flow could not be retrieved from deployment.
Traceback (most recent call last):
File "", line 940, in exec_module
File "", line 241, in _call_with_frames_removed
File "/opt/prefect/data-canopy/data-flows/dataflows/flows/projects_observability_export_flow.py", line 1, in
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/local/lib/python3.11/site-packages/prefect/engine.py", line 420, in retrieve_flow_then_begin_flow_run
else await load_flow_from_flow_run(flow_run, client=client)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/prefect/client/utilities.py", line 100, in with_injected_client
return await fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/prefect/deployments/deployments.py", line 317, in load_flow_from_flow_run
flow = await run_sync_in_worker_thread(load_flow_from_entrypoint, str(import_path))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 132, in run_sync_in_worker_thread
return await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/prefect/flows.py", line 1682, in load_flow_from_entrypoint
flow = import_object(entrypoint)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/prefect/utilities/importtools.py", line 201, in import_object
module = load_script_as_module(script_path)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/prefect/utilities/importtools.py", line 164, in load_script_as_module
raise ScriptError(user_exc=exc, path=path) from exc
prefect.exceptions.ScriptError: Script at 'data-flows/dataflows/flows/projects_observability_export_flow.py' encountered an exception: ModuleNotFoundError("No module named 'pandas'")
06:29:28 PM
prefect.flow_runs
Process for flow run 'phenomenal-sidewinder' exited cleanly.
Miguel Moncada
05/22/2024, 4:36 PMValueError: ctypes objects containing pointers cannot be pickled
UPDATE: Not sure why this was working earlier but I changed the logic to avoid returning custom class instances so we can ignore this.Miguel Moncada
05/22/2024, 6:19 PMKevin Grismore
05/22/2024, 6:22 PMMiguel Moncada
05/22/2024, 6:22 PMMiguel Moncada
05/23/2024, 7:43 AMMiguel Moncada
05/23/2024, 7:59 AMdocker run -ti europe-docker.pkg.dev/cambium-earth/prefect-runners-staging/default:latest
WARNING: The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested
___ ___ ___ ___ ___ ___ _____
| _ \ _ \ __| __| __/ __|_ _|
| _/ / _|| _|| _| (__ | |
|_| |_|_\___|_| |___\___| |_|
root@42107936d492:/opt/prefect# ls
dataflows dbt entrypoint.sh requirements-client.txt setup.py
dataflows.egg-info dist include-requirements.txt requirements.txt
root@42107936d492:/opt/prefect# pip freeze
agate==1.9.1
aiohttp==3.9.5
aiosignal==1.3.1
aiosqlite==0.20.0
alembic==1.13.1
annotated-types==0.6.0
anyio==3.7.1
apprise==1.8.0
asgi-lifespan==2.1.0
async-timeout==4.0.3
asyncpg==0.29.0
attrs==23.2.0
Babel==2.15.0
cachetools==5.3.3
certifi==2024.2.2
cffi==1.16.0
charset-normalizer==3.3.2
click==8.1.7
click-plugins==1.1.1
cligj==0.7.2
cloudpickle==3.0.0
colorama==0.4.6
coolname==2.2.0
croniter==2.0.5
cryptography==42.0.7
daff==1.3.46
# Editable install with no version control (dataflows==0.1.0)
-e /opt/prefect
dateparser==1.2.0
dbt-adapters==1.2.1
dbt-common==1.1.0
dbt-core==1.8.0
dbt-extractor==0.5.1
dbt-semantic-interfaces==0.5.1
decorator==5.1.1
dnspython==2.6.1
docker==6.1.3
duckdb==0.10.2
email_validator==2.1.1
exceptiongroup==1.2.1
Faker==24.14.1
fiona==1.9.6
frozenlist==1.4.1
fsspec==2024.3.1
gcsfs==2024.3.1
geopandas==0.14.4
google-api-core==2.19.0
google-api-python-client==2.129.0
google-auth==2.29.0
google-auth-httplib2==0.2.0
google-auth-oauthlib==1.2.0
google-cloud-core==2.4.1
google-cloud-storage==2.16.0
google-crc32c==1.5.0
google-resumable-media==2.7.0
googleapis-common-protos==1.63.0
graphql-core==3.2.3
graphviz==0.20.3
greenlet==3.0.3
griffe==0.45.0
h11==0.14.0
h2==4.1.0
hpack==4.0.0
httpcore==1.0.5
httplib2==0.22.0
httpx==0.27.0
humanize==4.9.0
hyperframe==6.0.1
idna==3.7
importlib-metadata==6.11.0
importlib_resources==6.1.3
isodate==0.6.1
itsdangerous==2.2.0
Jinja2==3.1.4
jinja2-humanize-extension==0.4.0
jsonpatch==1.33
jsonpointer==2.4
jsonschema==4.22.0
jsonschema-specifications==2023.12.1
kubernetes==29.0.0
leather==0.4.0
Logbook==1.5.3
Mako==1.3.5
Markdown==3.6
markdown-it-py==3.0.0
MarkupSafe==2.1.5
mashumaro==3.13
mdurl==0.1.2
minimal-snowplow-tracker==0.0.2
more-itertools==10.2.0
msgpack==1.0.8
multidict==6.0.5
networkx==3.3
numpy==1.26.4
oauthlib==3.2.2
orjson==3.10.3
packaging==24.0
pandas==2.2.2
parsedatetime==2.6
pathspec==0.12.1
pendulum==2.1.2
prefect==2.18.1
prefect-dbt==0.4.3
prefect-gcp==0.5.10
prefect-shell==0.2.5
proto-plus==1.23.0
protobuf==4.25.3
psycopg==3.1.19
pyasn1==0.6.0
pyasn1_modules==0.4.0
pycparser==2.22
pydantic==2.7.1
pydantic_core==2.18.2
pygit2==1.14.1
Pygments==2.18.0
pyparsing==3.1.2
pyproj==3.6.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-multipart==0.0.9
python-slugify==8.0.4
pytimeparse==1.1.8
pytz==2024.1
pytzdata==2020.1
PyYAML==6.0.1
readchar==4.1.0
referencing==0.35.1
regex==2024.5.15
requests==2.31.0
requests-oauthlib==2.0.0
rfc3339-validator==0.1.4
rich==13.7.1
rpds-py==0.18.1
rsa==4.9
ruamel.yaml==0.18.6
ruamel.yaml.clib==0.2.8
sgqlc==16.3
shapely==2.0.4
shellingham==1.5.4
six==1.16.0
sniffio==1.3.1
SQLAlchemy==2.0.30
sqlparse==0.5.0
tenacity==8.3.0
text-unidecode==1.3
toml==0.10.2
typer==0.12.3
typing_extensions==4.11.0
tzdata==2024.1
tzlocal==5.2
ujson==5.10.0
uritemplate==4.1.1
urllib3==2.2.1
uvicorn==0.28.1
websocket-client==1.8.0
websockets==12.0
yarl==1.9.4
zipp==3.18.2
The job's pod definition shows that the ImagePullPolicy
is correctly set to `'Always'`:
k get pod -n prefect amigurumi-grebe-hmq8s-78mwz -o yaml
<...>
image: europe-docker.pkg.dev/cambium-earth/prefect-runners-staging/default:latest
imagePullPolicy: Always
name: prefect-job
resources:
limits:
cpu: 1200m
memory: 8Gi
requests:
cpu: 110m
memory: 128Mi
<...>
And yet the same error occurs:
Job 'amigurumi-grebe-hmq8s': Pod has status 'Running'.
09:46:08 AM
prefect.flow_runs.worker
Opening process...
09:46:10 AM
prefect.flow_runs.runner
Downloading flow code from storage at '.'
09:46:14 AM
prefect.flow_runs
Flow could not be retrieved from deployment.
Traceback (most recent call last):
File "", line 940, in exec_module
File "", line 241, in _call_with_frames_removed
File "/opt/prefect/data-canopy/data-flows/dataflows/flows/projects_observability_export_flow.py", line 4, in
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
Am I missing something obvious?Miguel Moncada
05/23/2024, 9:16 AM"env": {
"EXTRA_PIP_PACKAGES": "pandas==2.2.2 geopandas==0.14.4 psycopg==3.1.19 psycopg-binary==3.1.19",
}
In my job_variables
cc @Kevin Grismore