Nick Kagkalos
10/15/2024, 6:25 PMModuleNotFoundError
for Python pandas even though it is installed locally:
...
prefect.exceptions.ScriptError: Script at '02-workflow-orchestration/test_pandas.py' encountered an exception: ModuleNotFoundError("No module named 'pandas'")
> Running git_clone step...
18:09:37.949 | ERROR | prefect.flow_runs.runner - Process for flow run 'wild-orangutan' exited with status code: 1
Prefect version:
Version: 3.0.4
API version: 0.8.4
Python version: 3.12.4
Git commit: c068d7e2
Built: Tue, Oct 1, 2024 11:54 AM
OS/Arch: linux/x86_64
Profile: local
Server type: server
Pydantic version: 2.9.2
Integrations:
prefect-gcp: 0.6.1
prefect-sqlalchemy: 0.5.1
prefect-docker: 0.6.1
My Deployment script:
from prefect import flow
if __name__ == "__main__":
flow.from_source(
source="<https://github.com/username/repo-name.git>",
entrypoint="my-script-directory/test_pandas.py:pandas_flow"
).deploy(
name="test-pandas-deployment",
work_pool_name="my-work-pool",
)
My Python script fetching pandas library:
import pandas as pd
from prefect import flow, task
@task
def create_and_print_dataframe():
df = pd.DataFrame({
'A': [1, 2, 3],
'B': [4, 5, 6]
})
print(df)
@flow()
def pandas_flow():
create_and_print_dataframe()
Anyone knows what the problem is and how I could tackle it? TIA!Nate
10/15/2024, 6:55 PMNick Kagkalos
10/15/2024, 7:41 PMNick Kagkalos
10/15/2024, 7:44 PMNate
10/15/2024, 7:48 PMI guess I need it to be Process?that's one thing you could do, but in general its good for deployments to be self-contained, meaning that they know what deps they need so I'll give 2 suggestions on how to fix this, the first being fine for testing and non-ideal for prod, the latter being my recommendation: ⢠slap
job_variables=dict(env=dict(EXTRA_PIP_PACKAGES="pandas"))
in your .deploy()
call, which will install pandas at runtime each time you run a flow
⢠you guessed it, build your own Dockerfile
and specify that as your image in your .deploy()
call, like this
if you're into videos for this type of thing, I actually made very related to this topic a bit agoNick Kagkalos
10/15/2024, 7:52 PMNate
10/15/2024, 7:58 PMpip install foo bar baz
) so
job_variables=dict(env=dict(EXTRA_PIP_PACKAGES="pandas matplotlib requests"))
Nick Kagkalos
10/15/2024, 8:07 PM