Hash Lin
04/16/2022, 2:28 PMFailed to load and execute flow run: ModuleNotFoundError("No module named '/Users/xxx/'")
Thanks for helping. 🙇Kevin Kho
04/16/2022, 5:43 PMHash Lin
04/17/2022, 7:18 AMimport prefect
from prefect import task, Flow
from prefect.run_configs import KubernetesRun
from app.job.experiment import ExperimentJob
@task
def process():
logger = prefect.context.get("logger")
experiment_job = ExperimentJob(logger)
experiment_job.update_experiment_user_count()
with Flow(
"update_experiment_user_count",
run_config=KubernetesRun(
image_pull_policy="Always",
labels=["k8s"],
image="<http://xxx.dkr.ecr.us-west-2.amazonaws.com/prefect-test|xxx.dkr.ecr.us-west-2.amazonaws.com/prefect-test>",
),
) as flow:
process()
flow.register(project_name="test_project", labels=["k8s"])
flow = Flow("my-flow")
flow.storage = GitLab(repo="my/repo", path="/flow.py", ref="MISOCU-3337")
https://cln.sh/GPi8xP
I'm not quite sure where I got wrong here. 🤦flow.py
as path, not /flow.py
. I think the document guild me in the wrong direction. 🤦ModuleNotFoundError
problem after I used GitLab Storage. Here's the error message and the example code.
prefect ModuleNotFoundError: No module named 'app'
import prefect
from prefect import task, Flow
from prefect.run_configs import KubernetesRun
from app.job.experiment import ExperimentJob
@task
def process():
logger = prefect.context.get("logger")
experiment_job = ExperimentJob(logger)
experiment_job.update_experiment_user_count()
with Flow(
"update_experiment_user_count",
run_config=KubernetesRun(
image_pull_policy="Always",
labels=["k8s"],
image="<http://xxx.dkr.ecr.us-west-2.amazonaws.com/prefect-test|xxx.dkr.ecr.us-west-2.amazonaws.com/prefect-test>",
),
storage=GitLab(
access_token_secret="gitlab-access-token",
repo="my/repo",
ref="MISOCU-3337",
path="flow.py"
),
) as flow:
process()
flow.register(project_name="test_project", labels=["k8s"])
I also create a setup.py
file and define the dependency.
from setuptools import setup, find_packages
with open('requirements.txt') as f:
requirements = f.read().splitlines()
setup(
name="flow_utilities",
version="0.1",
packages=find_packages(),
install_requires=requirements,
)
Kevin Kho
04/17/2022, 3:03 PMHash Lin
04/17/2022, 3:12 PMKevin Kho
04/17/2022, 3:13 PMTanasorn Chindasook
05/05/2022, 6:49 PMKevin Kho
05/05/2022, 6:51 PMTanasorn Chindasook
05/06/2022, 6:58 AMKevin Kho
05/06/2022, 1:45 PMMateo Merlo
05/09/2022, 11:01 AMsetup.py
@Hash Lin shared above.
My Dockerfile is:
FROM prefecthq/prefect:latest-python3.9
RUN /usr/local/bin/python -m pip install --upgrade pip
WORKDIR /opt/prefect
COPY src/requirements.txt .
COPY src/setup.py .
RUN pip install .
I built this image, pushed to Artifact Registry in GCP, and then I have the cluster pointing this image.
Should I create another file to be able to add pandas and have it available during the flow run?Kevin Kho
05/09/2022, 2:02 PMMateo Merlo
05/09/2022, 6:12 PMflow_test.py
import platform
import pandas as pd
import prefect
from prefect import Flow, Parameter, task
from prefect.client.secrets import Secret
from prefect.storage import GitHub
from prefect.run_configs import KubernetesRun
import subprocess
PREFECT_PROJECT_NAME = "testing"
FLOW_NAME = "flow_test"
AGENT_LABEL = "etl"
STORAGE = GitHub(
repo="mateo2181/prefect-flows-test",
path=f"flows/{FLOW_NAME}.py",
access_token_secret="GITHUB_ACCESS_TOKEN"
)
RUN_CONFIG = KubernetesRun(labels=[AGENT_LABEL])
@task
def extract_and_load(dataset: str) -> None:
logger = prefect.context.get("logger")
file = f"<gs://football_transfers/transfers/{dataset}>"
df = pd.read_csv(file)
<http://logger.info|logger.info>("Dataset %s with %d rows loaded to DB", dataset, len(df))
@task(log_stdout=True)
def hello_world():
print(f"Hello from {FLOW_NAME} v2!")
print(f"Running this task with Prefect: {prefect.__version__} and Python {platform.python_version()}")
with Flow(FLOW_NAME, storage=STORAGE, run_config=RUN_CONFIG,) as flow:
# user_input = Parameter("user_input", default="Marvin")
hw = hello_world()
extract_and_load("1999_dutch_eredivisie.csv")
Kevin Kho
05/09/2022, 6:45 PMMateo Merlo
05/09/2022, 7:02 PMk8s.cfg
I have this line:
spec:
containers:
- args:
- prefect agent kubernetes start
image: europe-west6-docker.pkg.dev/zoomcamp-340819/prefect-agents/prefect-agents-etl:latest
If I have this, I don't need to specify the image in KubernetesRun conf, right?Kevin Kho
05/09/2022, 7:05 PMKubernetesRun
or you need to specify it in the job template that the agent passes hereMateo Merlo
05/09/2022, 7:15 PMKevin Kho
05/09/2022, 7:18 PMMateo Merlo
05/09/2022, 7:24 PMKevin Kho
05/09/2022, 7:26 PMMateo Merlo
05/09/2022, 7:41 PMKevin Kho
05/09/2022, 7:46 PM