haf
10/20/2021, 8:06 AMNoah Holm
10/20/2021, 8:08 AMhaf
10/20/2021, 8:08 AMhaf
10/20/2021, 8:08 AMNoah Holm
10/20/2021, 8:09 AMhaf
10/20/2021, 8:10 AMhaf
10/20/2021, 8:10 AMhaf
10/20/2021, 8:11 AMNoah Holm
10/20/2021, 8:12 AMhaf
10/20/2021, 8:12 AMhaf
10/20/2021, 8:13 AMNoah Holm
10/20/2021, 8:13 AMhaf
10/20/2021, 8:15 AMhaf
10/20/2021, 8:16 AMNoah Holm
10/20/2021, 8:17 AMhaf
10/20/2021, 8:17 AMparser = ArgumentParser(add_help=False)
parser.add_argument(
"--debug",
default=False,
required=False,
action="store_true",
dest="debug",
help="debug flag",
)
subparser = parser.add_subparsers(dest="command")
register = subparser.add_parser("register")
run = subparser.add_parser("run")
register.add_argument("-c", "--commit-ref", dest="commit_ref", type=str, required=True)
register.add_argument("-p", "--project-name", dest="project_name", type=str, default="dbt")
register.add_argument("-l", "--labels", action="append", default=[])
register.add_argument("--build", dest="build", action="store_true", default=False)
run.add_argument(
"--run-on-schedule", dest="run_on_schedule", action="store_true", default=False
)
run.add_argument(
"--basepath", dest="basepath", type=str, default=path.dirname(path.realpath(__file__))
)
args = parser.parse_args()
haf
10/20/2021, 8:18 AMhaf
10/20/2021, 8:18 AM--labels prod
Noah Holm
10/20/2021, 8:18 AMhaf
10/20/2021, 8:19 AMNoah Holm
10/20/2021, 8:20 AMNoah Holm
10/20/2021, 8:21 AMhaf
10/20/2021, 8:34 AMNoah Holm
10/20/2021, 8:35 AMhaf
10/20/2021, 8:35 AMhaf
10/20/2021, 8:35 AMNoah Holm
10/20/2021, 8:36 AMNoah Holm
10/20/2021, 8:36 AMhaf
10/20/2021, 8:38 AMhaf
10/20/2021, 8:38 AMhaf
10/20/2021, 8:38 AMhaf
10/20/2021, 8:38 AMNoah Holm
10/20/2021, 8:48 AMadd_default_labels=False
kwarg at least
https://docs.prefect.io/api/latest/storage.html#localNoah Holm
10/20/2021, 8:48 AMAnna Geller
haf
10/20/2021, 9:27 AMhaf
10/20/2021, 9:27 AMhaf
10/20/2021, 9:28 AMhaf
10/20/2021, 9:30 AMhaf
10/20/2021, 9:30 AMvegan-bear
haf
10/20/2021, 9:35 AMFailed to load and execute Flow's environment: ValueError('Flow is not contained in this Storage')
Anna Geller
haf
10/20/2021, 9:37 AMif args.debug:
prefect.config.logging.level = "DEBUG"
if args.command == "run":
prefect.context["basepath"] = args.basepath
print(f"prefect.context.get(basepath)='{prefect.context.get('basepath')}")
flow.run(run_on_schedule=args.run_on_schedule)
elif args.command == "register":
image = f"europe-docker.pkg.dev/logary-delivery/cd/data-pipelines:{args.commit_ref}"
print(f"Registering flow with labels={args.labels} image={image}")
flow.schedule = IntervalSchedule(start_date=at_night(), interval=timedelta(hours=24))
flow.storage = Local(
path="/app/flows/run_mmm.py",
stored_as_script=True,
add_default_labels=False,
)
flow.run_config = KubernetesRun(
image=image,
labels=args.labels,
)
flow.register(
project_name=args.project_name,
build=args.build,
idempotency_key=args.commit_ref,
labels=args.labels,
add_default_labels=False,
)
Anna Geller
Anna Geller
haf
10/20/2021, 9:45 AMhaf
10/20/2021, 9:45 AMAnna Geller
haf
10/20/2021, 9:49 AMhaf
10/20/2021, 9:49 AMhaf
10/20/2021, 9:49 AMAnna Geller
haf
10/20/2021, 9:51 AMAnna Geller
prefect agent local start
Then, you don’t even need to pass any storage or agent, because Local storage and agent are the defaults:
# hw_flow.py
from prefect import task, Flow
@task(log_stdout=True)
def hello_world():
print("hello world")
with Flow("idempotent-flow") as flow:
hw = hello_world()
Then, you can use the CLI to register your flow to the Prefect Cloud:
prefect register --project YOUR_PROJECT_NAME -p hw_flow.py
Anna Geller
prefect auth login --key "YOUR_KEY"
haf
10/20/2021, 9:55 AMhaf
10/20/2021, 9:55 AMAnna Geller
haf
10/20/2021, 9:56 AMhaf
10/20/2021, 9:56 AMhaf
10/20/2021, 9:56 AMAnna Geller
haf
10/20/2021, 9:57 AMhaf
10/20/2021, 9:57 AMhaf
10/20/2021, 9:57 AMhaf
10/20/2021, 9:57 AMAnna Geller
haf
10/20/2021, 9:57 AMhaf
10/20/2021, 9:57 AM$ docker run --rm -it europe-docker.pkg.dev/logary-delivery/cd/data-pipelines:xxx
_____ _____ ______ ______ ______ _____ _______
| __ \| __ \| ____| ____| ____/ ____|__ __|
| |__) | |__) | |__ | |__ | |__ | | | |
| ___/| _ /| __| | __| | __|| | | |
| | | | \ \| |____| | | |___| |____ | |
|_| |_| \_\______|_| |______\_____| |_|
Thanks for using Prefect!!!
This is the official docker image for Prefect Core, intended for executing
Prefect Flows. For more information, please see the docs:
<https://docs.prefect.io/core/getting_started/installation.html#docker>
root@514926ea8f6a:/app# ls
Pipfile Pipfile.lock dbt dbt_project.yml flows infer packages.yml postinstall.py profiles.yml
root@514926ea8f6a:/app# ls flows
__pycache__ exchange_rates.py run_mmm.py
dask-worker-space exchange_rates__insert_rate.sql run_mmm__metrics_eligible_channels.sql
dbt.py exchange_rates__missing_dates.sql run_mmm__revenues_eligible_apps.sql
root@514926ea8f6a:/app# cd flows
root@514926ea8f6a:/app/flows# l
bash: l: command not found
root@514926ea8f6a:/app/flows# pwd
/app/flows
root@514926ea8f6a:/app/flows# exit
logout
Anna Geller
haf
10/20/2021, 9:59 AMhaf
10/20/2021, 9:59 AMAnna Geller
haf
10/20/2021, 9:59 AMhaf
10/20/2021, 9:59 AMFROM prefecthq/prefect:0.15.4-python3.8
RUN pip install --upgrade pip setuptools wheel twine \
&& pip install pipenv \
&& apt-get update \
&& apt-get install -y --no-install-recommends curl gcc python3-dev libssl-dev \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY Pipfile* packages.yml profiles.yml .user.yml .python-version dbt_project.yml postinstall.py ./
COPY infer ./infer
RUN PIPENV_VENV_IN_PROJECT=1 pipenv install --deploy
ENV PATH="/app/.venv/bin:$PATH"
RUN python postinstall.py
COPY flows ./flows
COPY dbt ./dbt
haf
10/20/2021, 9:59 AMAnna Geller
RUN pip install -r requirements.txt
haf
10/20/2021, 9:59 AMhaf
10/20/2021, 10:00 AMhaf
10/20/2021, 10:00 AMhaf
10/20/2021, 10:00 AMhaf
10/20/2021, 10:00 AMpython postinstall.py
haf
10/20/2021, 10:00 AMAnna Geller
haf
10/20/2021, 10:01 AMI have a pipfile because it makes it more consistentIf this really is my issue then I'm happy to discuss the hows and whys of this, but I don't think it is
Anna Geller
Anna Geller
COPY /path/to/your/flow.py .
haf
10/20/2021, 10:06 AMhaf
10/20/2021, 10:06 AMAnna Geller
haf
10/20/2021, 10:06 AMhaf
10/20/2021, 10:06 AMAnna Geller
COPY flows .
haf
10/20/2021, 10:08 AMhaf
10/20/2021, 10:09 AMhaf
10/20/2021, 10:10 AMhaf
10/20/2021, 10:10 AMhaf
10/20/2021, 10:10 AMAnna Geller
stored_as_script=True
in Docker storage. I’m not really recommending Docker storage, I think it would be easier with GCS 🙂 but this was your preference, and Docker storage is easiest to get started because you can pass your Dockerfile and it will be built anytime you register your flow so that you can ensure all dependencies are baked into the image
You need to use Docker storage, not local storage - remember the links to the docs I sent you?haf
10/20/2021, 10:20 AMhaf
10/20/2021, 10:20 AMhaf
10/20/2021, 10:21 AMhaf
10/20/2021, 10:21 AMhaf
10/20/2021, 10:21 AMhaf
10/20/2021, 10:21 AMflow.storage = Docker(
path="/app/flows/run_mmm.py",
image_name=image_base,
image_tag=args.commit_ref,
stored_as_script=True,
add_default_labels=False,
)
haf
10/20/2021, 10:22 AMhaf
10/20/2021, 10:22 AMFailed to load and execute Flow's environment: ValueError('Flow is not contained in this Storage')
haf
10/20/2021, 10:25 AMAnna Geller
haf
10/20/2021, 10:43 AMhaf
10/20/2021, 10:43 AMhaf
10/20/2021, 10:43 AMhaf
10/20/2021, 10:43 AMhaf
10/20/2021, 10:43 AMhaf
10/20/2021, 10:43 AMhaf
10/20/2021, 10:43 AMhaf
10/20/2021, 10:44 AMhaf
10/20/2021, 10:44 AMhaf
10/20/2021, 10:44 AMhaf
10/20/2021, 10:45 AMAnna Geller
haf
10/20/2021, 10:45 AMhaf
10/20/2021, 10:45 AMAnna Geller
haf
10/20/2021, 10:46 AMDocker storage is ideal because the image gets rebuilt any time you register your flowNo, it's not ideal, I don't want this
haf
10/20/2021, 10:46 AMhaf
10/20/2021, 10:47 AMrequirements.txt
but doesn't support editable dependencieshaf
10/20/2021, 10:47 AMhaf
10/20/2021, 10:47 AMhaf
10/20/2021, 10:47 AMhaf
10/20/2021, 10:48 AMhaf
10/20/2021, 10:48 AMhaf
10/20/2021, 10:49 AM.prefect
files are being added as part of the build; AFAIK these are the pickled filesAnna Geller
RUN pip install .
haf
10/20/2021, 10:50 AMpip install -e .
yesAnna Geller
haf
10/20/2021, 10:50 AMpipenv
does this as part of pipenv install --deploy
haf
10/20/2021, 10:50 AMpip install .
you're building a .egg
file, but when you do an editable install you're not.haf
10/20/2021, 10:51 AMhaf
10/20/2021, 10:51 AMhaf
10/20/2021, 10:52 AMflow.register()
haf
10/20/2021, 10:52 AMhaf
10/20/2021, 10:53 AMWhat you're missing is that when you doOr in other words, "editable" just means "load files from disk at runtime" while theyou're building apip install .
file, but when you do an editable install you're not..egg
pip install .
means load it from the egg file.Anna Geller
haf
10/20/2021, 10:55 AMAnna Geller
haf
10/20/2021, 10:57 AMhaf
10/20/2021, 10:57 AMhaf
10/20/2021, 10:57 AMhaf
10/20/2021, 10:58 AMDocker image is a packaging mechanism by itselfYes, but not "package" as in "python package"
Anna Geller
haf
10/20/2021, 10:59 AMThe result will be the same: a package is installed in the env so that it can be used by Prefect flows, right?But yes, with the Dockerfile I posted the python packages can be referenced by Prefect flows; and this is what is needed
haf
10/20/2021, 11:00 AMhaf
10/20/2021, 11:00 AMAnna Geller
Anna Geller
haf
10/20/2021, 11:03 AMAnna Geller
haf
10/20/2021, 11:04 AMAnna Geller
haf
10/20/2021, 11:05 AMhaf
10/21/2021, 8:42 AMAnna Geller
haf
10/21/2021, 8:44 AMtini
entrypoint which threw away all ENVs
• this meant pip
ran with the system pip, not the venv pip
• you're right that it would have been better to install with requirements.txt — but only as-so-far that pipenv doesn't tie into pyproject.toml
which seems to be "the way" nowadays after PIP https://www.python.org/dev/peps/pep-0518/haf
10/21/2021, 8:45 AMdocker = Docker(
path="/app/flows/run_mmm.py",
registry_url="ex/cd",
dockerfile="Dockerfile",
image_name="data-pipelines",
image_tag=args.commit_ref,
ignore_healthchecks=True,
stored_as_script=True,
)
docker.add_flow(flow)
flow.storage = docker
haf
10/21/2021, 8:46 AMadd_flow
it just crashes with the message I showed you before.haf
10/21/2021, 8:46 AMAnna Geller
haf
10/21/2021, 8:59 AMhaf
10/21/2021, 8:59 AMhaf
10/21/2021, 9:00 AMAnna Geller
haf
10/21/2021, 10:40 AM