Joël Luijmes
12/08/2021, 9:28 AMLocalDaskExecutor
as my requirements for concurrency were low. However with this flow I want to use the real DaskExecutor
.
As I’m aware Prefect is capable of creating a temporal Dask cluster for the running flow (using KubeCluster
), but alternatively can use an existing deployed Dask cluster (or even dask gateway if I’m not mistaken).
Note, I’m running in Kubernetes, so adaptive scaling could be interesting.
Are there any guidelines / suggestions / experiences for using a temporal vs. static Dask cluster? Additionally, if I have Docker
as storage, how can I supply the correct image tag containing my modules/dependencies while registering the flow?Ivan Zaikin
12/08/2021, 10:00 AMprefect==2.0a5
in Docker.
Here is my Dockerfile:
FROM python:3.8
RUN adduser prefect
USER prefect
WORKDIR /home/prefect
COPY --chown=prefect:prefect <http://requirements.in|requirements.in> ./
ENV LANG C.UTF-8
ENV LC_ALL C.UTF-8
ENV PATH="/home/prefect/.local/bin:${PATH}"
RUN pip install --user --no-cache-dir -r <http://requirements.in|requirements.in>
COPY --chown=prefect:prefect flow.py flow_deployment.py ./
Inside the container I create a deployment and several flow runs, but all of them are marked as “late”. Here is the terminal output:
$ prefect orion start --host 0.0.0.0 --log-level DEBUG
Starting Orion API server...
INFO: Started server process [71]
INFO: Waiting for application startup.
09:54:06.189 | Scheduler service scheduled to start in-app
09:54:06.189 | MarkLateRuns service scheduled to start in-app
INFO: Application startup complete.
INFO: Uvicorn running on <http://0.0.0.0:4200> (Press CTRL+C to quit)
09:54:06.501 | Finished monitoring for late runs.
09:54:06.538 | Scheduled 0 runs.
Starting agent connected to <http://0.0.0.0:4200/api/>...
Agent started! Checking for flow runs...
09:54:07.298 | Submitting flow run 'f0855bd3-2eab-4346-ad3a-2e237a688faa'
09:54:07.298 | Submitting flow run '521624eb-cec8-4f9b-9e92-f203e104586a'
09:54:07.298 | Submitting flow run 'f25da7b6-7893-4107-aaa3-df22377e2ccf'
09:54:07.299 | Submitting flow run '2895bdfa-082c-43b5-afc2-d0dcc269bf51'
09:54:07.299 | Submitting flow run '2b445dbd-58b4-4acb-89b7-1f6782dc0ec9'
09:54:07.300 | Completed submission of flow run 'f0855bd3-2eab-4346-ad3a-2e237a688faa'
09:54:07.300 | Completed submission of flow run '521624eb-cec8-4f9b-9e92-f203e104586a'
09:54:07.300 | Completed submission of flow run 'f25da7b6-7893-4107-aaa3-df22377e2ccf'
09:54:07.300 | Completed submission of flow run '2895bdfa-082c-43b5-afc2-d0dcc269bf51'
09:54:07.300 | Completed submission of flow run '2b445dbd-58b4-4acb-89b7-1f6782dc0ec9'
09:54:08.969 | Flow run '521624eb-cec8-4f9b-9e92-f203e104586a' exited with exception: KeyError('__main__')
09:54:08.975 | Flow run '2895bdfa-082c-43b5-afc2-d0dcc269bf51' exited with exception: KeyError('__main__')
09:54:08.976 | Flow run 'f25da7b6-7893-4107-aaa3-df22377e2ccf' exited with exception: KeyError('__main__')
09:54:08.979 | Flow run '2b445dbd-58b4-4acb-89b7-1f6782dc0ec9' exited with exception: KeyError('__main__')
09:54:08.980 | Flow run 'f0855bd3-2eab-4346-ad3a-2e237a688faa' exited with exception: KeyError('__main__')
Is there a way to debug these KeyErrors?Ievgenii Martynenko
12/08/2021, 1:04 PMlogger = logging.getLogger()
. The idea is to extend Prefect Task class and run some magic using LIB library. I've read that we can add as many NAMED loggers as we want using https://docs.prefect.io/core/concepts/logging.html#extra-loggers, but since with have root one, what happens now is: Prefect logs its records as usual, but messages from LIB are not passed. I suppose this is due to LIB logger is root one.
Have you ever faced such situation?Justin
12/08/2021, 2:32 PMflow = Flow("taskname")
flow.run_config = KubernetesRun(env={"POSTGRES_USER": "2234234",
...
},
image = "dockerimage:latest"
)
flow.register('projectname')
Vadym Dytyniak
12/08/2021, 2:54 PMVipul
12/08/2021, 2:59 PMJason Motley
12/08/2021, 3:09 PMLeon Kozlowski
12/08/2021, 4:18 PMEnvironment:*
from the agent? I am having issues persisting a service account and role ARN that give flows privileges to hit aws resources (details in thread)Sam Werbalowsky
12/08/2021, 4:31 PMStringFormatter
but pass in a file that is opened…is something like this bad practice:
@task
def format_sql_file(file):
with open(file, 'r') as sql:
string_to_format = sql.read()
return StringFormatter().run(var1='my_value', var2='other_value', template = string_to_format)
Pedro Machado
12/08/2021, 6:37 PMBrian S
12/08/2021, 7:30 PMFailed to load and execute Flow's environment: ModuleNotFoundError("No module named 'xxx'")
when submitting the flow to and ECS(Fargate). The module in question is a custom Python module that I include in the agent Docker image and it's installed via pip install. I feel like I'm missing something here but from what I've read, that should work. Does anyone have a tip on how I might get the flow execution to be aware of this custom module? Is the Agent the proper place to install the module?Tom Shaffner
12/08/2021, 10:23 PMJacob Blanco
12/09/2021, 2:06 AMAnh Nguyen
12/09/2021, 3:42 AMKlemen Strojan
12/09/2021, 12:02 PMKlemen Strojan
12/09/2021, 12:56 PM1.22.2
After running
prefect agent kubernetes install \
-k ${PREFECT_DE_PROD_API} \
--namespace prefect-latest \
--mem-request=16Gi \
--mem-limit=128Gi \
--cpu-request=4 \
--cpu-limit=32 \
--image-pull-secrets azurecr-secret \
--label k8s \
--label prod \
--label latest \
--rbac | kubectl apply --namespace=prefect-latest -f -
we get:
deployment.apps/prefect-agent created
<http://role.rbac.authorization.k8s.io/prefect-agent-rbac|role.rbac.authorization.k8s.io/prefect-agent-rbac> unchanged
error: unable to recognize "STDIN": no matches for kind "RoleBinding" in version "<http://rbac.authorization.k8s.io/v1beta1|rbac.authorization.k8s.io/v1beta1>"
We should be using
<http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
What should we do?Vincent Chéry
12/09/2021, 1:40 PMLocalAgent
running which gets its orders from a private prefect server instance and runs flow runs locally, and I basically do not want to persist any result on disk. Following the docs I have defined all my tasks with @task(checkpoint=False)
, which does the job for custom tasks but does not give me a solution for prefect tasks like prefect.tasks.control_flow.merge
.
I have tried setting task checkpointing to false
globally:
[tasks]
checkpointing = false
[tasks.defaults]
checkpointing = false
without success.
Any idea? Thanks a lot in advance!Lucas Hosoya
12/09/2021, 2:16 PMRyan Brennan
12/09/2021, 2:56 PMKeyError('The secret <SECRET NAME> was not found. Please ensure that it was set correctly in your tenant
This seems to happen the most when I’m using task mapping but it seems pretty random. 100 iterations of the mapped task will work, then the 101st fails with the error above, then the next 100 are successful. Has anyone ran into this issue or have any ideas on what might be happening?Jason Motley
12/09/2021, 4:07 PMJosh
12/09/2021, 7:25 PMNikhil Sthalekar
12/09/2021, 7:31 PMLeon Kozlowski
12/09/2021, 9:12 PMprefect.context
inside a flow run?Joseph Mathes
12/09/2021, 10:41 PMKevin Kho
12/10/2021, 12:12 AMJohn T
12/10/2021, 12:33 AMthread.lock
issue. Is there anything I could do about this?
I also tried ResourceManager to see if I could avoid it, but that also is not working.Pedro Machado
12/10/2021, 12:51 AMMaria
12/10/2021, 6:32 AMpostgres_execute = PostgresExecute( db_name="abc", user="peter", host="my_host", port=5432,)
I have multiple databases (dev, test, demo, +prod db per client), and I need to query/load data into the relevant one. I have a config file and I was hoping I could use Parameter to tell what config line to choose - but it seems like I cannot override those params... What is the recommended approach for such use case?Cristian Toma
12/10/2021, 6:53 AMRuben Sik
12/10/2021, 8:23 AMRuben Sik
12/10/2021, 8:23 AMFROM python:3.8-slim-buster
WORKDIR /app
COPY requirements.txt ./requirements.txt
RUN pip install "prefect[azure, gitlab]"
RUN pip install -r requirements.txt
COPY . .
ENV PYTHONPATH "${PYTHONPATH}:/app"
Furthermore we are using gitlab storage and our folder structure looks like:
|__app
| └── __init__.py
| └── flows
| └── └── __init__.py
| └── └── test_flow.py
| └── └── helper_script.py
| └── └── utils
| └── └── └── __init__.py
| └── └── └── helper_script.py
| └── utils
| └── └── __init__.py
| └── └── helper_script.py
We've copied the helper_script.py to test multiple location to try various imports.
When we run os.listdir() in our flow script "test_flow.py", the logger seems to show our utils folder is in the current working directory. This os.listdir() results in [utils, flows] which should mean an import possibily of
"import utils.helper_script" (whichs works on command line of the docker image but not when running the flow via a docker agent).
Are we overlooking something?Anna Geller
12/10/2021, 10:07 AMsetup.py
and make utils
a package. This way your package will be importable anywhere (in a virtual environment, Docker image etc) and you don’t have to manually add directories to the PYTHONPATH and it makes your code much cleaner. This post explains how to do it.
I also have an example Dockerfile and setup.py in this repo.Ruben Sik
12/10/2021, 1:37 PMAnna Geller
12/10/2021, 1:52 PMapp
directory also on your local, right?COPY app/. .
/app/app/utils
so the import would have to be:
from app.app.utils import ...
FROM prefecthq/prefect:0.15.10-python3.8
RUN /usr/local/bin/python -m pip install --upgrade pip
WORKDIR /opt/prefect
COPY utils/ /opt/prefect/utils/
COPY requirements.txt .
COPY setup.py .
RUN pip install .
COPY flows/ /opt/prefect/flows/
stored_as_script=True
. If you wanna use this pattern, check out this exampleRuben Sik
12/10/2021, 3:02 PM