Bob De Schutter
05/20/2022, 2:58 PMfor flow in flows:
flow.register(
project_name=f"Data Platform - {os.environ.get('PREFECT_TARGET_ENV', 'Develop')}",
add_default_labels=False,
set_schedule_active=True if os.environ.get('PREFECT_TARGET_ENV', 'Develop') == 'Production' else False
)
This has worked for me before but since I upgraded prefect to 1.2.1, I'm getting an error message related to the Azure storage I'm using for the flows:
azure.core.exceptions.ResourceExistsError: The specified blob already exists
Any idea why this strategy is not working anymore?James Keegan
05/20/2022, 3:47 PMShellTask
task) but keep running into this error. Any suggestions?
The SSH key is set up correctly, but there doesn't appear to be a user associated with the ShellTask
commands.
This wasn't an issue previously, as older flow images can pull git projects just fine. Only came up recently after I rebuilt the image again (Prefect version & python version are still the same).
Cloning into './dbt_project'...
No user exists for uid 1000190000
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
James Keegan
05/20/2022, 5:00 PMcloudpickle
v2.1.0 was just released an hour ago and it has broken the healthcheck when we try to register a flow. We're using the prefecthq/prefect:0.15.6-python3.7
as a base image.
We now get this error when building the docker image.
Step 21/21 : RUN python /opt/prefect/healthcheck.py '["/opt/prefect/flows/prefect-dbt-run-modelling.prefect"]' '(3, 7)'
---> Running in 1d14c333ced9
Beginning health checks...
System Version check: OK
Traceback (most recent call last):
File "/opt/prefect/healthcheck.py", line 152, in <module>
flows = cloudpickle_deserialization_check(flow_file_paths)
File "/opt/prefect/healthcheck.py", line 44, in cloudpickle_deserialization_check
flows.append(cloudpickle.loads(flow_bytes))
AttributeError: Can't get attribute '_make_function' on <module 'cloudpickle.cloudpickle' from '/usr/local/lib/python3.7/site-packages/cloudpickle/cloudpickle.py'>
Florian Guily
05/20/2022, 5:21 PMconfig.py
file in flow_utilities
folder like this from flow_utilities.config import funcA funcB
. When i do so, i get an error. So i tried to reproduced it with the dummy code you have on the image. I get the same error which is ModuleNotFoundError: No module named 'flow_utilities'
. This really seems stupid to me as it is just importing some func from local file but somehow i can't make it work with this file architecture. What am i doing wrong ?Florian Guily
05/20/2022, 5:46 PMSecret.get()
it apparently only search for local secrets. The doc says i have to change prefect.config.use_local_secrets
to false but i can't understand how... I suppose it is in the config.toml file but as the flow has to run on an eks cluster, how can i specify this for the cluster ?Mike Vanbuskirk
05/20/2022, 7:24 PMMalthe Karbo
05/20/2022, 7:54 PMAlvaro Durán Tovar
05/20/2022, 8:42 PMdef build(...):
with Flow(...) as Flow:
...
return flow
is it possible to use docker storage? thinking on possible issues trying to find the path of the flow inside the docker file, there won't any "flow" variable on the module levelPhilip MacMenamin
05/20/2022, 9:42 PMNash Taylor
05/20/2022, 10:55 PMpy.typed
on my own. I see that there is a py.typed
on the master branch; was there a conscious decision made not to include this marker file in 2.0? If so, why?
Without the marker file, I get:
from prefect.flows import flow
reveal_type(flow) # Unknown
With the marker file, I get:
from prefect.flows import flow
reveal_type(flow) # Overload[(__fn: (**P@flow) -> R@flow, /) -> Flow[P@flow, R@flow], (*, name: str = None, version: str = None, task_runner: BaseTaskRunner = ConcurrentTaskRunner, description: str = None, timeout_seconds: int | float = None, validate_parameters: bool = True) -> (((**P@flow) -> R@flow) -> Flow[P@flow, R@flow]), (__fn: Unknown | None = None, *, name: str = None, version: str = None, task_runner: BaseTaskRunner = ConcurrentTaskRunner, description: str = None, timeout_seconds: int | float = None, validate_parameters: bool = True) -> (Flow[P, R] | (((**P) -> R) -> Flow[P, R]))]
which is much better.
For example, now when I decorate a function with @task
and use it in an @flow
, the return type of the task is known to have a .result()
method. When it was just an Unknown, pyright would complain that I'm accessing a method that doesn't exist. For someone who enforces fully passing mypy (now pyright) checks on every pull request, this is kind of a necessity.GGK Kellie
05/21/2022, 2:34 AMDaniel Sääf
05/21/2022, 11:45 AM16:34:10.826
is the last thing that happens in the task read_blob in which the error occurs in. So it looks to me that something goes wrong when reporting the task.
The error message doens’t tell me that much - so if you have any advices on how i should troubleshoot this i would be really helpful (or if you can guess on what might be wrong?)Constantino Schillebeeckx
05/21/2022, 2:08 PMClément VEROVE
05/21/2022, 3:11 PMdocker volume create
/ docker-compose up
so i need docker daemon but it cannot be outside my job.
Here is my job template
apiVersion: batch/v1
kind: Job
spec:
template:
spec:
restartPolicy: Never
containers:
- name: flow-container
- name: dind-daemon
image: docker:stable-dind
env:
- name: DOCKER_TLS_CERTDIR
value: ""
securityContext:
privileged: true
imagePullSecrets:
- name: regcred
It works but my docker daemon container never stop......
any ideas ?Daniel Saxton
05/21/2022, 4:04 PMJoshua Greenhalgh
05/21/2022, 4:22 PMDEV_TAG = os.environ.get("DEV", "") != ""
JOB_IMAGE_NAME = f"blah/flows{':dev' if DEV_TAG else ''}"
And then in each flow I ref the JOB_IMAGE_NAME
- this just changes the image but otherwise uses the job template I have defined on the agent;
apiVersion: batch/v1
kind: Job
spec:
template:
spec:
containers:
- name: flow
imagePullPolicy: Always
env:
- name: SOME_ENV
valueFrom:
secretKeyRef:
name: secret-env-vars
key: some_env
optional: false
Now when I specify the dask setup I do the following;
executor=DaskExecutor(
cluster_class=lambda: KubeCluster(make_pod_spec(image=JOB_IMAGE_NAME)),
adapt_kwargs={"minimum": 2, "maximum": 3},
)
But this is obviously missing the env part of my default template - I would like to not have to respecify it (its much bigger then the above snippet) - is it possible to grab a handle on the default template and just override the image name?Kayvan Shah
05/21/2022, 4:57 PM$ prefect deployment inspect 'hello-world/hello-world-daily'
{
'id': '710145d4-a5cb-4e58-a887-568e4df9da88',
'created': '2022-04-25T20:23:42.311269+00:00',
'updated': '2022-04-25T20:23:42.309339+00:00',
'name': 'hello-world-daily',
'flow_id': '80768746-cc02-4d25-a01c-4e4a92797142',
'flow_data': {
'encoding': 'blockstorage',
'blob': '{"data": "\\"f8e7f81f24512625235fe5814f1281ae\\"", "block_id":
"c204821d-a44f-4b9e-aec3-fcf24619d22f"}'
},
'schedule': {
'interval': 86400.0,
'timezone': None,
'anchor_date': '2020-01-01T00:00:00+00:00'
},
'is_schedule_active': True,
'parameters': {},
'tags': ['earth'],
'flow_runner': {'type': 'universal', 'config': {'env': {}}}
}
Is there any extensive example available to write the complete config for a flow??Kayvan Shah
05/21/2022, 6:29 PMJan Domanski
05/21/2022, 6:43 PMMarwan Sarieddine
05/21/2022, 8:48 PMHui Zheng
05/21/2022, 8:56 PMAnna Geller
05/21/2022, 9:15 PMQuan Cao
05/22/2022, 5:14 AMhttpx.HTTPStatusError: Server error '500 Internal Server Error' for url '<https://api-beta.prefect.io/api/accounts/ff4de07a-3c0a-4831-a96f-236ce6513b52/workspaces/c36fd28e-d969-4a32-bb11-0f7712f96635/work_queues/cfbcef84-6668-41ff-83dd-0053055deffb/get_runs>'
jars
05/22/2022, 10:49 AMAnna Geller
05/22/2022, 10:56 AMjars
05/22/2022, 10:59 AMKayvan Shah
05/22/2022, 11:10 AMAssaf Ben Shimon
05/22/2022, 11:12 AMhttpx.HTTPStatusError: Server error '500 Internal Server Error' for url '<https://api-beta.prefect.io/api/accounts/927cf1b8-198e-4236-841e-36098d433977/workspaces/be783b36-6bf5-4c3e-b919-8da5b8ea57cc/task_runs/277f8cff-c1aa-47b4-a5d7-126ce2957ad3/set_state>'
jars
05/22/2022, 11:26 AMAnna Geller
05/22/2022, 11:52 AMAnna Geller
05/22/2022, 11:52 AMFlorian Kühnlenz
05/22/2022, 12:41 PMAnna Geller
05/22/2022, 2:25 PMFlorian Kühnlenz
05/22/2022, 2:29 PMcb076c5-f718-4225-b9e9-bb15e0d19665
This is probably the one. For some reason we were notified about it twice. The other errors were from before 11:31 UTC I just realized.Anna Geller
05/22/2022, 6:31 PMFlorian Kühnlenz
05/22/2022, 6:33 PM1cb076c5-f718-4225-b9e9-bb15e0d19665
Anna Geller
05/24/2022, 1:39 PM