Lior Barak
12/04/2023, 9:10 AMsqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection timed out, timeout 30.00 (Background on this error at: <https://sqlalche.me/e/20/3o7r>)
this causes flows to crash and sometimes creates zombie flows that I have to cancel/delete maually
the loads are around 100 flows (each flow has around 100 sub-tasks), and flows are running on Process type agents
I can increase these settings on the server, or maybe even put in a better Postgres DB, but i'm not sure how to avoid these errors in the futureAwaish Kumar
12/04/2023, 9:27 AM...
# the deployments section allows you to provide configuration for deploying flows
deployments:
- name: test_flow
entrypoint: flows/test_flow.py:test_flow
work_pool:
name: dimensional-etl
I added GCP push as my architecture (work pool) with above given prefect.yml. But when trying to trigger the flow run I am getting below error. Do you have any idea what's happening? Many thanks
Failed due to a(n) InfrastructureError Flow run infrastructure exited with non-zero status code: Exited with non 0 code. (Error Code: 1). This may be caused by attempting to run an image with a misspecified platform or architecture.
Steps I took to get to this point are:
1. Add a GCP cloud run push as work pool named "dimensional-etl"
2. Create a test flow
3. Add entry in deployments section in prefect.yml file
4. Github action to deploy flows
5. Then trigger a flow run from UI
Am I missing something? @JennyNaila Chennit
12/04/2023, 3:44 PMScott Melhop
12/04/2023, 4:34 PMBelal Aboabdo
12/04/2023, 5:18 PMPeter Peter
12/04/2023, 6:05 PMIan Thomas
12/04/2023, 8:30 PMschwannden kuo
12/05/2023, 2:40 AMAwaish Kumar
12/05/2023, 7:35 AMValidation of flow parameters failed with error:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/prefect/engine.py", line 434, in retrieve_flow_then_begin_flow_run
parameters = flow.validate_parameters(flow_run.parameters)
File "/usr/local/lib/python3.9/site-packages/prefect/flows.py", line 502, in validate_parameters
validated_fn = V2ValidatedFunction(
File "/usr/local/lib/python3.9/site-packages/pydantic/v1/decorator.py", line 126, in __init__
File "/usr/local/lib/python3.9/site-packages/prefect/_internal/pydantic/v2_validated_func.py", line 51, in create_model
class DecoratorBaseModel(BaseModel):
File "/usr/local/lib/python3.9/site-packages/pydantic/_internal/_model_construction.py", line 92, in __new__
File "/usr/local/lib/python3.9/site-packages/pydantic/_internal/_model_construction.py", line 308, in inspect_namespace
File "pydantic/fields.py", line 32, in init pydantic.fields
File "pydantic/class_validators.py", line 8, in init pydantic.class_validators
File "/usr/local/lib/python3.9/site-packages/pydantic/_migration.py", line 278, in wrapper
ModuleNotFoundError: No module named 'pydantic._internal._validators'
Packages:
prefect==2.14.5
prefect-dbt==0.4.1Johan sh
12/05/2023, 9:24 AMBlina
12/05/2023, 1:34 PMRyan Brennan
12/05/2023, 4:16 PMSean Malone
12/05/2023, 5:01 PMprefect.yaml
deployments and have 2 types of pull actions I am assigning to YAML aliases (dev/prod). This is the error I get at runtime:
Flow could not be retrieved from deployment.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 154, in run_steps
step_output = await run_step(step, upstream_outputs)
File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 110, in run_step
raise ValueError(
ValueError: Step has unexpected additional keys: repository, access_token, prefect.deployments.steps.git_clone
Here is my prefect.yaml
, am I misusing aliases here?
name: my-prefect
prefect-version: 2.14.9
actions:
pull:
dev: &pull_dev
- prefect.deployments.steps.git_clone:
repository: <https://github.com/my-repo.git>
branch: dev
access_token: "{{ prefect.blocks.secret.my-secret }}"
prod: &pull_prod
- prefect.deployments.steps.git_clone:
repository: <https://github.com/my-repo.git>
branch: main
access_token: "{{ prefect.blocks.secret.my-secret }}"
build: null
push: null
pull: null
# the deployments section allows you to provide configuration for deploying flows
deployments:
- name: my-deploy
version: 1
tags: []
description:
schedule:
entrypoint:
parameters: {}
work_pool:
name:
work_queue_name:
job_variables: {}
pull: *pull_dev # I want this deployment to pull from the dev branch
I am able to successfully run the deployment with a single pull:
definition, but it breaks when trying to use an alias to define the pull action on the deployment pull: *pull_dev
Acchhu
12/05/2023, 5:56 PMDevin Flake
12/05/2023, 6:18 PMprefect.yaml
as well - it doesn't seem to be downloading the flow code:Ian Thomas
12/05/2023, 8:32 PM$ prefect deployment apply prefect.yaml
I get the following error:
'prefect.yaml' did not conform to deployment spec: AttributeError("'NoneType' object has no attribute 'name'")
Contents of the prefect.yaml.
name: a_flow
prefect-version: 2.14.9
pull:
- prefect.deployments.steps.set_working_directory:
directory: /app
definitions:
tags: &common_tags
- development
- eks
work_pool: &common_work_pool
name: development-kubernetes
job_variables:
image: "{{ $PREFECT_IMAGE_NAME }}:$CIRCLE_SHA1"
deployments:
- name: default
entrypoint: main.py:run_flow
path: /app
tags: *common_tags
version: "{{ $CIRCLE_SHA1 }}"
work_pool_name: *common_work_pool
John LaRocque
12/05/2023, 11:07 PMexec "$@"
as here: https://github.com/PrefectHQ/prefect/blob/main/scripts/entrypoint.sh ? Apologies if the solution is obvious/in the docs somewhere.Tim Galvin
12/06/2023, 4:57 AMMohamed Rafiyudeen
12/06/2023, 5:38 AMDaryl
12/06/2023, 7:03 AMdocker-compose
that was top google search, it was implied to be out of date since Prefect is moving off Agents etc.
Does anyone have a solid up-to-date docker-compose and Dockerfile for on-prem server (or my local laptop for dev) they could share (esp if battle tested in prod?)? The idea here being able to run community for pipelines in-docker (not worried about deployed dockerized individual flows at the moment).Daryl
12/06/2023, 7:07 AMdocker-compose
and a build Dockerfile
since I have a things like astropy
etc I need to have available to prefect.).schwannden kuo
12/06/2023, 10:41 AMMorten Hauge
12/06/2023, 11:41 AM.fn
on a flow to run it directly. Unfortunately, this results in an error if the flow calls tasks that aren't also ran using .fn
. Is there a way to run a flow such that both the flow itself and all tasks are automatically called as task.fn
and flow.fn
? This would be really useful for lightweight E2E tests that don't need the resilience of actually running the flows, checking states etc.Gregory Hunt
12/06/2023, 1:52 PMPREFECT_
is set to 30 and the PREFECT_API_DATABASE_CONNECTION_TIMEOUT
doesn't seem to change it
sqlalchemy.exc.TimeoutError: QueuePool limit of size 50 overflow 50 reached, connection timed out, timeout 30.00 (Background on this error at: <https://sqlalche.me/e/20/3o7r>)
vkrot
12/06/2023, 2:23 PM@flow(name="test")
def main():
out = ShellOperation(commands=['echo hi'], stream_output=True).run()
print_output(out)
trace_success()
if __name__ == '__main__':
Deployment.build_from_flow(
flow=main,
name="test",
schedule=(CronSchedule(cron="0 0 * * *", timezone="UTC")),
work_pool_name="default",
infrastructure=KubernetesJob(
image='***.<http://dkr.ecr.us-east-1.amazonaws.com/prefect:prefect-clickhouse-backup_6e7690c|dkr.ecr.us-east-1.amazonaws.com/prefect:prefect-clickhouse-backup_6e7690c>',
namespace='prefect',
finished_job_ttl=60*60*24*7,
env={
'CLICKHOUSE_HOST': 'clickhouse.clickhouse.svc.cluster.local',
}
),
tags=["clickhouse"],
).apply()
when i navigate to deployment i see that it uses anonymous infrastructure block:
The problem is that whenever i run a deployment i don’t set image to why is specified here, also doesn’t set env. It just runs with default prefect image.
What am i doing wrong?Sergio Shevtsov
12/06/2023, 6:53 PMBryce Codell
12/07/2023, 4:12 AMephemeralStorage
parameter for ECS Tasks in Prefect Cloud? I tried adding the snippet below to the Job Variables configuration, but based on the logs it looks like each ECS Task is still using the default (20 GiB). Using this configuration works to override the default cpu
and memory
parameters, so it’s unclear why overriding this parameter isn’t working:
"ephemeralStorage": {
"sizeInGiB": 40
}
jcozar
12/07/2023, 8:08 AMmira
12/07/2023, 10:49 AMflow.deploy(
...
image=DeploymentImage(
name=os.getenv("ECR_REPO_URL", ""),
tag=os.getenv("IMAGE_TAG"),
dockerfile=cfd / "Dockerfile",
),
...
)
But then I get the error: Flow run could not be submitted to infrastructure: An error occurred (ClientException) when calling the RegisterTaskDefinition operation: Fargate requires task definition to have execution role ARN to support ECR images.
Why do I still have to provision the deployment with an execution role, shouldn't it (or the work pool) create one? Or is it because it is a my own ECR Repo? Where do you usually push / save the flow image to run it on the ecs cluster (especially in the frame of the ecs:push work pool with infra provisioning)?
Thank you and best regards!Lior Barak
12/07/2023, 2:45 PM2023-12-07T15:19:54+02:00 13:19:54.822 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 333.405324 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T15:24:52+02:00 13:24:52.771 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 297.947375 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T15:29:54+02:00 13:29:54.353 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 301.581939 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T15:35:23+02:00 13:35:23.989 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 329.635069 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T15:41:26+02:00 13:41:26.275 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 362.284819 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T15:46:40+02:00 13:46:40.591 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 314.314349 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T15:52:41+02:00 13:52:41.746 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 361.154281 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T15:59:18+02:00 13:59:18.214 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 396.467137 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T16:04:05+02:00 14:04:05.897 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 287.68302 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T16:09:54+02:00 14:09:54.378 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 348.479282 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T16:16:08+02:00 14:16:08.516 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 374.137448 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T16:22:20+02:00 14:22:20.136 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 371.61939 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T16:27:01+02:00 14:27:01.589 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 281.451967 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T16:32:15+02:00 14:32:15.239 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 313.649418 seconds to run, which is longer than its loop interval of 20.0 seconds.
2023-12-07T16:37:58+02:00 14:37:58.483 | WARNING | prefect.server.services.cancellationcleanup - CancellationCleanup took 343.243348 seconds to run, which is longer than its loop interval of 20.0 seconds.
the servers are unders stress sure but this seems a little excessive
i'm self hosting on k8s, and these errors happen with a local or remote (AWS RDS) DB
should I be worried?