An Vu Trong
10/25/2024, 12:50 PM# Welcome to your prefect.yaml file! You can use this file for storing and managing
# configuration for deploying your flows. We recommend committing this file to source
# control along with your flow code.
# Generic metadata about this project
name: fintech_api
prefect-version: 3.0.4
# build section allows you to manage and build docker images
build:
# push section allows you to manage if and how this project is uploaded to remote locations
push:
# pull section allows you to provide instructions for cloning this project in remote locations
pull:
- prefect.deployments.steps.set_working_directory:
directory: /home/anvutrong/trong_an_personal/Promete/fintech_api
# the deployments section allows you to provide configuration for deploying flows
deployments:
- name: daily_update
tags:
- staging pipeline
description: Fetch and store data from Wifeed API to MongoDB
entrypoint: src/api/flow_deployment.py:daily_update
parameters: {}
work_pool:
name: local-wp
work_queue_name: primary-queue
job_variables: {}
version:
concurrency_limit: 1
collision_strategy: CANCEL_NEW
enforce_parameter_schema: true
schedules:
- rrule: RRULE:FREQ=DAILY;INTERVAL=1;BYDAY=MO,TU,WE,TH,FR;BYHOUR=0;BYMINUTE=0;BYSECOND=0
timezone: Asia/Bangkok
active: true
max_active_runs:
catchup: false
- name: intraday_eod_auto
tags:
- stock price pipeline
description: Automatically fetch intraday stock price into clickhouse DB
entrypoint: src/api/flow_deployment.py:intraday_eod
parameters: {
check_first_time_migration: false
}
work_pool:
name: local-wp
work_queue_name: primary-queue
job_variables: {}
concurrency_limit: 1
collision_strategy: CANCEL_NEW
enforce_parameter_schema: true
schedules:
- rrule: RRULE:FREQ=MINUTELY;INTERVAL=1;BYDAY=MO,TU,WE,TH,FR;BYHOUR=9,10,11,13,14,15
timezone: Asia/Bangkok
active: true
max_active_runs: 1
catchup: false
version:
Andrew Brookins
10/26/2024, 1:57 AMAndrew Brookins
10/26/2024, 2:08 AMprefect deploy ... --concurrency-limit 3 --collision-strategy CANCEL_NEW
If you're editing your YAML file directly and want to set a collision strategy, you do that a little differently than how your YAML is configured. Check this out:
concurrency_limit:
limit: 3
collision_strategy: CANCEL_NEW
An Vu Trong
10/26/2024, 7:26 AMAndrew Brookins
10/28/2024, 5:42 PMAn Vu Trong
10/29/2024, 4:58 AMAndrew Brookins
10/30/2024, 7:27 PMcatchup=False
and have the worker ignore late runs and only run the next non-late run, or something like that?Andrew Brookins
10/30/2024, 7:28 PMAndrew Brookins
10/30/2024, 7:29 PMdeployment.catch_up_late_runs = False
or deployment.schedules[0].catch_up_late_runs=False
. I think we'd probably start at the deployment level and then consider afterward if we also wanted/needed to let individual schedules override.An Vu Trong
11/05/2024, 3:29 AM- name: daily_update
tags:
- staging pipeline
description: Fetch and store data from Wifeed API to MongoDB
entrypoint: src/api/flow_controller.py:daily_update
parameters: {}
work_pool:
name: local-wp
work_queue_name: primary-queue
job_variables: {}
version:
concurrency_limit:
limit: 1
collision_strategy: CANCEL_NEW
enforce_parameter_schema: true
catch_up_late_runs: false
schedules:
- rrule: RRULE:FREQ=DAILY;INTERVAL=1;BYDAY=MO,TU,WE,TH,FR;BYHOUR=0;BYMINUTE=0;BYSECOND=0
timezone: Asia/Bangkok
active: true
max_active_runs:
catchup: false
catch_up_late_runs: false
@Andrew Brookins I set this up with catch_up_late_runs = False as you suggested, but the late runs still got executed. Any idea how to properly do this? Thanks you very muchBring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.
Powered by