Martijn van Elferen
07/05/2025, 3:36 PMMarvin
07/05/2025, 3:36 PMMarvin
07/05/2025, 3:37 PMyaml
# prefect.yaml
deployments:
- name: process-customer-data-us
entrypoint: flows/process_data.py:main_flow
parameters:
region: "US"
batch_size: 1000
work_pool:
name: my-work-pool
- name: process-customer-data-eu
entrypoint: flows/process_data.py:main_flow
parameters:
region: "EU"
batch_size: 500
work_pool:
name: my-work-pool
You can deploy these configurations using:
bash
prefect deploy
2. Ad-hoc Flow Runs (Better for development/testing):
For one-off runs or testing, you can run the flow directly with different parameters:
python
from prefect import flow
@flow
def my_flow(region: str, batch_size: int):
print(f"Processing {region} with batch size {batch_size}")
# Run directly with different parameters
my_flow("US", 1000)
my_flow("EU", 500)
# Or create flow runs programmatically
from prefect.client import get_client
async with get_client() as client:
await client.create_flow_run(
deployment_id="deployment-id",
parameters={"region": "US", "batch_size": 1000}
)
Some key considerations:
1. Each deployment can have its own:
- Parameters
- Schedule
- Work pool configuration
- Infrastructure settings
2. Using multiple deployments is beneficial when you need:
- Different schedules for different parameter sets
- Separate monitoring and observability for each parameter configuration
- Different infrastructure requirements based on parameters
- Clear separation of concerns in your workflow management
3. The prefect.yaml
file can use YAML anchors and aliases to avoid repetition when you have many similar deployments.
Remember that in Prefect 3.x, deployments are managed through the prefect.yaml
file or programmatically using flow.from_source().deploy()
. This is different from Prefect 2.x, which used the now-removed Deployment.build_from_flow()
method.
For more details on deployment configuration options, check out the deployment documentation.Martijn van Elferen
07/05/2025, 3:39 PMMarvin
07/05/2025, 3:39 PMMarvin
07/05/2025, 3:40 PMpython
@flow
def my_flow():
pass
if __name__ == "__main__":
my_flow.deploy(
name="my-deployment",
work_pool_name="my-pool",
parameters={"param1": "value1"}
)
Advantages:
- More Pythonic and familiar for developers
- Easier to implement dynamic/programmatic deployment logic
- Great for CI/CD pipelines
- Keeps deployment configuration close to the flow code
- Simpler to get started with
- Better for automated deployment management
prefect.yaml Approach:
yaml
deployments:
- name: my-deployment
entrypoint: flows/my_flow.py:my_flow
work_pool:
name: my-pool
parameters:
param1: value1
Advantages:
- Better for managing multiple deployments across a large project
- Easier to version control deployment configurations
- More suitable for teams that prefer declarative configuration
- Clearer overview of all deployments in one place
- Better for infrastructure-as-code approaches
Your current approach of using in-code deployments is absolutely fine and follows modern Prefect practices. In fact, it's often simpler and more maintainable for many use cases, especially when:
- You're working on smaller to medium-sized projects
- You want to keep deployment logic close to your flow code
- You need programmatic control over deployments
- You're implementing CI/CD pipelines
The prefect.yaml
approach becomes more valuable when:
- You're managing many deployments across a large project
- You need a central place to view all deployment configurations
- You're working in a team that prefers infrastructure-as-code
- You want to separate deployment configuration from flow logic
Bottom line: There's no strict requirement to switch to prefect.yaml
if your current in-code deployment approach is working well for you. Both methods are supported and considered best practice depending on your specific needs.Martijn van Elferen
07/05/2025, 3:41 PMMarvin
07/05/2025, 3:41 PMMarvin
07/05/2025, 3:42 PMprefect.yaml
does offer superior organization and clarity. Let me show you a comprehensive approach combining prefect.yaml
with CI/CD best practices.
Here's an example prefect.yaml
that demonstrates parameter management for different DBT environments:
yaml
# prefect.yaml
deployments:
# Use YAML anchors for shared configurations
- &base-deployment
work_pool:
name: dbt-pool
work_queue_name: default
build:
- prefect.deployments.steps.git_clone:
repository: <https://github.com/your-org/your-repo.git>
branch: main
push: true
# Development environment
- <<: *base-deployment
name: dbt-dev
entrypoint: flows/dbt_flow.py:dbt_flow
parameters:
profiles_dir: "."
project_dir: "dbt_project"
select: "tag:daily" # DBT selector
target: "dev"
vars:
env: "dev"
days_lookback: 7
# Production environment
- <<: *base-deployment
name: dbt-prod
entrypoint: flows/dbt_flow.py:dbt_flow
parameters:
profiles_dir: "."
project_dir: "dbt_project"
select: "tag:daily"
target: "prod"
vars:
env: "prod"
days_lookback: 30
And here's a corresponding GitHub Actions workflow that demonstrates CI/CD best practices:
yaml
# .github/workflows/prefect-cicd.yml
name: Prefect CI/CD
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
deploy-flows:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.10'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install prefect dbt-core dbt-snowflake # adjust based on your DBT adapter
- name: Login to Prefect Cloud
run: |
prefect cloud login --key ${{ secrets.PREFECT_API_KEY }}
- name: Deploy flows
run: |
# Deploy to dev on PR
if [[ $GITHUB_EVENT_NAME == 'pull_request' ]]; then
prefect deploy --name dbt-dev
fi
# Deploy to both dev and prod on main branch push
if [[ $GITHUB_EVENT_NAME == 'push' && $GITHUB_REF == 'refs/heads/main' ]]; then
prefect deploy
fi
Here's the corresponding DBT flow code (flows/dbt_flow.py
):
python
from prefect import flow, task
from prefect_dbt.cli.commands import DbtCoreOperation
from typing import Dict, Optional
@flow
def dbt_flow(
profiles_dir: str,
project_dir: str,
select: str,
target: str,
vars: Optional[Dict] = None
):
dbt_op = DbtCoreOperation(
commands=["build"],
project_dir=project_dir,
profiles_dir=profiles_dir,
target=target,
select=select,
vars=vars
)
result = dbt_op.run()
return result
if __name__ == "__main__":
# Local testing only - deployment handled by prefect.yaml
dbt_flow(
profiles_dir=".",
project_dir="dbt_project",
select="tag:daily",
target="dev",
vars={"env": "dev", "days_lookback": 7}
)
Key Best Practices:
1. Environment Separation:
- Use YAML anchors (&base-deployment
) to