<@ULVA73B9P> what is a common github directory str...
# ask-marvin
f
@Marvin what is a common github directory structure to orgnize prefect info like flows, variables, blocks, prefect.yaml etc
m
thought for 365.8 seconds
Great question. There isn’t a single “official” layout, but the following is a common, battle‑tested structure for Prefect 3.x projects in GitHub that keeps flows, deployments, variables, and blocks cleanly organized. Example repository layout
Copy code
.
├─ src/
│  └─ my_project/
│     ├─ flows/
│     │  ├─ etl_daily.py           # your flow(s)
│     │  └─ utils.py
│     ├─ blocks/                   # optional: custom block classes
│     │  └─ custom_blocks.py
│     └─ __init__.py
├─ scripts/
│  ├─ seed_variables.sh            # sets Prefect Variables via CLI
│  ├─ seed_blocks.py               # creates Prefect Block instances via Python
│  └─ register_blocks.sh           # registers custom block types (if any)
├─ tests/
│  └─ test_flows.py
├─ prefect.yaml                    # deployments config for `prefect deploy`
├─ pyproject.toml                  # or requirements.txt
├─ .prefectignore                  # exclude files from packaging (optional)
├─ .env.example                    # local env hints (do NOT commit real secrets)
├─ Dockerfile                      # if packaging into an image
├─ .github/
│  └─ workflows/
│     └─ prefect-deploy.yml        # CI for deploys/seeding
└─ README.md
What goes where (and why) - src/my_project/flows: All your flow code, kept importable with a src/ layout so testing and packaging are predictable. - prefect.yaml: The project’s deployment definitions (what to deploy, where, and how). You’ll commit this. Running
prefect deploy
from the repo root will read it. - scripts/seed_variables.sh: A repeatable script to seed Prefect Variables in your Prefect server. - scripts/seed_blocks.py: A repeatable script to create/update Prefect Block instances (secrets, credentials, storage) via Python, so you never commit secrets. - scripts/register_blocks.sh: If you define custom Blocks (classes), register their types once so they show up in the UI. - .prefectignore: Exclude large/unnecessary files when packaging or building artifacts. - .github/workflows/prefect-deploy.yml: CI pipeline that runs
prefect deploy
, and optionally seeds variables/blocks in a controlled way. Minimal prefect.yaml (Prefect 3.x)
Copy code
# Name your project and pin the Prefect version used for deployments
name: my-project
prefect-version: ">=3.0.0"

# Define one or more deployments
deployments:
  - name: etl-daily
    entrypoint: "src/my_project/flows/etl_daily.py:etl_daily"  # file:function
    work_pool:
      name: "my-work-pool"
      # job_variables: {}           # optionally customize job variables here
    parameters:
      run_date: null
    tags: ["daily", "etl"]
    description: "Daily ETL job"
    # schedule:                     # optional; can be cron or interval
    #   cron: "0 2 * * *"
Seed Prefect Variables (safe for CI)
Copy code
#!/usr/bin/env bash
set -euo pipefail

# Example: non-secret config you want available as Prefect Variables
prefect variable set API_BASE_URL <https://api.example.com> --overwrite --tag prod
prefect variable set DEFAULT_PAGE_SIZE 100 --overwrite
- CLI reference: prefect variable CLI Register custom block types (if you defined any)
Copy code
#!/usr/bin/env bash
set -euo pipefail

# Registers all block classes found in your module so they appear in the UI
prefect block register -m my_project.blocks
- CLI reference: prefect block CLI Seed Prefect Block instances (Python; good for secrets/credentials)
Copy code
# scripts/seed_blocks.py
from prefect.blocks.system import Secret

# Example: store a secret in Prefect (read from env/CI secret)
Secret(value="super-secret-token").save("api-token", overwrite=True)

# Example for an integration that needs extras:
# from prefect_aws.credentials import AwsCredentials
# AwsCredentials(aws_access_key_id=..., aws_secret_access_key=...).save("prod-aws", overwrite=True)
- Run with extras if needed (AWS/Docker/Kubernetes): ```
uv run --with 'prefect[aws]' python scripts/seed_blocks.py
Copy code
GitHub Actions CI skeleton
name: Prefect Deploy on: push: branches: [ main ] jobs: deploy: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: astral-sh/uv-action@v2 - run: uv sync # Optionally seed variables/blocks (ensure secrets are via GitHub Secrets) - run: ./scripts/seed_variables.sh - run: uv run python scripts/seed_blocks.py # Create/update deployments defined in prefect.yaml - run: uv run prefect deploy ``` Tips and best practices - Keep flow code importable (src/ layout) and tested (tests/). - Don’t commit secrets. Store them as Blocks and seed via CI or one-off scripts. - Use Variables for non-secret, environment-specific configuration. - Use .prefectignore to keep artifacts lean. - If you package code into Docker images or use cloud storage, wire those steps into CI and/or use optional
build/push/pull
actions in prefect.yaml as needed. - Use
prefect init
to generate a starter prefect.yaml and recipe prompts: - CLI:
prefect init
(help:
prefect init --help
) Helpful links - Prefect 3.x prefect.yaml guide: prefect.yaml (deployments) - Variables CLI: prefect variable - Blocks CLI: prefect block If you can share whether you’re targeting Docker, Kubernetes, or just pure Python workers, I can tailor the prefect.yaml and CI bits (e.g., image build/push, job variables) to match your setup.
f
@Marvin what if I have multiple deployments?
m
thought for 58.2 seconds
Totally fine—put them all in the same prefect.yaml under the deployments list. Each item defines a single deployment, so you can mix multiple flows and/or multiple variants of the same flow (different schedules, parameters, work pools, etc.). Example with multiple deployments
Copy code
name: my-project
prefect-version: ">=3.0.0"

deployments:
  # Same flow, daily schedule
  - name: etl-daily
    entrypoint: "src/my_project/flows/etl.py:etl"
    work_pool:
      name: "default-pool"
    schedule:
      cron: "0 2 * * *"
    parameters:
      mode: "daily"
    tags: ["etl", "daily"]

  # Same flow, backfill parameters
  - name: etl-backfill
    entrypoint: "src/my_project/flows/etl.py:etl"
    work_pool:
      name: "high-throughput-pool"
    parameters:
      mode: "backfill"
      start_date: "2024-01-01"
      end_date: "2024-01-31"
    tags: ["etl", "backfill"]

  # Different flow, different pool/schedule
  - name: reports-monthly
    entrypoint: "src/my_project/flows/reports.py:generate_reports"
    work_pool:
      name: "analytics-pool"
    schedule:
      cron: "0 3 1 * *"
    tags: ["reports", "monthly"]
How to deploy multiple deployments - Deploy everything defined in prefect.yaml (from the repo root):
Copy code
prefect deploy
- Deploy a single flow/deployment by entrypoint:
Copy code
prefect deploy src/my_project/flows/etl.py:etl
- Deploy a subset by name pattern (as hinted in
prefect deploy --help
):
Copy code
prefect deploy --name 'etl-*'
Tips - Reuse the same flow function in multiple deployments to vary parameters, schedules, or work pools without duplicating code. - Keep names clear and consistent. Prefect identifies deployments as flow_name/deployment_name; using distinct deployment names helps filtering with patterns. - For larger repos/monorepos, you can keep multiple folders each with their own prefect.yaml, and run
prefect deploy
from the appropriate folder. Docs: prefect.yaml (deployments)
f
@Marvin let me know more about this: scripts/ │ ├─ seed_variables.sh # sets Prefect Variables via CLI │ ├─ seed_blocks.py # creates Prefect Block instances via
m
thought for 86.9 seconds
Great! These scripts give you a repeatable, versioned way to bootstrap configuration in a new environment (or refresh an existing one) without committing secrets. What they’re for - seed_variables.sh: Creates/updates Prefect Variables (non-secret config) via the CLI. Good for URLs, flags, batch sizes, etc. - seed_blocks.py: Creates/updates Prefect Block instances (secrets/credentials/storage/etc.) via Python. Keeps sensitive values out of Git and in your Prefect backend. Key properties - Idempotent: Re-running won’t duplicate anything (use
--overwrite
or
overwrite=True
). - Environment-aware: Use tags or naming conventions to separate dev/staging/prod. - CI-friendly: Runs in GitHub Actions or any CI where Prefect API credentials are available. Example: seed_variables.sh - Simple, direct version (good for a few variables):
Copy code
#!/usr/bin/env bash
set -euo pipefail

# Optional: pass an env name (dev|staging|prod) for tagging
ENVIRONMENT="${1:-dev}"

prefect variable set API_BASE_URL "<https://api.example.com>" --overwrite --tag "$ENVIRONMENT"
prefect variable set DEFAULT_PAGE_SIZE "100" --overwrite --tag "$ENVIRONMENT"

# Verify
prefect variable ls
- File-driven version (good for many variables). Put config/variables/dev.env like:
Copy code
# config/variables/dev.env
API_BASE_URL=<https://dev-api.example.com>
DEFAULT_PAGE_SIZE=100
FEATURE_X_ENABLED=true
Then script:
Copy code
#!/usr/bin/env bash
set -euo pipefail

ENVIRONMENT="${1:-dev}"
VARS_FILE="config/variables/${ENVIRONMENT}.env"

if [[ ! -f "$VARS_FILE" ]]; then
  echo "Variables file not found: $VARS_FILE" >&2
  exit 1
fi

# Read KEY=VALUE lines (skip comments/blank lines)
while IFS='=' read -r KEY VALUE; do
  [[ -z "${KEY// }" || "$KEY" =~ ^# ]] && continue
  # Trim quotes around VALUE if present
  VALUE="${VALUE%\"}"; VALUE="${VALUE#\"}"
  prefect variable set "$KEY" "$VALUE" --overwrite --tag "$ENVIRONMENT"
done < "$VARS_FILE"

prefect variable ls
CLI reference: prefect variable CLI Example: seed_blocks.py - Use Python so you can safely pull secrets from CI env vars and save Blocks. Minimal secrets example
Copy code
# scripts/seed_blocks.py
import os
from prefect.blocks.system import Secret

# Expect token from CI or local env
api_token = os.environ["EXAMPLE_API_TOKEN"]
Secret(value=api_token).save("example-api-token", overwrite=True)
print("Saved Secret block: example-api-token")
AWS credentials + S3 bucket example
Copy code
# scripts/seed_blocks.py
import os
from prefect.blocks.system import Secret

# Optional: save a Secret
if "EXAMPLE_API_TOKEN" in os.environ:
    Secret(value=os.environ["EXAMPLE_API_TOKEN"]).save("example-api-token", overwrite=True)

# AWS-related (requires extras)
from prefect_aws.credentials import AwsCredentials
from prefect_aws.s3 import S3Bucket

aws_creds = AwsCredentials(
    aws_access_key_id=os.environ["AWS_ACCESS_KEY_ID"],
    aws_secret_access_key=os.environ["AWS_SECRET_ACCESS_KEY"],
    region_name=os.getenv("AWS_DEFAULT_REGION", "us-east-1"),
)
aws_creds.save("prod-aws", overwrite=True)
print("Saved AwsCredentials block: prod-aws")

s3_block = S3Bucket(
    bucket_name=os.environ["S3_BUCKET_NAME"],
    credentials=AwsCredentials.load("prod-aws"),
)
s3_block.save("prod-data-bucket", overwrite=True)
print("Saved S3Bucket block: prod-data-bucket")
Run with required extras (example):
Copy code
uv run --with 'prefect[aws]' python scripts/seed_blocks.py
Block CLI notes - Register custom block types (if you define your own classes) so they appear in the UI:
Copy code
prefect block register -m my_project.blocks
CLI reference: prefect block CLI Using variables/blocks in your flows
Copy code
from prefect import flow
from prefect.variables import Variable
from prefect.blocks.system import Secret

@flow
def etl():
    api_base = Variable.get("API_BASE_URL")
    token = Secret.load("example-api-token").get()
    # use api_base and token...
Typical CI wiring (GitHub Actions) - You’ll need PREFECT_API_URL and PREFECT_API_KEY set as repo or environment secrets so the CLI/Python can talk to your Prefect server/Cloud.
Copy code
name: Seed Prefect Config

on:
  workflow_dispatch:
  push:
    branches: [ main ]

jobs:
  seed:
    runs-on: ubuntu-latest
    env:
      PREFECT_API_URL: ${{ secrets.PREFECT_API_URL }}
      PREFECT_API_KEY: ${{ secrets.PREFECT_API_KEY }}
      EXAMPLE_API_TOKEN: ${{ secrets.EXAMPLE_API_TOKEN }}
      AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
      AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
      S3_BUCKET_NAME: my-prod-bucket
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/uv-action@v2
      - run: uv sync

      # seed variables
      - run: ./scripts/seed_variables.sh prod

      # seed blocks (with extras if needed)
      - run: uv run --with 'prefect[aws]' python scripts/seed_blocks.py

      # optional: verify
      - run: prefect variable ls
      - run: prefect block ls
Best practices - Don’t commit secrets. Keep values in CI secrets or a secure vault; create blocks from those values at deploy time. - Use tags or block/variable naming conventions per environment (e.g., prod-aws, staging-aws). - Make scripts idempotent (
--overwrite
and
overwrite=True
) so they can run safely on every deploy. - Keep the “definitions” in code (names and structure) and the “values” in your environment (CI secrets, .env files not committed). - If you have custom Blocks, put their classes under src/my_project/blocks and run
prefect block register -m my_project.blocks
once per change. If you tell me which integrations you’re using (AWS/GCP/Azure/Docker/Kubernetes), I can tailor a ready-to-run seed_blocks.py for your stack.