Great question — here’s a concise, table-by-table overview of what Prefect (assuming Prefect 3.x) stores in each table. Exact columns can vary between releases; for the authoritative schema, see the ORM models in source:
Prefect Server ORM models (schema)
Runs and states
- flow: Registered flows. Stores flow metadata like name, slug, storage/source info, creation timestamps; referenced by deployments and flow runs.
- flow_run: One row per flow run. Holds identifiers and runtime metadata (flow_id, deployment_id, parameters snapshot, tags, start/end timestamps, state_id/current state pointer, parent/child relationships, etc.).
- flow_run_state: Full state history for flow runs. Each row records a transition (type, name, timestamp, message, state_details JSON like scheduled time, pause details, etc.).
- task_run: One row per task execution inside a flow run. Includes flow_run_id, task_key, dynamic_key, cache_key, state pointer, retries, run times, etc.
- task_run_state: Full state history for task runs (same structure as flow_run_state but for tasks).
- task_run_state_cache: Cache index for task results keyed by cache_key, including associated state, expiration, and scope — used for task caching.
- log: All structured logs emitted by flows/tasks. Includes flow_run_id/task_run_id, timestamp, level, logger/name, message, and extra metadata.
- flow_run_input: Server-side storage for flow run “inputs” payloads (e.g., data sent to a run at start or via UI/APIs) to support retries/resumes and auditability.
Deployments and scheduling
- deployment: Deployment definitions for a flow. Stores name, flow_id, version link, default parameters, tags, work_pool association, job variables/overrides, storage/source location, description, etc.
- deployment_version: Immutable records of deployment “releases” (build metadata, manifest/hash, parameters snapshot, source reference, version number).
- deployment_schedule: Zero or more schedules attached to a deployment (cron/interval/RRule definitions, timezone, active/paused).
Work pools, queues, and workers
- work_pool: Execution backends (e.g., Process, Docker, Kubernetes, ECS). Stores type, base job template, default settings, concurrency controls, pause status.
- work_queue: Named queues under a work pool, with optional filters and per-queue concurrency limits; used to route scheduled work to workers.
- worker: Registered workers and their heartbeats/configuration. Includes worker name, type, pool_id, status, last_heartbeat, and capabilities.
- agent: Legacy agent records (from Prefect 2.x). Retained for compatibility/telemetry in some environments; superseded by workers in 3.x.
Blocks (configuration objects)
- block_type: Definitions for block types (e.g., “S3 Bucket”, “GCS Credentials”). Includes slug, display name, logo/docs links, and code entrypoint info.
- block_schema: JSON schema for a block_type at a specific version (field definitions, defaults, encryption flags).
- block_schema_reference: References between block schemas (supports nested/linked schemas).
- block_document: A concrete block instance with values (linked to block_schema/type), including name, is_protected, is_anonymous, and the stored encrypted/unencrypted payload.
- block_document_reference: Links from one block document to another (used for nested block configurations).
Automations and events
- automation: Automation definitions (triggers and actions). Stores name, enabled flag, trigger configuration (event filters, windows), actions (e.g., create flow run), owner/metadata.
- automation_bucket: Internal bucketization for automations to coordinate event consumption and deduplication.
- automation_related_resource: Join table tying automations to specific resources they relate to (for targeting/scoping).