Hi <@ULVA73B9P>, What do you think of this Prefect...
# marvin-ai
j
Hi @Marvin, What do you think of this Prefect server self-hosted deployment on EC2? Would it be sufficient and optimized to run on a t3.medium with best practices? user_data.sh.tftpl:
Copy code
#!/bin/bash
set -euxo pipefail

...
# Install Docker Compose
mkdir -p /usr/local/lib/docker/cli-plugins
curl -L "<https://github.com/docker/compose/releases/download/v2.29.2/docker-compose-linux-x86_64>" -o /usr/local/lib/docker/cli-plugins/docker-compose
chmod +x /usr/local/lib/docker/cli-plugins/docker-compose

# Setup directories
mkdir -p /data/postgres /opt/prefect /var/lib/tailscale
chown -R ec2-user:ec2-user /opt/prefect

# Mount Postgres volume
if ! blkid /dev/xvdb >/dev/null 2>&1; then
  mkfs -t xfs /dev/xvdb
fi
if ! grep -q "/dev/xvdb" /etc/fstab; then
  echo "/dev/xvdb /data/postgres xfs defaults,nofail 0 2" >> /etc/fstab
fi
mount -a
chown -R 999:999 /data/postgres

# Fetch or generate secrets from SSM
cd /opt/prefect
rm -f .env

# Create docker-compose.yml
cat > docker-compose.yml << 'COMPOSE_EOF'
services:
  postgres:
    image: postgres:15
    container_name: prefect-postgres
    environment:
      POSTGRES_USER: prefect
      POSTGRES_PASSWORD: $${DB_PASSWORD}
      POSTGRES_DB: prefect
    volumes:
      - /data/postgres:/var/lib/postgresql/data
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U prefect"]
      interval: 5s
      timeout: 5s
      retries: 5
    restart: always
    networks:
      - prefect

  redis:
    image: redis:7-alpine
    container_name: prefect-redis
    healthcheck:
      test: ["CMD-SHELL", "redis-cli ping"]
      interval: 5s
      timeout: 5s
      retries: 5
    restart: always
    networks:
      - prefect

  prefect-server:
    image: prefecthq/prefect:3-latest
    container_name: prefect-server
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    environment:
      # Database
      
      # Server config
      
      # Redis messaging
      # UI config - point to ALB for external access
      
    command: prefect server start
    ports:
      - "4200:4200"
    restart: always
    networks:
      - prefect

  tailscale:
    image: tailscale/tailscale:latest
    container_name: tailscale
    hostname: $${TAILSCALE_HOSTNAME}
    environment:
      - TS_AUTHKEY=$${TAILSCALE_AUTH_KEY}
      - TS_STATE_DIR=/var/lib/tailscale
      - TS_USERSPACE=false
      - TS_ACCEPT_DNS=true
      - TS_EXTRA_ARGS=--advertise-tags=tag:server
    volumes:
      - /var/lib/tailscale:/var/lib/tailscale
      - /dev/net/tun:/dev/net/tun
    cap_add:
      - NET_ADMIN
      - SYS_MODULE
    network_mode: host
    restart: always
networks:
  prefect:
    driver: bridge
COMPOSE_EOF

# Create .env file with secrets
cat > .env << ENV_EOF
DB_PASSWORD=$DB_PASSWORD
PREFECT_API_KEY=$PREFECT_API_KEY
TAILSCALE_AUTH_KEY=$TAILSCALE_AUTH_KEY
TAILSCALE_HOSTNAME=$TAILSCALE_HOSTNAME
APP_DOMAIN=$APP_DOMAIN
AWS_REGION=$AWS_REGION
PROJECT=$PROJECT
ENVIRONMENT=$ENVIRONMENT
ENV_EOF

# Ensure proper permissions
chown ec2-user:ec2-user docker-compose.yml .env
chmod 600 .env

# Configure Docker daemon to use CloudWatch logs driver
mkdir -p /etc/docker
cat > /etc/docker/daemon.json << DAEMON_EOF
{
  "log-driver": "awslogs",
  "log-opts": {
    "awslogs-region": "$AWS_REGION"
  }
}
DAEMON_EOF
systemctl restart docker

# Start services
docker compose up -d

# Wait for Tailscale to connect
echo "Waiting for Tailscale to establish connection..."
sleep 10

# If routes are specified, advertise them
if [ -n "$ADVERTISE_ROUTES" ] && [ "$ADVERTISE_ROUTES" != "" ]; then
  echo "Advertising routes: $ADVERTISE_ROUTES"
  docker exec tailscale tailscale set --advertise-routes="$ADVERTISE_ROUTES" || true
fi

# Display Tailscale status
docker exec tailscale tailscale status || true

# Create systemd service for auto-start
cat > /etc/systemd/system/prefect.service << 'SYSTEMD_EOF'
[Unit]
Description=Prefect Server Docker Compose
Requires=docker.service
After=docker.service

[Service]
Type=oneshot
WorkingDirectory=/opt/prefect
ExecStart=/usr/bin/docker compose up -d
ExecStop=/usr/bin/docker compose down
RemainAfterExit=yes

[Install]
WantedBy=multi-user.target
SYSTEMD_EOF

systemctl daemon-reload
systemctl enable prefect.service
m
thought for 347.5 seconds
Great start! Overall, this will work on a single EC2 host and a t3.medium can handle a small team / low–moderate throughput deployment of Prefect Server. I’d make a few tweaks for correctness, reliability, and ops hygiene. High-level verdict - Suitability: A t3.medium (2 vCPU, 4GB) is sufficient for light usage (e.g., a few hundred flow runs/day, low concurrency). For steadier performance, enable T3 Unlimited or step up to t3.large/t3a.large if you expect bursts or heavier UI/API use. - Architecture: Using Docker Compose with a separate Postgres volume is good. Tailscale is fine for private access. If exposing publicly via ALB, harden CORS and pin the image version. - Missing bits: Server env vars, a healthcheck for the server, per-service CloudWatch logging (your daemon.json is not enough), and optionally a worker service. Key fixes and recommendations 1) Use the correct Prefect 3.x env vars in the server Fill in the
prefect-server
environment:
block with these. This ensures the API listens externally, the server uses your Postgres DB, and the UI can reach the API when behind an ALB.
Copy code
prefect-server:
    image: prefecthq/prefect:3.0.11   # pin a version; avoid 3-latest
    container_name: prefect-server
    depends_on:
      postgres:
        condition: service_healthy
      redis:
        condition: service_healthy
    environment:
      # Postgres connection
      PREFECT_SERVER_DATABASE_CONNECTION_URL: "<postgresql+asyncpg://prefect:${DB_PASSWORD}@postgres:5432/prefect>"

      # Server API bind address/port (expose to the ALB and Tailscale)
      PREFECT_SERVER_API_HOST: "0.0.0.0"
      PREFECT_SERVER_API_PORT: "4200"

      # If you front with an ALB on a custom domain, consider setting:
      # - exact CORS allowed origin (avoid '*')
      PREFECT_SERVER_API_CORS_ALLOWED_ORIGINS: "https://$${APP_DOMAIN}"
      # - base path if the ALB rewrites paths (omit if not needed)
      # PREFECT_SERVER_API_BASE_PATH: "/"

      # UI: keep enabled (default), and set the API URL if the UI is accessed via your domain
      # Omit if the ALB terminates TLS and forwards to the same host/port without path changes
      PREFECT_SERVER_UI_ENABLED: "true"
      PREFECT_SERVER_UI_API_URL: "https://$${APP_DOMAIN}/api"
    command: prefect server start
    ports:
      - "4200:4200"
    restart: always
    healthcheck:
      test: ["CMD-SHELL", "wget -qO- <http://127.0.0.1:4200/api/health> || exit 1"]
      interval: 10s
      timeout: 5s
      retries: 5
    networks:
      - prefect
Notes - Database URL format:
<postgresql+asyncpg://user:password@host:port/db>
. In 3.x, the canonical setting is
PREFECT_SERVER_DATABASE_CONNECTION_URL
. For backward compatibility,
PREFECT_API_DATABASE_CONNECTION_URL
also works. See Settings reference. - Host/port/CORS/UI settings above map directly to ServerAPISettings and ServerUISettings. See Self-hosted server with Docker and Settings reference. 2) Add a Prefect worker (if you want flows to run) Prefect Server does not execute flows by itself; you’ll need at least one worker pointed at your server. You can run it on the same box:
Copy code
worker:
    image: prefecthq/prefect:3.0.11
    container_name: prefect-worker
    depends_on:
      prefect-server:
        condition: service_healthy
    environment:
      # Inside the compose network, refer to the server container by name
      PREFECT_API_URL: "<http://prefect-server:4200/api>"
    command: >
      prefect worker start
      -p default           # create a work pool named 'default' first
      -n ec2-t3m
    restart: always
    networks:
      - prefect
- After the server is up, create a work pool you’ll use for deployments: - In the UI or via CLI. If using CLI on the host, set
PREFECT_API_URL=https://$APP_DOMAIN/api
first.
3) Logging to CloudWatch: configure per-service (not daemon-only) Setting only the DAO-wide
awslogs
region in
/etc/docker/daemon.json
is insufficient and can actually break container starts because
awslogs-group
is required. Configure it per service:
Copy code
logging:
      driver: awslogs
      options:
        awslogs-region: $${AWS_REGION}
        awslogs-group: "/ecs/prefect-selfhosted"
        awslogs-stream: "prefect-server"
        awslogs-create-group: "true"
Repeat for
postgres
,
redis
, and
worker
(with a different
awslogs-stream
per service). Alternatively, keep Docker’s default
json-file
logger and ship logs via the CloudWatch agent. 4) Redis: optional in many cases - For a single-node server with the built-in services, Redis is not strictly required for basic orchestration. If you don’t explicitly use features requiring a broker at scale, you can remove Redis to simplify ops. - If you keep Redis, keep the healthcheck (nice!) and ensure security group/EBS backup posture is acceptable. 5) Health and lifecycle - You already have Postgres/Redis healthchecks; add one for the Prefect server (see above). - Consider using
docker compose up -d --pull always
during upgrades and pin the Prefect image with a version tag. - Your systemd oneshot service is fine; you might add a timer/cron for periodic
docker compose pull && docker compose up -d
. 6) Security and secrets - You’re writing an
.env
with secrets. If possible, fetch from SSM Parameter Store at boot and export directly into the process environment or use Docker secrets. At minimum, restrict permissions (you did
chmod 600
, good). - If the ALB is the entry point, restrict the EC2 security group so port 4200 is not public. With Tailscale, you can even avoid public exposure entirely. 7) Data durability and DB ops - You set up a dedicated EBS volume and XFS for Postgres; good. Make sure you: - Enable snapshots (e.g., DLM or AWS Backup). - Set `max_connections`/`shared_buffers` appropriately if you scale; for small installations defaults are fine. - Consider RDS if you want managed backups/patching. 8) Instance sizing guidance for t3.medium - Works well for: - 1 server + Postgres + (optional) 1 worker - Tens of concurrent tasks and a few hundred daily runs - Caveats: - Burstable credits can throttle CPU under sustained load; enable T3 Unlimited or size up. - Postgres and the UI can contend for memory on 4GB if you add a busy worker; monitor memory and swap. - If you foresee heavier concurrency, step up to t3.large/t3a.large (8GB) or split DB onto RDS. Reference docs - Prefect 3 self-hosted with Docker: Server in Docker - Docker Compose example and guidance: Docker Compose guide - Settings reference (env var names): Settings reference If you share how you plan to access the UI (Tailscale-only vs ALB + public domain) and expected workload, I can tailor the exact env vars and CORS/UI settings and suggest instance sizing with more precision.