Saad80
08/29/2025, 9:57 PMMarvin
08/29/2025, 9:58 PMMarvin
08/29/2025, 10:00 PMPREFECT_SERVER_ALLOW_EPHEMERAL_MODE
. That setting does not exist in Prefect 3.x (it’s a leftover from older configs), so Pydantic raises a validation error during import.
How to fix
Because this occurs on import, fix the profile first (before running Prefect/your flow):
1) Check and remove any environment variable
- Run:
env | grep PREFECT_SERVER_ALLOW_EPHEMERAL_MODE
- If you see it, unset it for the current shell:
unset PREFECT_SERVER_ALLOW_EPHEMERAL_MODE
- Also remove it from your shell profile (e.g., ~/.bashrc, ~/.zshrc, systemd unit, etc.) so it doesn’t reappear.
2) Clean up your profiles.toml (most likely culprit)
- Open or move aside your user profiles file:
mv ~/.prefect/profiles.toml ~/.prefect/profiles.toml.bak
- This prevents Prefect from loading the invalid setting.
3) Recreate default profiles
- Now that the bad file is out of the way, you can safely run:
prefect profile populate-defaults
prefect profile ls
prefect profile inspect
- If you need a custom profile again, create it with:
prefect profile create my-profile
and only add settings from the Prefect 3 settings reference.
4) Re-run your flow
python3 ray-flow.py
Notes
- If you were trying to enable an “ephemeral server” behavior from older docs, that specific setting is not part of Prefect 3.x. In 3.x, just configure your API target normally:
- Prefect Cloud: login and leave PREFECT_API_URL
managed by the CLI
- Prefect Server (self-hosted): set PREFECT_API_URL
to your server API URL (e.g., http://127.0.0.1:4200/api)
Docs (Prefect 3.x)
- Settings reference: Prefect 3 settings reference
- Settings and profiles: Settings and profiles concept
If you want, paste the relevant part of your ~/.prefect/profiles.toml
(with any secrets redacted), and I’ll point out exactly what to keep or remove. Also let me know if you were upgrading from Prefect 2.x — I can help map any other settings that changed.Saad80
08/29/2025, 10:02 PMMarvin
08/29/2025, 10:02 PMMarvin
08/29/2025, 10:03 PMprefect.runner
and a Pydantic v1 validation error for an unknown setting, your script is actually importing Prefect 3.x at runtime. In Prefect 3, PREFECT_SERVER_ALLOW_EPHEMERAL_MODE
is not a valid setting, hence the failure. Your pip3 show prefect
output shows 2.14.20, which means pip3
and python3
are likely pointing at different environments.
Quick checks
Run these to confirm what your script’s Python sees:
which python3
python3 -c "import prefect, sys; print('prefect', prefect.__version__, 'from', prefect.__file__, 'python', sys.executable)"
python3 -m pip show prefect
If that prints a 3.x version, you’ve got a mismatch between pip and python.
How to fix (stay on Prefect 2.x)
- Ensure the python you run your flow with has Prefect 2.x installed:
- Option A: Reinstall Prefect 2 in the interpreter you’re using:
python3 -m pip uninstall -y prefect
python3 -m pip install 'prefect<3' prefect-ray
python3 -c "import prefect; print(prefect.__version__)"
- Option B (recommended): Use a clean virtual environment so pip and python are aligned:
python3 -m venv .venv
source .venv/bin/activate
pip install 'prefect<3' prefect-ray
python -c "import prefect; print(prefect.__version__)"
python ray-flow.py
Notes
- Your ~/.prefect/profiles.toml
is fine for Prefect 2.x; you do not need to remove PREFECT_SERVER_ALLOW_EPHEMERAL_MODE
if you’re staying on 2.x.
- The error only appears because Prefect 3 is being imported somewhere on your path; once you’re truly running 2.14.20, that setting will be accepted.
If you prefer to migrate to Prefect 3 instead, remove PREFECT_SERVER_ALLOW_EPHEMERAL_MODE
from your profile and update your code accordingly, but since you’re using prefect-ray
(a 2.x integration), staying on 2.x is likely the right call here.
Send me the output of the version/path checks above if you want me to pinpoint the exact mismatch.Saad80
08/29/2025, 11:02 PMMarvin
08/29/2025, 11:02 PMMarvin
08/29/2025, 11:04 PMshout
tasks are submitted to Ray and run in parallel on Ray workers (each sleeps 0.5s and prints “#<n>”).
- The script waits for all tasks to finish, then exits.
- If you have a Prefect API configured (PREFECT_API_URL set), you’ll see one flow run with 10 task runs in the Prefect UI; otherwise it runs entirely locally.
Important details and fixes for your snippet
- Use a valid Ray address string. No angle brackets:
address="ray://<host>:10001"
- Fix runtime_env key: Ray expects pip
, not pip3
.
RayTaskRunner(
address="ray://<host>:10001",
init_kwargs={"runtime_env": {"pip": ["prefect==2.14.20", "prefect-ray"]}},
)
Alternatively, pre-install packages on your Ray cluster nodes and omit runtime_env.
- Printing vs logging: print()
output may land in Ray worker logs, not your local console or Prefect logs. If you want to see output in Prefect, use the run logger:
from prefect import get_run_logger
@task
def shout(number):
logger = get_run_logger()
time.sleep(0.5)
<http://logger.info|logger.info>(f"#{number}")
- Waiting: .wait()
is fine; .result()
is useful if you want to raise any task exceptions in your driver process.
What you’ll see if things are working
- Quick parallel execution: total runtime should be ~0.6–1.5s for 10 tasks (0.5s each) plus overhead.
- Ray will log that it’s connecting to the cluster.
- In the Prefect UI (if connected), a single flow run with 10 completed task runs.
Common pitfalls to watch for
- Wrong runtime_env key: using pip3
will error; use pip
.
- Connectivity: if the driver can’t reach ray//<host>10001 (firewall/SG), it will hang or throw connection errors. You can test directly:
python3 -c "import ray; ray.init(address='ray://<host>:10001'); print(ray.cluster_resources())"
- Package availability on workers: ensure Prefect (and any task deps) are installed on Ray workers, either via runtime_env={"pip": [...]}
or by baking your cluster images.
- Mixed Prefect versions: if you see the “Unknown setting 'PREFECT_SERVER_ALLOW_EPHEMERAL_MODE'” error again, your driver is importing Prefect 3.x somewhere. Make sure python3
is using the environment with Prefect 2.14.20.
Docs
- Prefect + Ray (2.x): prefect-ray integration
- Task runners (2.x): Task runners overview
- Logging (2.x): Logging guide
If you share the actual Ray address you’re using (sanitized) and any connection errors, I can help verify connectivity and version compatibility.Saad80
08/29/2025, 11:10 PMMarvin
08/29/2025, 11:10 PMMarvin
08/29/2025, 11:10 PMSaad80
08/29/2025, 11:10 PMMarvin
08/29/2025, 11:10 PMMarvin
08/29/2025, 11:10 PMSaad80
08/29/2025, 11:36 PMMarvin
08/29/2025, 11:36 PMMarvin
08/29/2025, 11:38 PMprefect server start --host 127.0.0.1 --port 4200
2) From your laptop, create a tunnel:
ssh -i /path/to/key.pem -N -L 4200:127.0.0.1:4200 ec2-user@<ec2-public-dns>
3) Point your CLI at the tunneled API:
prefect config set PREFECT_API_URL="<http://127.0.0.1:4200/api>"
prefect config view | grep PREFECT_API_URL
curl -s <http://127.0.0.1:4200/api/health>
Security group requirements: only inbound SSH (22) from your IP. No 4200 open to the internet.
Option B — Publicly expose the API (lock down to your IP and add TLS)
1) On EC2, run Prefect Server listening on all interfaces:
prefect server start --host 0.0.0.0 --port 4200
2) AWS Security Group for the instance:
- Inbound:
- TCP 4200 from your laptop’s public IP only (x.x.x.x/32)
- TCP 22 from your IP for SSH
- Outbound: allow all (default)
3) OS firewall: ensure no local firewall blocks 4200 (Amazon Linux typically allows by default).
4) Point your CLI at the EC2 address:
prefect config set PREFECT_API_URL="http://<ec2-public-dns>:4200/api"
prefect config view | grep PREFECT_API_URL
curl -s http://<ec2-public-dns>:4200/api/health
5) Strongly recommended: put Nginx/ALB in front with HTTPS
- Terminate TLS on 443 and proxy to 127.0.0.1:4200.
- Then set:
prefect config set PREFECT_API_URL="https://<your-domain>/api"
- Restrict inbound 443 to your IP(s) or your VPN.
What your CLI needs to work
- The CLI/SDK on your laptop must be able to reach PREFECT_API_URL
over TCP.
- In Prefect 2.x, the correct value is the full API endpoint with /api
, e.g. <http://host:4200/api>
or <https://domain/api>
.
- You can validate with:
prefect config view
prefect config validate
curl -s $PREFECT_API_URL/health
Notes and gotchas
- Prefect 2 Server has no built-in auth/TLS. If you expose it, use a reverse proxy and restrict ingress.
- If you only need the CLI (not the UI), you still only need port 4200 (or 443 via proxy).
- If workers/flow-runners live in the same VPC, you can keep the API private and access via VPN/SSH tunnel instead of opening to the internet.
Docs (Prefect 2.x)
- Hosting Prefect Server: Self-hosting guide
- Settings: Settings guide
If you tell me whether you prefer SSH tunnel, VPN, or public exposure with TLS, I can share an exact command/proxy config and the precise Security Group rule to add.Saad80
08/30/2025, 3:35 AMMarvin
08/30/2025, 3:35 AMMarvin
08/30/2025, 3:37 AMprefect deployment build flows/ray-flow.py:count_to \
-n ray-count \
-q default \
-o ray-deploy.yaml
# Inspect the YAML to ensure no null/None values
cat ray-deploy.yaml
# Then apply
prefect deployment apply ray-deploy.yaml
B) Using workers/work pools (newer 2.x pattern)
# Ensure a work pool exists, e.g., base worker pool name 'default-agent-pool'
prefect work-pool ls
prefect deployment build flows/ray-flow.py:count_to \
-n ray-count \
--pool default-agent-pool \
-o ray-deploy.yaml
cat ray-deploy.yaml
prefect deployment apply ray-deploy.yaml
Minimal deployment YAML template you can use to test
name: ray-count
version: "1"
work_pool_name: "default-agent-pool" # or remove this and use 'work_queue_name: default' if you use agents
job_variables: {} # keep as {}, not null
schedule: null
tags: []
parameters: {}
entrypoint: flows/ray-flow.py:count_to
description: ""
Checklist to resolve
- Ensure you’re on Prefect 2.x on both client and server (2.14.20). Run prefect version
on your laptop and on EC2.
- If you’re using workers/pools, include a non-empty work_pool_name
and ensure the pool exists (prefect work-pool ls
).
- If you’re using agents/queues, include a non-empty work_queue_name
(e.g., default
) and have an agent pointed at that queue.
- Do not set optional strings to None in YAML; either provide a string or omit the field.
- If you attempted to “recreate” an existing deployment, you might also see 409 due to slug conflict. You can delete the old one and re-apply:
prefect deployment ls
prefect deployment delete <flow-name>/<deployment-name>
prefect deployment apply ray-deploy.yaml
- Server logs on EC2 will usually include which exact field failed validation. If you can capture that log line, I can pinpoint the fix.
Docs (Prefect 2.x)
- Deployments overview: Deployments (2.x)
- Build/apply flow deployments: Block-based deployments
- Work pools and workers: Work pools
- Prefect + Ray integration: prefect-ray
If you paste your command and the YAML (or Python snippet) you used to create the deployment, I’ll tell you exactly which field is coming through as None and how to correct it.Saad80
08/30/2025, 3:38 AMMarvin
08/30/2025, 3:38 AMMarvin
08/30/2025, 3:41 AMdef build_job_env():
env = {
"AWS_DEFAULT_REGION": os.getenv("AWS_DEFAULT_REGION"),
"AWS_DEFAULT_BUCKET": os.getenv("AWS_DEFAULT_BUCKET"),
"AWS_ECR_REGISTRY": os.getenv("AWS_ECR_REGISTRY"),
"RAY_HEAD_NODE_IP": os.getenv("RAY_HEAD_NODE_IP"),
}
# Warn and drop any unset values
missing = [k for k, v in env.items() if v in (None, "")]
if missing:
print(f"Warning: missing env vars for job_variables.env: {missing} (they will be omitted)")
return {k: v for k, v in env.items() if v not in (None, "")}
# ...
wp = os.getenv('PREFECT_WORKPOOL_NAME')
print("Deploying to work pool:", wp)
img = get_ray_image_uri()
print("Image URI to use:", img)
test_ray.deploy(
name="test_ray_1",
work_pool_name=wp, # must be a non-empty string
description="Test Ray deployment",
image=DeploymentImage(
name=img, # must be a non-empty string
platform="linux/amd64",
dockerfile="Dockerfile",
buildargs={"BUILD_TIME": str(int(time.time()))},
),
job_variables={
"cpu": 1024,
"memory": 1024,
"env": build_job_env(), # ensure only strings here
}
)
Also double-check these are non-empty strings at runtime:
- work_pool_name (from PREFECT_WORKPOOL_NAME)
- image.name (comes from AWS_ECR_REGISTRY)
- name (you hard-coded "test_ray_1", good)
- Any other optional strings you pass (description is fine)
Important version note
- Your code uses Flow.deploy(...) and DeploymentImage, which are Prefect 3-style APIs. If you truly want to stay on Prefect 2.14.20 end-to-end, you’d typically use the 2.x deployment flow (build/apply YAML) rather than Flow.deploy. Please confirm the versions on both your laptop and the EC2 server:
- On laptop and EC2:
prefect version
- If they don’t both show a 3.x version, you may be mixing client/server versions which can also cause schema errors.
Next steps
- Add the filtering patch above and re-run; if it still fails, please share the stdout prints for:
- work_pool_name value
- image URI
- the final env dict you’re sending in job_variables
- If you prefer to stick with Prefect 2.x, I can show the equivalent “prefect deployment build/apply” approach for your flow. If you’re moving to 3.x, we’ll keep using Flow.deploy but ensure your server is also 3.x.
If you can also grab the server log line around the 409 on EC2, it often includes the exact field that failed validation and we can pinpoint it immediately.