hagai
07/24/2025, 8:27 AM➜ ~ prefect config view
PREFECT_PROFILE='default'
PREFECT_API_URL='<https://prefect.some.com>' (from profile)
PREFECT_SERVER_API_HOST='<http://prefect.some.com|prefect.some.com>' (from profile)
PREFECT_SERVER_API_PORT='443' (from profile)
When I run any Prefect CLI command I get this:
➜ ~ prefect work-pool ls
Traceback (most recent call last):
File "/opt/homebrew/lib/python3.10/site-packages/prefect/cli/_utilities.py", line 42, in wrapper
return fn(*args, **kwargs)
File "/opt/homebrew/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 311, in coroutine_wrapper
return call()
File "/opt/homebrew/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 432, in __call__
return self.result()
File "/opt/homebrew/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 318, in result
return self.future.result(timeout=timeout)
File "/opt/homebrew/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 179, in result
return self.__get_result()
File "/opt/homebrew/Cellar/python@3.10/3.10.17/Frameworks/Python.framework/Versions/3.10/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/opt/homebrew/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 389, in _run_async
result = await coro
File "/opt/homebrew/lib/python3.10/site-packages/prefect/cli/work_pool.py", line 284, in ls
pools = await client.read_work_pools()
File "/opt/homebrew/lib/python3.10/site-packages/prefect/client/orchestration.py", line 2662, in read_work_pools
response = await <http://self._client.post|self._client.post>("/work_pools/filter", json=body)
File "/opt/homebrew/lib/python3.10/site-packages/httpx/_client.py", line 1859, in post
return await self.request(
File "/opt/homebrew/lib/python3.10/site-packages/httpx/_client.py", line 1540, in request
return await self.send(request, auth=auth, follow_redirects=follow_redirects)
File "/opt/homebrew/lib/python3.10/site-packages/prefect/client/base.py", line 327, in send
response = await self._send_with_retry(
File "/opt/homebrew/lib/python3.10/site-packages/prefect/client/base.py", line 243, in _send_with_retry
await self._add_csrf_headers(request=request)
File "/opt/homebrew/lib/python3.10/site-packages/prefect/client/base.py", line 393, in _add_csrf_headers
token: CsrfToken = CsrfToken.parse_obj(token_response.json())
File "/opt/homebrew/lib/python3.10/site-packages/httpx/_models.py", line 832, in json
return jsonlib.loads(self.content, **kwargs)
File "/opt/homebrew/Cellar/python@3.10/3.10.17/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/__init__.py", line 346, in loads
return _default_decoder.decode(s)
File "/opt/homebrew/Cellar/python@3.10/3.10.17/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/opt/homebrew/Cellar/python@3.10/3.10.17/Frameworks/Python.framework/Versions/3.10/lib/python3.10/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
An exception occurred.
Marvin
07/24/2025, 8:27 AMMarvin
07/24/2025, 8:28 AMhttps://
- But your PREFECT_SERVER_API_HOST is showing both http://
and https://
(<http://prefect.some.com|prefect.some.com>
)
2. The JSONDecodeError you're seeing typically occurs when the server isn't responding with valid JSON, which often means the connection isn't being established properly.
Here are the steps to fix this:
1. First, clean up your configuration. Run these commands to set a consistent configuration:
bash
prefect config set PREFECT_API_URL="<https://prefect.some.com>"
prefect config set PREFECT_SERVER_API_HOST="<http://prefect.some.com|prefect.some.com>"
prefect config set PREFECT_SERVER_API_PORT="443"
2. Since you're using HTTPS (port 443), ensure your environment has the proper SSL configuration:
bash
# If you're using a custom CA certificate
export SSL_CERT_FILE=/path/to/your/cert.pem
# If you need to disable SSL verification (not recommended for production)
export PREFECT_API_DISABLE_SSL_VALIDATION=true
3. Verify your connection can reach the server:
bash
# Test the connection
curl -v <https://prefect.some.com/api/health>
# If you need to skip SSL verification for testing
curl -v -k <https://prefect.some.com/api/health>
4. If you're behind a corporate proxy, make sure to set:
bash
export HTTPS_PROXY="your-proxy-url"
5. Double-check that your NLB is properly configured:
- Ensure the health checks are configured correctly
- Verify the target group is pointing to port 4200 on your EKS pods
- Confirm the security groups allow traffic from your client to the NLB on port 443
If these steps don't resolve the issue, please try:
1. Running curl
against your endpoint to see the raw response
2. Checking the Prefect server logs in your EKS cluster for any errors
3. Verifying that the NLB's target group health checks are passing
Could you please:
1. Try the configuration changes above and let me know if they help
2. Share the output of curl -v <https://prefect.some.com/api/health>
(with any sensitive information redacted)
3. Check if there are any relevant errors in your Prefect server logs?
This will help us narrow down whether this is a network configuration issue, an SSL issue, or a server configuration problem.hagai
07/24/2025, 9:38 AMcurl
command:
~ curl -v <https://prefect.some.com/api/health>
* Host <http://prefect.some.com:443|prefect.some.com:443> was resolved.
* IPv6: (none)
* IPv4: 10.15.14.138
* Trying 10.15.14.138:443...
* Connected to <http://prefect.some.com|prefect.some.com> (10.15.14.138) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
* CAfile: /etc/ssl/cert.pem
* CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 / [blank] / UNDEF
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
* subject: CN=*.<http://some.com|some.com>
* start date: Jun 16 00:00:00 2025 GMT
* expire date: Jul 15 23:59:59 2026 GMT
* subjectAltName: host "<http://prefect.some.com|prefect.some.com>" matched cert's "*.<http://some.com|some.com>"
* issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M02
* SSL certificate verify ok.
* using HTTP/1.x
> GET /api/health HTTP/1.1
> Host: <http://prefect.some.com|prefect.some.com>
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200 OK
< date: Thu, 24 Jul 2025 09:33:52 GMT
< server: uvicorn
< content-length: 4
< content-type: application/json
<
* Connection #0 to host <http://prefect.some.com|prefect.some.com> left intact
true%
3. No error in EKSMarvin
07/24/2025, 9:38 AMMarvin
07/24/2025, 9:38 AMbash
curl -v <https://prefect.some.com/api/csrf>
Could you run this command and share the output?
Also, since you're using an NLB, there are a few things we should verify:
1. Is your NLB configured with TLS termination, or is it just passing through the TLS traffic?
2. In your Kubernetes service configuration for Prefect, could you confirm:
- The service type (ClusterIP/LoadBalancer)
- The port mappings
- Any annotations
Could you share your Kubernetes service configuration (with any sensitive information redacted)?
The fact that the health check works but other endpoints fail suggests one of these possibilities:
1. The NLB might be incorrectly handling certain paths
2. There might be a mismatch in how TLS is being terminated and forwarded
3. There could be an issue with how headers are being passed through the NLB
While we wait for those details, let's try one more thing. Could you temporarily modify your configuration to test with HTTP instead of HTTPS (just for troubleshooting):
bash
prefect config set PREFECT_API_URL="<http://prefect.some.com:443>"
prefect config set PREFECT_SERVER_API_HOST="<http://prefect.some.com|prefect.some.com>"
prefect config set PREFECT_SERVER_API_PORT="443"
Then try running prefect work-pool ls
again. This will help us determine if the issue is specific to HTTPS handling.
Also, could you share your Prefect server deployment configuration (particularly any environment variables set for the server)?hagai
07/24/2025, 9:41 AM~ curl -v <https://prefect.some.com/api/csrf>
* Host <http://prefect.some.com:443|prefect.some.com:443> was resolved.
* IPv6: (none)
* IPv4: 10.15.14.138
* Trying 10.15.14.138:443...
* Connected to <http://prefect.some.com|prefect.some.com> (10.15.14.138) port 443
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
* CAfile: /etc/ssl/cert.pem
* CApath: none
* (304) (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS handshake, Certificate (11):
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
* TLSv1.2 (IN), TLS handshake, Server finished (14):
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256 / [blank] / UNDEF
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
* subject: CN=*.<http://some.com|some.com>
* start date: Jun 16 00:00:00 2025 GMT
* expire date: Jul 15 23:59:59 2026 GMT
* subjectAltName: host "<http://prefect.some.com|prefect.some.com>" matched cert's "*.<http://some.com|some.com>"
* issuer: C=US; O=Amazon; CN=Amazon RSA 2048 M02
* SSL certificate verify ok.
* using HTTP/1.x
> GET /api/csrf HTTP/1.1
> Host: <http://prefect.some.com|prefect.some.com>
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 404 Not Found
< date: Thu, 24 Jul 2025 09:38:59 GMT
< server: uvicorn
< content-length: 22
< content-type: application/json
<
* Connection #0 to host <http://prefect.some.com|prefect.some.com> left intact
{"detail":"Not Found"}%
Marvin
07/24/2025, 9:41 AMMarvin
07/24/2025, 9:42 AMPREFECT_SERVER_CSRF_PROTECTION_ENABLED
2. Try setting this explicitly in your server deployment:
yaml
env:
- name: PREFECT_SERVER_CSRF_PROTECTION_ENABLED
value: "false"
3. Another approach would be to modify your client configuration to disable CSRF validation temporarily (for testing):
bash
export PREFECT_CLIENT_CSRF_VALIDATION_ENABLED=false
Then try your command again:
bash
prefect work-pool ls
Could you also share:
1. Your Kubernetes service configuration for the Prefect server
2. Is the NLB doing TLS termination, or is it just passing through the TLS traffic?
3. The result of trying the command with CSRF validation disabled as suggested above
This will help us determine if:
a) The CSRF configuration is the root cause
b) There's a routing issue with the NLB
c) There's a mismatch in how the server is configured versus how it's being accessed
Also, just to verify - when you access the Prefect UI through your browser at <https://prefect.some.com>
, does it work correctly?hagai
07/24/2025, 9:45 AMPREFECT_SERVER_CSRF_PROTECTION_ENABLED: false
In my client I ran:
export PREFECT_CLIENT_CSRF_VALIDATION_ENABLED=false
but still got the same error.
The NLP is doing TLS termination. The traffic is https from client to NLB and from NLB to Prefect server it’s http on port 4200Marvin
07/24/2025, 9:45 AMMarvin
07/24/2025, 9:46 AMbash
# Clear existing configuration first
prefect config unset PREFECT_API_URL
prefect config unset PREFECT_SERVER_API_HOST
prefect config unset PREFECT_SERVER_API_PORT
# Set new configuration
prefect config set PREFECT_API_URL="<https://prefect.some.com>"
prefect config set PREFECT_SERVER_API_HOST="0.0.0.0"
prefect config set PREFECT_SERVER_API_PORT="4200"
# Additional settings that might help
export PREFECT_CLIENT_CSRF_VALIDATION_ENABLED=false
export PREFECT_API_DISABLE_SSL_VALIDATION=true
Also, could you share your Kubernetes service configuration? I want to make sure the service is properly configured to handle the forwarded traffic from the NLB.
The expected configuration should look something like this:
yaml
apiVersion: v1
kind: Service
metadata:
name: prefect-server
annotations:
<http://service.beta.kubernetes.io/aws-load-balancer-type|service.beta.kubernetes.io/aws-load-balancer-type>: nlb
<http://service.beta.kubernetes.io/aws-load-balancer-ssl-cert|service.beta.kubernetes.io/aws-load-balancer-ssl-cert>: "your-cert-arn"
<http://service.beta.kubernetes.io/aws-load-balancer-ssl-ports|service.beta.kubernetes.io/aws-load-balancer-ssl-ports>: "443"
spec:
ports:
- port: 443
targetPort: 4200
protocol: TCP
type: LoadBalancer
Could you also try this quick test to verify the internal HTTP communication is working:
bash
# Test the internal HTTP endpoint (if you have access to the cluster network)
curl -v <http://prefect-server-service:4200/api/health>
Additionally, could you check your Prefect server logs after attempting to run a command? Even though you mentioned there are no errors, there might be some relevant INFO or DEBUG level messages that could help us understand what's happening when the requests come in.
Let me know:
1. If the new configuration changes help
2. Your Kubernetes service configuration (if different from what I showed above)
3. Any relevant log entries from the Prefect server when you try to run commandshagai
07/24/2025, 9:56 AMapiVersion: v1
kind: Service
metadata:
annotations:
<http://service.beta.kubernetes.io/aws-load-balancer-name|service.beta.kubernetes.io/aws-load-balancer-name>: <load-balancer-name>
<http://service.beta.kubernetes.io/aws-load-balancer-nlb-target-type|service.beta.kubernetes.io/aws-load-balancer-nlb-target-type>: ip
<http://service.beta.kubernetes.io/aws-load-balancer-scheme|service.beta.kubernetes.io/aws-load-balancer-scheme>: internal
<http://service.beta.kubernetes.io/aws-load-balancer-ssl-cert|service.beta.kubernetes.io/aws-load-balancer-ssl-cert>: <cert-arn>
<http://service.beta.kubernetes.io/aws-load-balancer-ssl-ports|service.beta.kubernetes.io/aws-load-balancer-ssl-ports>: "443"
<http://service.beta.kubernetes.io/aws-load-balancer-type|service.beta.kubernetes.io/aws-load-balancer-type>: nlb
labels:
<http://app.kubernetes.io/component|app.kubernetes.io/component>: server
<http://app.kubernetes.io/instance|app.kubernetes.io/instance>: prefect-server
<http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: Helm
<http://app.kubernetes.io/name|app.kubernetes.io/name>: prefect-server
<http://app.kubernetes.io/version|app.kubernetes.io/version>: 2.20.4
prefect-version: 2.20.4
name: prefect-server
namespace: prefect-server
spec:
loadBalancerClass: service.k8s.aws/nlb
ports:
- name: server-svc-port
nodePort: 31440
port: 443
protocol: TCP
targetPort: 4200
Marvin
07/24/2025, 9:56 AMMarvin
07/24/2025, 9:56 AMbash
curl -v -H "Accept: application/json" -H "X-Forwarded-Proto: https" <https://prefect.some.com/api/work_pools/filter> -d '{}'
2. Also, let's check your Prefect server deployment configuration. Could you share:
- The relevant parts of your Prefect server deployment YAML (particularly the environment variables)
- The output of kubectl describe pod <prefect-server-pod-name> -n prefect-server
3. Try setting these additional environment variables in your client environment:
bash
export PREFECT_CLIENT_CONNECT_TIMEOUT=30
export PREFECT_CLIENT_RETRY_EXTRA_CODES="404"
The fact that the health check endpoint works but other endpoints don't suggests this might be related to how the NLB is handling the path routing or how the Prefect server is configured to handle its base URL. Could you also confirm:
1. Are you using any ingress controller in addition to the NLB?
2. What's the value of PREFECT_SERVER_API_BASE_URL
in your server deployment (if set)?