< Marvin> For Prefect Cloud what is the limit of API request Prefect Community #ask-marvin

Join Slack

<@ULVA73B9P> For Prefect Cloud, what is the limit ...

# ask-marvin

Daniel Chan

07/15/2025, 6:04 PM

@Marvin For Prefect Cloud, what is the limit of API requests per minute for Enterprise?

Marvin

07/15/2025, 6:05 PM

✅ thought for 14.0 seconds

Marvin

07/15/2025, 6:05 PM

Based on the research, I need to inform you that the specific API rate limits for Prefect Cloud Enterprise accounts are not publicly documented. While we know that: - Free tier: 400 requests/minute for flow runs, task runs, and flows endpoints - Pro tier: 2,000 requests/minute for these endpoints - Logs have separate limits (700/min for Free, 10,000/min for Pro) For Enterprise tier limits, you would need to contact Prefect support directly as these limits may be customized based on your specific Enterprise agreement. You can find the general information about rate limits in the documentation here: https://docs-3.prefect.io/v3/concepts/rate-limits If you hit a rate limit, the API will return a 429 HTTP response with a Retry-After header indicating when you can resume making requests. If you need the specific Enterprise tier limits for your use case, I'd recommend reaching out to your Prefect account representative or Prefect support for detailed information.

Daniel Chan

07/15/2025, 6:20 PM

@Marvin Do you know if I can reduce the calls being made to

/api/admin/version

endpoint path?

Marvin

07/15/2025, 6:20 PM

✅ thought for 37.9 seconds

Marvin

07/15/2025, 6:21 PM

Based on my research, I don't see any direct configuration option to reduce or disable calls to the

/api/admin/version

endpoint. This endpoint is part of Prefect's core functionality for version checking and API compatibility. However, I can suggest a few general approaches to minimize the impact: 1. If you're experiencing rate limiting issues due to these calls, you might want to: - Consider upgrading your Prefect Cloud tier if you're hitting limits frequently - Reduce the number of concurrent clients/workers if possible - Implement client-side caching of the version information in your application code 2. If you're concerned about the frequency of these calls specifically, you might want to: - Monitor and log when these calls are being made to understand the pattern - Consider consolidating your Prefect client instances if you have multiple instances running Since this seems to be a specific concern, you might want to: 1. Open a feature request on the Prefect GitHub repository to request configuration options for this behavior 2. Reach out to Prefect support (especially if you're an Enterprise customer) to discuss your specific use case and potential solutions Would you mind sharing what specific issue you're running into with these version endpoint calls? Are you hitting rate limits, or is there another concern? This would help me provide more targeted advice.

Daniel Chan

07/15/2025, 6:23 PM

@Marvin I am running Prefect 3.4.7 in a GCP GKE cluster, and am noticing 10x calls to

/api/admin/version

more than any of my other non log related calls. I believe this to be negatively affecting my Prefect deployment, as many of these calls are causing other calls to return 5xx. Is there a way to reduce the rate of how many calls are being made to

/api/admin/version

, even if this is a core requirement?

Marvin

07/15/2025, 6:23 PM

✅ thought for 56.7 seconds

Marvin

07/15/2025, 6:24 PM

Given your specific situation with high-frequency

/api/admin/version

calls causing 5xx errors in a GKE environment, here are a few potential approaches: 1. Client Configuration - While there's no direct setting to control the version check frequency, you can try setting

PREFECT_CLIENT_RETRY_EXTRA_CODES="500,502,503,504"

as an environment variable. This will help handle 5xx errors more gracefully. 2. Kubernetes-specific Optimizations - Ensure your Kubernetes pods have appropriate resource limits and requests set - Consider implementing a network policy to rate limit these specific API calls at the cluster level - If you're running multiple workers/agents, you might want to adjust the number of concurrent workers to reduce the overall API load 3. Monitoring and Debugging - Enable detailed logging to understand exactly when and where these version checks are being triggered - Set

PREFECT_LOGGING_LEVEL="DEBUG"

to get more information about the API calls - Monitor the correlation between version check calls and your 5xx errors 4. Temporary Workaround If this is severely impacting your operations, you could potentially use a reverse proxy (like nginx) in front of your Prefect API calls to: - Cache the

/api/admin/version

endpoint responses - Implement rate limiting for this specific endpoint - Return cached responses when rate limits are exceeded 5. Best Practices - Ensure you're not creating unnecessary Prefect client instances - Consider implementing circuit breakers in your application code for API calls - Monitor and adjust your pod's readiness/liveness probe settings if they might be contributing to the load Given that this is causing significant issues in your production environment, I would recommend: 1. Opening a GitHub issue at https://github.com/PrefectHQ/prefect describing this specific problem, including: - Your exact setup - The frequency of these calls - The impact on your system - Any patterns you've noticed 2. If you're a Prefect Cloud customer, contact Prefect support directly as they might have additional recommendations or be able to adjust rate limits for your account. Would you like me to provide more detailed information about implementing any of these suggestions? Also, it would be helpful to know: - How many workers/agents are you running in your GKE cluster? - Are you seeing any patterns in when these version checks occur? - Are you using any custom deployment configurations that might be creating multiple client instances?

3 Views

Open in Slack

Previous Next