How can I revoke user API Keys Prefect Community #ask-community

Join Slack

How can I revoke user API Keys?

# ask-community

Jacob Blanco

04/13/2022, 7:01 AM

How can I revoke user API Keys?

Anna Geller

04/13/2022, 9:56 AM

The easiest is via the UI. You can revoke Service Account keys on a team level here: https://cloud.prefect.io/team/service-accounts and personal user keys can be revoked here https://cloud.prefect.io/user/keys

Jacob Blanco

04/13/2022, 11:48 PM

Sorry, what I mean is, as an admin if another user creates an API key, can I revoke it?

Jacob Blanco

04/13/2022, 11:49 PM

Or maybe said in another way, can I control who has the ability to use the Cloud API??

Anna Geller

04/14/2022, 12:12 AM

What is your Cloud plan? If you are on Enterprise, you can have read-only accounts and an admin user who can control that and revoke access when needed. It's definitely possible, but this is an Enterprise feature you would need to discuss with the sales@prefect.io team to enable this on your plan/tenant and configure based on your needs.

Jacob Blanco

04/14/2022, 2:42 AM

We are on the Enterprise plan yes. OK so currently anything the user can do in Cloud UI they can do via the API, and the only way to control both is via the permissions. My concern is primarily around rate-limiting of flow run submissions. It’s much easier to submit too many flow runs accidentally in code than it is in the UI.

Anna Geller

04/14/2022, 10:21 AM

So it's not about permissions then, but rather about flow-level concurrency limits?

Anna Geller

04/14/2022, 10:24 AM

It’s much easier to submit too many flow runs accidentally in code than it is in the UI

There are two ways users may run flows: locally (Python client and CLI) and via backend. The latter includes running flows via UI, API, and even CLI. RBAC permissions and concurrency limits apply only to backend runs. Does this answer your question or is it still unclear?

Jacob Blanco

04/14/2022, 11:18 AM

Sorry for the confusion, I guess my question are all around controlling execution of flows via potentially “dangerous” means, but the approaches are different: 1. Prevent people from creating API key at all so they must use the UI or a set a schedule. 2. Limit runaway execution of flows, which can be managed by concurrency limits, although I guess that would require setting global concurrency for all flows. This is just me spitballing but it would be nice to be able to limit the rate of API requests for end-users. As in “The data scientist group can’t request more than 20 flow executions per hour” or something like that.

Anna Geller

04/14/2022, 11:55 AM

I can understand #1 to ensure e.g. that flows are only registered via CI/CD rather than manually, but I don't understand why #2 would be helpful. Do you want to add this to limit how often users talk to some database or other resources to prevent overloading external systems? I can't think of any use case when introducing such limits per team would be actually helpful, curious if you can tell more about the actual problem you try to solve.

Anna Geller

04/14/2022, 11:58 AM

but even the #2 use case is totally doable with global concurrency limits

Jacob Blanco

04/14/2022, 12:16 PM

Let me give you a concrete case, one of our data scientists asked if they can run a flow using an API Key, we said yes and showed them how. They immediately proceeded to unintentionally spawn 100 flow runs which ate all the memory in the EC2 instance we run on. Of course the preventative solution here is to run on ECS so that the memory is not being shared across all other flow. I can also imagine cases where we might have some well defined, PR reviewed services that spawn jobs using a Service Account that would be allowed to spawn a larger number of flow runs since we have more confidence that those jobs are well formed and aren’t doing something silly. Whereas end-users running something in a Jupyter Notebook which is not really peer reviewed COULD in theory make a mistake and spawn 100s or 1000s of flow runs without meaning to.

Jacob Blanco

04/14/2022, 12:18 PM

I realize these are not concrete use-cases, and I suspect a concurrency limit is sufficient to prevent disaster (although the runs would still get scheduled)

Anna Geller

04/14/2022, 12:26 PM

Thanks for sharing! This makes sense, and I believe if your agent on EC2 has a concrete label, it's easy to limit concurrency based on this label. And regarding data scientists spinning up hundreds of runs from a Jupyter notebook, I believe that: • general user education and tackling this from a process perspective may help more than restrictive rules • usually, such runs are created only locally and don't run on the agent, so this shouldn't be a problem - when the data scientists do

flow.run()

in a Jupyter notebook, this runs only on their local machine, not on EC2 (storage and run configs are ignored when running flows locally)

Anna Geller

04/14/2022, 12:28 PM

also, I got a response from the team regarding use case #1: it's possible via custom roles - this way, you can create a user role and remove the “create” API key permission

👍 1

Jacob Blanco

04/14/2022, 12:42 PM

when the data scientists do
flow.run()
in a Jupyter notebook, this runs only on their local machine, not on EC2 (storage and run configs are ignored when running flows locally)

In this case they explicitly wanted to spawn runs in our Prefect EC2 instance since the connectivity with EMR and other AWS resources is already setup and since we are in fintech we are limited in what resources we can interact with directly. I agree with your comment on education, but people make mistakes and I would prefer people feel that they have the freedom to work without necessarily breaking anything in case they make a mistake. I would also prefer that someone with an API key couldn’t DDoS our Prefect instance and kill a bunch of critical jobs. Anyways, thanks for the information. Lots of food for thought here. If I come up with more insights I’ll share back.

🙌 1

4 Views

Open in Slack

Previous Next