Daniel Soares
08/06/2025, 11:22 AMRefreshError(TransportError('Failed to retrieve <http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform> from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\t<https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to>\\n\\n"', <google_auth_httplib2._Response object at 0x7de56c381c50>))
I have a gcp service account attached to the prefect-worker pods with the roles "roles/run.developer",
"roles/run.invoker",
"roles/iam.serviceAccountTokenCreator" and also with the workload identity binding to the GKE workload identity service account.
What can I do to solve this?Marvin
08/06/2025, 11:22 AMMarvin
08/06/2025, 11:23 AMroles/iam.serviceAccountUser
- roles/cloudrun.developer
2. Workload Identity Configuration:
The error suggests the Workload Identity setup isn't complete. Here's what to verify:
a. Ensure the GKE cluster has Workload Identity enabled:
bash
gcloud container clusters describe YOUR_CLUSTER_NAME --zone=YOUR_ZONE | grep workloadPool
b. Verify the Kubernetes service account annotation:
bash
kubectl describe serviceaccount prefect-worker -n YOUR_NAMESPACE
It should have an annotation like:
yaml
<http://iam.gke.io/gcp-service-account=YOUR-GSA@YOUR-PROJECT.iam.gserviceaccount.com|iam.gke.io/gcp-service-account=YOUR-GSA@YOUR-PROJECT.iam.gserviceaccount.com>
c. Verify the IAM binding between the GCP service account and Kubernetes service account:
bash
gcloud iam service-accounts get-iam-policy <mailto:YOUR-GSA@YOUR-PROJECT.iam.gserviceaccount.com|YOUR-GSA@YOUR-PROJECT.iam.gserviceaccount.com>
You should see a binding like:
yaml
roles/iam.workloadIdentityUser:
members:
- serviceAccount:YOUR-PROJECT.svc.id.goog[YOUR-NAMESPACE/prefect-worker]
3. Pod Configuration:
Ensure your pod spec includes:
yaml
serviceAccountName: prefect-worker # or whatever your K8s service account name is
4. Verify the Setup:
You can test the Workload Identity configuration by running:
bash
kubectl run -it \
--serviceaccount=prefect-worker \
--image=google/cloud-sdk:slim \
--namespace=YOUR_NAMESPACE \
workload-identity-test -- /bin/bash
# Inside the pod:
gcloud auth list
5. Common Fixes:
- If the binding isn't working, recreate it:
bash
gcloud iam service-accounts add-iam-policy-binding <mailto:YOUR-GSA@YOUR-PROJECT.iam.gserviceaccount.com|YOUR-GSA@YOUR-PROJECT.iam.gserviceaccount.com> \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:YOUR-PROJECT.svc.id.goog[YOUR-NAMESPACE/prefect-worker]"
- Ensure the namespace matches exactly in all configurations
- Double-check there are no typos in service account names
- Verify the GCP service account exists and is enabled
The error message specifically mentions iam.serviceAccounts.getAccessToken
, which suggests the Workload Identity binding isn't properly configured or the pod isn't using the correct Kubernetes service account.
Would you like me to provide more specific guidance about any of these areas? It would be helpful if you could share:
1. The exact namespace you're using
2. The name of your Kubernetes service account
3. The full name of your GCP service account
This would allow me to provide more targeted troubleshooting steps.Daniel Soares
08/06/2025, 11:48 AMMarvin
08/06/2025, 11:49 AMMarvin
08/06/2025, 11:50 AMyaml
- PREFECT_API_KEY
- PREFECT_API_URL
- GCP_REGION # The region where Cloud Run jobs will be created
2. Worker Configuration Variables:
When creating the work pool, make sure these are set:
yaml
- project_name # Your GCP project
- location # GCP region
- cpu # CPU allocation for Cloud Run jobs
- memory # Memory allocation for Cloud Run jobs
3. GCP-specific Configuration:
The worker needs to know which service account to use for creating Cloud Run jobs. This can be specified in two ways:
yaml
- service_account_name: <mailto:your-service-account@your-project.iam.gserviceaccount.com|your-service-account@your-project.iam.gserviceaccount.com>
or
yaml
- service_account_email: <mailto:your-service-account@your-project.iam.gserviceaccount.com|your-service-account@your-project.iam.gserviceaccount.com>
Can you verify:
1. Are all these environment variables set in your worker pod?
2. What values do you have configured for your work pool?
You can check the worker pod's environment variables with:
bash
kubectl exec -it <worker-pod-name> -- env | grep PREFECT
And you can check the work pool configuration with:
bash
prefect work-pool inspect <your-workpool-name>
Also, since you mentioned the worker pod is receiving the flow run but failing to create the Cloud Run job, can you check the worker logs? This might give us more context about where exactly the authentication is failing:
bash
kubectl logs <worker-pod-name>
The error you're seeing suggests that even though Workload Identity is working at the pod level, there might be an issue with how the worker is trying to authenticate when creating the Cloud Run job. This could be because:
1. The worker is not using the Workload Identity credentials correctly
2. The service account configuration in the work pool doesn't match the Workload Identity setup
3. There might be missing environment variables that the worker needs to properly authenticate
Can you share these details so we can pinpoint the exact issue?Daniel Soares
08/06/2025, 12:00 PMInstalling prefect-gcp...
Using Python 3.11.13 environment at: /usr/local
Resolved 116 packages in 1.56s
Downloading google-api-python-client (13.1MiB)
Downloading prefect (5.8MiB)
Downloading apprise (1.3MiB)
Downloading grpcio (5.9MiB)
Downloading cryptography (4.3MiB)
Downloading sqlalchemy (3.1MiB)
Downloading uv (18.0MiB)
Downloading apprise
Downloading grpcio
Downloading sqlalchemy
Downloading cryptography
Downloading uv
Downloading prefect
Downloading google-api-python-client
Prepared 34 packages in 1.55s
error: failed to remove file `/usr/local/lib/python3.11/site-packages/cachetools-6.1.0.dist-info/INSTALLER`: Read-only file system (os error 30)
Worker 'CloudRunWorkerV2 8b50ccb0-097a-4d9c-937d-3f0f01179363' started!
11:47:10.930 | INFO | prefect.flow_runs.worker - Worker 'CloudRunWorkerV2 8b50ccb0-097a-4d9c-937d-3f0f01179363' submitting flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4'
11:47:11.096 | WARNING | prefect.flow_runs.worker - PREFECT_API_AUTH_STRING is provided as a plaintext environment variable. For better security, consider providing it as a secret using 'prefect_api_auth_string_secret' or 'env_from_secrets' in your base job template.
11:47:11.139 | INFO | prefect.flow_runs.worker - Creating Cloud Run JobV2 tasteful-coati-985e9381fc6245028197305f4db94333
11:47:11.154 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout. Set the timeout when constructing the httplib2.Http instance.
11:47:11.173 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout. Set the timeout when constructing the httplib2.Http instance.
11:47:11.299 | ERROR | prefect.flow_runs.worker - Failed to submit flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' to infrastructure.
Traceback (most recent call last):
File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/credentials.py", line 128, in refresh
self.token, self.expiry = _metadata.get_service_account_token(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/_metadata.py", line 375, in get_service_account_token
token_json = get(request, path, params=params, headers=metrics_header)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/_metadata.py", line 267, in get
raise exceptions.TransportError(
google.auth.exceptions.TransportError: ('Failed to retrieve <http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform> from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\t<https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to>\\n\\n"', <google_auth_httplib2._Response object at 0x7982bb1b5d10>)
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/prefect/.local/lib/python3.11/site-packages/prefect/workers/base.py", line 1262, in _submit_run_and_capture_errors
result = await self.run(
^^^^^^^^^^^^^^^
File "/home/prefect/.local/lib/python3.11/site-packages/prefect_gcp/workers/cloud_run_v2.py", line 643, in run
await run_sync_in_worker_thread(
File "/home/prefect/.local/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 233, in run_sync_in_worker_thread
result = await anyio.to_thread.run_sync(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2470, in run_sync_in_worker_thread
return await future
^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 967, in run
result = context.run(func, *args)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/prefect/.local/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 243, in call_with_mark
return call()
^^^^^^
File "/home/prefect/.local/lib/python3.11/site-packages/prefect_gcp/workers/cloud_run_v2.py", line 714, in _create_job_and_wait_for_registration
JobV2.create(
File "/home/prefect/.local/lib/python3.11/site-packages/prefect_gcp/models/cloud_run_v2.py", line 167, in create
response = request.execute()
^^^^^^^^^^^^^^^^^
File "/home/prefect/.local/lib/python3.11/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
return wrapped(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/prefect/.local/lib/python3.11/site-packages/googleapiclient/http.py", line 923, in execute
resp, content = _retry_request(
^^^^^^^^^^^^^^^
File "/home/prefect/.local/lib/python3.11/site-packages/googleapiclient/http.py", line 191, in _retry_request
resp, content = http.request(uri, method, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/prefect/.local/lib/python3.11/site-packages/google_auth_httplib2.py", line 209, in request
self.credentials.before_request(self._request, method, uri, request_headers)
File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/credentials.py", line 239, in before_request
self._blocking_refresh(request)
File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/credentials.py", line 202, in _blocking_refresh
self.refresh(request)
File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/credentials.py", line 133, in refresh
raise new_exc from caught_exc
google.auth.exceptions.RefreshError: ('Failed to retrieve <http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform> from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\t<https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to>\\n\\n"', <google_auth_httplib2._Response object at 0x7982bb1b5d10>)
11:47:11.349 | INFO | prefect.flow_runs.worker - Reported flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' as crashed: Flow run could not be submitted to infrastructure:
RefreshError(TransportError('Failed to retrieve <http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform> from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\t<https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to>\\n\\n"', <google_auth_httplib2._Response object at 0x7982bb1b5d10>))
Wait for more info on the next messageMarvin
08/06/2025, 12:00 PM/usr/local/lib/python3 11/site-packages/cachetools-6 1 0 dist-info/INSTALLER
Read-only file system (os error 30) Worker 'CloudRunWorkerV2 8b50ccb0-097a-4d9c-937d-3f0f01179363' started! 11 47 10 930 | INFO | prefect flow_runs worker - Worker 'CloudRunWorkerV2 8b50ccb0-097a-4d9c-937d-3f0f01179363' submitting flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' 11 47 11 096 | WARNING | prefect flow_runs worker - PREFECT_API_AUTH_STRING is provided as a plaintext environment variable For better security consider providing it as a secret using 'prefect_api_auth_string_secret' or 'env_from_secrets' in your base job template 11 47 11 139 | INFO | prefect flow_runs worker - Creating Cloud Run JobV2 tasteful-coati-985e9381fc6245028197305f4db94333 11 47 11 154 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout Set the timeout when constructing the httplib2 Http instance 11 47 11 173 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout Set the timeout when constructing the httplib2 Http instance 11 47 11 299 | ERROR | prefect flow_runs worker - Failed to submit flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' to infrastructure Traceback (most recent call last) File /home/prefect/ local/lib/python3 11/site-packages/google/auth/compute_engine/credentials py line 128 in refresh self token self expiry = _metadata get_service_account_token( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File /home/prefect/ local/lib/python3 11/site-packages/google/auth/compute_engine/_metadata py line 375 in get_service_account_token token_json = get(request path params=params headers=metrics_header) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File /home/prefect/ local/lib/python3 11/site-packages/google/auth/compute_engine/_metadata py line 267 in get raise exceptions TransportError( google auth exceptions TransportError ('Failed to retrieve <http //metadata google internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww googleapis com%2Fauth%2Fcloud-platform> from the Google Compute Engine metadata service Status 403 Response \nb Unable to generate access token; IAM returned 403 Forbidden Permission \'iam serviceAccounts getAccessToken\' denied on resource (or it may not exist) \\nThis error could be caused by a missing IAM policy binding on the target IAM service account \\nFor more information refer to the Workload Identity documentation \\n\\t<https //cloud google com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to>\\n\\n ' <google_auth_httplib2 _Response object at 0x7982bb1b5d10>) The above exception was the direct cause of the following exception Traceback (most recent call last) File /home/prefect/ local/lib/python3 11/site-packages/prefect/workers/base py line 1262 in _submit_run_and_capture_errors result = await self run( ^^^^^^^^^^^^^^^ File /home/prefect/ local/lib/python3 11/site-packages/prefect_gcp/workers/cloud_run_v2 py line 643 in run await run_sync_in_worker_thread( File /home/prefect/ local/lib/python3 11/site-packages/prefect/utilities/asyncutils py line 233 in run_sync_in_worker_thread result = await anyio to_thread run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File /usr/local/lib/python3 11/site-packages/anyio/to_thread py line 56 in run_sync return await get_async_bMarvin
08/06/2025, 12:00 PMDaniel Soares
08/06/2025, 12:01 PMWorkPool(
id='eab22801-5ef4-4e26-bb04-8476a9d5388d',
created=DateTime(2025, 7, 25, 9, 47, 15, 851829, tzinfo=Timezone('UTC')),
updated=DateTime(2025, 8, 6, 10, 41, 48, 678144, tzinfo=Timezone('UTC')),
name='base',
description='Main workpool for Prefect workloads',
type='cloud-run-v2',
base_job_template={
'variables': {
'type': 'object',
'properties': {
'cpu': {
'type': 'string',
'title': 'CPU',
'default': '1000m',
'description': 'The CPU to allocate to the Cloud Run job.'
},
'env': {
'type': 'object',
'title': 'Environment Variables',
'description': 'Environment variables to set when starting a flow
run.',
'additionalProperties': {'type': 'string'}
},
'args': {
'type': 'array',
'items': {'type': 'string'},
'title': 'Args',
'description': "The arguments to pass to the Cloud Run Job V2's
entrypoint command."
},
'image': {
'type': 'string',
'title': 'Image Name',
'default': 'prefecthq/prefect:2-latest',
'description': 'The image to use for the Cloud Run job. If not
provided the default Prefect image will be used.'
},
'memory': {
'type': 'string',
'title': 'Memory',
'default': '512Mi',
'example': '512Mi',
'pattern': '^\\d+(?:G|Gi|M|Mi)$',
'description': 'The memory to allocate to the Cloud Run job along
with the units, whichcould be: G, Gi, M, Mi.'
},
'region': {
'type': 'string',
'title': 'Region',
'default': 'europe-west6',
'description': 'The region in which to run the Cloud Run job'
},
'timeout': {
'type': 'integer',
'title': 'Job Timeout',
'default': 600,
'maximum': 86400,
'description': 'The length of time that Prefect will wait for a
Cloud Run Job to complete before raising an exception (maximum of 86400 seconds, 1
day).',
'exclusiveMinimum': 0
},
'keep_job': {
'type': 'boolean',
'title': 'Keep Job After Completion',
'default': True,
'description': 'Keep the completed Cloud run job on Google Cloud
Platform.'
},
'max_retries': {
'type': 'integer',
'title': 'Max Retries',
'default': 0,
'description': 'The number of times to retry the Cloud Run job.'
}
},
'description': 'Default variables for the Cloud Run worker V2.\n\nThe
schema for this class is used to populate the `variables` section of the\ndefault base
job template.'
},
'job_configuration': {
'env': '{{ env }}',
'region': '{{ region }}',
'timeout': '{{ timeout }}',
'job_body': {
'client': 'prefect',
'template': {
'template': {
'timeout': '{{ timeout }}',
'vpcAccess': {
'connector':
'projects/my-net-project/locations/my-location/connectors/my-connector'
},
'containers': [
{
'env': [],
'args': '{{ args }}',
'image': '{{ image }}',
'resources': {
'limits': {
'cpu': '{{ cpu }}',
'memory': '{{ memory }}'
}
}
}
],
'maxRetries': '{{ max_retries }}',
'serviceAccount':
'<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>'
}
},
'launchStage': 'BETA'
},
'keep_job': '{{ keep_job }}'
}
},
status=WorkPoolStatus.READY,
storage_configuration=WorkPoolStorageConfiguration(),
default_queue_id='b86a0d36-f11d-4190-ae8c-1b705b7ec91b'
)
Output of the env vars from prefect-worker pod:
PREFECT_KUBERNETES_CLUSTER_UID=my-uid
PREFECT_API_AUTH_STRING=my-auth-string
PREFECT_API_ENABLE_HTTP2=true
PREFECT_WORKER_PREFETCH_SECONDS=10
PREFECT_WORKER_QUERY_SECONDS=5
PREFECT_API_URL=http://my-prefect.test.com/api
PREFECT_INTEGRATIONS_KUBERNETES_OBSERVER_NAMESPACES=prefect
PREFECT_DEBUG_MODE=false
PREFECT_SERVER_PORT_4200_TCP_PROTO=tcp
PREFECT_SERVER_PORT=tcp://my-internal-ip:4200
PREFECT_SERVER_PORT_4200_TCP_PORT=4200
PREFECT_SERVER_PORT_4200_TCP_ADDR=my-internal-ip
PREFECT_SERVER_SERVICE_HOST=my-internal-ip
PREFECT_SERVER_SERVICE_PORT=4200
PREFECT_SERVER_SERVICE_PORT_SERVER_SVC_PORT=4200
PREFECT_SERVER_PORT_4200_TCP=tcp://my-internal-ip:4200Marvin
08/06/2025, 12:01 PMMarvin
08/06/2025, 12:01 PMyaml
'serviceAccount': '<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>'
2. Looking at your worker pod's environment variables, I notice that some important variables are missing:
- GCP_REGION
(although this is specified in your work pool as 'europe-west6')
- GOOGLE_CLOUD_PROJECT
(your GCP project ID)
The error you're seeing suggests that when the worker tries to create a Cloud Run job, it's not properly authenticating with GCP. Here's what I recommend:
1. Add these environment variables to your worker deployment (in your Helm values or Kubernetes manifest):
yaml
env:
- name: GOOGLE_CLOUD_PROJECT
value: "your-project-id"
- name: GCP_REGION
value: "europe-west6" # match your work pool region
2. Verify the service account permissions again. The service account needs these exact roles:
- roles/run.developer
- roles/run.invoker
- roles/iam.serviceAccountUser
3. In your work pool configuration, I notice you're using a VPC connector. Make sure the service account also has permission to use this VPC connector:
- roles/vpcaccess.user
4. You can also try adding some debug environment variables to get more information:
yaml
env:
- name: PREFECT_LOGGING_LEVEL
value: "DEBUG"
- name: GOOGLE_API_USE_CLIENT_CERTIFICATE
value: "true"
5. Check if the Cloud Run API is enabled in your project:
bash
gcloud services list | grep <http://run.googleapis.com|run.googleapis.com>
If not, enable it:
bash
gcloud services enable <http://run.googleapis.com|run.googleapis.com>
Can you try adding those environment variables to your worker deployment and let me know if that resolves the issue? If not, with the debug logging enabled, we should get more detailed error messages that can help pinpoint the exact cause.
Also, could you share a redacted version of the worker logs when it tries to create a Cloud Run job? This would help us see exactly where in the process it's failing.Daniel Soares
08/06/2025, 1:26 PMMarvin
08/06/2025, 1:26 PMMarvin
08/06/2025, 1:26 PMDaniel Soares
08/06/2025, 2:04 PMMarvin
08/06/2025, 2:05 PMDaniel Soares
08/06/2025, 2:05 PMMarvin
08/06/2025, 2:05 PMDaniel Soares
08/06/2025, 2:05 PMMarvin
08/06/2025, 2:05 PMPermission 'iam.serviceAccounts.getAccessToken' denied on resource (or it may not exist)
This specific error occurs when the Workload Identity setup is incomplete. Even though you mentioned Workload Identity is working, there seems to be an issue with the token generation step.
Let's verify the complete Workload Identity setup:
1. First, check if the Kubernetes service account has the correct annotation:
bash
kubectl get serviceaccount prefect-worker -n your-namespace -o yaml
It should have:
yaml
metadata:
annotations:
<http://iam.gke.io/gcp-service-account|iam.gke.io/gcp-service-account>: <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>
2. Verify the IAM binding includes the exact format:
bash
gcloud iam service-accounts get-iam-policy <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>
It should show:
yaml
bindings:
- members:
- serviceAccount:my-project.svc.id.goog[your-namespace/prefect-worker]
role: roles/iam.workloadIdentityUser
3. Add this specific role to your GCP service account if not already present:
bash
gcloud projects add-iam-policy-binding my-project \
--member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
--role="roles/iam.serviceAccountTokenCreator"
4. In your Kubernetes deployment for the worker, ensure you have:
yaml
spec:
template:
spec:
serviceAccountName: prefect-worker # Must match the K8s service account name
nodeSelector:
<http://iam.gke.io/gke-metadata-server-enabled|iam.gke.io/gke-metadata-server-enabled>: "true"
5. Also, try adding these environment variables to your worker deployment:
yaml
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: "/var/run/secrets/tokens/gcp-token"
- name: CLOUDSDK_AUTH_CREDENTIAL_FILE_OVERRIDE
value: "/var/run/secrets/tokens/gcp-token"
Can you verify these settings and let me know if any are missing or different in your setup? Also, could you try running a test pod with the same service account to verify the Workload Identity is working:
bash
kubectl run -it \
--serviceaccount=prefect-worker \
--image=google/cloud-sdk:slim \
--namespace=your-namespace \
workload-identity-test -- /bin/bash
# Inside the pod run:
gcloud auth list
gcloud projects list
This will help us confirm if the issue is specific to the Prefect worker or if it's a general Workload Identity configuration issue.Marvin
08/06/2025, 2:05 PMDaniel Soares
08/06/2025, 2:05 PMMarvin
08/06/2025, 2:05 PMMarvin
08/06/2025, 2:05 PMMarvin
08/06/2025, 2:05 PMbash
prefect work-pool update base --update-base-job-template '{
"job_configuration": {
"job_body": {
"template": {
"template": {
"serviceAccount": null # Remove the explicit service account
}
}
}
}
}'
2. Add these specific environment variables to your worker deployment:
yaml
env:
- name: GOOGLE_CLOUD_PROJECT
value: "your-project-id"
- name: USE_GKE_GCLOUD_AUTH_PLUGIN
value: "True"
- name: GOOGLE_APPLICATION_CREDENTIALS
value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
3. Also, let's verify the worker's service account has all the necessary roles. Run these commands:
bash
# Add Cloud Run Admin role
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
--role="roles/run.admin"
# Add Service Account User role
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
--role="roles/iam.serviceAccountUser"
# Add Service Account Token Creator role
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
--role="roles/iam.serviceAccountTokenCreator"
4. In your worker pod, you can try to verify the authentication is working by running this Python code:
python
from google.auth import default
from google.cloud import run_v2
credentials, project = default()
client = run_v2.JobsClient(credentials=credentials)
# Try to list jobs
parent = f"projects/{project}/locations/europe-west6"
jobs = client.list_jobs(parent=parent)
Can you:
1. Apply these changes to your worker deployment
2. Verify all the IAM roles are properly set
3. Try running a flow again
If it still fails, can you run the Python code snippet above in your worker pod and let me know what output/error you get? This will help us understand exactly where the authentication chain is breaking.Daniel Soares
08/06/2025, 2:05 PMMarvin
08/06/2025, 2:05 PMMarvin
08/06/2025, 2:05 PMMarvin
08/06/2025, 2:05 PM<http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token>
This suggests that the worker isn't properly configured to use the Workload Identity service account. Let's fix this:
1. First, check your Kubernetes service account annotation to ensure it's exactly correct:
bash
kubectl get serviceaccount prefect-worker -n your-namespace -o yaml
2. In your work pool configuration, I notice you have the service account hardcoded in the job template. Instead, let's modify it to use the Workload Identity service account. Update your work pool configuration:
bash
prefect work-pool create-type cloud-run-v2 \
--name base \
--base-job-template '{
"job_configuration": {
"template": {
"serviceAccount": null, # Remove the hardcoded service account
"containers": [{
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu }}",
"memory": "{{ memory }}"
}
}
}]
}
}
}'
3. Add these specific environment variables to your worker deployment:
yaml
spec:
template:
spec:
serviceAccountName: prefect-worker # Your K8s service account name
containers:
- name: prefect-worker
env:
- name: GOOGLE_CLOUD_PROJECT
value: "your-project-id"
- name: USE_GKE_GCLOUD_AUTH_PLUGIN
value: "True"
- name: CLOUDSDK_CORE_PROJECT
value: "your-project-id"
4. Verify the IAM roles on your GCP service account:
bash
gcloud iam service-accounts get-iam-policy <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>
Make sure it has these exact roles:
bash
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
--role="roles/iam.serviceAccountTokenCreator"
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
--role="roles/iam.workloadIdentityUser"
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
--role="roles/run.developer"
5. Verify the Workload Identity binding:
bash
gcloud iam service-accounts add-iam-policy-binding \
<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com> \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:your-project-id.svc.id.goog[your-namespace/prefect-worker]"
6. Test the Workload Identity setup:
bash
kubectl run -it --rm \
--serviceaccount=prefect-worker \
--image=google/cloud-sdk:slim \
--namespace=your-namespace \
workload-identity-test -- bash -c \
'gcloud auth list && gcloud projects list'
The key changes here are:
1. Removing the hardcoded service account from the work pool configuration
2. Adding the necessary environment variables to the worker deployment
3. Ensuring all IAM roles are correctly set
4. Verifying the Workload Identity binding is correct
Can you try these changes and let me know if the error persists? Also, could you share:
1. The output of kubectl get serviceaccount prefect-worker -n your-namespace -o yaml
2. The output of the test command in step 6?Marvin
08/06/2025, 2:05 PMbash
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
--role="roles/run.developer"
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
--role="roles/run.invoker"
2. Check if the Cloud Run API is enabled:
bash
gcloud services enable <http://run.googleapis.com|run.googleapis.com>
3. In your work pool configuration, I notice you have a VPC connector specified. Make sure:
- The VPC connector exists and is properly configured
- The service account has permission to use the VPC connector:
bash
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
--role="roles/vpcaccess.user"
4. Let's modify your work pool configuration to add more debugging information. Update your work pool with:
bash
prefect work-pool create --type cloud-run-v2 --update base \
--job-variables '{
"region": "europe-west6",
"image": "prefecthq/prefect:2-latest",
"cpu": "1000m",
"memory": "512Mi",
"env": {
"PREFECT_LOGGING_LEVEL": "DEBUG",
"GOOGLE_CLOUD_PROJECT": "your-project-id"
}
}'
5. Can you also try running a simple test to verify the service account can create Cloud Run jobs directly? Create a test pod:
bash
kubectl run -it \
--serviceaccount=prefect-worker \
--image=google/cloud-sdk:slim \
--namespace=your-namespace \
cloud-run-test -- /bin/bash
# Inside the pod:
gcloud auth list
gcloud run jobs create test-job \
--image=<http://gcr.io/cloudrun/hello|gcr.io/cloudrun/hello> \
--region=europe-west6
If this test fails with a similar error, it would confirm that the issue is with the service account permissions rather than the Prefect worker configuration.
Can you:
1. Confirm all these permissions are in place
2. Try the test pod approach
3. Share the output of:
bash
gcloud iam service-accounts get-iam-policy <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>
This will help us determine if it's a permissions issue or a configuration issue with the worker itself.Marvin
08/06/2025, 2:06 PMyaml
env:
- name: GOOGLE_CLOUD_PROJECT
value: "your-project-id" # Replace with your actual project ID
- name: CLOUDSDK_CORE_PROJECT
value: "your-project-id" # Same as above
- name: GOOGLE_APPLICATION_CREDENTIALS
value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
- name: USE_GKE_GCLOUD_AUTH_PLUGIN
value: "True"
2. Modify your GCP service account permissions. Run these commands:
bash
# Replace with your actual service account and project
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"
export PROJECT_ID="your-project-id"
# Add necessary roles
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/iam.serviceAccountTokenCreator"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/run.admin"
# This is important - add permission to act as itself
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/iam.serviceAccountUser"
3. Update your Kubernetes service account to explicitly specify the GCP service account:
yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: prefect-worker
annotations:
<http://iam.gke.io/gcp-service-account|iam.gke.io/gcp-service-account>: "<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"
4. Make sure your worker deployment is using this service account:
yaml
spec:
template:
spec:
serviceAccountName: prefect-worker
5. Update your work pool configuration to use the default compute service account for the Cloud Run jobs:
bash
prefect work-pool update base --update-base-job-template '{
"job_configuration": {
"job_body": {
"template": {
"template": {
"serviceAccount": null
}
}
}
}
}'
After making these changes:
1. Delete the existing worker pod to force a new one with the updated configuration
2. Try running a flow again
If you still see issues, can you run this diagnostic command in the worker pod:
bash
kubectl exec -it <worker-pod> -- bash -c "curl -H 'Metadata-Flavor: Google' <http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token>"
This will help us verify if the pod can properly obtain GCP tokens.
The key difference in these changes is:
1. We're explicitly setting the GOOGLE_APPLICATION_CREDENTIALS path
2. We're ensuring the service account has permission to create tokens
3. We're removing the explicit service account from the Cloud Run job template
4. We're adding the service account's ability to act as itself
These changes should resolve the token generation issue you're seeing.Daniel Soares
08/06/2025, 3:59 PMMarvin
08/06/2025, 4:00 PMMarvin
08/06/2025, 4:00 PMtest_auth.py
and copy it into the worker pod:
bash
# Create the test file
cat << 'EOF' > test_auth.py
from google.auth import default
from google.cloud import run_v2
import google.auth.transport.requests
def test_auth():
print("1. Getting default credentials...")
credentials, project = default()
print(f"2. Project ID: {project}")
print("3. Refreshing credentials...")
request = google.auth.transport.requests.Request()
credentials.refresh(request)
print("4. Token info:", credentials.token)
print("5. Creating Cloud Run client...")
client = run_v2.JobsClient(credentials=credentials)
print("6. Trying to list jobs...")
parent = f"projects/{project}/locations/europe-west6"
jobs = client.list_jobs(parent=parent)
for job in jobs:
print(f"Found job: {job.name}")
if __name__ == "__main__":
test_auth()
EOF
# Copy it to the pod
kubectl cp test_auth.py <pod-name>:/tmp/test_auth.py -n <namespace>
# Execute it in the pod
kubectl exec -it <pod-name> -n <namespace> -- python /tmp/test_auth.py
This will help us diagnose exactly where the authentication is failing. Can you run this and share the output?
Also, let's verify the current IAM bindings for your service account. Run this outside the pod:
bash
# Replace with your actual service account email
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"
# Check IAM bindings
gcloud iam service-accounts get-iam-policy $SA_EMAIL
And check the Kubernetes service account configuration:
bash
kubectl get serviceaccount prefect-worker -n <namespace> -o yaml
Please share the outputs of:
1. The Python test script
2. The IAM policy check
3. The Kubernetes service account configuration
This will help us pinpoint exactly where the authentication chain is breaking.Daniel Soares
08/06/2025, 4:05 PMTraceback (most recent call last):
File "/tmp/test_auth.py", line 2, in <module>
from google.cloud import run_v2
ImportError: cannot import name 'run_v2' from 'google.cloud' (unknown location)
command terminated with exit code 1
2. bindings:
- members:
- serviceAccount:my-project.svc.id.goog[prefect/prefect-worker-base]
role: roles/iam.workloadIdentityUser
etag: BwY7tDND5N4=
version: 1
3.
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
<http://iam.gke.io/gcp-service-account|iam.gke.io/gcp-service-account>: <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>
<http://meta.helm.sh/release-name|meta.helm.sh/release-name>: prefect-worker-base
<http://meta.helm.sh/release-namespace|meta.helm.sh/release-namespace>: prefect
creationTimestamp: "2025-08-06T15:49:24Z"
labels:
<http://app.kubernetes.io/component|app.kubernetes.io/component>: worker
<http://app.kubernetes.io/instance|app.kubernetes.io/instance>: prefect-worker-base
<http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: Helm
<http://app.kubernetes.io/name|app.kubernetes.io/name>: prefect-worker
<http://app.kubernetes.io/version|app.kubernetes.io/version>: 3.4.10
<http://helm.sh/chart|helm.sh/chart>: prefect-worker-2025.7.28164749
prefect-version: 3.4.10
name: prefect-worker-base
namespace: prefect
resourceVersion: "1754495364822991003"
uid: 5ce72423-d044-428c-a9d9-48b1cd037ac8
Marvin
08/06/2025, 4:05 PMMarvin
08/06/2025, 4:05 PMgoogle-cloud-run
package isn't installed in the worker pod.
2. There's a mismatch in your service account names:
- Your K8s service account is named prefect-worker-base
- But your workload identity binding is looking for prefect-worker
3. Let's fix these issues:
First, update your Workload Identity binding to match the correct service account name:
bash
# Remove old binding
gcloud iam service-accounts remove-iam-policy-binding \
<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com> \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:my-project.svc.id.goog[prefect/prefect-worker]"
# Add correct binding
gcloud iam service-accounts add-iam-policy-binding \
<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com> \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:my-project.svc.id.goog[prefect/prefect-worker-base]"
Next, add these environment variables to your Helm values or worker deployment:
yaml
env:
- name: GOOGLE_CLOUD_PROJECT
value: "your-project-id" # Replace with your actual project ID
- name: CLOUDSDK_CORE_PROJECT
value: "your-project-id" # Same as above
Also, make sure your GCP service account has all required roles:
bash
export PROJECT_ID="your-project-id" # Replace with your project
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"
# Add all necessary roles
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/run.admin"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/iam.serviceAccountUser"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/iam.serviceAccountTokenCreator"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/vpcaccess.user"
Let's create a simpler test script that doesn't require additional packages:
python
from google.auth import default
import google.auth.transport.requests
def test_auth():
print("1. Getting default credentials...")
credentials, project = default()
print(f"2. Project ID: {project}")
print("3. Refreshing credentials...")
request = google.auth.transport.requests.Request()
credentials.refresh(request)
print("4. Token info:", credentials.token[:20] + "...") # Only show start of token
if __name__ == "__main__":
test_auth()
Can you:
1. Run the updated IAM commands
2. Apply the environment variables to your worker deployment
3. Try the new test script
4. Attempt to run a flow again
Also, please share your work pool configuration:
bash
prefect work-pool inspect base
This will help us ensure the Cloud Run job configuration is correct as well.Daniel Soares
08/06/2025, 4:10 PM1. Getting default credentials...
2. Project ID: my-project
3. Refreshing credentials...
4. Token info: jknkjnvbbbkjjknegrfg...
Workpool configs:
WorkPool(
id='eab22801-5ef4-4e26-bb04-8476a9d5388d',
created=DateTime(2025, 7, 25, 9, 47, 15, 851829, tzinfo=Timezone('UTC')),
updated=DateTime(2025, 8, 6, 14, 24, 50, 754629, tzinfo=Timezone('UTC')),
name='base',
description='Main workpool for Prefect workloads',
type='cloud-run-v2',
base_job_template={
'variables': {
'type': 'object',
'properties': {
'cpu': {
'type': 'string',
'title': 'CPU',
'default': '1000m',
'description': 'The CPU to allocate to the Cloud Run job.'
},
'env': {
'type': 'object',
'title': 'Environment Variables',
'description': 'Environment variables to set when starting a flow
run.',
'additionalProperties': {'type': 'string'}
},
'args': {
'type': 'array',
'items': {'type': 'string'},
'title': 'Args',
'description': "The arguments to pass to the Cloud Run Job V2's
entrypoint command."
},
'image': {
'type': 'string',
'title': 'Image Name',
'default': 'prefecthq/prefect:2-latest',
'description': 'The image to use for the Cloud Run job. If not
provided the default Prefect image will be used.'
},
'memory': {
'type': 'string',
'title': 'Memory',
'default': '512Mi',
'example': '512Mi',
'pattern': '^\\d+(?:G|Gi|M|Mi)$',
'description': 'The memory to allocate to the Cloud Run job along
with the units, whichcould be: G, Gi, M, Mi.'
},
'region': {
'type': 'string',
'title': 'Region',
'default': 'my-region',
'description': 'The region in which to run the Cloud Run job'
},
'timeout': {
'type': 'integer',
'title': 'Job Timeout',
'default': 600,
'maximum': 86400,
'description': 'The length of time that Prefect will wait for a
Cloud Run Job to complete before raising an exception (maximum of 86400 seconds, 1
day).',
'exclusiveMinimum': 0
},
'keep_job': {
'type': 'boolean',
'title': 'Keep Job After Completion',
'default': True,
'description': 'Keep the completed Cloud run job on Google Cloud
Platform.'
},
'max_retries': {
'type': 'integer',
'title': 'Max Retries',
'default': 0,
'description': 'The number of times to retry the Cloud Run job.'
}
},
'description': 'Default variables for the Cloud Run worker V2.\n\nThe
schema for this class is used to populate the variables
section of the\ndefault base
job template.'
},
'job_configuration': {
'env': '{{ env }}',
'region': '{{ region }}',
'timeout': '{{ timeout }}',
'job_body': {
'client': 'prefect',
'template': {
'template': {
'timeout': '{{ timeout }}',
'vpcAccess': {
'connector':
'projects/my-net-proj/locations/my-location/connectors/name'
},
'containers': [
{
'env': [],
'args': '{{ args }}',
'image': '{{ image }}',
'resources': {
'limits': {
'cpu': '{{ cpu }}',
'memory': '{{ memory }}'
}
}
}
],
'maxRetries': '{{ max_retries }}',
'serviceAccount': None
}
},
'launchStage': 'BETA'
},
'keep_job': '{{ keep_job }}'
}
},
status=WorkPoolStatus.READY,
storage_configuration=WorkPoolStorageConfiguration(),
default_queue_id='b86a0d36-f11d-4190-ae8c-1b705b7ec91b'
)
Triggering a flow again, now I get this error:
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://run.googleapis.com/v2/projects/my-project/locations/my-location/jobs?jobId=devout-turtle-69e6f41daa3340c99dec962e89315651&alt=json returned "The caller does not have permission". Details: "The caller does not have permission">Marvin
08/06/2025, 4:11 PMMarvin
08/06/2025, 4:11 PMbash
# Replace these with your actual values
export PROJECT_ID="my-project"
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"
# Add Cloud Run Admin role (this is needed for creating jobs)
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/run.admin"
# Add Service Account User role (needed to act as the service account)
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/iam.serviceAccountUser"
# Add VPC Access User role (since you're using a VPC connector)
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/vpcaccess.user"
Also, I notice in your work pool configuration that:
1. The region in the job configuration (my-location
) doesn't match the region variable default (my-region
)
2. The serviceAccount
is set to None
Let's update your work pool configuration:
bash
prefect work-pool update base --update-base-job-template '{
"job_configuration": {
"region": "{{ region }}",
"job_body": {
"template": {
"template": {
"serviceAccount": "<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>",
"vpcAccess": {
"connector": "projects/my-net-proj/locations/my-location/connectors/name"
}
}
}
}
}
}'
And set the correct region:
bash
prefect work-pool update base --set-base-job-variable region="my-location"
After making these changes:
1. Delete the existing worker pod to force a refresh
2. Try running your flow again
The key changes are:
1. Adding the explicit Cloud Run Admin role
2. Setting the service account explicitly in the job template
3. Ensuring the regions match between the configuration and the variable
4. Adding the VPC Access User role for the VPC connector
Can you apply these changes and let me know if you still get the permission error?Daniel Soares
08/06/2025, 4:15 PMMarvin
08/06/2025, 4:15 PMMarvin
08/06/2025, 4:16 PMbash
prefect work-pool update base --update-base-job-template '{
"job_configuration": {
"job_body": {
"template": {
"template": {
"serviceAccount": "",
"vpcAccess": {
"connector": "projects/my-net-proj/locations/my-location/connectors/name"
},
"containers": [
{
"env": [],
"resources": {
"limits": {
"cpu": "{{ cpu }}",
"memory": "{{ memory }}"
}
}
}
]
}
}
}
}
}'
2. Add these specific environment variables to your worker deployment (via Helm values):
yaml
env:
- name: GOOGLE_CLOUD_PROJECT
value: "my-project" # Your actual project ID
- name: GOOGLE_APPLICATION_CREDENTIALS
value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
- name: CLOUDSDK_AUTH_CREDENTIAL_FILE_OVERRIDE
value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
- name: USE_GKE_GCLOUD_AUTH_PLUGIN
value: "True"
- name: PREFECT_CLOUD_RUN_JOB_CREDENTIALS_MODE
value: "workload_identity"
3. Let's verify all IAM permissions are correctly set:
bash
export PROJECT_ID="my-project"
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"
# Remove all existing bindings first
for role in roles/run.admin roles/run.developer roles/run.invoker roles/iam.serviceAccountUser roles/iam.serviceAccountTokenCreator roles/vpcaccess.user; do
gcloud projects remove-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="$role"
done
# Add them back one by one
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/run.admin"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/iam.serviceAccountUser"
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/vpcaccess.user"
# Add workload identity binding
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
--role="roles/iam.workloadIdentityUser" \
--member="serviceAccount:$PROJECT_ID.svc.id.goog[prefect/prefect-worker-base]"
4. Verify the Cloud Run API is enabled:
bash
gcloud services enable <http://run.googleapis.com|run.googleapis.com>
5. After applying these changes:
bash
# Delete the worker pod to force a refresh
kubectl delete pod -l <http://app.kubernetes.io/name=prefect-worker|app.kubernetes.io/name=prefect-worker> -n prefect
# Wait for the new pod to be ready
kubectl wait --for=condition=ready pod -l <http://app.kubernetes.io/name=prefect-worker|app.kubernetes.io/name=prefect-worker> -n prefect
6. Before trying your flow, let's run this test in the worker pod:
python
from google.auth import default
from google.auth.transport import requests
import json
credentials, project = default()
request = requests.Request()
credentials.refresh(request)
print("Project:", project)
print("Token type:", credentials.token_type)
print("Scopes:", credentials.scopes)
print("Valid:", credentials.valid)
print("Expired:", credentials.expired)
Can you:
1. Apply these changes
2. Run the test script
3. Try your flow again
If it still fails, please share:
1. The output of the test script
2. The exact error message from the flow run
3. The output of kubectl describe pod <worker-pod-name> -n prefect
This will help us understand if there's an issue with the token scope or if there's another configuration problem we need to address.Daniel Soares
08/06/2025, 4:18 PMMarvin
08/06/2025, 4:18 PMMarvin
08/06/2025, 4:18 PMpython
from google.auth import default
from google.auth.transport import requests
import json
def test_auth():
try:
print("1. Getting default credentials...")
credentials, project = default()
print(f"2. Project ID: {project}")
print("3. Credential type:", type(credentials).__name__)
print("4. Refreshing credentials...")
request = requests.Request()
credentials.refresh(request)
print("5. Token available:", bool(credentials.token))
print("6. Token (first 20 chars):", credentials.token[:20] if credentials.token else "No token")
print("7. Credential attributes:", dir(credentials))
except Exception as e:
print(f"Error occurred: {type(e).__name__} - {str(e)}")
if __name__ == "__main__":
test_auth()
Also, let's try a different approach with the worker configuration. Update your work pool with this configuration:
bash
prefect work-pool update base --update-base-job-template '{
"job_configuration": {
"job_body": {
"template": {
"template": {
"containers": [
{
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu }}",
"memory": "{{ memory }}"
}
},
"env": [
{
"name": "PREFECT_API_URL",
"value": "{{ prefect.api.url }}"
},
{
"name": "PREFECT_API_KEY",
"value": "{{ prefect.api.key }}"
}
]
}
],
"vpcAccess": {
"connector": "projects/my-net-proj/locations/my-location/connectors/name"
}
}
}
},
"region": "my-location"
}
}'
And update your worker deployment with these environment variables (via Helm values):
yaml
worker:
env:
- name: GOOGLE_CLOUD_PROJECT
value: "my-project" # Your actual project ID
- name: CLOUDSDK_CORE_PROJECT
value: "my-project" # Same as above
- name: GOOGLE_APPLICATION_CREDENTIALS
value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
- name: USE_GKE_GCLOUD_AUTH_PLUGIN
value: "True"
Can you:
1. Apply the new work pool configuration
2. Update your worker deployment with the new environment variables
3. Run the updated test script
4. Try running a flow again
Please share:
1. The output of the new test script
2. Any new error messages you see when running the flow
This will help us understand exactly what kind of credentials the worker is getting and how we might need to adjust the configuration.Daniel Soares
08/06/2025, 4:20 PMMarvin
08/06/2025, 4:21 PMMarvin
08/06/2025, 4:21 PMbash
export PROJECT_ID="my-project"
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"
# Grant Cloud Run Admin role at the project level
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/run.admin"
# Grant Service Account User role
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/iam.serviceAccountUser"
# Grant Service Account Token Creator
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/iam.serviceAccountTokenCreator"
# Grant VPC Access User
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/vpcaccess.user"
# Grant Cloud Run Invoker
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/run.invoker"
# Grant Service Account User on itself
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/iam.serviceAccountUser"
2. Update your work pool configuration to explicitly use the service account:
bash
prefect work-pool update base --update-base-job-template '{
"job_configuration": {
"job_body": {
"template": {
"template": {
"serviceAccount": "<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>",
"containers": [
{
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu }}",
"memory": "{{ memory }}"
}
}
}
],
"vpcAccess": {
"connector": "projects/my-net-proj/locations/my-location/connectors/name"
}
}
}
},
"region": "my-location"
}
}'
3. Add these specific environment variables to your worker deployment:
yaml
env:
- name: GOOGLE_CLOUD_PROJECT
value: "my-project"
- name: CLOUDSDK_CORE_PROJECT
value: "my-project"
- name: GOOGLE_APPLICATION_CREDENTIALS
value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
- name: USE_GKE_GCLOUD_AUTH_PLUGIN
value: "True"
- name: PREFECT_LOGGING_LEVEL
value: "DEBUG"
- name: PREFECT_CLOUD_RUN_JOB_CREDENTIALS_MODE
value: "workload_identity"
4. Let's also verify the Cloud Run API is enabled and accessible:
bash
# Enable the API
gcloud services enable <http://run.googleapis.com|run.googleapis.com>
# Verify the service account can list jobs (run this in Cloud Shell or similar)
gcloud run jobs list \
--impersonate-service-account=prefect@my-project.iam.gserviceaccount.com \
--project=my-project \
--region=my-location
5. After applying all these changes:
bash
# Delete the worker pod to force a refresh
kubectl delete pod -l <http://app.kubernetes.io/name=prefect-worker|app.kubernetes.io/name=prefect-worker> -n prefect
Can you:
1. Apply all these changes
2. Try running a flow again
3. If it fails, please share:
- The complete worker pod logs: kubectl logs <pod-name> -n prefect
- The output of: gcloud projects get-iam-policy my-project --flatten="bindings[].members" --filter="bindings.members:prefect@my-project.iam.gserviceaccount.com" --format="table(bindings.role)"
This will help us see exactly what permissions are in place and what might be missing.Marvin
08/06/2025, 4:22 PMMarvin
08/06/2025, 4:22 PMserviceAccount
is set to None
. Let's fix this and make a few other adjustments:
1. First, update your work pool configuration with the explicit service account:
bash
prefect work-pool update base --update-base-job-template '{
"job_configuration": {
"job_body": {
"template": {
"template": {
"serviceAccount": "<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>",
"vpcAccess": {
"connector": "projects/my-net-proj/locations/my-location/connectors/name"
},
"containers": [
{
"env": [],
"args": "{{ args }}",
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu }}",
"memory": "{{ memory }}"
}
}
}
]
}
}
},
"region": "my-location"
}
}'
2. Update the region setting to match your location:
bash
prefect work-pool update base --set-base-job-variable region="my-location"
3. Let's verify and update all the necessary IAM permissions:
bash
export PROJECT_ID="my-project"
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"
# First, grant project-level permissions
for role in \
"roles/run.admin" \
"roles/run.invoker" \
"roles/iam.serviceAccountUser" \
"roles/vpcaccess.user"; do
gcloud projects add-iam-policy-binding $PROJECT_ID \
--member="serviceAccount:$SA_EMAIL" \
--role="$role"
done
# Grant the service account permission to act as itself
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
--member="serviceAccount:$SA_EMAIL" \
--role="roles/iam.serviceAccountUser"
# Update the Workload Identity binding
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
--member="serviceAccount:$PROJECT_ID.svc.id.goog[prefect/prefect-worker-base]" \
--role="roles/iam.workloadIdentityUser"
4. Add these environment variables to your worker deployment:
yaml
env:
- name: GOOGLE_CLOUD_PROJECT
value: "my-project"
- name: CLOUDSDK_CORE_PROJECT
value: "my-project"
- name: PREFECT_LOGGING_LEVEL
value: "DEBUG"
- name: GOOGLE_APPLICATION_CREDENTIALS
value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
5. After applying these changes, let's verify the permissions are working. Create this test script and run it in the worker pod:
python
from google.auth import default
from google.cloud import run_v2
import google.auth.transport.requests
def test_permissions():
print("1. Getting credentials...")
credentials, project = default()
print("2. Refreshing token...")
auth_req = google.auth.transport.requests.Request()
credentials.refresh(auth_req)
print("3. Creating Cloud Run client...")
client = run_v2.JobsClient(credentials=credentials)
print("4. Testing list jobs...")
parent = f"projects/{project}/locations/my-location"
try:
jobs = client.list_jobs(request={"parent": parent})
print("Successfully listed jobs!")
except Exception as e:
print(f"Error listing jobs: {str(e)}")
if __name__ == "__main__":
test_permissions()
Can you:
1. Apply the work pool updates
2. Apply the IAM permissions
3. Update your worker deployment
4. Run the new test script (after installing google-cloud-run with pip install google-cloud-run
)
5. Try running a flow again
If it still fails, please share:
1. The output of the test script
2. The complete worker pod logs after trying to run a flow
3. The output of:
bash
gcloud projects get-iam-policy $PROJECT_ID \
--flatten="bindings[].members" \
--filter="bindings.members:$SA_EMAIL" \
--format="table(bindings.role)"
This will help us understand exactly what permissions are in place and what might be missing.Daniel Soares
08/06/2025, 4:24 PMMarvin
08/06/2025, 4:25 PMMarvin
08/06/2025, 4:25 PMbash
prefect work-pool update base --update-base-job-template '{
"job_configuration": {
"job_body": {
"template": {
"template": {
"containers": [
{
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu }}",
"memory": "{{ memory }}"
}
}
}
]
}
}
},
"region": "my-location"
}
}'
2. Then, create a new GCP service account specifically for running the jobs:
bash
# Create a new service account for running jobs
gcloud iam service-accounts create prefect-runner \
--display-name="Prefect Runner Service Account"
# Grant it Cloud Run permissions
gcloud projects add-iam-policy-binding my-project \
--member="serviceAccount:prefect-runner@my-project.iam.gserviceaccount.com" \
--role="roles/run.developer"
# Grant the worker service account permission to impersonate it
gcloud iam service-accounts add-iam-policy-binding \
<mailto:prefect-runner@my-project.iam.gserviceaccount.com|prefect-runner@my-project.iam.gserviceaccount.com> \
--member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
--role="roles/iam.serviceAccountUser"
3. Update the work pool to use this new service account:
bash
prefect work-pool update base --update-base-job-template '{
"job_configuration": {
"job_body": {
"template": {
"template": {
"serviceAccount": "<mailto:prefect-runner@my-project.iam.gserviceaccount.com|prefect-runner@my-project.iam.gserviceaccount.com>",
"containers": [
{
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu }}",
"memory": "{{ memory }}"
}
}
}
]
}
}
},
"region": "my-location"
}
}'
This approach:
1. Creates a separate service account for running the jobs
2. Simplifies the work pool configuration
3. Uses proper service account delegation
If you're willing to try one last time, apply these changes and try running a flow. This pattern has worked in similar situations where the permissions were causing issues.
If this still doesn't work, I understand your frustration, and you might want to:
1. Open an issue on the Prefect GitHub repository
2. Consider using a different worker type temporarily
3. Reach out to Prefect support with the specific error messages and configuration
Would you like to try this last approach, or would you prefer to explore alternative solutions?Daniel Soares
08/06/2025, 5:13 PMMarvin
08/06/2025, 5:13 PMMarvin
08/06/2025, 5:13 PMkubernetes
instead of `cloud-run-v2`:
bash
prefect work-pool create kubernetes-pool --type kubernetes
b. Or use the Process worker type with a Kubernetes deployment:
bash
prefect work-pool create process-pool --type process
3. Contact Prefect Support
Since this seems to be a specific issue with the Cloud Run v2 worker type, you might want to reach out to Prefect support directly:
https://www.prefect.io/support
Would you like me to:
1. Help you set up an alternative worker type as a temporary solution?
2. Help you draft a detailed GitHub issue?
3. Provide information about a different deployment approach?