<@ULVA73B9P> I am trying to run the prefect worker...
# ask-marvin
d
@Marvin I am trying to run the prefect worker deployed via helm chart in kubernetes with type cloud-run-v2 but when I try to launch a flow into that worker, I allways get this error
Copy code
RefreshError(TransportError('Failed to retrieve <http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform> from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\t<https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to>\\n\\n"', <google_auth_httplib2._Response object at 0x7de56c381c50>))
I have a gcp service account attached to the prefect-worker pods with the roles "roles/run.developer", "roles/run.invoker", "roles/iam.serviceAccountTokenCreator" and also with the workload identity binding to the GKE workload identity service account. What can I do to solve this?
m
thought for 43.1 seconds
Based on the error message and research, this appears to be a Workload Identity configuration issue. Here are the key points and solutions to check: 1. Required IAM Roles: You have most of the required roles, but you should also add: -
roles/iam.serviceAccountUser
-
roles/cloudrun.developer
2. Workload Identity Configuration: The error suggests the Workload Identity setup isn't complete. Here's what to verify: a. Ensure the GKE cluster has Workload Identity enabled:
Copy code
bash
   gcloud container clusters describe YOUR_CLUSTER_NAME --zone=YOUR_ZONE | grep workloadPool
b. Verify the Kubernetes service account annotation:
Copy code
bash
   kubectl describe serviceaccount prefect-worker -n YOUR_NAMESPACE
It should have an annotation like:
Copy code
yaml
   <http://iam.gke.io/gcp-service-account=YOUR-GSA@YOUR-PROJECT.iam.gserviceaccount.com|iam.gke.io/gcp-service-account=YOUR-GSA@YOUR-PROJECT.iam.gserviceaccount.com>
c. Verify the IAM binding between the GCP service account and Kubernetes service account:
Copy code
bash
   gcloud iam service-accounts get-iam-policy <mailto:YOUR-GSA@YOUR-PROJECT.iam.gserviceaccount.com|YOUR-GSA@YOUR-PROJECT.iam.gserviceaccount.com>
You should see a binding like:
Copy code
yaml
   roles/iam.workloadIdentityUser:
     members:
     - serviceAccount:YOUR-PROJECT.svc.id.goog[YOUR-NAMESPACE/prefect-worker]
3. Pod Configuration: Ensure your pod spec includes:
Copy code
yaml
   serviceAccountName: prefect-worker  # or whatever your K8s service account name is
4. Verify the Setup: You can test the Workload Identity configuration by running:
Copy code
bash
   kubectl run -it \
     --serviceaccount=prefect-worker \
     --image=google/cloud-sdk:slim \
     --namespace=YOUR_NAMESPACE \
     workload-identity-test -- /bin/bash
   
   # Inside the pod:
   gcloud auth list
5. Common Fixes: - If the binding isn't working, recreate it:
Copy code
bash
   gcloud iam service-accounts add-iam-policy-binding <mailto:YOUR-GSA@YOUR-PROJECT.iam.gserviceaccount.com|YOUR-GSA@YOUR-PROJECT.iam.gserviceaccount.com> \
     --role roles/iam.workloadIdentityUser \
     --member "serviceAccount:YOUR-PROJECT.svc.id.goog[YOUR-NAMESPACE/prefect-worker]"
- Ensure the namespace matches exactly in all configurations - Double-check there are no typos in service account names - Verify the GCP service account exists and is enabled The error message specifically mentions
iam.serviceAccounts.getAccessToken
, which suggests the Workload Identity binding isn't properly configured or the pod isn't using the correct Kubernetes service account. Would you like me to provide more specific guidance about any of these areas? It would be helpful if you could share: 1. The exact namespace you're using 2. The name of your Kubernetes service account 3. The full name of your GCP service account This would allow me to provide more targeted troubleshooting steps.
d
@Marvin All those points are checked. Workload identity works, everything is properly setup Somehow, when I trigger a flow run, it goes into the worker pod, and then the worker pod is not able to create the cloud run job for the flow becuse it allways gives that error. Is any env var I need to set in the worker or something ?
m
thought for 49.5 seconds
Based on the research and recent GitHub issues, there are a few key environment variables and configurations that need to be set for the Cloud Run v2 worker to function properly. Let me outline what needs to be configured: 1. Required Environment Variables for the Worker Pod:
Copy code
yaml
- PREFECT_API_KEY
- PREFECT_API_URL
- GCP_REGION  # The region where Cloud Run jobs will be created
2. Worker Configuration Variables: When creating the work pool, make sure these are set:
Copy code
yaml
- project_name  # Your GCP project
- location     # GCP region
- cpu          # CPU allocation for Cloud Run jobs
- memory       # Memory allocation for Cloud Run jobs
3. GCP-specific Configuration: The worker needs to know which service account to use for creating Cloud Run jobs. This can be specified in two ways:
Copy code
yaml
- service_account_name: <mailto:your-service-account@your-project.iam.gserviceaccount.com|your-service-account@your-project.iam.gserviceaccount.com>
or
Copy code
yaml
- service_account_email: <mailto:your-service-account@your-project.iam.gserviceaccount.com|your-service-account@your-project.iam.gserviceaccount.com>
Can you verify: 1. Are all these environment variables set in your worker pod? 2. What values do you have configured for your work pool? You can check the worker pod's environment variables with:
Copy code
bash
kubectl exec -it <worker-pod-name> -- env | grep PREFECT
And you can check the work pool configuration with:
Copy code
bash
prefect work-pool inspect <your-workpool-name>
Also, since you mentioned the worker pod is receiving the flow run but failing to create the Cloud Run job, can you check the worker logs? This might give us more context about where exactly the authentication is failing:
Copy code
bash
kubectl logs <worker-pod-name>
The error you're seeing suggests that even though Workload Identity is working at the pod level, there might be an issue with how the worker is trying to authenticate when creating the Cloud Run job. This could be because: 1. The worker is not using the Workload Identity credentials correctly 2. The service account configuration in the work pool doesn't match the Workload Identity setup 3. There might be missing environment variables that the worker needs to properly authenticate Can you share these details so we can pinpoint the exact issue?
d
@Marvin This is the logs from the worker pod:
Copy code
Installing prefect-gcp...
Using Python 3.11.13 environment at: /usr/local
Resolved 116 packages in 1.56s
Downloading google-api-python-client (13.1MiB)
Downloading prefect (5.8MiB)
Downloading apprise (1.3MiB)
Downloading grpcio (5.9MiB)
Downloading cryptography (4.3MiB)
Downloading sqlalchemy (3.1MiB)
Downloading uv (18.0MiB)
 Downloading apprise
 Downloading grpcio
 Downloading sqlalchemy
 Downloading cryptography
 Downloading uv
 Downloading prefect
 Downloading google-api-python-client
Prepared 34 packages in 1.55s
error: failed to remove file `/usr/local/lib/python3.11/site-packages/cachetools-6.1.0.dist-info/INSTALLER`: Read-only file system (os error 30)
Worker 'CloudRunWorkerV2 8b50ccb0-097a-4d9c-937d-3f0f01179363' started!
11:47:10.930 | INFO    | prefect.flow_runs.worker - Worker 'CloudRunWorkerV2 8b50ccb0-097a-4d9c-937d-3f0f01179363' submitting flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4'
11:47:11.096 | WARNING | prefect.flow_runs.worker - PREFECT_API_AUTH_STRING is provided as a plaintext environment variable. For better security, consider providing it as a secret using 'prefect_api_auth_string_secret' or 'env_from_secrets' in your base job template.
11:47:11.139 | INFO    | prefect.flow_runs.worker - Creating Cloud Run JobV2 tasteful-coati-985e9381fc6245028197305f4db94333
11:47:11.154 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout. Set the timeout when constructing the httplib2.Http instance.
11:47:11.173 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout. Set the timeout when constructing the httplib2.Http instance.
11:47:11.299 | ERROR   | prefect.flow_runs.worker - Failed to submit flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' to infrastructure.
Traceback (most recent call last):
  File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/credentials.py", line 128, in refresh
    self.token, self.expiry = _metadata.get_service_account_token(
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/_metadata.py", line 375, in get_service_account_token
    token_json = get(request, path, params=params, headers=metrics_header)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/_metadata.py", line 267, in get
    raise exceptions.TransportError(
google.auth.exceptions.TransportError: ('Failed to retrieve <http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform> from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\t<https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to>\\n\\n"', <google_auth_httplib2._Response object at 0x7982bb1b5d10>)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/prefect/.local/lib/python3.11/site-packages/prefect/workers/base.py", line 1262, in _submit_run_and_capture_errors
    result = await self.run(
             ^^^^^^^^^^^^^^^
  File "/home/prefect/.local/lib/python3.11/site-packages/prefect_gcp/workers/cloud_run_v2.py", line 643, in run
    await run_sync_in_worker_thread(
  File "/home/prefect/.local/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 233, in run_sync_in_worker_thread
    result = await anyio.to_thread.run_sync(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2470, in run_sync_in_worker_thread
    return await future
           ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 967, in run
    result = context.run(func, *args)
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/prefect/.local/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 243, in call_with_mark
    return call()
           ^^^^^^
  File "/home/prefect/.local/lib/python3.11/site-packages/prefect_gcp/workers/cloud_run_v2.py", line 714, in _create_job_and_wait_for_registration
    JobV2.create(
  File "/home/prefect/.local/lib/python3.11/site-packages/prefect_gcp/models/cloud_run_v2.py", line 167, in create
    response = request.execute()
               ^^^^^^^^^^^^^^^^^
  File "/home/prefect/.local/lib/python3.11/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
    return wrapped(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/prefect/.local/lib/python3.11/site-packages/googleapiclient/http.py", line 923, in execute
    resp, content = _retry_request(
                    ^^^^^^^^^^^^^^^
  File "/home/prefect/.local/lib/python3.11/site-packages/googleapiclient/http.py", line 191, in _retry_request
    resp, content = http.request(uri, method, *args, **kwargs)
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/prefect/.local/lib/python3.11/site-packages/google_auth_httplib2.py", line 209, in request
    self.credentials.before_request(self._request, method, uri, request_headers)
  File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/credentials.py", line 239, in before_request
    self._blocking_refresh(request)
  File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/credentials.py", line 202, in _blocking_refresh
    self.refresh(request)
  File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/credentials.py", line 133, in refresh
    raise new_exc from caught_exc
google.auth.exceptions.RefreshError: ('Failed to retrieve <http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform> from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\t<https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to>\\n\\n"', <google_auth_httplib2._Response object at 0x7982bb1b5d10>)
11:47:11.349 | INFO    | prefect.flow_runs.worker - Reported flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' as crashed: Flow run could not be submitted to infrastructure:
RefreshError(TransportError('Failed to retrieve <http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform> from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\t<https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to>\\n\\n"', <google_auth_httplib2._Response object at 0x7982bb1b5d10>))
Wait for more info on the next message
m
Your message was too long, here's your message at the allowed limit: This is the logs from the worker pod ```Installing prefect-gcp Using Python 3 11 13 environment at /usr/local Resolved 116 packages in 1 56s Downloading google-api-python-client (13 1MiB) Downloading prefect (5 8MiB) Downloading apprise (1 3MiB) Downloading grpcio (5 9MiB) Downloading cryptography (4 3MiB) Downloading sqlalchemy (3 1MiB) Downloading uv (18 0MiB) Downloading apprise Downloading grpcio Downloading sqlalchemy Downloading cryptography Downloading uv Downloading prefect Downloading google-api-python-client Prepared 34 packages in 1 55s error failed to remove file
/usr/local/lib/python3 11/site-packages/cachetools-6 1 0 dist-info/INSTALLER
Read-only file system (os error 30) Worker 'CloudRunWorkerV2 8b50ccb0-097a-4d9c-937d-3f0f01179363' started! 11 47 10 930 | INFO | prefect flow_runs worker - Worker 'CloudRunWorkerV2 8b50ccb0-097a-4d9c-937d-3f0f01179363' submitting flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' 11 47 11 096 | WARNING | prefect flow_runs worker - PREFECT_API_AUTH_STRING is provided as a plaintext environment variable For better security consider providing it as a secret using 'prefect_api_auth_string_secret' or 'env_from_secrets' in your base job template 11 47 11 139 | INFO | prefect flow_runs worker - Creating Cloud Run JobV2 tasteful-coati-985e9381fc6245028197305f4db94333 11 47 11 154 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout Set the timeout when constructing the httplib2 Http instance 11 47 11 173 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout Set the timeout when constructing the httplib2 Http instance 11 47 11 299 | ERROR | prefect flow_runs worker - Failed to submit flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' to infrastructure Traceback (most recent call last) File /home/prefect/ local/lib/python3 11/site-packages/google/auth/compute_engine/credentials py line 128 in refresh self token self expiry = _metadata get_service_account_token( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File /home/prefect/ local/lib/python3 11/site-packages/google/auth/compute_engine/_metadata py line 375 in get_service_account_token token_json = get(request path params=params headers=metrics_header) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File /home/prefect/ local/lib/python3 11/site-packages/google/auth/compute_engine/_metadata py line 267 in get raise exceptions TransportError( google auth exceptions TransportError ('Failed to retrieve <http //metadata google internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww googleapis com%2Fauth%2Fcloud-platform> from the Google Compute Engine metadata service Status 403 Response \nb Unable to generate access token; IAM returned 403 Forbidden Permission \'iam serviceAccounts getAccessToken\' denied on resource (or it may not exist) \\nThis error could be caused by a missing IAM policy binding on the target IAM service account \\nFor more information refer to the Workload Identity documentation \\n\\t<https //cloud google com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to>\\n\\n ' <google_auth_httplib2 _Response object at 0x7982bb1b5d10>) The above exception was the direct cause of the following exception Traceback (most recent call last) File /home/prefect/ local/lib/python3 11/site-packages/prefect/workers/base py line 1262 in _submit_run_and_capture_errors result = await self run( ^^^^^^^^^^^^^^^ File /home/prefect/ local/lib/python3 11/site-packages/prefect_gcp/workers/cloud_run_v2 py line 643 in run await run_sync_in_worker_thread( File /home/prefect/ local/lib/python3 11/site-packages/prefect/utilities/asyncutils py line 233 in run_sync_in_worker_thread result = await anyio to_thread run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File /usr/local/lib/python3 11/site-packages/anyio/to_thread py line 56 in run_sync return await get_async_b
ackend() run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File /usr/local/lib/python3 11/site-packages/anyio/_backends/_asyncio py line 2470 in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File /usr/local/lib/python3 11/site-packages/anyio/_backends/_asyncio py line 967 in run result = context run(func *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File /home/prefect/ local/lib/python3 11/site-packages/prefect/utilities/asyncutils py line 243 in call_with_mark return call() ^^^^^^ File /home/prefect/ local/lib/python3 11/site-packages/prefect_gcp/workers/cloud_run_v2 py line 714 in _create_job_and_wait_for_registration JobV2 create( File /home/prefect/ local/lib/python3 11/site-packages/prefect_gcp/models/cloud_run_v2 py line 167 in create response = request execute() ^^^^^^^^^^^^^^^^^ File /home/prefect/ local/lib/python3
d
@Marvin here is more info Output of prefect work-pool inspect:
Copy code
WorkPool(
    id='eab22801-5ef4-4e26-bb04-8476a9d5388d',
    created=DateTime(2025, 7, 25, 9, 47, 15, 851829, tzinfo=Timezone('UTC')),
    updated=DateTime(2025, 8, 6, 10, 41, 48, 678144, tzinfo=Timezone('UTC')),
    name='base',
    description='Main workpool for Prefect workloads',
    type='cloud-run-v2',
    base_job_template={
        'variables': {
            'type': 'object',
            'properties': {
                'cpu': {
                    'type': 'string',
                    'title': 'CPU',
                    'default': '1000m',
                    'description': 'The CPU to allocate to the Cloud Run job.'
                },
                'env': {
                    'type': 'object',
                    'title': 'Environment Variables',
                    'description': 'Environment variables to set when starting a flow
run.',
                    'additionalProperties': {'type': 'string'}
                },
                'args': {
                    'type': 'array',
                    'items': {'type': 'string'},
                    'title': 'Args',
                    'description': "The arguments to pass to the Cloud Run Job V2's
entrypoint command."
                },
                'image': {
                    'type': 'string',
                    'title': 'Image Name',
                    'default': 'prefecthq/prefect:2-latest',
                    'description': 'The image to use for the Cloud Run job. If not
provided the default Prefect image will be used.'
                },
                'memory': {
                    'type': 'string',
                    'title': 'Memory',
                    'default': '512Mi',
                    'example': '512Mi',
                    'pattern': '^\\d+(?:G|Gi|M|Mi)$',
                    'description': 'The memory to allocate to the Cloud Run job along
with the units, whichcould be: G, Gi, M, Mi.'
                },
                'region': {
                    'type': 'string',
                    'title': 'Region',
                    'default': 'europe-west6',
                    'description': 'The region in which to run the Cloud Run job'
                },
                'timeout': {
                    'type': 'integer',
                    'title': 'Job Timeout',
                    'default': 600,
                    'maximum': 86400,
                    'description': 'The length of time that Prefect will wait for a
Cloud Run Job to complete before raising an exception (maximum of 86400 seconds, 1
day).',
                    'exclusiveMinimum': 0
                },
                'keep_job': {
                    'type': 'boolean',
                    'title': 'Keep Job After Completion',
                    'default': True,
                    'description': 'Keep the completed Cloud run job on Google Cloud
Platform.'
                },
                'max_retries': {
                    'type': 'integer',
                    'title': 'Max Retries',
                    'default': 0,
                    'description': 'The number of times to retry the Cloud Run job.'
                }
            },
            'description': 'Default variables for the Cloud Run worker V2.\n\nThe
schema for this class is used to populate the `variables` section of the\ndefault base
job template.'
        },
        'job_configuration': {
            'env': '{{ env }}',
            'region': '{{ region }}',
            'timeout': '{{ timeout }}',
            'job_body': {
                'client': 'prefect',
                'template': {
                    'template': {
                        'timeout': '{{ timeout }}',
                        'vpcAccess': {
                            'connector':
'projects/my-net-project/locations/my-location/connectors/my-connector'
                        },
                        'containers': [
                            {
                                'env': [],
                                'args': '{{ args }}',
                                'image': '{{ image }}',
                                'resources': {
                                    'limits': {
                                        'cpu': '{{ cpu }}',
                                        'memory': '{{ memory }}'
                                    }
                                }
                            }
                        ],
                        'maxRetries': '{{ max_retries }}',
                        'serviceAccount':
'<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>'
                    }
                },
                'launchStage': 'BETA'
            },
            'keep_job': '{{ keep_job }}'
        }
    },
    status=WorkPoolStatus.READY,
    storage_configuration=WorkPoolStorageConfiguration(),
    default_queue_id='b86a0d36-f11d-4190-ae8c-1b705b7ec91b'
)
Output of the env vars from prefect-worker pod: PREFECT_KUBERNETES_CLUSTER_UID=my-uid PREFECT_API_AUTH_STRING=my-auth-string PREFECT_API_ENABLE_HTTP2=true PREFECT_WORKER_PREFETCH_SECONDS=10 PREFECT_WORKER_QUERY_SECONDS=5 PREFECT_API_URL=http://my-prefect.test.com/api PREFECT_INTEGRATIONS_KUBERNETES_OBSERVER_NAMESPACES=prefect PREFECT_DEBUG_MODE=false PREFECT_SERVER_PORT_4200_TCP_PROTO=tcp PREFECT_SERVER_PORT=tcp://my-internal-ip:4200 PREFECT_SERVER_PORT_4200_TCP_PORT=4200 PREFECT_SERVER_PORT_4200_TCP_ADDR=my-internal-ip PREFECT_SERVER_SERVICE_HOST=my-internal-ip PREFECT_SERVER_SERVICE_PORT=4200 PREFECT_SERVER_SERVICE_PORT_SERVER_SVC_PORT=4200 PREFECT_SERVER_PORT_4200_TCP=tcp://my-internal-ip:4200
m
thought for 12.8 seconds
Thank you for providing those details. I notice a few important things: 1. In your work pool configuration, I see that you have a service account specified in the `job_configuration`:
Copy code
yaml
'serviceAccount': '<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>'
2. Looking at your worker pod's environment variables, I notice that some important variables are missing: -
GCP_REGION
(although this is specified in your work pool as 'europe-west6') -
GOOGLE_CLOUD_PROJECT
(your GCP project ID) The error you're seeing suggests that when the worker tries to create a Cloud Run job, it's not properly authenticating with GCP. Here's what I recommend: 1. Add these environment variables to your worker deployment (in your Helm values or Kubernetes manifest):
Copy code
yaml
env:
  - name: GOOGLE_CLOUD_PROJECT
    value: "your-project-id"
  - name: GCP_REGION
    value: "europe-west6"  # match your work pool region
2. Verify the service account permissions again. The service account needs these exact roles:
Copy code
- roles/run.developer
- roles/run.invoker
- roles/iam.serviceAccountUser
3. In your work pool configuration, I notice you're using a VPC connector. Make sure the service account also has permission to use this VPC connector:
Copy code
- roles/vpcaccess.user
4. You can also try adding some debug environment variables to get more information:
Copy code
yaml
env:
  - name: PREFECT_LOGGING_LEVEL
    value: "DEBUG"
  - name: GOOGLE_API_USE_CLIENT_CERTIFICATE
    value: "true"
5. Check if the Cloud Run API is enabled in your project:
Copy code
bash
gcloud services list | grep <http://run.googleapis.com|run.googleapis.com>
If not, enable it:
Copy code
bash
gcloud services enable <http://run.googleapis.com|run.googleapis.com>
Can you try adding those environment variables to your worker deployment and let me know if that resolves the issue? If not, with the debug logging enabled, we should get more detailed error messages that can help pinpoint the exact cause. Also, could you share a redacted version of the worker logs when it tries to create a Cloud Run job? This would help us see exactly where in the process it's failing.
d
@Marvin I enabled the debug mode and setup the env vars as you showed. The logs from the worker are this : 132340.991 | DEBUG | prefect.utilities.services.critical_service_loop - Starting run of 'get_and_submit_flow_runs' 132340.992 | DEBUG | prefect.workers.cloud-run-v2.cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Querying for flow runs scheduled before 2025-08-06 132350.991839+00:00 132341.023 | DEBUG | prefect.workers.cloud-run-v2.cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Discovered 1 scheduled_flow_runs 132341.023 | INFO | prefect.flow_runs.worker - Worker 'CloudRunWorkerV2 0c097b6f-d6e8-4364-8475-331cfd5da9e6' submitting flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' 132341.081 | DEBUG | prefect.workers.cloud-run-v2.cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Ready to submit cb2d6285-ea3a-496a-b0b8-a2a15d68abf4: True 132341.182 | WARNING | prefect.flow_runs.worker - PREFECT_API_AUTH_STRING is provided as a plaintext environment variable. For better security, consider providing it as a secret using 'prefect_api_auth_string_secret' or 'env_from_secrets' in your base job template. 132341.185 | DEBUG | prefect.events.clients - EventsClient(id=138237298386704): Emitting event id=01987f8d-43ff-7f5c-ae81-6e0fd4ec7684. 132341.185 | DEBUG | prefect.events.clients - Added event id=01987f8d-43ff-7f5c-ae81-6e0fd4ec7684 to unconfirmed events list. There are now 3 unconfirmed events. 132341.186 | DEBUG | prefect.events.clients - EventsClient(id=138237298386704): Emit reconnection attempt 0. 132341.186 | DEBUG | prefect.events.clients - EventsClient(id=138237298386704): Sending event id=01987f8d-43ff-7f5c-ae81-6e0fd4ec7684. 132341.187 | DEBUG | prefect.events.clients - EventsClient(id=138237298386704): Checkpointing event id=01987f8d-43ff-7f5c-ae81-6e0fd4ec7684. 132341.229 | INFO | prefect.flow_runs.worker - Creating Cloud Run JobV2 tasteful-coati-754c241e2c2d4b829a180bc63f9d7a73 132341.246 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout. Set the timeout when constructing the httplib2.Http instance. 132341.267 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout. Set the timeout when constructing the httplib2.Http instance. 132341.311 | ERROR | prefect.flow_runs.worker - Failed to submit flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' to infrastructure. Traceback (most recent call last): File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/credentials.py", line 128, in refresh self.token, self.expiry = _metadata.get_service_account_token( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/_metadata.py", line 375, in get_service_account_token token_json = get(request, path, params=params, headers=metrics_header) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/_metadata.py", line 267, in get raise exceptions.TransportError( google.auth.exceptions.TransportError: ('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\thttps://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to\\n\\n"', <google_auth_httplib2._Response object at 0x7db9e01a7f90>) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/prefect/.local/lib/python3.11/site-packages/prefect/workers/base.py", line 1262, in _submit_run_and_capture_errors result = await self.run( ^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/prefect_gcp/workers/cloud_run_v2.py", line 643, in run await run_sync_in_worker_thread( File "/home/prefect/.local/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 233, in run_sync_in_worker_thread result = await anyio.to_thread.run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2470, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 967, in run result = context.run(func, *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 243, in call_with_mark return call() ^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/prefect_gcp/workers/cloud_run_v2.py", line 714, in _create_job_and_wait_for_registration JobV2.create( File "/home/prefect/.local/lib/python3.11/site-packages/prefect_gcp/models/cloud_run_v2.py", line 167, in create response = request.execute() ^^^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper return wrapped(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/googleapiclient/http.py", line 923, in execute resp, content = _retry_request( ^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/googleapiclient/http.py", line 191, in _retry_request resp, content = http.request(uri, method, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/google_auth_httplib2.py", line 209, in request self.credentials.before_request(self._request, method, uri, request_headers) File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/credentials.py", line 239, in before_request self._blocking_refresh(request) File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/credentials.py", line 202, in _blocking_refresh self.refresh(request) File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/credentials.py", line 133, in refresh raise new_exc from caught_exc google.auth.exceptions.RefreshError: ('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\thttps://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to\\n\\n"', <google_auth_httplib2._Response object at 0x7db9e01a7f90>) 132341.361 | INFO | prefect.flow_runs.worker - Reported flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' as crashed: Flow run could not be submitted to infrastructure: RefreshError(TransportError('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\thttps://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to\\n\\n"', <google_auth_httplib2._Response object at 0x7db9e01a7f90>)) 132344.487 | DEBUG | prefect.utilities.services.critical_service_loop - Starting run of 'sync_with_backend' 132344.531 | DEBUG | prefect.workers.cloud-run-v2.cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Worker synchronized with the Prefect API server. 132346.786 | DEBUG | prefect.utilities.services.critical_service_loop - Starting run of 'get_and_submit_flow_runs' 132346.786 | DEBUG | prefect.workers.cloud-run-v2.cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Querying for flow runs scheduled before 2025-08-06 132356.786714+00:00 132346.812 | DEBUG | prefect.workers.cloud-run-v2.cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Discovered 0 scheduled_flow_runs
m
Your message was too long, here's your message at the allowed limit: I enabled the debug mode and setup the env vars as you showed The logs from the worker are this 13 23 40 991 | DEBUG | prefect utilities services critical_service_loop - Starting run of 'get_and_submit_flow_runs' 13 23 40 992 | DEBUG | prefect workers cloud-run-v2 cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Querying for flow runs scheduled before 2025-08-06 13 23 50 991839+00 00 13 23 41 023 | DEBUG | prefect workers cloud-run-v2 cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Discovered 1 scheduled_flow_runs 13 23 41 023 | INFO | prefect flow_runs worker - Worker 'CloudRunWorkerV2 0c097b6f-d6e8-4364-8475-331cfd5da9e6' submitting flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' 13 23 41 081 | DEBUG | prefect workers cloud-run-v2 cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Ready to submit cb2d6285-ea3a-496a-b0b8-a2a15d68abf4 True 13 23 41 182 | WARNING | prefect flow_runs worker - PREFECT_API_AUTH_STRING is provided as a plaintext environment variable For better security consider providing it as a secret using 'prefect_api_auth_string_secret' or 'env_from_secrets' in your base job template 13 23 41 185 | DEBUG | prefect events clients - EventsClient(id=138237298386704) Emitting event id=01987f8d-43ff-7f5c-ae81-6e0fd4ec7684 13 23 41 185 | DEBUG | prefect events clients - Added event id=01987f8d-43ff-7f5c-ae81-6e0fd4ec7684 to unconfirmed events list There are now 3 unconfirmed events 13 23 41 186 | DEBUG | prefect events clients - EventsClient(id=138237298386704) Emit reconnection attempt 0 13 23 41 186 | DEBUG | prefect events clients - EventsClient(id=138237298386704) Sending event id=01987f8d-43ff-7f5c-ae81-6e0fd4ec7684 13 23 41 187 | DEBUG | prefect events clients - EventsClient(id=138237298386704) Checkpointing event id=01987f8d-43ff-7f5c-ae81-6e0fd4ec7684 13 23 41 229 | INFO | prefect flow_runs worker - Creating Cloud Run JobV2 tasteful-coati-754c241e2c2d4b829a180bc63f9d7a73 13 23 41 246 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout Set the timeout when constructing the httplib2 Http instance 13 23 41 267 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout Set the timeout when constructing the httplib2 Http instance 13 23 41 311 | ERROR | prefect flow_runs worker - Failed to submit flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' to infrastructure Traceback (most recent call last) File /home/prefect/ local/lib/python3 11/site-packages/google/auth/compute_engine/credentials py line 128 in refresh self token self expiry = _metadata get_service_account_token( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File /home/prefect/ local/lib/python3 11/site-packages/google/auth/compute_engine/_metadata py line 375 in get_service_account_token token_json = get(request path params=params headers=metrics_header) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File /home/prefect/ local/lib/python3 11/site-packages/google/auth/compute_engine/_metadata py line 267 in get raise exceptions TransportError( google auth exceptions TransportError ('Failed to retrieve <http //metadata google internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww googleapis com%2Fauth%2Fcloud-platform> from the Google Compute Engine metadata service Status 403 Response \nb Unable to generate access token; IAM returned 403 Forbidden Permission \'iam serviceAccounts getAccessToken\' denied on resource (or it may not exist) \\nThis error could be caused by a missing IAM policy binding on the target IAM service account \\nFor more information refer to the Workload Identity documentation \\n\\t<https //cloud google com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to>\\n\\n ' <google_auth_httplib2 _Response object at 0x7db9e01a7f90>) The above exception was the direct cause of the following exception Traceback (most recent call l
ast) File /home/prefect/ local/lib/python3 11/site-packages/prefect/workers/base py line 1262 in _submit_run_and_capture_errors result = await self run( ^^^^^^^^^^^^^^^ File /home/prefect/ local/lib/python3 11/site-packages/prefect_gcp/workers/cloud_run_v2 py line 643 in run await run_sync_in_worker_thread( File /home/prefect/ local/lib/python3 11/site-packages/prefect/utilities/asyncutils py line 233
d
@Marvin here are the worker logs. Wait for the next 4 messages 132340.991 | DEBUG | prefect.utilities.services.critical_service_loop - Starting run of 'get_and_submit_flow_runs' 132340.992 | DEBUG | prefect.workers.cloud-run-v2.cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Querying for flow runs scheduled before 2025-08-06 132350.991839+00:00 132341.023 | DEBUG | prefect.workers.cloud-run-v2.cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Discovered 1 scheduled_flow_runs 132341.023 | INFO | prefect.flow_runs.worker - Worker 'CloudRunWorkerV2 0c097b6f-d6e8-4364-8475-331cfd5da9e6' submitting flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' 132341.081 | DEBUG | prefect.workers.cloud-run-v2.cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Ready to submit cb2d6285-ea3a-496a-b0b8-a2a15d68abf4: True 132341.182 | WARNING | prefect.flow_runs.worker - PREFECT_API_AUTH_STRING is provided as a plaintext environment variable. For better security, consider providing it as a secret using 'prefect_api_auth_string_secret' or 'env_from_secrets' in your base job template. 132341.185 | DEBUG | prefect.events.clients - EventsClient(id=138237298386704): Emitting event id=01987f8d-43ff-7f5c-ae81-6e0fd4ec7684. 132341.185 | DEBUG | prefect.events.clients - Added event id=01987f8d-43ff-7f5c-ae81-6e0fd4ec7684 to unconfirmed events list. There are now 3 unconfirmed events. 132341.186 | DEBUG | prefect.events.clients - EventsClient(id=138237298386704): Emit reconnection attempt 0. 132341.186 | DEBUG | prefect.events.clients - EventsClient(id=138237298386704): Sending event id=01987f8d-43ff-7f5c-ae81-6e0fd4ec7684. 132341.187 | DEBUG | prefect.events.clients - EventsClient(id=138237298386704): Checkpointing event id=01987f8d-43ff-7f5c-ae81-6e0fd4ec7684. 132341.229 | INFO | prefect.flow_runs.worker - Creating Cloud Run JobV2 tasteful-coati-754c241e2c2d4b829a180bc63f9d7a73 132341.246 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout. Set the timeout when constructing the httplib2.Http instance. 132341.267 | WARNING | google_auth_httplib2 - httplib2 transport does not support per-request timeout. Set the timeout when constructing the httplib2.Http instance. 132341.311 | ERROR | prefect.flow_runs.worker - Failed to submit flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' to infrastructure. Traceback (most recent call last): File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/credentials.py", line 128, in refresh self.token, self.expiry = _metadata.get_service_account_token( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/_metadata.py", line 375, in get_service_account_token token_json = get(request, path, params=params, headers=metrics_header) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/_metadata.py", line 267, in get raise exceptions.TransportError( google.auth.exceptions.TransportError: ('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\thttps://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to\\n\\n"', <google_auth_httplib2._Response object at 0x7db9e01a7f90>) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/home/prefect/.local/lib/python3.11/site-packages/prefect/workers/base.py", line 1262, in _submit_run_and_capture_errors result = await self.run( ^^^^^^^^^^^^^^^
m
thought for 16.9 seconds
d
@Marvin File "/home/prefect/.local/lib/python3.11/site-packages/prefect_gcp/workers/cloud_run_v2.py", line 643, in run await run_sync_in_worker_thread( File "/home/prefect/.local/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 233, in run_sync_in_worker_thread result = await anyio.to_thread.run_sync( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 2470, in run_sync_in_worker_thread return await future ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/anyio/_backends/_asyncio.py", line 967, in run result = context.run(func, *args) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/prefect/utilities/asyncutils.py", line 243, in call_with_mark return call() ^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/prefect_gcp/workers/cloud_run_v2.py", line 714, in _create_job_and_wait_for_registration JobV2.create( File "/home/prefect/.local/lib/python3.11/site-packages/prefect_gcp/models/cloud_run_v2.py", line 167, in create response = request.execute() ^^^^^^^^^^^^^^^^^
m
Error: status_code: 429, model_name: claude-3-5-sonnet-latest, body: {'type': 'error', 'error': {'type': 'rate_limit_error', 'message': 'Number of concurrent connections has exceeded your rate limit. Please try again later or contact sales at https://www.anthropic.com/contact-sales to discuss your options for a rate limit increase.'}}
d
@Marvin File "/home/prefect/.local/lib/python3.11/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper return wrapped(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/googleapiclient/http.py", line 923, in execute resp, content = _retry_request( ^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/googleapiclient/http.py", line 191, in _retry_request resp, content = http.request(uri, method, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/prefect/.local/lib/python3.11/site-packages/google_auth_httplib2.py", line 209, in request self.credentials.before_request(self._request, method, uri, request_headers) File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/credentials.py", line 239, in before_request self._blocking_refresh(request) File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/credentials.py", line 202, in _blocking_refresh self.refresh(request)
m
Looking at the logs, I notice a few important things: 1. The worker successfully picks up the flow run and attempts to create a Cloud Run job named "tasteful-coati-754c241e2c2d4b829a180bc63f9d7a73" 2. The critical error is:
Copy code
Permission 'iam.serviceAccounts.getAccessToken' denied on resource (or it may not exist)
This specific error occurs when the Workload Identity setup is incomplete. Even though you mentioned Workload Identity is working, there seems to be an issue with the token generation step. Let's verify the complete Workload Identity setup: 1. First, check if the Kubernetes service account has the correct annotation:
Copy code
bash
kubectl get serviceaccount prefect-worker -n your-namespace -o yaml
It should have:
Copy code
yaml
metadata:
  annotations:
    <http://iam.gke.io/gcp-service-account|iam.gke.io/gcp-service-account>: <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>
2. Verify the IAM binding includes the exact format:
Copy code
bash
gcloud iam service-accounts get-iam-policy <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>
It should show:
Copy code
yaml
bindings:
- members:
  - serviceAccount:my-project.svc.id.goog[your-namespace/prefect-worker]
  role: roles/iam.workloadIdentityUser
3. Add this specific role to your GCP service account if not already present:
Copy code
bash
gcloud projects add-iam-policy-binding my-project \
    --member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
    --role="roles/iam.serviceAccountTokenCreator"
4. In your Kubernetes deployment for the worker, ensure you have:
Copy code
yaml
spec:
  template:
    spec:
      serviceAccountName: prefect-worker  # Must match the K8s service account name
      nodeSelector:
        <http://iam.gke.io/gke-metadata-server-enabled|iam.gke.io/gke-metadata-server-enabled>: "true"
5. Also, try adding these environment variables to your worker deployment:
Copy code
yaml
env:
  - name: GOOGLE_APPLICATION_CREDENTIALS
    value: "/var/run/secrets/tokens/gcp-token"
  - name: CLOUDSDK_AUTH_CREDENTIAL_FILE_OVERRIDE
    value: "/var/run/secrets/tokens/gcp-token"
Can you verify these settings and let me know if any are missing or different in your setup? Also, could you try running a test pod with the same service account to verify the Workload Identity is working:
Copy code
bash
kubectl run -it \
  --serviceaccount=prefect-worker \
  --image=google/cloud-sdk:slim \
  --namespace=your-namespace \
  workload-identity-test -- /bin/bash

# Inside the pod run:
gcloud auth list
gcloud projects list
This will help us confirm if the issue is specific to the Prefect worker or if it's a general Workload Identity configuration issue.
thought for 14.9 seconds
d
@Marvin File "/home/prefect/.local/lib/python3.11/site-packages/google/auth/compute_engine/credentials.py", line 133, in refresh raise new_exc from caught_exc google.auth.exceptions.RefreshError: ('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\thttps://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to\\n\\n"', <google_auth_httplib2._Response object at 0x7db9e01a7f90>) 132341.361 | INFO | prefect.flow_runs.worker - Reported flow run 'cb2d6285-ea3a-496a-b0b8-a2a15d68abf4' as crashed: Flow run could not be submitted to infrastructure: RefreshError(TransportError('Failed to retrieve http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token?scopes=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform from the Google Compute Engine metadata service. Status: 403 Response:\nb"Unable to generate access token; IAM returned 403 Forbidden: Permission \'iam.serviceAccounts.getAccessToken\' denied on resource (or it may not exist).\\nThis error could be caused by a missing IAM policy binding on the target IAM service account.\\nFor more information, refer to the Workload Identity documentation:\\n\\thttps://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity#authenticating_to\\n\\n"', <google_auth_httplib2._Response object at 0x7db9e01a7f90>)) 132344.487 | DEBUG | prefect.utilities.services.critical_service_loop - Starting run of 'sync_with_backend' 132344.531 | DEBUG | prefect.workers.cloud-run-v2.cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Worker synchronized with the Prefect API server. 132346.786 | DEBUG | prefect.utilities.services.critical_service_loop - Starting run of 'get_and_submit_flow_runs' 132346.786 | DEBUG | prefect.workers.cloud-run-v2.cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Querying for flow runs scheduled before 2025-08-06 132356.786714+00:00 132346.812 | DEBUG | prefect.workers.cloud-run-v2.cloudrunworkerv2 0c097b6f-d6e8-4364-8475-331cfd5da9e6 - Discovered 0 scheduled_flow_runs
m
thought for 18.9 seconds
thought for 13.8 seconds
This is the continuation of the error stack trace, and it's showing the authentication chain that's failing. The error is occurring during the credential refresh process when trying to make a request to Cloud Run. Let's try a different approach. Since we know the Workload Identity is working at the pod level (as you mentioned), but the Cloud Run worker is having issues, we can try to modify how the worker authenticates. 1. First, let's modify your work pool configuration. In your base job template, try removing the explicit service account specification and let it use the default compute service account:
Copy code
bash
prefect work-pool update base --update-base-job-template '{
  "job_configuration": {
    "job_body": {
      "template": {
        "template": {
          "serviceAccount": null  # Remove the explicit service account
        }
      }
    }
  }
}'
2. Add these specific environment variables to your worker deployment:
Copy code
yaml
env:
  - name: GOOGLE_CLOUD_PROJECT
    value: "your-project-id"
  - name: USE_GKE_GCLOUD_AUTH_PLUGIN
    value: "True"
  - name: GOOGLE_APPLICATION_CREDENTIALS
    value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
3. Also, let's verify the worker's service account has all the necessary roles. Run these commands:
Copy code
bash
# Add Cloud Run Admin role
gcloud projects add-iam-policy-binding your-project-id \
    --member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
    --role="roles/run.admin"

# Add Service Account User role
gcloud projects add-iam-policy-binding your-project-id \
    --member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
    --role="roles/iam.serviceAccountUser"

# Add Service Account Token Creator role
gcloud projects add-iam-policy-binding your-project-id \
    --member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
    --role="roles/iam.serviceAccountTokenCreator"
4. In your worker pod, you can try to verify the authentication is working by running this Python code:
Copy code
python
from google.auth import default
from google.cloud import run_v2

credentials, project = default()
client = run_v2.JobsClient(credentials=credentials)

# Try to list jobs
parent = f"projects/{project}/locations/europe-west6"
jobs = client.list_jobs(parent=parent)
Can you: 1. Apply these changes to your worker deployment 2. Verify all the IAM roles are properly set 3. Try running a flow again If it still fails, can you run the Python code snippet above in your worker pod and let me know what output/error you get? This will help us understand exactly where the authentication chain is breaking.
d
@Marvin those are all the logs, what can I do to fix it ?
m
Error: status_code: 429, model_name: claude-3-5-sonnet-latest, body: {'type': 'error', 'error': {'type': 'rate_limit_error', 'message': 'Number of concurrent connections has exceeded your rate limit. Please try again later or contact sales at https://www.anthropic.com/contact-sales to discuss your options for a rate limit increase.'}}
thought for 20.6 seconds
Looking at the continued logs, it's clear that the worker is trying to use the "default" service account when making the request to the metadata server:
Copy code
<http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token>
This suggests that the worker isn't properly configured to use the Workload Identity service account. Let's fix this: 1. First, check your Kubernetes service account annotation to ensure it's exactly correct:
Copy code
bash
kubectl get serviceaccount prefect-worker -n your-namespace -o yaml
2. In your work pool configuration, I notice you have the service account hardcoded in the job template. Instead, let's modify it to use the Workload Identity service account. Update your work pool configuration:
Copy code
bash
prefect work-pool create-type cloud-run-v2 \
  --name base \
  --base-job-template '{
    "job_configuration": {
      "template": {
        "serviceAccount": null,  # Remove the hardcoded service account
        "containers": [{
          "image": "{{ image }}",
          "resources": {
            "limits": {
              "cpu": "{{ cpu }}",
              "memory": "{{ memory }}"
            }
          }
        }]
      }
    }
  }'
3. Add these specific environment variables to your worker deployment:
Copy code
yaml
spec:
  template:
    spec:
      serviceAccountName: prefect-worker  # Your K8s service account name
      containers:
        - name: prefect-worker
          env:
            - name: GOOGLE_CLOUD_PROJECT
              value: "your-project-id"
            - name: USE_GKE_GCLOUD_AUTH_PLUGIN
              value: "True"
            - name: CLOUDSDK_CORE_PROJECT
              value: "your-project-id"
4. Verify the IAM roles on your GCP service account:
Copy code
bash
gcloud iam service-accounts get-iam-policy <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>
Make sure it has these exact roles:
Copy code
bash
gcloud projects add-iam-policy-binding your-project-id \
    --member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
    --role="roles/iam.serviceAccountTokenCreator"

gcloud projects add-iam-policy-binding your-project-id \
    --member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
    --role="roles/iam.workloadIdentityUser"

gcloud projects add-iam-policy-binding your-project-id \
    --member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
    --role="roles/run.developer"
5. Verify the Workload Identity binding:
Copy code
bash
gcloud iam service-accounts add-iam-policy-binding \
    <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com> \
    --role="roles/iam.workloadIdentityUser" \
    --member="serviceAccount:your-project-id.svc.id.goog[your-namespace/prefect-worker]"
6. Test the Workload Identity setup:
Copy code
bash
kubectl run -it --rm \
    --serviceaccount=prefect-worker \
    --image=google/cloud-sdk:slim \
    --namespace=your-namespace \
    workload-identity-test -- bash -c \
    'gcloud auth list && gcloud projects list'
The key changes here are: 1. Removing the hardcoded service account from the work pool configuration 2. Adding the necessary environment variables to the worker deployment 3. Ensuring all IAM roles are correctly set 4. Verifying the Workload Identity binding is correct Can you try these changes and let me know if the error persists? Also, could you share: 1. The output of
kubectl get serviceaccount prefect-worker -n your-namespace -o yaml
2. The output of the test command in step 6?
I see the full error trace now. The issue is happening at the point where the worker tries to execute the Cloud Run job creation request. Let's verify a few specific Cloud Run-related permissions and configurations: 1. First, make sure your GCP service account has these specific Cloud Run permissions:
Copy code
bash
gcloud projects add-iam-policy-binding your-project-id \
    --member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
    --role="roles/run.developer"

gcloud projects add-iam-policy-binding your-project-id \
    --member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
    --role="roles/run.invoker"
2. Check if the Cloud Run API is enabled:
Copy code
bash
gcloud services enable <http://run.googleapis.com|run.googleapis.com>
3. In your work pool configuration, I notice you have a VPC connector specified. Make sure: - The VPC connector exists and is properly configured - The service account has permission to use the VPC connector:
Copy code
bash
gcloud projects add-iam-policy-binding your-project-id \
    --member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
    --role="roles/vpcaccess.user"
4. Let's modify your work pool configuration to add more debugging information. Update your work pool with:
Copy code
bash
prefect work-pool create --type cloud-run-v2 --update base \
  --job-variables '{
    "region": "europe-west6",
    "image": "prefecthq/prefect:2-latest",
    "cpu": "1000m",
    "memory": "512Mi",
    "env": {
      "PREFECT_LOGGING_LEVEL": "DEBUG",
      "GOOGLE_CLOUD_PROJECT": "your-project-id"
    }
  }'
5. Can you also try running a simple test to verify the service account can create Cloud Run jobs directly? Create a test pod:
Copy code
bash
kubectl run -it \
  --serviceaccount=prefect-worker \
  --image=google/cloud-sdk:slim \
  --namespace=your-namespace \
  cloud-run-test -- /bin/bash

# Inside the pod:
gcloud auth list
gcloud run jobs create test-job \
  --image=<http://gcr.io/cloudrun/hello|gcr.io/cloudrun/hello> \
  --region=europe-west6
If this test fails with a similar error, it would confirm that the issue is with the service account permissions rather than the Prefect worker configuration. Can you: 1. Confirm all these permissions are in place 2. Try the test pod approach 3. Share the output of:
Copy code
bash
gcloud iam service-accounts get-iam-policy <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>
This will help us determine if it's a permissions issue or a configuration issue with the worker itself.
Looking at the complete error stack trace, this is a classic Workload Identity token issue. Even though the pod can access GCP resources, the Cloud Run worker is having trouble getting a token for creating Cloud Run jobs. Here's what to do: 1. First, add these environment variables to your worker deployment (via Helm values or K8s manifest):
Copy code
yaml
env:
  - name: GOOGLE_CLOUD_PROJECT
    value: "your-project-id"  # Replace with your actual project ID
  - name: CLOUDSDK_CORE_PROJECT
    value: "your-project-id"  # Same as above
  - name: GOOGLE_APPLICATION_CREDENTIALS
    value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
  - name: USE_GKE_GCLOUD_AUTH_PLUGIN
    value: "True"
2. Modify your GCP service account permissions. Run these commands:
Copy code
bash
# Replace with your actual service account and project
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"
export PROJECT_ID="your-project-id"

# Add necessary roles
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/iam.serviceAccountTokenCreator"

gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/run.admin"

# This is important - add permission to act as itself
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/iam.serviceAccountUser"
3. Update your Kubernetes service account to explicitly specify the GCP service account:
Copy code
yaml
apiVersion: v1
kind: ServiceAccount
metadata:
  name: prefect-worker
  annotations:
    <http://iam.gke.io/gcp-service-account|iam.gke.io/gcp-service-account>: "<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"
4. Make sure your worker deployment is using this service account:
Copy code
yaml
spec:
  template:
    spec:
      serviceAccountName: prefect-worker
5. Update your work pool configuration to use the default compute service account for the Cloud Run jobs:
Copy code
bash
prefect work-pool update base --update-base-job-template '{
  "job_configuration": {
    "job_body": {
      "template": {
        "template": {
          "serviceAccount": null
        }
      }
    }
  }
}'
After making these changes: 1. Delete the existing worker pod to force a new one with the updated configuration 2. Try running a flow again If you still see issues, can you run this diagnostic command in the worker pod:
Copy code
bash
kubectl exec -it <worker-pod> -- bash -c "curl -H 'Metadata-Flavor: Google' <http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token>"
This will help us verify if the pod can properly obtain GCP tokens. The key difference in these changes is: 1. We're explicitly setting the GOOGLE_APPLICATION_CREDENTIALS path 2. We're ensuring the service account has permission to create tokens 3. We're removing the explicit service account from the Cloud Run job template 4. We're adding the service account's ability to act as itself These changes should resolve the token generation issue you're seeing.
d
@Marvin Curl is not a part of the prefect-worker pod
m
thought for 10.3 seconds
Ah, you're right. Let's try a Python-based diagnostic test instead. Create a file called
test_auth.py
and copy it into the worker pod:
Copy code
bash
# Create the test file
cat << 'EOF' > test_auth.py
from google.auth import default
from google.cloud import run_v2
import google.auth.transport.requests

def test_auth():
    print("1. Getting default credentials...")
    credentials, project = default()
    
    print(f"2. Project ID: {project}")
    
    print("3. Refreshing credentials...")
    request = google.auth.transport.requests.Request()
    credentials.refresh(request)
    
    print("4. Token info:", credentials.token)
    
    print("5. Creating Cloud Run client...")
    client = run_v2.JobsClient(credentials=credentials)
    
    print("6. Trying to list jobs...")
    parent = f"projects/{project}/locations/europe-west6"
    jobs = client.list_jobs(parent=parent)
    for job in jobs:
        print(f"Found job: {job.name}")

if __name__ == "__main__":
    test_auth()
EOF

# Copy it to the pod
kubectl cp test_auth.py <pod-name>:/tmp/test_auth.py -n <namespace>

# Execute it in the pod
kubectl exec -it <pod-name> -n <namespace> -- python /tmp/test_auth.py
This will help us diagnose exactly where the authentication is failing. Can you run this and share the output? Also, let's verify the current IAM bindings for your service account. Run this outside the pod:
Copy code
bash
# Replace with your actual service account email
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"

# Check IAM bindings
gcloud iam service-accounts get-iam-policy $SA_EMAIL
And check the Kubernetes service account configuration:
Copy code
bash
kubectl get serviceaccount prefect-worker -n <namespace> -o yaml
Please share the outputs of: 1. The Python test script 2. The IAM policy check 3. The Kubernetes service account configuration This will help us pinpoint exactly where the authentication chain is breaking.
d
@Marvin 1. the script gives an error
Copy code
Traceback (most recent call last):
  File "/tmp/test_auth.py", line 2, in <module>
    from google.cloud import run_v2
ImportError: cannot import name 'run_v2' from 'google.cloud' (unknown location)
command terminated with exit code 1
2. bindings: - members: - serviceAccount:my-project.svc.id.goog[prefect/prefect-worker-base] role: roles/iam.workloadIdentityUser etag: BwY7tDND5N4= version: 1 3.
Copy code
apiVersion: v1
kind: ServiceAccount
metadata:
  annotations:
    <http://iam.gke.io/gcp-service-account|iam.gke.io/gcp-service-account>: <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>
    <http://meta.helm.sh/release-name|meta.helm.sh/release-name>: prefect-worker-base
    <http://meta.helm.sh/release-namespace|meta.helm.sh/release-namespace>: prefect
  creationTimestamp: "2025-08-06T15:49:24Z"
  labels:
    <http://app.kubernetes.io/component|app.kubernetes.io/component>: worker
    <http://app.kubernetes.io/instance|app.kubernetes.io/instance>: prefect-worker-base
    <http://app.kubernetes.io/managed-by|app.kubernetes.io/managed-by>: Helm
    <http://app.kubernetes.io/name|app.kubernetes.io/name>: prefect-worker
    <http://app.kubernetes.io/version|app.kubernetes.io/version>: 3.4.10
    <http://helm.sh/chart|helm.sh/chart>: prefect-worker-2025.7.28164749
    prefect-version: 3.4.10
  name: prefect-worker-base
  namespace: prefect
  resourceVersion: "1754495364822991003"
  uid: 5ce72423-d044-428c-a9d9-48b1cd037ac8
m
thought for 15.5 seconds
I notice a few important things from your outputs: 1. The Python script failed because the
google-cloud-run
package isn't installed in the worker pod. 2. There's a mismatch in your service account names: - Your K8s service account is named
prefect-worker-base
- But your workload identity binding is looking for
prefect-worker
3. Let's fix these issues: First, update your Workload Identity binding to match the correct service account name:
Copy code
bash
# Remove old binding
gcloud iam service-accounts remove-iam-policy-binding \
    <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com> \
    --role="roles/iam.workloadIdentityUser" \
    --member="serviceAccount:my-project.svc.id.goog[prefect/prefect-worker]"

# Add correct binding
gcloud iam service-accounts add-iam-policy-binding \
    <mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com> \
    --role="roles/iam.workloadIdentityUser" \
    --member="serviceAccount:my-project.svc.id.goog[prefect/prefect-worker-base]"
Next, add these environment variables to your Helm values or worker deployment:
Copy code
yaml
env:
  - name: GOOGLE_CLOUD_PROJECT
    value: "your-project-id"  # Replace with your actual project ID
  - name: CLOUDSDK_CORE_PROJECT
    value: "your-project-id"  # Same as above
Also, make sure your GCP service account has all required roles:
Copy code
bash
export PROJECT_ID="your-project-id"  # Replace with your project
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"

# Add all necessary roles
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/run.admin"

gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/iam.serviceAccountUser"

gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/iam.serviceAccountTokenCreator"

gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/vpcaccess.user"
Let's create a simpler test script that doesn't require additional packages:
Copy code
python
from google.auth import default
import google.auth.transport.requests

def test_auth():
    print("1. Getting default credentials...")
    credentials, project = default()
    print(f"2. Project ID: {project}")
    
    print("3. Refreshing credentials...")
    request = google.auth.transport.requests.Request()
    credentials.refresh(request)
    
    print("4. Token info:", credentials.token[:20] + "...")  # Only show start of token

if __name__ == "__main__":
    test_auth()
Can you: 1. Run the updated IAM commands 2. Apply the environment variables to your worker deployment 3. Try the new test script 4. Attempt to run a flow again Also, please share your work pool configuration:
Copy code
bash
prefect work-pool inspect base
This will help us ensure the Cloud Run job configuration is correct as well.
d
@Marvin 3. The script returns this:
Copy code
1. Getting default credentials...
2. Project ID: my-project
3. Refreshing credentials...
4. Token info: jknkjnvbbbkjjknegrfg...
Workpool configs: WorkPool( id='eab22801-5ef4-4e26-bb04-8476a9d5388d', created=DateTime(2025, 7, 25, 9, 47, 15, 851829, tzinfo=Timezone('UTC')), updated=DateTime(2025, 8, 6, 14, 24, 50, 754629, tzinfo=Timezone('UTC')), name='base', description='Main workpool for Prefect workloads', type='cloud-run-v2', base_job_template={ 'variables': { 'type': 'object', 'properties': { 'cpu': { 'type': 'string', 'title': 'CPU', 'default': '1000m', 'description': 'The CPU to allocate to the Cloud Run job.' }, 'env': { 'type': 'object', 'title': 'Environment Variables', 'description': 'Environment variables to set when starting a flow run.', 'additionalProperties': {'type': 'string'} }, 'args': { 'type': 'array', 'items': {'type': 'string'}, 'title': 'Args', 'description': "The arguments to pass to the Cloud Run Job V2's entrypoint command." }, 'image': { 'type': 'string', 'title': 'Image Name', 'default': 'prefecthq/prefect:2-latest', 'description': 'The image to use for the Cloud Run job. If not provided the default Prefect image will be used.' }, 'memory': { 'type': 'string', 'title': 'Memory', 'default': '512Mi', 'example': '512Mi', 'pattern': '^\\d+(?:G|Gi|M|Mi)$', 'description': 'The memory to allocate to the Cloud Run job along with the units, whichcould be: G, Gi, M, Mi.' }, 'region': { 'type': 'string', 'title': 'Region', 'default': 'my-region', 'description': 'The region in which to run the Cloud Run job' }, 'timeout': { 'type': 'integer', 'title': 'Job Timeout', 'default': 600, 'maximum': 86400, 'description': 'The length of time that Prefect will wait for a Cloud Run Job to complete before raising an exception (maximum of 86400 seconds, 1 day).', 'exclusiveMinimum': 0 }, 'keep_job': { 'type': 'boolean', 'title': 'Keep Job After Completion', 'default': True, 'description': 'Keep the completed Cloud run job on Google Cloud Platform.' }, 'max_retries': { 'type': 'integer', 'title': 'Max Retries', 'default': 0, 'description': 'The number of times to retry the Cloud Run job.' } }, 'description': 'Default variables for the Cloud Run worker V2.\n\nThe schema for this class is used to populate the
variables
section of the\ndefault base job template.' }, 'job_configuration': { 'env': '{{ env }}', 'region': '{{ region }}', 'timeout': '{{ timeout }}', 'job_body': { 'client': 'prefect', 'template': { 'template': { 'timeout': '{{ timeout }}', 'vpcAccess': { 'connector': 'projects/my-net-proj/locations/my-location/connectors/name' }, 'containers': [ { 'env': [], 'args': '{{ args }}', 'image': '{{ image }}', 'resources': { 'limits': { 'cpu': '{{ cpu }}', 'memory': '{{ memory }}' } } } ], 'maxRetries': '{{ max_retries }}', 'serviceAccount': None } }, 'launchStage': 'BETA' }, 'keep_job': '{{ keep_job }}' } }, status=WorkPoolStatus.READY, storage_configuration=WorkPoolStorageConfiguration(), default_queue_id='b86a0d36-f11d-4190-ae8c-1b705b7ec91b' ) Triggering a flow again, now I get this error: googleapiclient.errors.HttpError: <HttpError 403 when requesting https://run.googleapis.com/v2/projects/my-project/locations/my-location/jobs?jobId=devout-turtle-69e6f41daa3340c99dec962e89315651&alt=json returned "The caller does not have permission". Details: "The caller does not have permission">
m
thought for 12.8 seconds
Ah, now we're getting somewhere! The authentication is working (as shown by your test script getting a token), but the service account doesn't have the correct permissions to create Cloud Run jobs. Let's fix the permissions. Run these commands:
Copy code
bash
# Replace these with your actual values
export PROJECT_ID="my-project"
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"

# Add Cloud Run Admin role (this is needed for creating jobs)
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/run.admin"

# Add Service Account User role (needed to act as the service account)
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/iam.serviceAccountUser"

# Add VPC Access User role (since you're using a VPC connector)
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/vpcaccess.user"
Also, I notice in your work pool configuration that: 1. The region in the job configuration (
my-location
) doesn't match the region variable default (
my-region
) 2. The
serviceAccount
is set to
None
Let's update your work pool configuration:
Copy code
bash
prefect work-pool update base --update-base-job-template '{
  "job_configuration": {
    "region": "{{ region }}",
    "job_body": {
      "template": {
        "template": {
          "serviceAccount": "<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>",
          "vpcAccess": {
            "connector": "projects/my-net-proj/locations/my-location/connectors/name"
          }
        }
      }
    }
  }
}'
And set the correct region:
Copy code
bash
prefect work-pool update base --set-base-job-variable region="my-location"
After making these changes: 1. Delete the existing worker pod to force a refresh 2. Try running your flow again The key changes are: 1. Adding the explicit Cloud Run Admin role 2. Setting the service account explicitly in the job template 3. Ensuring the regions match between the configuration and the variable 4. Adding the VPC Access User role for the VPC connector Can you apply these changes and let me know if you still get the permission error?
d
@Marvin nothing works, the flow still crashes with the same error
m
thought for 18.5 seconds
Let's try a different approach since we're still getting the same error. The issue might be with how the Cloud Run worker is trying to authenticate. 1. First, let's modify your work pool configuration to use the default compute service account instead of specifying one:
Copy code
bash
prefect work-pool update base --update-base-job-template '{
  "job_configuration": {
    "job_body": {
      "template": {
        "template": {
          "serviceAccount": "",
          "vpcAccess": {
            "connector": "projects/my-net-proj/locations/my-location/connectors/name"
          },
          "containers": [
            {
              "env": [],
              "resources": {
                "limits": {
                  "cpu": "{{ cpu }}",
                  "memory": "{{ memory }}"
                }
              }
            }
          ]
        }
      }
    }
  }
}'
2. Add these specific environment variables to your worker deployment (via Helm values):
Copy code
yaml
env:
  - name: GOOGLE_CLOUD_PROJECT
    value: "my-project"  # Your actual project ID
  - name: GOOGLE_APPLICATION_CREDENTIALS
    value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
  - name: CLOUDSDK_AUTH_CREDENTIAL_FILE_OVERRIDE
    value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
  - name: USE_GKE_GCLOUD_AUTH_PLUGIN
    value: "True"
  - name: PREFECT_CLOUD_RUN_JOB_CREDENTIALS_MODE
    value: "workload_identity"
3. Let's verify all IAM permissions are correctly set:
Copy code
bash
export PROJECT_ID="my-project"
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"

# Remove all existing bindings first
for role in roles/run.admin roles/run.developer roles/run.invoker roles/iam.serviceAccountUser roles/iam.serviceAccountTokenCreator roles/vpcaccess.user; do
    gcloud projects remove-iam-policy-binding $PROJECT_ID \
        --member="serviceAccount:$SA_EMAIL" \
        --role="$role"
done

# Add them back one by one
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/run.admin"

gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/iam.serviceAccountUser"

gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/vpcaccess.user"

# Add workload identity binding
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
    --role="roles/iam.workloadIdentityUser" \
    --member="serviceAccount:$PROJECT_ID.svc.id.goog[prefect/prefect-worker-base]"
4. Verify the Cloud Run API is enabled:
Copy code
bash
gcloud services enable <http://run.googleapis.com|run.googleapis.com>
5. After applying these changes:
Copy code
bash
# Delete the worker pod to force a refresh
kubectl delete pod -l <http://app.kubernetes.io/name=prefect-worker|app.kubernetes.io/name=prefect-worker> -n prefect

# Wait for the new pod to be ready
kubectl wait --for=condition=ready pod -l <http://app.kubernetes.io/name=prefect-worker|app.kubernetes.io/name=prefect-worker> -n prefect
6. Before trying your flow, let's run this test in the worker pod:
Copy code
python
from google.auth import default
from google.auth.transport import requests
import json

credentials, project = default()
request = requests.Request()
credentials.refresh(request)

print("Project:", project)
print("Token type:", credentials.token_type)
print("Scopes:", credentials.scopes)
print("Valid:", credentials.valid)
print("Expired:", credentials.expired)
Can you: 1. Apply these changes 2. Run the test script 3. Try your flow again If it still fails, please share: 1. The output of the test script 2. The exact error message from the flow run 3. The output of
kubectl describe pod <worker-pod-name> -n prefect
This will help us understand if there's an issue with the token scope or if there's another configuration problem we need to address.
d
@Marvin the script does not work Traceback (most recent call last): File "/tmp/test_auth.py", line 10, in <module> print("Token type:", credentials.token_type) ^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'Credentials' object has no attribute 'token_type'. Did you mean: 'token_state'? command terminated with exit code 1
m
thought for 14.7 seconds
Let's modify the test script to show the information we can actually access. Here's an updated version:
Copy code
python
from google.auth import default
from google.auth.transport import requests
import json

def test_auth():
    try:
        print("1. Getting default credentials...")
        credentials, project = default()
        print(f"2. Project ID: {project}")
        
        print("3. Credential type:", type(credentials).__name__)
        
        print("4. Refreshing credentials...")
        request = requests.Request()
        credentials.refresh(request)
        
        print("5. Token available:", bool(credentials.token))
        print("6. Token (first 20 chars):", credentials.token[:20] if credentials.token else "No token")
        
        print("7. Credential attributes:", dir(credentials))
        
    except Exception as e:
        print(f"Error occurred: {type(e).__name__} - {str(e)}")

if __name__ == "__main__":
    test_auth()
Also, let's try a different approach with the worker configuration. Update your work pool with this configuration:
Copy code
bash
prefect work-pool update base --update-base-job-template '{
  "job_configuration": {
    "job_body": {
      "template": {
        "template": {
          "containers": [
            {
              "image": "{{ image }}",
              "resources": {
                "limits": {
                  "cpu": "{{ cpu }}",
                  "memory": "{{ memory }}"
                }
              },
              "env": [
                {
                  "name": "PREFECT_API_URL",
                  "value": "{{ prefect.api.url }}"
                },
                {
                  "name": "PREFECT_API_KEY",
                  "value": "{{ prefect.api.key }}"
                }
              ]
            }
          ],
          "vpcAccess": {
            "connector": "projects/my-net-proj/locations/my-location/connectors/name"
          }
        }
      }
    },
    "region": "my-location"
  }
}'
And update your worker deployment with these environment variables (via Helm values):
Copy code
yaml
worker:
  env:
    - name: GOOGLE_CLOUD_PROJECT
      value: "my-project"  # Your actual project ID
    - name: CLOUDSDK_CORE_PROJECT
      value: "my-project"  # Same as above
    - name: GOOGLE_APPLICATION_CREDENTIALS
      value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
    - name: USE_GKE_GCLOUD_AUTH_PLUGIN
      value: "True"
Can you: 1. Apply the new work pool configuration 2. Update your worker deployment with the new environment variables 3. Run the updated test script 4. Try running a flow again Please share: 1. The output of the new test script 2. Any new error messages you see when running the flow This will help us understand exactly what kind of credentials the worker is getting and how we might need to adjust the configuration.
d
@Marvin 1: 1. Getting default credentials... 2. Project ID: rad-data-processing-qa 3. Credential type: Credentials 4. Refreshing credentials... 5. Token available: True 6. Token (first 20 chars): jknkjsndfjhbkjjknegr... 7. Credential attributes: ['__abstractmethods__', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getstate__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_abc_impl', '_apply', '_blocking_refresh', '_default_scopes', '_metric_header_for_usage', '_non_blocking_refresh', '_quota_project_id', '_refresh_worker', '_retrieve_info', '_scopes', '_service_account_email', '_trust_boundary', '_universe_domain', '_universe_domain_cached', '_use_non_blocking_refresh', 'apply', 'before_request', 'default_scopes', 'expired', 'expiry', 'get_cred_info', 'has_scopes', 'quota_project_id', 'refresh', 'requires_scopes', 'scopes', 'service_account_email', 'token', 'token_state', 'universe_domain', 'valid', 'with_non_blocking_refresh', 'with_quota_project', 'with_quota_project_from_environment', 'with_scopes', 'with_universe_domain'] 2: still the same error Reported flow run '75b406be-439d-4b66-aa43-604709059d62' as crashed: Flow run could not be submitted to infrastructure: <HttpError 403 when requesting https://run.googleapis.com/v2/projects/my-project/locations/my-location/jobs?jobId=devout-turtle-25d22a6342bb40fd91fe4978fd88ab5e&alt=json returned "The caller does not have permission". Details: "The caller does not have permission">
m
thought for 18.1 seconds
The test script shows that we're getting valid credentials, but the permissions are still not correct. Let's try a more comprehensive approach to the permissions: 1. First, let's grant a broader set of permissions to ensure nothing is missing:
Copy code
bash
export PROJECT_ID="my-project"
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"

# Grant Cloud Run Admin role at the project level
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/run.admin"

# Grant Service Account User role
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/iam.serviceAccountUser"

# Grant Service Account Token Creator
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/iam.serviceAccountTokenCreator"

# Grant VPC Access User
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/vpcaccess.user"

# Grant Cloud Run Invoker
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/run.invoker"

# Grant Service Account User on itself
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/iam.serviceAccountUser"
2. Update your work pool configuration to explicitly use the service account:
Copy code
bash
prefect work-pool update base --update-base-job-template '{
  "job_configuration": {
    "job_body": {
      "template": {
        "template": {
          "serviceAccount": "<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>",
          "containers": [
            {
              "image": "{{ image }}",
              "resources": {
                "limits": {
                  "cpu": "{{ cpu }}",
                  "memory": "{{ memory }}"
                }
              }
            }
          ],
          "vpcAccess": {
            "connector": "projects/my-net-proj/locations/my-location/connectors/name"
          }
        }
      }
    },
    "region": "my-location"
  }
}'
3. Add these specific environment variables to your worker deployment:
Copy code
yaml
env:
  - name: GOOGLE_CLOUD_PROJECT
    value: "my-project"
  - name: CLOUDSDK_CORE_PROJECT
    value: "my-project"
  - name: GOOGLE_APPLICATION_CREDENTIALS
    value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
  - name: USE_GKE_GCLOUD_AUTH_PLUGIN
    value: "True"
  - name: PREFECT_LOGGING_LEVEL
    value: "DEBUG"
  - name: PREFECT_CLOUD_RUN_JOB_CREDENTIALS_MODE
    value: "workload_identity"
4. Let's also verify the Cloud Run API is enabled and accessible:
Copy code
bash
# Enable the API
gcloud services enable <http://run.googleapis.com|run.googleapis.com>

# Verify the service account can list jobs (run this in Cloud Shell or similar)
gcloud run jobs list \
  --impersonate-service-account=prefect@my-project.iam.gserviceaccount.com \
  --project=my-project \
  --region=my-location
5. After applying all these changes:
Copy code
bash
# Delete the worker pod to force a refresh
kubectl delete pod -l <http://app.kubernetes.io/name=prefect-worker|app.kubernetes.io/name=prefect-worker> -n prefect
Can you: 1. Apply all these changes 2. Try running a flow again 3. If it fails, please share: - The complete worker pod logs:
kubectl logs <pod-name> -n prefect
- The output of:
gcloud projects get-iam-policy my-project --flatten="bindings[].members" --filter="bindings.members:prefect@my-project.iam.gserviceaccount.com" --format="table(bindings.role)"
This will help us see exactly what permissions are in place and what might be missing.
thought for 20.4 seconds
I notice in your work pool configuration that
serviceAccount
is set to
None
. Let's fix this and make a few other adjustments: 1. First, update your work pool configuration with the explicit service account:
Copy code
bash
prefect work-pool update base --update-base-job-template '{
  "job_configuration": {
    "job_body": {
      "template": {
        "template": {
          "serviceAccount": "<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>",
          "vpcAccess": {
            "connector": "projects/my-net-proj/locations/my-location/connectors/name"
          },
          "containers": [
            {
              "env": [],
              "args": "{{ args }}",
              "image": "{{ image }}",
              "resources": {
                "limits": {
                  "cpu": "{{ cpu }}",
                  "memory": "{{ memory }}"
                }
              }
            }
          ]
        }
      }
    },
    "region": "my-location"
  }
}'
2. Update the region setting to match your location:
Copy code
bash
prefect work-pool update base --set-base-job-variable region="my-location"
3. Let's verify and update all the necessary IAM permissions:
Copy code
bash
export PROJECT_ID="my-project"
export SA_EMAIL="<mailto:prefect@my-project.iam.gserviceaccount.com|prefect@my-project.iam.gserviceaccount.com>"

# First, grant project-level permissions
for role in \
    "roles/run.admin" \
    "roles/run.invoker" \
    "roles/iam.serviceAccountUser" \
    "roles/vpcaccess.user"; do
    gcloud projects add-iam-policy-binding $PROJECT_ID \
        --member="serviceAccount:$SA_EMAIL" \
        --role="$role"
done

# Grant the service account permission to act as itself
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
    --member="serviceAccount:$SA_EMAIL" \
    --role="roles/iam.serviceAccountUser"

# Update the Workload Identity binding
gcloud iam service-accounts add-iam-policy-binding $SA_EMAIL \
    --member="serviceAccount:$PROJECT_ID.svc.id.goog[prefect/prefect-worker-base]" \
    --role="roles/iam.workloadIdentityUser"
4. Add these environment variables to your worker deployment:
Copy code
yaml
env:
  - name: GOOGLE_CLOUD_PROJECT
    value: "my-project"
  - name: CLOUDSDK_CORE_PROJECT
    value: "my-project"
  - name: PREFECT_LOGGING_LEVEL
    value: "DEBUG"
  - name: GOOGLE_APPLICATION_CREDENTIALS
    value: "/var/run/secrets/kubernetes.io/serviceaccount/token"
5. After applying these changes, let's verify the permissions are working. Create this test script and run it in the worker pod:
Copy code
python
from google.auth import default
from google.cloud import run_v2
import google.auth.transport.requests

def test_permissions():
    print("1. Getting credentials...")
    credentials, project = default()
    
    print("2. Refreshing token...")
    auth_req = google.auth.transport.requests.Request()
    credentials.refresh(auth_req)
    
    print("3. Creating Cloud Run client...")
    client = run_v2.JobsClient(credentials=credentials)
    
    print("4. Testing list jobs...")
    parent = f"projects/{project}/locations/my-location"
    try:
        jobs = client.list_jobs(request={"parent": parent})
        print("Successfully listed jobs!")
    except Exception as e:
        print(f"Error listing jobs: {str(e)}")

if __name__ == "__main__":
    test_permissions()
Can you: 1. Apply the work pool updates 2. Apply the IAM permissions 3. Update your worker deployment 4. Run the new test script (after installing google-cloud-run with
pip install google-cloud-run
) 5. Try running a flow again If it still fails, please share: 1. The output of the test script 2. The complete worker pod logs after trying to run a flow 3. The output of:
Copy code
bash
gcloud projects get-iam-policy $PROJECT_ID \
    --flatten="bindings[].members" \
    --filter="bindings.members:$SA_EMAIL" \
    --format="table(bindings.role)"
This will help us understand exactly what permissions are in place and what might be missing.
d
@Marvin nothing works, I give up
m
thought for 17.2 seconds
Before you give up, let me try one last approach that I think might work. There's a specific pattern that often resolves these Cloud Run v2 worker permission issues: 1. First, let's simplify the work pool configuration completely:
Copy code
bash
prefect work-pool update base --update-base-job-template '{
  "job_configuration": {
    "job_body": {
      "template": {
        "template": {
          "containers": [
            {
              "image": "{{ image }}",
              "resources": {
                "limits": {
                  "cpu": "{{ cpu }}",
                  "memory": "{{ memory }}"
                }
              }
            }
          ]
        }
      }
    },
    "region": "my-location"
  }
}'
2. Then, create a new GCP service account specifically for running the jobs:
Copy code
bash
# Create a new service account for running jobs
gcloud iam service-accounts create prefect-runner \
    --display-name="Prefect Runner Service Account"

# Grant it Cloud Run permissions
gcloud projects add-iam-policy-binding my-project \
    --member="serviceAccount:prefect-runner@my-project.iam.gserviceaccount.com" \
    --role="roles/run.developer"

# Grant the worker service account permission to impersonate it
gcloud iam service-accounts add-iam-policy-binding \
    <mailto:prefect-runner@my-project.iam.gserviceaccount.com|prefect-runner@my-project.iam.gserviceaccount.com> \
    --member="serviceAccount:prefect@my-project.iam.gserviceaccount.com" \
    --role="roles/iam.serviceAccountUser"
3. Update the work pool to use this new service account:
Copy code
bash
prefect work-pool update base --update-base-job-template '{
  "job_configuration": {
    "job_body": {
      "template": {
        "template": {
          "serviceAccount": "<mailto:prefect-runner@my-project.iam.gserviceaccount.com|prefect-runner@my-project.iam.gserviceaccount.com>",
          "containers": [
            {
              "image": "{{ image }}",
              "resources": {
                "limits": {
                  "cpu": "{{ cpu }}",
                  "memory": "{{ memory }}"
                }
              }
            }
          ]
        }
      }
    },
    "region": "my-location"
  }
}'
This approach: 1. Creates a separate service account for running the jobs 2. Simplifies the work pool configuration 3. Uses proper service account delegation If you're willing to try one last time, apply these changes and try running a flow. This pattern has worked in similar situations where the permissions were causing issues. If this still doesn't work, I understand your frustration, and you might want to: 1. Open an issue on the Prefect GitHub repository 2. Consider using a different worker type temporarily 3. Reach out to Prefect support with the specific error messages and configuration Would you like to try this last approach, or would you prefer to explore alternative solutions?
d
@Marvin the error persists
m
thought for 9.8 seconds
I understand your frustration. Since we've tried multiple approaches and the error persists, here are your best options: 1. Open a GitHub Issue This seems to be a specific issue that needs attention from the Prefect team. You can open an issue at: https://github.com/PrefectHQ/prefect/issues/new Include: - The error message - Your work pool configuration - The fact that Workload Identity is properly configured (as evidenced by the token generation working) - All the IAM roles you've tried 2. Alternative Solutions In the meantime, you could: a. Use a different worker type like
kubernetes
instead of `cloud-run-v2`:
Copy code
bash
prefect work-pool create kubernetes-pool --type kubernetes
b. Or use the Process worker type with a Kubernetes deployment:
Copy code
bash
prefect work-pool create process-pool --type process
3. Contact Prefect Support Since this seems to be a specific issue with the Cloud Run v2 worker type, you might want to reach out to Prefect support directly: https://www.prefect.io/support Would you like me to: 1. Help you set up an alternative worker type as a temporary solution? 2. Help you draft a detailed GitHub issue? 3. Provide information about a different deployment approach?