Hey! We're getting a lot of these errors on our K8...
# prefect-server
j
Hey! We're getting a lot of these errors on our K8s agent and it's preventing the server and agent communicating as expected. Any quick fixes for this? Error in thread ๐Ÿงต
โœ… 1
Copy code
`[2022-06-07 13:38:40,262] ERROR - agent | Error while managing existing k8s jobs
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/site-packages/prefect/agent/kubernetes/agent.py", line 409, in heartbeat
    self.manage_jobs()
  File "/usr/local/lib/python3.6/site-packages/prefect/agent/kubernetes/agent.py", line 349, in manage_jobs
    level="ERROR",
  File "/usr/local/lib/python3.6/site-packages/prefect/client/client.py", line 2062, in write_run_logs
    mutation, variables=dict(input=dict(logs=logs))
  File "/usr/local/lib/python3.6/site-packages/prefect/client/client.py", line 569, in graphql
    raise ClientError(result["errors"])
prefect.exceptions.ClientError: [{'message': 'Foreign key violation.', 'locations': [{'line': 2, 'column': 5}], 'path': ['write_run_logs'], 'extensions': {'code': 'INTERNAL_SERVER_ERROR', 'exception': {'message': 'Foreign key violation.'}}}]
ERROR:agent:Error while managing existing k8s jobs
a
what's your
prefect diagnostics
?
looks like some issue with duplicated or deleted flow runs - could be that backend tries to write logs for a flow run that got deleted/no longer exists - hard to say based only on that info you gave so far
j
Copy code
{
  "config_overrides": {
    "server": {
      "database": {
        "connection_url": true
      },
      "ui": {
        "apollo_url": true
      }
    }
  },
  "env_vars": [
    "PREFECT__CONTEXT__SECRETS__SP_USER",
    "PREFECT__CONTEXT__SECRETS__TABLEAU_USERNAME",
    "PREFECT__CONTEXT__SECRETS__RFL_SUGARCRM_DB_HOST",
    "PREFECT__CONTEXT__SECRETS__PRO14_DD_API_KEY",
    "PREFECT__CONTEXT__SECRETS__FS_AUTH_PASSWORD",
    "PREFECT__CONTEXT__SECRETS__SP_PASS",
    "PREFECT__CONTEXT__SECRETS__RFL_SUGARCRM_DB_PORT",
    "PREFECT__CONTEXT__SECRETS__RFL_SUGARCRM_DB_USER",
    "PREFECT__CONTEXT__SECRETS__TABLEAU_PASSWORD",
    "PREFECT__CONTEXT__SECRETS__INCROWD_SFTP_EMBULK_USER",
    "PREFECT__CONTEXT__SECRETS__DREMIO_DB_USER",
    "PREFECT__CONTEXT__SECRETS__SP_DB",
    "PREFECT__CONTEXT__SECRETS__INCROWD_SFTP_EMBULK_PASS",
    "PREFECT__CONTEXT__SECRETS__SP_HOST",
    "PREFECT__CONTEXT__SECRETS__DREMIO_DB_PORT",
    "PREFECT__CONTEXT__SECRETS__SP_PORT",
    "PREFECT__CONTEXT__SECRETS__RFL_SUGARCRM_DB_DB",
    "PREFECT__CONTEXT__SECRETS__DREMIO_DB_PASS",
    "PREFECT__CONTEXT__SECRETS__RFL_SUGARCRM_DB_PASS",
    "PREFECT__CONTEXT__SECRETS__DREMIO_DB_HOST",
    "PREFECT__CONTEXT__SECRETS__TABLEAU_HOST",
    "PREFECT__CONTEXT__SECRETS__FS_AUTH_USERNAME",
    "PREFECT__CONTEXT__SECRETS__INCROWD_SFTP_EMBULK_HOST",
    "PREFECT__CONTEXT__SECRETS__INCROWD_SFTP_EMBULK_PORT"
  ],
  "system_information": {
    "platform": "Linux-5.4.0-110-generic-x86_64-with-glibc2.31",
    "prefect_backend": "server",
    "prefect_version": "1.2.0",
    "python_version": "3.9.7"
  }
}
Diagnostics โ˜๏ธ
If you need anything else then let me know ๐Ÿ‘ restarting the server didn't have any affect either, will see if I can restart the externalised Postgres cluster
No luck so far, here's an output from the server's logs
Copy code
graphql_1  |
graphql_1  | During handling of the above exception, another exception occurred:
graphql_1  |
graphql_1  | Traceback (most recent call last):
graphql_1  |   File "/usr/local/lib/python3.7/site-packages/graphql/execution/execute.py", line 668, in await_completed
graphql_1  |     return await completed
graphql_1  |   File "/usr/local/lib/python3.7/site-packages/graphql/execution/execute.py", line 653, in await_result
graphql_1  |     return_type, field_nodes, info, path, await result
graphql_1  |   File "/usr/local/lib/python3.7/site-packages/graphql/execution/execute.py", line 727, in complete_value
graphql_1  |     raise result
graphql_1  |   File "/usr/local/lib/python3.7/site-packages/graphql/execution/execute.py", line 626, in await_result
graphql_1  |     return await result
graphql_1  |   File "/prefect-server/src/prefect_server/graphql/extensions.py", line 52, in resolve
graphql_1  |     result = await result
graphql_1  |   File "/prefect-server/src/prefect_server/graphql/logs.py", line 26, in resolve_write_run_logs
graphql_1  |     await api.logs.create_logs(logs)
graphql_1  |   File "/prefect-server/src/prefect_server/api/logs.py", line 47, in create_logs
graphql_1  |     await models.Log.insert_many(model_logs, selection_set={"affected_rows"})
graphql_1  |   File "/prefect-server/src/prefect_server/database/orm.py", line 318, in insert_many
graphql_1  |     insert_mutation_name=cls.__root_fields__.get("insert"),
graphql_1  |   File "/prefect-server/src/prefect_server/database/hasura.py", line 237, in insert
graphql_1  |     result = await self.execute_mutations_in_transaction(mutations=[graphql])
graphql_1  |   File "/prefect-server/src/prefect_server/database/hasura.py", line 165, in execute_mutations_in_transaction
graphql_1  |     as_box=as_box,
graphql_1  |   File "/prefect-server/src/prefect_server/database/hasura.py", line 91, in execute
graphql_1  |     raise ValueError("Foreign key violation.")
graphql_1  | ValueError: Foreign key violation.
graphql_1  |
graphql_1  | The above exception was the direct cause of the following exception:
graphql_1  |
graphql_1  | Traceback (most recent call last):
graphql_1  |   File "/usr/local/lib/python3.7/site-packages/graphql/execution/execute.py", line 668, in await_completed
graphql_1  |     return await completed
graphql_1  |   File "/usr/local/lib/python3.7/site-packages/graphql/execution/execute.py", line 653, in await_result
graphql_1  |     return_type, field_nodes, info, path, await result
graphql_1  |   File "/usr/local/lib/python3.7/site-packages/graphql/execution/execute.py", line 727, in complete_value
graphql_1  |     raise result
graphql_1  |   File "/usr/local/lib/python3.7/site-packages/graphql/execution/execute.py", line 626, in await_result
graphql_1  |     return await result
graphql_1  |   File "/prefect-server/src/prefect_server/graphql/extensions.py", line 52, in resolve
graphql_1  |     result = await result
graphql_1  |   File "/prefect-server/src/prefect_server/graphql/logs.py", line 26, in resolve_write_run_logs
graphql_1  |     await api.logs.create_logs(logs)
graphql_1  |   File "/prefect-server/src/prefect_server/api/logs.py", line 47, in create_logs
graphql_1  |     await models.Log.insert_many(model_logs, selection_set={"affected_rows"})
graphql_1  |   File "/prefect-server/src/prefect_server/database/orm.py", line 318, in insert_many
graphql_1  |     insert_mutation_name=cls.__root_fields__.get("insert"),
graphql_1  |   File "/prefect-server/src/prefect_server/database/hasura.py", line 237, in insert
graphql_1  |     result = await self.execute_mutations_in_transaction(mutations=[graphql])
graphql_1  |   File "/prefect-server/src/prefect_server/database/hasura.py", line 165, in execute_mutations_in_transaction
graphql_1  |     as_box=as_box,
graphql_1  |   File "/prefect-server/src/prefect_server/database/hasura.py", line 91, in execute
graphql_1  |     raise ValueError("Foreign key violation.")
graphql_1  | graphql.error.graphql_error.GraphQLError: Foreign key violation.
graphql_1  |
graphql_1  | GraphQL request:2:3
graphql_1  | 1 | mutation ($input: write_run_logs_input!) {
graphql_1  | 2 |   write_run_logs(input: $input) {
graphql_1  |   |   ^
graphql_1  | 3 |     success
graphql_1  | INFO:     172.19.0.4:46026 - "POST / HTTP/1.1" 200 OK
graphql_1  | INFO:     172.19.0.4:46036 - "POST / HTTP/1.1" 200 OK
graphql_1  | INFO:     172.19.0.4:46034 - "POST / HTTP/1.1" 200 OK
graphql_1  | INFO:     172.19.0.4:46042 - "POST / HTTP/1.1" 200 OK
graphql_1  | INFO:     172.19.0.4:46044 - "POST / HTTP/1.1" 200 OK
graphql_1  | INFO:     172.19.0.4:46050 - "POST / HTTP/1.1" 200 OK
graphql_1  | INFO:     172.19.0.4:46058 - "POST / HTTP/1.1" 200 OK
graphql_1  | INFO:     172.19.0.4:46060 - "POST / HTTP/1.1" 200 OK
a
can you try adding this env variable? could be a Hasura issue:
Copy code
export HASURA_GRAPHQL_V1_BOOLEAN_NULL_COLLAPSE=true
more on that https://discourse.prefect.io/t/we-upgrade-from-0-15-14-to-1-1-0-and-it-seems-like-our-f[โ€ฆ]ing-shows-up-on-the-dashboard-unless-we-pick-a-project/825