Amir
11/02/2022, 7:34 PMNate
11/02/2022, 8:36 PMkubectl get pods # to find the problem pod
kubectl logs problem-pod-name
Amir
11/03/2022, 3:19 PMNate
11/03/2022, 3:27 PMAmir
11/03/2022, 4:08 PMapiVersion: <http://helm.toolkit.fluxcd.io/v2beta1|helm.toolkit.fluxcd.io/v2beta1>
kind: HelmRelease
metadata:
name: agent
namespace: prefect
spec:
interval: 30m
chart:
spec:
chart: prefect-agent
version: 2022.09.26
sourceRef:
name: prefecthq
kind: HelmRepository
namespace: kube-system
values:
#See <https://github.com/PrefectHQ/prefect-helm/blob/2022.09.26/charts/prefect-agent/values.yaml> for details
agent:
image:
prefectTag: 2.4.5-python3.9
2.
apiVersion: <http://helm.toolkit.fluxcd.io/v2beta1|helm.toolkit.fluxcd.io/v2beta1>
kind: HelmRelease
metadata:
name: agent
namespace: prefect
spec:
values:
#See <https://github.com/PrefectHQ/prefect-helm/blob/2022.09.26/charts/prefect-agent/values.yaml> for details
agent:
cloudApiConfig:
accountId: <REDACTED ID>
workspaceId: <REDACTED ID>
containerSecurityContext:
readOnlyRootFilesystem: false
image:
debug: true
Some background on the flow itself: The flow is using the SequentialTaskRunner (current config is: @flow(task_runner=SequentialTaskRunner())
). There are three tasks within the flow, 1) pulls data from snowflake to local, 2) shell_run_command to run the model (poetry run python src/driver/__main__.py
) and 3) upload outputted data to snowflake. Here is the flow:
@flow(task_runner=SequentialTaskRunner())
def flow():
logger = get_run_logger()
######
# Retreive the input data needed for the transformations from Snowflake:
######
<http://logger.info|logger.info>(f":large_yellow_square: INFO Downloading Snowflake Data")
snowflake_to_local_task.submit()
<http://logger.info|logger.info>(f':white_check_mark: INFO Finished Snowflake Downloads')
######
# Run transformation script
#######
<http://logger.info|logger.info>(f":large_yellow_square: INFO Running Transformations")
commands = ["cd models/model_folder/ && poetry install && poetry run python src/driver/__main__.py"]
for cmd in commands:
<http://logger.info|logger.info>(f"INFO Running the command {cmd}:")
shell_run_command(command=cmd, return_all=True)
<http://logger.info|logger.info>(f':white_check_mark: INFO Finished Running Transformations')
#######
# Upload the results from the model to snowflake:
#######
<http://logger.info|logger.info>(f":large_yellow_square: INFO Uploading Data to Snowflake")
local_to_snowflake_task.submit()
<http://logger.info|logger.info>(f':white_check_mark: INFO Finished Uploading Data to Snowflake')
<http://logger.info|logger.info>(f':white_check_mark: :white_check_mark: :white_check_mark: INFO RUN COMPLETE')
Nate
11/03/2022, 8:09 PMvalues.yaml
?Amir
11/03/2022, 8:26 PMNate
11/03/2022, 8:32 PMAmir
11/03/2022, 8:37 PM"resources": {
"requests": {
"cpu": "4000m",
"memory": "32Gi"
},
"limits": {
"cpu": "8000m",
"memory": "62Gi"
which is now reflected in the pod. I'm unsure if setting limits will help prevent the issue, but just noting it here (also a best practice habit I should get into). The flow is about 100m into its run, so the issue may appear again shortlyNate
11/03/2022, 8:49 PMfrom prefect.deployments import run_deployment
@flow
def parent(date_range: List, chunk_size: 5):
chunked_dates = [
date_range[i : i + chunk_size]
for i in range(0, len(date_range), chunk_size)
]
run_deployment.map(name=unmapped("processNDatesFlow"), params=chunked_dates)
Amir
11/03/2022, 9:28 PMtop -c 1
from within the terminal, also shows 0.
Such an odd problem. It's almost as if it just gives up.Nate
11/03/2022, 9:33 PMrun_deployment
isnt a task, so you can't use map
but you can use some for loop or wrap `run_deployment`in a task and map that
I just thought it might be useful to see logs from the pod you mentioned showing weird behaviour above