Amir
11/02/2022, 7:34 PMNate
11/02/2022, 8:36 PMkubectl get pods # to find the problem pod
kubectl logs problem-pod-nameAmir
11/03/2022, 3:19 PMNate
11/03/2022, 3:27 PMAmir
11/03/2022, 4:08 PMapiVersion: <http://helm.toolkit.fluxcd.io/v2beta1|helm.toolkit.fluxcd.io/v2beta1>
kind: HelmRelease
metadata:
  name: agent
  namespace: prefect
spec:
  interval: 30m
  chart:
    spec:
      chart: prefect-agent
      version: 2022.09.26
      sourceRef:
        name: prefecthq
        kind: HelmRepository
        namespace: kube-system
  values:
    #See <https://github.com/PrefectHQ/prefect-helm/blob/2022.09.26/charts/prefect-agent/values.yaml> for details
    agent:
      image:
        prefectTag: 2.4.5-python3.9
2.
apiVersion: <http://helm.toolkit.fluxcd.io/v2beta1|helm.toolkit.fluxcd.io/v2beta1>
kind: HelmRelease
metadata:
  name: agent
  namespace: prefect
spec:
  values:
    #See <https://github.com/PrefectHQ/prefect-helm/blob/2022.09.26/charts/prefect-agent/values.yaml> for details
    agent:
      cloudApiConfig:
        accountId: <REDACTED ID>
        workspaceId: <REDACTED ID>
      containerSecurityContext:
        readOnlyRootFilesystem: false
      image:
        debug: true
Some background on the flow itself: The flow is using the SequentialTaskRunner (current config is: @flow(task_runner=SequentialTaskRunner())). There are three tasks within the flow, 1) pulls data from snowflake to local, 2) shell_run_command to run the model (poetry run python src/driver/__main__.py) and 3) upload outputted data to snowflake. Here is the flow:
@flow(task_runner=SequentialTaskRunner())
def flow():
    logger = get_run_logger()
######
# Retreive the input data needed for the transformations from Snowflake:
######
    <http://logger.info|logger.info>(f":large_yellow_square: INFO Downloading Snowflake Data")
    snowflake_to_local_task.submit()
    
    <http://logger.info|logger.info>(f':white_check_mark: INFO Finished Snowflake Downloads')
######
# Run transformation script
#######
    <http://logger.info|logger.info>(f":large_yellow_square: INFO Running Transformations")
    commands = ["cd models/model_folder/ && poetry install && poetry run python src/driver/__main__.py"] 
    for cmd in commands:
        <http://logger.info|logger.info>(f"INFO Running the command {cmd}:")
        shell_run_command(command=cmd, return_all=True)
    <http://logger.info|logger.info>(f':white_check_mark: INFO Finished Running Transformations')
#######
# Upload the results from the model to snowflake:
#######
    <http://logger.info|logger.info>(f":large_yellow_square: INFO Uploading Data to Snowflake")
    local_to_snowflake_task.submit()
    <http://logger.info|logger.info>(f':white_check_mark: INFO Finished Uploading Data to Snowflake')
    <http://logger.info|logger.info>(f':white_check_mark: :white_check_mark: :white_check_mark: INFO RUN COMPLETE')Nate
11/03/2022, 8:09 PMvalues.yaml ?Amir
11/03/2022, 8:26 PMNate
11/03/2022, 8:32 PMAmir
11/03/2022, 8:37 PMAmir
11/03/2022, 8:39 PM"resources": {
              "requests": {
                "cpu": "4000m",
                "memory": "32Gi"
              },
              "limits": {
                "cpu": "8000m",
                "memory": "62Gi"
which is now reflected in the pod. I'm unsure if setting limits will help prevent the issue, but just noting it here (also a best practice habit I should get into). The flow is about 100m into its run, so the issue may appear again shortlyNate
11/03/2022, 8:49 PMfrom prefect.deployments import run_deployment
@flow
def parent(date_range: List, chunk_size: 5):
    chunked_dates = [
        date_range[i : i + chunk_size] 
        for i in range(0, len(date_range), chunk_size)
    ]
    run_deployment.map(name=unmapped("processNDatesFlow"), params=chunked_dates)Amir
11/03/2022, 9:28 PMtop -c 1  from within the terminal, also shows 0.
Such an odd problem. It's almost as if it just gives up.Amir
11/03/2022, 9:28 PMNate
11/03/2022, 9:33 PMrun_deployment isnt a task, so you can't use map but you can use some for loop or wrap `run_deployment`in a task and map that
I just thought it might be useful to see logs from the pod you mentioned showing weird behaviour above