I'm crossposting this because I have not received ...
# prefect-cloud
l
I'm crossposting this because I have not received a response: I have a k8s worker and sometimes my flows will auto fail with the following:
Submission failed. ValueError: Process cannot be run with empty command. See Less
- is there any way I can debug this?
n
hi @Leon Kozlowski - are you using a work pool / infra block where you've overwritten the
command
field?
l
We are using a k8s-worker + command is not overridden
This is my dockerfile
Copy code
FROM prefecthq/prefect:2.14.20-python3.10

RUN apt-get update && apt-get install --no-install-recommends --no-upgrade -y \
    libpq-dev \
    curl -y && \
    rm -rf /var/lib/apt/lists/*

ARG PIP_INDEX_URL
RUN test -n "${PIP_INDEX_URL}" || (echo "'--build-arg PIP_INDEX_URL' was not set!" && false)

COPY requirements.txt .
RUN pip install -U pip
RUN pip install --no-cache-dir -r requirements.txt

WORKDIR /app

COPY src src
COPY opsgenie.yaml opsgenie.yaml
Example
prefect.yaml
deployment config:
Copy code
pull:
- prefect.deployments.steps.set_working_directory:
    directory: /app

deployments:
  - name: prod-insights-generic
    tags:
      - prod
      - insights
      - redshift
    description: Generic Insights Report Handler
    schedule: 
      cron: "0 8 * * *"
      timezone: US/Eastern
    entrypoint: src/flow.py:insights_generic
    work_pool:
      name: k8s-work-pool
      work_queue_name: prod
      job_variables:
        image: "{{ build-image.image }}"
n
hmm that all looks reasonable at a glance
do you have more workers logs from this failure?
l
Let me check
These are the two runs where this happened this morning
If I run:
k logs prefect-worker-58dc454f98-cnth6 | grep acrid-avocet
or
k logs prefect-worker-58dc454f98-cnth6 | grep zeta595-aguateca
I'm getting no logs
k logs prefect-worker-58dc454f98-cnth6 | grep 'Submission failed'
also has nothing
n
hmm - sorry if this an annoying question 🙂 but are you sure that's the right worker pod? itd be weird if there was nothing about the flow run in the worker logs
l
valid question -
Copy code
k get pods | grep worker
prefect-worker-58dc454f98-cnth6       1/1     Running   0          42h
only 1 worker pod
n
hmm how often do you notice this failure mode?
l
It's pretty random
today it was on a scheduled run and a manual run
4 times in the past 6 days and the flow runs once per day
n
hmm that is strange @Kevin Grismore when u have a sec, have you seen this? tldr intermittent k8s failures, no worker logs to speak of
Copy code
Submission failed. ValueError: Process cannot be run with empty command. See Less
no overridden
command
👀 1
k
this does not look familiar to me
n
hmm okay, me neither yet
k
kind of puzzled at why the worker would set an empty command, wonder if the job gets created in k8s could you look at the yaml manifest and see if there's a command in there?
l
the job doesn't get created, nor does the worker acknowledge that it even tried
n
what version is your worker?
k
🤔
l
Copy code
worker:
    image:
      prefectTag: 2.14.20-python3.10-kubernetes
thank you 1
Copy code
k exec -it prefect-worker-77df78588d-qlhqh -- sh
$ prefect --version
$ 2.14.20
This was also happening on
2.14.16
Is there any update here/any tracking - should I create an issue?
Another error I'm seeing related to the worker deployment is random errors relating to
WORKDIR
Copy code
Flow could not be retrieved from deployment.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 122, in run_steps
    step_output = await run_step(step, upstream_outputs)
  File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/core.py", line 93, in run_step
    result = await from_async.call_soon_in_new_thread(
  File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 291, in aresult
    return await asyncio.wrap_future(self.future)
  File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 315, in _run_sync
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.10/site-packages/prefect/deployments/steps/pull.py", line 28, in set_working_directory
    os.chdir(directory)
FileNotFoundError: [Errno 2] No such file or directory: '/app'
whereas of schedule runs do not hit this error