Marwan Sarieddine
08/21/2020, 8:29 PMjosh
08/21/2020, 8:34 PMMarwan Sarieddine
08/21/2020, 8:35 PMI believe the K8s agent only propagates environment variables to the first job it createsThat is correct - I see the first job pod in a Running state, then the second job pod with an Error state. I guess as a followup - why there are two job pods that get created ?
Are you providing both a scheduler and worker spec to the environment?No I am not _ I thought with a DaskKuberenetesEnvironment I only need to specify the dask-worker pod spec
josh
08/21/2020, 8:36 PMMarwan Sarieddine
08/21/2020, 8:37 PMjosh
08/21/2020, 8:37 PMMarwan Sarieddine
08/21/2020, 8:38 PMjob.yaml
file and assigned it to scheduler_spec_file
but I am still facing the same issue for some reason:
below is the job.yaml
spec
apiVersion: batch/v1
kind: Job
metadata:
name: prefect-dask-job
labels:
app: prefect-dask-job
spec:
template:
metadata:
labels:
app: prefect-dask-job
spec:
containers:
- name: flow
image: xxxxx
imagePullPolicy: IfNotPresent
command: ["/bin/sh", "-c"]
args:
[
'python -c "import prefect; prefect.environments.execution.load_and_run_flow()"',
]
env:
- name: PREFECT__CLOUD__GRAPHQL
value: PREFECT__CLOUD__GRAPHQL
- name: PREFECT__CLOUD__AUTH_TOKEN
value: PREFECT__CLOUD__AUTH_TOKEN
- name: PREFECT__CONTEXT__FLOW_RUN_ID
value: PREFECT__CONTEXT__FLOW_RUN_ID
- name: PREFECT__CONTEXT__NAMESPACE
value: PREFECT__CONTEXT__NAMESPACE
- name: PREFECT__CONTEXT__IMAGE
value: PREFECT__CONTEXT__IMAGE
- name: PREFECT__CLOUD__USE_LOCAL_SECRETS
value: "false"
- name: PREFECT__ENGINE__FLOW_RUNNER__DEFAULT_CLASS
value: "prefect.engine.cloud.CloudFlowRunner"
- name: PREFECT__ENGINE__TASK_RUNNER__DEFAULT_CLASS
value: "prefect.engine.cloud.CloudTaskRunner"
- name: PREFECT__ENGINE__EXECUTOR__DEFAULT_CLASS
value: "prefect.engine.executors.DaskExecutor"
- name: PREFECT__LOGGING__LOG_TO_CLOUD
value: "true"
- name: PREFECT__LOGGING__LEVEL
value: "INFO"
- name: PREFECT__DEBUG
value: "true"
- name: DASK_DISTRIBUTED__SCHEDULER__WORK_STEALING
value: "True"
- name: PREFECT__LOGGING__EXTRA_LOGGERS
value: PREFECT__LOGGING__EXTRA_LOGGERS
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
key: AWS_ACCESS_KEY_ID
name: aws-s3-secret
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
key: AWS_SECRET_ACCESS_KEY
name: aws-s3-secret
resources:
requests:
cpu: "100m"
limits:
cpu: "100m"
restartPolicy: Never
the worker spec
kind: Pod
metadata:
labels:
app: prefect-dask-worker
spec:
containers:
- args:
- dask-worker
- --no-dashboard
- --death-timeout
- '60'
- --nthreads
- '1'
- --nprocs
- '1'
- --memory-limit
- 4GB
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
key: AWS_ACCESS_KEY_ID
name: aws-s3-secret
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
key: AWS_SECRET_ACCESS_KEY
name: aws-s3-secret
image: xxxxx
imagePullPolicy: IfNotPresent
name: dask-worker
resources:
limits:
cpu: 1000m
memory: 4G
requests:
cpu: 1000m
memory: 4G
restartPolicy: Never
here is the call to DaskKubernetesEnvironment
DaskKubernetesEnvironment(
min_workers=1,
max_workers=2,
scheduler_spec_file="specs/job.yaml",
worker_spec_file="specs/worker_spec.yaml",
)
josh
08/21/2020, 9:39 PMMarwan Sarieddine
08/21/2020, 9:40 PMkubectl logs pod/prefect-job-ad7ab790-795nm
[2020-08-21 21:39:29] INFO - prefect.S3 | Downloading simple-flow-4/2020-08-21t21-37-27-970539-00-00 from infima-etl-flows
[2020-08-21 21:39:31] ERROR - prefect.S3 | Error downloading Flow from S3: An error occurred (403) when calling the HeadObject operation: Forbidden
An error occurred (403) when calling the HeadObject operation: Forbidden
Traceback (most recent call last):
File "/usr/local/bin/prefect", line 8, in <module>
sys.exit(cli())
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/prefect/cli/execute.py", line 80, in cloud_flow
raise exc
File "/usr/local/lib/python3.8/site-packages/prefect/cli/execute.py", line 69, in cloud_flow
flow = storage.get_flow(storage.flows[flow_data.name])
File "/usr/local/lib/python3.8/site-packages/prefect/environments/storage/s3.py", line 92, in get_flow
raise err
File "/usr/local/lib/python3.8/site-packages/prefect/environments/storage/s3.py", line 87, in get_flow
self._boto3_client.download_fileobj(
File "/usr/local/lib/python3.8/site-packages/boto3/s3/inject.py", line 678, in download_fileobj
return future.result()
File "/usr/local/lib/python3.8/site-packages/s3transfer/futures.py", line 106, in result
return self._coordinator.result()
File "/usr/local/lib/python3.8/site-packages/s3transfer/futures.py", line 265, in result
raise self._exception
File "/usr/local/lib/python3.8/site-packages/s3transfer/tasks.py", line 255, in _main
self._submit(transfer_future=transfer_future, **kwargs)
File "/usr/local/lib/python3.8/site-packages/s3transfer/download.py", line 340, in _submit
response = client.head_object(
File "/usr/local/lib/python3.8/site-packages/botocore/client.py", line 316, in _api_call
return self._make_api_call(operation_name, kwargs)
File "/usr/local/lib/python3.8/site-packages/botocore/client.py", line 635, in _make_api_call
raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
pod/prefect-job-ad7ab790-795nm
it doesnt have the AWS env variables ...josh
08/21/2020, 9:42 PMAWS_CREDENTIALS
that is set up to solve issues like this https://docs.prefect.io/core/concepts/secrets.html#default-secretsMarwan Sarieddine
08/21/2020, 9:43 PMjosh
08/21/2020, 9:44 PMflow.storage = S3(..., secrets=["AWS_CREDENTIALS]")
and those credentials will automatically be used when pulling the flowMarwan Sarieddine
08/21/2020, 10:02 PM[5:57 PM] $ kubectl logs pod/prefect-job-e74114b4-26n6v
[2020-08-21 21:55:08] INFO - prefect.S3 | Downloading simple-flow-8/2020-08-21t21-54-33-484226-00-00 from infima-etl-flows
Traceback (most recent call last):
File "/usr/local/bin/prefect", line 8, in <module>
sys.exit(cli())
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/prefect/cli/execute.py", line 80, in cloud_flow
raise exc
File "/usr/local/lib/python3.8/site-packages/prefect/cli/execute.py", line 69, in cloud_flow
flow = storage.get_flow(storage.flows[flow_data.name])
File "/usr/local/lib/python3.8/site-packages/prefect/environments/storage/s3.py", line 101, in get_flow
return cloudpickle.loads(output)
TypeError: an integer is required (got type bytes)
EDIT _ sorry I suppose I should have posted this issue thread on the prefect-community channel .. just noticed I am on the wrong channel