Ilya Galperin
08/08/2022, 8:13 PMprefecthq/prefect:2.0.2-python3.8
and trying to execute the following flow from this tutorial on a Kubernetes cluster pointing to an S3 storage block. When doing so, we receive the following error:
RuntimeError: File system created with scheme 's3' from base path 's3://<mybucket>' could not be created. You are likely missing a Python module required to use the given storage protocol.
It looks like others that have been experiencing this have had to manually run pip install s3fs
in their execution environment to get S3 external storage working with the Kubernetes execution environment. Is this the recommended deployment pattern for now? If so, is there a plan to start including these dependencies in the prefect filesystems package? It seems strange that we’d need a custom image for something that is supposed to already be tightly integrated with Prefect 2.0 and a fundamental requirement for using k8s infrastructure.
import prefect
from prefect import task, flow, get_run_logger
from prefect.filesystems import S3
s3_block = S3.load("aws-s3")
@task
def hello_world():
logger = get_run_logger()
text = "hello from orion_flow!"
<http://logger.info|logger.info>(text)
return text
@flow(name="orion_flow")
def orion_flow():
logger = get_run_logger()
<http://logger.info|logger.info>("Hello from Kubernetes!")
hw = hello_world()
return
Anna Geller
Ilya Galperin
08/08/2022, 8:44 PMEXTRA_PIP_PACKAGES
does not work for kubernetes-job
infrastructure types. Is that incorrect? I’ve tried via the following but this doesn’t seem to have any effect.
infrastructure:
type: kubernetes-job
env: {'EXTRA_PIP_PACKAGES': 's3fs'}
Anna Geller
Ilya Galperin
08/09/2022, 4:36 PMs3fs
is getting installed on the pod using the EXTRA_PIP_PACKAGES
argument and it does not seem to be. If I comment out the pip list
command here and run python -m prefect.engine
I continue to get the above described error. Am I maybe doing something wrong?
name: orion_flow_demo
description: null
version: 5de7c987b95f4eea5c6bff3a8585460b
tags:
- kubernetes
parameters: {}
schedule: null
infrastructure:
type: kubernetes-job
env: {'EXTRA_PIP_PACKAGES': 's3fs'}
labels: {}
name: null
command:
- pip
- list
# - python
# - -m
# - prefect.engine
image: prefecthq/prefect:2.0.2-python3.8
namespace: prefect2
service_account_name: null
image_pull_policy: null
cluster_config: null
job:
apiVersion: batch/v1
kind: Job
metadata:
labels: {}
spec:
template:
spec:
parallelism: 1
completions: 1
restartPolicy: Never
containers:
- name: prefect-job
env: []
customizations: []
job_watch_timeout_seconds: 5
pod_watch_timeout_seconds: 60
stream_output: true
Anna Geller
Ilya Galperin
08/09/2022, 5:25 PM