https://prefect.io logo
Title
i

Ilya Galperin

08/08/2022, 8:13 PM
Hi all - we are running
prefecthq/prefect:2.0.2-python3.8
and trying to execute the following flow from this tutorial on a Kubernetes cluster pointing to an S3 storage block. When doing so, we receive the following error:
RuntimeError: File system created with scheme 's3' from base path 's3://<mybucket>' could not be created. You are likely missing a Python module required to use the given storage protocol.
It looks like others that have been experiencing this have had to manually run
pip install s3fs
in their execution environment to get S3 external storage working with the Kubernetes execution environment. Is this the recommended deployment pattern for now? If so, is there a plan to start including these dependencies in the prefect filesystems package? It seems strange that we’d need a custom image for something that is supposed to already be tightly integrated with Prefect 2.0 and a fundamental requirement for using k8s infrastructure.
import prefect
from prefect import task, flow, get_run_logger
from prefect.filesystems import S3

s3_block = S3.load("aws-s3")


@task
def hello_world():
    logger = get_run_logger()
    text = "hello from orion_flow!"
    <http://logger.info|logger.info>(text)
    return text


@flow(name="orion_flow")
def orion_flow():
    logger = get_run_logger()
    <http://logger.info|logger.info>("Hello from Kubernetes!")
    hw = hello_world()
    return
a

Anna Geller

08/08/2022, 8:36 PM
i

Ilya Galperin

08/08/2022, 8:44 PM
My understanding is that
EXTRA_PIP_PACKAGES
does not work for
kubernetes-job
infrastructure types. Is that incorrect? I’ve tried via the following but this doesn’t seem to have any effect.
infrastructure:
  type: kubernetes-job
  env: {'EXTRA_PIP_PACKAGES': 's3fs'}
a

Anna Geller

08/09/2022, 10:30 AM
It should work for Kubernetes too - LMK if not
i

Ilya Galperin

08/09/2022, 4:36 PM
Hi Anna — I’ve tried with the following deployment to validate if
s3fs
is getting installed on the pod using the
EXTRA_PIP_PACKAGES
argument and it does not seem to be. If I comment out the
pip list
command here and run
python -m prefect.engine
I continue to get the above described error. Am I maybe doing something wrong?
name: orion_flow_demo
description: null
version: 5de7c987b95f4eea5c6bff3a8585460b
tags:
- kubernetes
parameters: {}
schedule: null
infrastructure:
  type: kubernetes-job
  env: {'EXTRA_PIP_PACKAGES': 's3fs'}
  labels: {}
  name: null
  command:
  - pip
  - list
  # - python
  # - -m
  # - prefect.engine
  image: prefecthq/prefect:2.0.2-python3.8
  namespace: prefect2
  service_account_name: null
  image_pull_policy: null
  cluster_config: null
  job:
    apiVersion: batch/v1
    kind: Job
    metadata:
      labels: {}
    spec:
      template:
        spec:
          parallelism: 1
          completions: 1
          restartPolicy: Never
          containers:
          - name: prefect-job
            env: []
  customizations: []
  job_watch_timeout_seconds: 5
  pod_watch_timeout_seconds: 60
  stream_output: true
a

Anna Geller

08/09/2022, 5:19 PM
Doesn't seem so - I'd suggest to open a GitHub issue, since the package gets installed at runtime I don't see why it shouldn't be possible
i

Ilya Galperin

08/09/2022, 5:25 PM
Will do, thank you Anna.
🙌 1
:thank-you: 1