Espen Overbye
07/11/2020, 10:51 AMChris White
07/12/2020, 11:06 PMKubernetesJobEnvironment
allows you to specify a job spec for the flow which can include mounted volumes: https://docs.prefect.io/api/latest/environments/execution.html#kubernetesjobenvironmentEspen Overbye
07/14/2020, 12:35 PMjosh
08/10/2020, 7:01 PMEspen Overbye
08/10/2020, 7:02 PM# flows/my_flow.py
from prefect import task, Flow
from prefect.environments.storage import GitHub, Docker
from prefect.environments import LocalEnvironment, KubernetesJobEnvironment
import pathlib
@task
def get_data():
return [1, 2, 3, 4, 5]
@task
def print_data(data):
print(data)
with Flow("test") as flow:
data = get_data()
print_data(data)
flow.storage = GitHub(
repo="airmine-ai/prefect", # name of repo
path="flows/my_flow.py", # location of flow file in repo
secrets=["GITHUB_ACCESS_TOKEN"], # name of personal access token secret,
)
flow.environment = KubernetesJobEnvironment(job_spec_file="job.yaml", metadata={"image": "<http://airmineacrprod.azurecr.io/griddr-prefect:2020-08-11|airmineacrprod.azurecr.io/griddr-prefect:2020-08-11>"})
apiVersion: batch/v1
kind: Job
metadata:
name: prefect-job-UUID
labels:
app: prefect-job-UUID
identifier: UUID
spec:
template:
metadata:
labels:
app: prefect-job-UUID
identifier: UUID
spec:
containers:
- name: flow
image: <http://airmineacrprod.azurecr.io/griddr-prefect:2020-08-10|airmineacrprod.azurecr.io/griddr-prefect:2020-08-10>
imagePullPolicy: IfNotPresent
restartPolicy: Never
volumeMounts:
- name: azure
mountPath: /mnt/data
volumes:
- name: azure
azureFile:
secretName: griddeddata
shareName: gridded-data
readOnly: false
Agent deployed with prefect agent install
josh
08/10/2020, 7:06 PM"<http://airmineacrprod.azurecr.io/griddr-prefect:2020-08-11|airmineacrprod.azurecr.io/griddr-prefect:2020-08-11>"
is there a file called job.yaml
? Generally with storage options in the past the job spec file is loaded at registration time and shipped with the flow pickle to wherever you store it. However I’m not 100% sure how this plays with file-based storage (like github) because the flow is not initialized until it goes to run to which it requires that the job spec yaml is present thereEspen Overbye
08/10/2020, 7:09 PMjosh
08/10/2020, 7:16 PMjob_spec.yaml
? That error is showing it initializing your environment and attempting to load the yaml file it has listed under self.job_spec_file
and that value is initialized as:
self.job_spec_file = os.path.abspath(job_spec_file) if job_spec_file
job_spec.yaml
is being passed into the environment somewhereEspen Overbye
08/10/2020, 7:29 PMjosh
08/10/2020, 7:33 PMEspen Overbye
08/10/2020, 7:39 PM