Hi all! I am running my Prefect 2 flows via Prefe...
# prefect-community
k
Hi all! I am running my Prefect 2 flows via Prefect Cloud and a GKE k8s autopilot cluster running Docker images. We are working on building up our deployments to be automated and have a couple of questions that are not clear to me from the docs and was wanting to make sure my intuition is correct. We are attempting to move our deployments to Python from the cli and have the following working for requesting custom k8s configurations.
Copy code
from prefect2_flow import extract_load_transform
from prefect.deployments import Deployment
from prefect.infrastructure import KubernetesJob
from prefect.orion.schemas.schedules import CronSchedule


schedule = CronSchedule(cron="15 * * * *", timezone="UTC")

deployment = Deployment().build_from_flow(
    flow=extract_load_transform,
    name="test_hourly_elt",
    parameters={
        'collection_duration': '1h',
        'uri': '<https://google.com>',
    },
    skip_upload=True,
    schedule=schedule,
    tags=["test"],
    version=2,
    work_queue_name="test-kubernetes",
    infrastructure=KubernetesJob(
        finished_job_ttl=30,
        image="us-central1-docker.pkg.dev/.../prefect2-flows/elt:latest",
        image_pull_policy="Always",
        namespace="test-prefect2",
        pod_watch_timeout_seconds=180,
        job={
            "apiVersion": "batch/v1",
            "kind": "Job",
            "metadata": {"labels": {}},
            "spec": {
                "template": {
                    "spec": {
                        "parallelism": 1,
                        "completions": 1,
                        "restartPolicy": "Never",
                        "containers": [
                            {
                                "name": "prefect-job",
                                "env": [],
                                "resources": {
                                    "requests": {
                                        "memory": "5Gi",
                                        "cpu": "2",
                                        "ephemeral-storage": "1Gi",
                                    }
                                }
                            }
                        ],
                    }
                }
            },
        }
    )
)


if __name__ == "__main__":
    deployment.apply()
1. Is it okay to set
skip_upload
to true? Since we are using Docker I'm not sure what the benefit of uploading our project to GCS is. 2. The Prefect API has a parameter for
is_schedule_active
but it doesn't look like that parameter has made it over to the Python API b/c when I try to add it above it gives me an error about not including more parameters than needed, is this something I can contribute to or add functionality for? 3. The k8s job configuration is pretty verbose to request resources above the default, is there a better way to set these that I am missing? Thank you for the time and any help provided!
c
Hi Keith, 1. - Yes, because you are using an image, it’s safe to skip_upload. The upload is platform / execution agnostic, but in the case that you have it as part of your image, it is redundant for your use case. 2. Which API parameter are you referring to from the Prefect API side? Do you have a link to the code? You’re correct, it’s not in the deployment spec, but I can’t say for certain if it’s missing intentionally, or unintentionally here. 3. You can define and pass in a job template / infrastructure block - : https://discourse.prefect.io/t/how-to-customize-kubernetes-jobs-with-kubernetesflowrunners-customizations/1147 https://discourse.prefect.io/t/creating-and-deploying-a-custom-kubernetes-infrastructure-block/1531
The tl;dr for 3. : from prefect.infrastructure import KubernetesJob
Copy code
from prefect.infrastructure import KubernetesJob

k8s_job=KubernetesJob(
    namespace="prefect2",
    image="prefecthq/prefect:2.3.0-python3.9",
    job=KubernetesJob.job_from_file("modified_run_job.yaml")
)

k8s_job.save("k8sdev")
👀 1
1
Alternatively, you can replace your job spec with just the customizations above the default:
Copy code
customizations=[
    {
        "op": "add",
        "path": "/spec/imagePullSecrets",
        "value": [{'name': 'dockerhub'}],
    },
],

or
customizations=[
    {
        "op": "add",
        "path": "/spec/imagePullSecrets",
        "value": [{'name': 'dockerhub'}],
    },
    {
        "op": "add",
        "path": "/spec/template/spec/resources",
        "value": {"limits": {"memory": "8Gi", "cpu": "4000m"}},
    }
    ],
👀 1
1
k
Thank you for the great response @Christopher Boyd, a lot of this I can certainly implement to make things smoother!! The alternative version for (3) looks much cleaner and easier to maintain, going to adopt this method!! For (2) I am referring to the Orion API
is_schedule_active
, here are a couple of references to the actual flag and some test deployments: • Flag toggling - https://github.com/PrefectHQ/prefect/blob/main/src/prefect/orion/api/deployments.py#L232 • Part of CLI response - https://github.com/PrefectHQ/prefect/blob/main/docs/concepts/deployments.md?plain=1#L449 • API create deployment with
is_schedule_active
flag turned off - https://github.com/PrefectHQ/prefect/blob/main/tests/orion/models/test_deployments.py#L128
🙌 1
c
I’ll take a look and see what’s missing here - you’re absolutely welcome to open a pull request / issue as well on the github would be the best place in the future for identified issues
d
@Ofek Katriel FYI