Scott Pedersen
07/26/2024, 1:09 AMMarvin
07/26/2024, 1:09 AMKubernetesClusterConfig
to request CPU and memory for a pod, you can modify the relevant sections in the job template. Here are the steps:
1. Edit the JSON Representation:
Navigate to the advanced settings in your work pool and edit the JSON representation of the base job template.
2. Add Variables for CPU and Memory Requests:
Add the following sections under the variables section to define the CPU and memory requests:
json
"cpu_request": {
"title": "CPU Request",
"description": "The CPU allocation to request for this pod.",
"default": "default",
"type": "string"
},
"memory_request": {
"title": "Memory Request",
"description": "The memory allocation to request for this pod.",
"default": "default",
"type": "string"
}
3. Update the Containers Section:
Then update the containers
section under job_configuration
with the following configuration:
json
{
"containers": [
{
"...": "...",
"resources": {
"requests": {
"cpu": "{{ cpu_request }}",
"memory": "{{ memory_request }}"
}
}
}
]
}
This setup ensures that deployments with this work pool will request the specified CPU and memory.
You can find more details in the Prefect Kubernetes Deployment Guide.
I hope this helps. If you have any other questions or need further assistance, feel free to ask.Scott Pedersen
07/26/2024, 1:12 AMMarvin
07/26/2024, 1:13 AMKubernetesClusterConfig
base job template with the necessary additions to request CPU and memory for a pod.
### Example Base Job Template
json
{
"job_configuration": {
"namespace": "my-namespace",
"customizations": {
"env": [
{
"name": "ENVIRONMENT_VARIABLE",
"value": "value"
}
],
"containers": [
{
"image": "my-custom-image:latest",
"name": "my-container",
"resources": {
"requests": {
"cpu": "{{ cpu_request }}",
"memory": "{{ memory_request }}"
},
"limits": {
"cpu": "500m",
"memory": "512Mi"
}
}
}
]
}
},
"variables": {
"cpu_request": {
"title": "CPU Request",
"description": "The CPU allocation to request for this pod.",
"default": "100m",
"type": "string"
},
"memory_request": {
"title": "Memory Request",
"description": "The memory allocation to request for this pod.",
"default": "128Mi",
"type": "string"
}
}
}
### Explanation
1. Job Configuration:
- Namespace: Set the Kubernetes namespace to create jobs within.
- Customizations:
- Containers: Define the container configuration, including the image and resource requests.
- Resources: Specify the requests for CPU and memory using the {{ cpu_request }}
and {{ memory_request }}
variables.
2. Variables:
- Add cpu_request
and memory_request
under the variables
section to allow for customization of these values.
### Deployment Guide
For more detailed steps and examples on deploying Kubernetes clusters with Prefect, refer to the Prefect Kubernetes Deployment Guide.
If you have any more questions or need further assistance, feel free to ask.Marvin
07/26/2024, 1:17 AMthis is what I see in the base job template and how the KubernetesClusterConfig and job_configuration have been set up can you please use this to add the cpu_request and memory_request variables?```{ "variables": { "type": "object", "properties": { .... }, "definitions": { "KubernetesClusterConfig": { "type": "object", "title": "KubernetesClusterConfig", "required": [ "config", "context_name" ], "properties": { "config": { "type": "object", "title": "Config", "description": "The entire contents of a kubectl config file." }, "context_name": { "type": "string", "title": "Context Name", "description": "The name of the kubectl context to use." }, "description": "Stores configuration for interaction with Kubernetes clusters.\n\nSee
from_file
for creation.",
"secret_fields": [],
"block_type_slug": "kubernetes-cluster-config",
"block_schema_references": {}
}
},
"description": "Default variables for the Kubernetes worker.\n\nThe schema for this class is used to populate the variables
section of the default\nbase job template."
},
"job_configuration": {
"env": "{{ env }}",
"name": "{{ name }}Marvin
07/26/2024, 1:19 AMthis is what I see in the base job template and how the KubernetesClusterConfig and job_configuration have been set up can you please use this to add the cpu_request and memory_request variables?```{ "variables": { "type": "object", "properties": { .... }, "definitions": { "KubernetesClusterConfig": { "type": "object", "title": "KubernetesClusterConfig", "required": [ "config", "context_name" ], "properties": { "config": { "type": "object", "title": "Config", "description": "The entire contents of a kubectl config file." }, "context_name": { "type": "string", "title": "Context Name", "description": "The name of the kubectl context to use." }, "description": "Stores configuration for interaction with Kubernetes clusters.\n\nSee
from_file
for creation.",
"secret_fields": [],
"block_type_slug": "kubernetes-cluster-config",
"block_schema_references": {}
}
}
},
"job_configuration": {
"env": "{{ env }}",
"name": "{{ name }}",
"labels": "{{ labels }}",
"command": "{{ command }}",
"namespace": "{{ namespace }}",
"job_manifest": "{{ job_manifest }}",Scott Pedersen
07/26/2024, 1:21 AM{
"variables": {
"type": "object",
"properties": {
....
},
"definitions": {
"KubernetesClusterConfig": {
"type": "object",
"title": "KubernetesClusterConfig",
"required": [
"config",
"context_name"
],
"properties": {
"config": {
"type": "object",
"title": "Config",
"description": "The entire contents of a kubectl config file."
},
"context_name": {
"type": "string",
"title": "Context Name",
"description": "The name of the kubectl context to use."
},
"description": "Stores configuration for interaction with Kubernetes clusters.\n\nSee `from_file` for creation.",
"secret_fields": [],
"block_type_slug": "kubernetes-cluster-config",
"block_schema_references": {}
}
}
},
"job_configuration": {
"env": "{{ env }}",
"name": "{{ name }}",
"namespace": "{{ namespace }}",
"job_manifest": "{{ job_manifest }}",
"stream_output": "{{ stream_output }}",
"cluster_config": "{{ cluster_config }}"
}
}
Marvin
07/26/2024, 1:21 AMcpu_request
and memory_request
variables:
### Extended Base Job Template
json
{
"variables": {
"type": "object",
"properties": {
...,
"cpu_request": {
"type": "string",
"title": "CPU Request",
"description": "The CPU allocation to request for this pod.",
"default": "100m"
},
"memory_request": {
"type": "string",
"title": "Memory Request",
"description": "The memory allocation to request for this pod.",
"default": "128Mi"
}
},
"definitions": {
"KubernetesClusterConfig": {
"type": "object",
"title": "KubernetesClusterConfig",
"required": [
"config",
"context_name"
],
"properties": {
"config": {
"type": "object",
"title": "Config",
"description": "The entire contents of a kubectl config file."
},
"context_name": {
"type": "string",
"title": "Context Name",
"description": "The name of the kubectl context to use."
}
},
"description": "Stores configuration for interaction with Kubernetes clusters.\n\nSee `from_file` for creation.",
"secret_fields": [],
"block_type_slug": "kubernetes-cluster-config",
"block_schema_references": {}
}
}
},
"job_configuration": {
"env": "{{ env }}",
"name": "{{ name }}",
"namespace": "{{ namespace }}",
"job_manifest": "{{ job_manifest }}",
"stream_output": "{{ stream_output }}",
"cluster_config": "{{ cluster_config }}",
"containers": [
{
"resources": {
"requests": {
"cpu": "{{ cpu_request }}",
"memory": "{{ memory_request }}"
}
}
}
]
}
}
### Explanation
1. Variables Section:
- Added cpu_request
and memory_request
to the properties
section with their respective descriptions and default values.
2. Job Configuration Section:
- Updated the containers
subsection of the job_configuration
to include cpu_request
and memory_request
.
With these adjustments, your deployments can now request the specified CPU and memory resources as defined in the variables.
For more detailed steps on Prefect deployment on Kubernetes, refer to the Prefect Kubernetes Deployment Guide.
If you need further assistance, feel free to ask.Nate
07/26/2024, 1:50 AMNate
07/26/2024, 1:50 AMMarvin
07/26/2024, 1:51 AM--job-variable
flag for runs kicked off via CLI.
bash
prefect deployment run \
--id "<DEPLOYMENT_ID>" \
--job-variable MY_NEW_ENV_VAR=42 \
--job-variable HELLO=THERE
- Using Job Variables in Automations: Runs kicked off via automation actions can use job variables, including ones rendered from Jinja templates.
3. Examples:
- env Variable Example:
python
import os
from prefect import flow, task
@task
def do_something_important(not_so_secret_value: str) -> None:
print(f"Doing something important with {not_so_secret_value}!")
@flow(log_prints=True)
def some_work():
environment = os.environ.get("EXECUTION_ENVIRONMENT", "local")
print(f"Coming to you live from {environment}!")
not_so_secret_value = os.environ.get("MY_NOT_SO_SECRET_CONFIG")
do_something_important(not_so_secret_value)
4. Further Details:
- For more advanced configurations and examples, refer to the specific sections of the Overriding Work Pool Job Variables.
By following these guidelines, you should be able to effectively override job variables for your deployments.
If you have any more questions or need additional assistance, feel free to ask.Scott Pedersen
07/26/2024, 2:19 AMScott Pedersen
07/26/2024, 2:24 AMNate
07/26/2024, 2:38 AMScott Pedersen
07/26/2024, 3:10 AMvariables
then applying them to containers
in the job_configuration
. We have annotations
I would like to add as well just not sure where they should go.Scott Pedersen
07/27/2024, 11:31 PM{
"op": "add",
"path": "/spec/template/metadata",
"value": {
"annotations": {
"<http://sumologic.com/sourceCategory|sumologic.com/sourceCategory>": source_category,
"<http://cluster-autoscaler.kubernetes.io/safe-to-evict|cluster-autoscaler.kubernetes.io/safe-to-evict>": "false"
}
}
}
Marvin
07/27/2024, 11:32 PMjson
{
"variables": {
"type": "object",
"properties": {
"cpu_request": {
"type": "string",
"title": "CPU Request",
"description": "The CPU allocation to request for this pod.",
"default": "100m"
},
"memory_request": {
"type": "string",
"title": "Memory Request",
"description": "The memory allocation to request for this pod.",
"default": "128Mi"
}
},
"definitions": {
"KubernetesClusterConfig": {
"type": "object",
"title": "KubernetesClusterConfig",
"required": ["config", "context_name"],
"properties": {
"config": {
"type": "object",
"title": "Config",
"description": "The entire contents of a kubectl config file."
},
"context_name": {
"type": "string",
"title": "Context Name",
"description": "The name of the kubectl context to use."
}
},
"description": "Stores configuration for interaction with Kubernetes clusters.\n\nSee `from_file` for creation.",
"secret_fields": [],
"block_type_slug": "kubernetes-cluster-config",
"block_schema_references": {}
}
}
},
"job_configuration": {
"env": "{{ env }}",
"name": "{{ name }}",
"namespace": "{{ namespace }}",
"job_manifest": "{{ job_manifest }}",
"stream_output": "{{ stream_output }}",
"cluster_config": "{{ cluster_config }}",
"containers": [
{
"resources": {
"requests": {
"cpu": "{{ cpu_request }}",
"memory": "{{ memory_request }}"
}
}
}
],
"metadata": {
"annotations": {
"<http://sumologic.com/sourceCategory|sumologic.com/sourceCategory>": "source_category",
"<http://cluster-autoscaler.kubernetes.io/safe-to-evict|cluster-autoscaler.kubernetes.io/safe-to-evict>": "false"
}
}
}
}
### Explanation
1. Variables Section:
- Added cpu_request
and memory_request
variables under the properties
section for CPU and memory customization.
2. Job Configuration Section:
- In the containers
section, added the resource requests for CPU and memory using the variables.
- Added the metadata
section with the specified annotations.
### References
For more information on customizing job templates and annotations, please refer to the Prefect documentation on Overriding Work Pool Job Variables.
Additionally, you can check the related GitHub issue discussing adding annotations to Kubernetes jobs.
Feel free to ask if you need any further assistance!Scott Pedersen
07/27/2024, 11:35 PMflow.from_source().deploy()
to achieve this?Marvin
07/27/2024, 11:35 PMflow.from_source().deploy()
method. Below is a detailed example:
### Example Code
python
import os
from prefect import flow
@flow
def some_work():
print("Running some work...")
# Define your job variables
job_variables = {
"env": {
"EXECUTION_ENVIRONMENT": os.environ.get("EXECUTION_ENVIRONMENT", "local"),
"MY_NOT_SO_SECRET_CONFIG": os.environ.get("MY_NOT_SO_SECRET_CONFIG", "default_value"),
},
"cpu_request": "200m",
"memory_request": "256Mi"
}
# Deploy the flow with overridden job variables
some_work.deploy(
name="demo-deployment",
work_pool_name="my-work-pool",
job_variables=job_variables
)
### Explanation
1. Define the Flow: Define your Prefect flow using the @flow
decorator.
2. Set Job Variables: Create a dictionary with the job variables you wish to override. For example, setting environment variables, CPU, and memory requests.
3. Deploy the Flow: Use the deploy
method from the flow instance, specifying the deployment name, work pool name, and the job_variables
dictionary.
### Reference
For more information and detailed steps on overriding job variables, including different methods and additional customization options, refer to the Overriding Work Pool Job Variables in the official Prefect documentation.
If you need further assistance or have more questions, feel free to ask!Scott Pedersen
07/27/2024, 11:38 PMmetadata
section in the job_configuration
require a variable
to be declared for source_category
?
"metadata": {
"annotations": {
"<http://sumologic.com/sourceCategory|sumologic.com/sourceCategory>": "{{ source_category }}",
"<http://cluster-autoscaler.kubernetes.io/safe-to-evict|cluster-autoscaler.kubernetes.io/safe-to-evict>": "false"
}
}
Marvin
07/27/2024, 11:39 PM{{ source_category }}
) in the metadata
section of your job configuration, you need to declare a corresponding variable in the variables
section. This ensures that Prefect knows how to replace the placeholder with the appropriate value at runtime.
### Example Extended Base Job Template
Below is your base job template adjusted to include a source_category
variable:
json
{
"variables": {
"type": "object",
"properties": {
"cpu_request": {
"type": "string",
"title": "CPU Request",
"description": "The CPU allocation to request for this pod.",
"default": "100m"
},
"memory_request": {
"type": "string",
"title": "Memory Request",
"description": "The memory allocation to request for this pod.",
"default": "128Mi"
},
"source_category": {
"type": "string",
"title": "Source Category",
"description": "The source category for Sumologic annotations.",
"default": "default-source-category"
}
},
"definitions": {
"KubernetesClusterConfig": {
"type": "object",
"title": "KubernetesClusterConfig",
"required": ["config", "context_name"],
"properties": {
"config": {
"type": "object",
"title": "Config",
"description": "The entire contents of a kubectl config file."
},
"context_name": {
"type": "string",
"title": "Context Name",
"description": "The name of the kubectl context to use."
}
},
"description": "Stores configuration for interaction with Kubernetes clusters.\n\nSee `from_file` for creation.",
"secret_fields": [],
"block_type_slug": "kubernetes-cluster-config",
"block_schema_references": {}
}
}
},
"job_configuration": {
"env": "{{ env }}",
"name": "{{ name }}",
"namespace": "{{ namespace }}",
"job_manifest": "{{ job_manifest }}",
"stream_output": "{{ stream_output }}",
"cluster_config": "{{ cluster_config }}",
"containers": [
{
"resources": {
"requests": {
"cpu": "{{ cpu_request }}",
"memory": "{{ memory_request }}"
}
}
}
],
"metadata": {
"annotations": {
"<http://sumologic.com/sourceCategory|sumologic.com/sourceCategory>": "{{ source_category }}",
"<http://cluster-autoscaler.kubernetes.io/safe-to-evict|cluster-autoscaler.kubernetes.io/safe-to-evict>": "false"
}
}
}
}
### Explanation
1. Variables Section:
- Added source_category
to the properties
section. This variable will be used to replace {{ source_category }}
in the annotations.
2. Job Configuration Section:
- Included source_category
in the metadata.annotations
section.
By setting up your base job template this way, you can override the source_category
value when deploying or running your jobs.
For more information on job variables and how to override them, refer to the Overriding Work Pool Job Variables documentation.
If you need further assistance, feel free to ask!Scott Pedersen
07/28/2024, 12:06 AMflow.from_source().deploy()
thats returns a deployment_id
is deployment_id.apply()
still required?Marvin
07/28/2024, 12:06 AMflow.from_source().deploy()
, the returned deployment_id
represents the unique identifier of the created deployment. However, calling deployment_id.apply()
is not necessary unless you have specific requirements to manually apply or trigger the deployment.
### Explanation
After executing flow.from_source().deploy()
, the created deployment is already registered and ready to run per the configuration specified during deployment. The deployment_id
serves as a reference to this deployment.
### Example Code
Here's a step-by-step example using `flow.from_source().deploy()`:
python
from prefect import flow
@flow
def some_work():
print("Running some work...")
# Deploy the flow
deployment_id = some_work.deploy(
name="demo-deployment",
work_pool_name="my-work-pool",
job_variables={
"cpu_request": "200m",
"memory_request": "256Mi",
"env": {
"EXECUTION_ENVIRONMENT": "local",
"MY_NOT_SO_SECRET_CONFIG": "default_value"
}
}
)
# Optionally, apply the deployment if you need to trigger it manually
# deployment_id.apply()
### When to Use deployment_id.apply()
- Manual Trigger: If you want to trigger the deployment manually after creation.
- Further Configuration or Updates: If further adjustments or manual interventions are needed post-deployment.
### References
For more details, refer to the Prefect Deployment Tutorial and the Concepts of Deployments.
If you have any more questions or need additional assistance, feel free to ask!Marvin
07/28/2024, 12:07 AMflow.from_source().deploy()
, the returned deployment_id
represents the unique identifier of the created deployment, and calling deployment_id.apply()
is not required for the deployment to be active or registered. The deployment is already registered and ready to be scheduled or triggered when deploy()
is called.
### Example Code
Deploying a flow and assigning a deployment ID:
python
from prefect import flow
@flow
def some_work():
print("Running some work...")
# Deploy the flow
deployment_id = some_work.deploy(
name="demo-deployment",
work_pool_name="my-work-pool",
job_variables={
"cpu_request": "200m",
"memory_request": "256Mi",
"env": {
"EXECUTION_ENVIRONMENT": "local",
"MY_NOT_SO_SECRET_CONFIG": "default_value"
}
}
)
### When to Use deployment_id.apply()
You generally do not need to call deployment_id.apply()
after using deploy()
unless you have specific use cases such as:
- Manual Triggering: If you want to manually trigger the deployment right after creation without waiting for its scheduled time.
- Further Configuration: If additional manual steps are required post-deployment for customization or specific business logic.
For most common scenarios, calling .deploy()
is sufficient to register and activate the deployment.
### References
For additional information on deploying flows and managing deployments, refer to the Prefect Deployment Tutorial and the Concepts of Deployments.
Feel free to ask if you need further clarification or assistance!Scott Pedersen
07/28/2024, 12:14 AMS3.load(storage_block)
to load our flow code and our image is stored in artifactory. I would like to know if I have set up the from_source
and deploy
method call correctly and whether you can give me any suggestions?
prefect_deployment = my_flow.from_source(
source=s3_storage,
entrypoint="flow:my_flow").deploy(
name=deployment_name,
work_pool_name=workspace_settings.work_pool_name,
work_queue_name=workspace_settings.kubernetes.namespace,
job_variables=job_variables,
tags=tags,
schedule=schedule,
is_schedule_active=bool(schedule),
parameters=deployment_parameters,
image=image,
build=False
)
Marvin
07/28/2024, 12:14 AMfrom_source
and deploy
methods to load your flow code from S3
and use an image stored in Artifactory seems correct. Here's a detailed breakdown and validation of your setup:
### Setup Validation
1. Loading Flow Code from S3:
- Using source=s3_storage
to point to your S3 storage block is correct.
- The entrypoint="flow:my_flow"
specifies the entry point for your flow, which is also correct.
2. Deploying the Flow:
- You are populating all the necessary parameters such as name
, work_pool_name
, work_queue_name
, job_variables
, tags
, schedule
, is_schedule_active
, parameters
, and image
.
- Setting image=image
to specify the Docker image stored in Artifactory.
- Using build=False
to indicate not to build a new image, which is appropriate since you have a pre-built image.
### Example Code with Comments
Below is your provided code with slight adjustments and comments for clarity:
python
from prefect import flow
from prefect.infrastructure import S3
# Define your flow
@flow
def my_flow():
print("Running my flow...")
# Load the S3 storage block
s3_storage = S3.load("your-s3-storage-block")
# Set other required deployment parameters
deployment_name = "my-deployment"
workspace_settings = {
"work_pool_name": "my-work-pool",
"kubernetes": {
"namespace": "my-namespace"
}
}
job_variables = {
"cpu_request": "200m",
"memory_request": "256Mi",
"env": {
"EXECUTION_ENVIRONMENT": "local",
"MY_NOT_SO_SECRET_CONFIG": "default_value"
}
}
tags = ["example"]
schedule = None # or set your schedule object
deployment_parameters = {"param1": "value1"}
image = "your-artifactory-image:latest"
# Deploy the flow using from_source and deploy methods
prefect_deployment = my_flow.from_source(
source=s3_storage,
entrypoint="flow:my_flow"
).deploy(
name=deployment_name,
work_pool_name=workspace_settings["work_pool_name"],
work_queue_name=workspace_settings["kubernetes"]["namespace"],
job_variables=job_variables,
tags=tags,
schedule=schedule,
is_schedule_active=bool(schedule),
parameters=deployment_parameters,
image=image,
build=False
)
### Key Points
1. Source: Ensure s3_storage
is correctly configured to point to your S3 bucket containing your flow's source code.
2. Entry Point: Validate that the entry point flow:my_flow
matches the flow definition in your code.
3. Image: Confirm that your Docker image in Artifactory is publicly available or accessible by the Prefect workers executing the flow runs.
4. Work Pool and Queue: Verify the work pool (work_pool_name
) and work queue (work_queue_name
) configurations are as per your infrastructure setup.
### References
For additional guidance on deploying and managing flows in Prefect, refer to the following documentation:
- Deploying Flows
- Upgrading from Agents to Workers
- Prefect Deployment Guide
Feel free to ask if you need any further clarification or assistance!Scott Pedersen
07/29/2024, 1:30 AMsource_category
is the variable defined in the base job template and my_source_category
is the value
job_variables = {
"cpu_request": "200m",
"memory_request": "256Mi",
"source_category": my_source_category,
"env": {
"EXECUTION_ENVIRONMENT": "local",
"MY_NOT_SO_SECRET_CONFIG": "default_value"
}
}
Marvin
07/29/2024, 1:30 AMsource_category
to the job_variables
dictionary in the same way as other variables. By including "source_category": my_source_category
in the job_variables
, you ensure that the source_category
variable defined in the base job template is populated with the appropriate value.
### Example Code
python
job_variables = {
"cpu_request": "200m",
"memory_request": "256Mi",
"source_category": "my_source_category", # Add this line
"env": {
"EXECUTION_ENVIRONMENT": "local",
"MY_NOT_SO_SECRET_CONFIG": "default_value"
}
}
### Full Context with Deployment
Here’s how it fits into your full script with the deployment:
python
from prefect import flow
from prefect.infrastructure import S3
# Define your flow
@flow
def my_flow():
print("Running my flow...")
# Load the S3 storage block
s3_storage = S3.load("your-s3-storage-block")
# Set other required deployment parameters
deployment_name = "my-deployment"
workspace_settings = {
"work_pool_name": "my-work-pool",
"kubernetes": {
"namespace": "my-namespace"
}
}
job_variables = {
"cpu_request": "200m",
"memory_request": "256Mi",
"source_category": "my_source_category", # Add the source_category variable here
"env": {
"EXECUTION_ENVIRONMENT": "local",
"MY_NOT_SO_SECRET_CONFIG": "default_value"
}
}
tags = ["example"]
schedule = None # or set your schedule object
deployment_parameters = {"param1": "value1"}
image = "your-artifactory-image:latest"
# Deploy the flow using from_source and deploy methods
prefect_deployment = my_flow.from_source(
source=s3_storage,
entrypoint="flow:my_flow"
).deploy(
name=deployment_name,
work_pool_name=workspace_settings["work_pool_name"],
work_queue_name=workspace_settings["kubernetes"]["namespace"],
job_variables=job_variables,
tags=tags,
schedule=schedule,
is_schedule_active=bool(schedule),
parameters=deployment_parameters,
image=image,
build=False
)
### Summary
By adding "source_category": "my_source_category"
to the job_variables
, you correctly populate the source_category
variable in your base job template.
For additional information on job variables and deployment configurations, refer to the Prefect documentation on Overriding Work Pool Job Variables.
If you have any more questions or need further assistance, feel free to ask!Scott Pedersen
08/01/2024, 6:29 AM"cpu": "{{ cpu_request }}"
with "1"
from within Python code how do I do this, can it be done as part of the Flow.deploy()
method?
"job_manifest": {
"kind": "Job",
"spec": {
"template": {
"spec": {
"containers": [
{
"env": "{{ env }}",
"args": "{{ command }}",
"name": "prefect-job",
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu_limit }}",
"memory": "{{ mem_limit }}"
},
"requests": {
"cpu": "{{ cpu_request }}",
"memory": "{{ mem_request }}"
}
},
"imagePullPolicy": "{{ image_pull_policy }}"
}
],
"completions": 1,
"parallelism": 1,
"restartPolicy": "Never",
"serviceAccountName": "{{ service_account_name }}"
}
},
"backoffLimit": 0,
"ttlSecondsAfterFinished": "{{ finished_job_ttl }}"
},
"metadata": {
"labels": "{{ labels }}",
"namespace": "{{ namespace }}",
"generateName": "{{ name }}-"
},
"apiVersion": "batch/v1"
},
Marvin
08/01/2024, 6:29 AMI would like to override the fields in the job_manifest section in the base_job_template for examplefrom within Python code how do I do this, can it be done as part of thewith"cpu": "{{ cpu_request }}"
"1"
Flow.deploy()
method?
``` "job_manifest": {
"kind": "Job",
"spec": {
"template": {
"spec": {
"containers": [
{
"env": "{{ env }}",
"args": "{{ command }}",
"name": "prefect-job",
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu_limit }}",
"memory": "{{ mem_limit }}"
},
"requests": {
"cpu": "{{ cpu_request }}",
"memory": "{{ mem_request }}"
}
},
"imagePullPolicy": "{{ image_pull_policy }}"
}
],
"completions": 1,
"parallelism": 1,
"restartPolicy": "Never",
"serviceAccountName": "{{ service_account_name }}"
}
},
"backoffLimit": 0,
"ttlSecondsAfterFinished": "{{ finished_job_ttl }}"
},
"metadata": {
"labels": "{{ labels }}",
"namespace": "{{ namespace }}",
"generateName": "{{ name }}-"
},
"apiVersion": "batch/v1Scott Pedersen
08/01/2024, 6:30 AM"cpu": "{{ cpu_request }}"
with "1"
from within Python code how do I do this, can it be done as part of the Flow.deploy()
method?
"job_manifest": {
"kind": "Job",
"spec": {
"template": {
"spec": {
"containers": [
{
"env": "{{ env }}",
"args": "{{ command }}",
"name": "prefect-job",
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu_limit }}",
"memory": "{{ mem_limit }}"
},
"requests": {
"cpu": "{{ cpu_request }}",
"memory": "{{ mem_request }}"
}
},
"imagePullPolicy": "{{ image_pull_policy }}"
}
],
"completions": 1,
"parallelism": 1,
"restartPolicy": "Never",
"serviceAccountName": "{{ service_account_name }}"
}
},
"backoffLimit": 0,
"ttlSecondsAfterFinished": "{{ finished_job_ttl }}"
},
"metadata": {
"labels": "{{ labels }}",
"namespace": "{{ namespace }}",
"generateName": "{{ name }}-"
},
"apiVersion": "batch/v1
Marvin
08/01/2024, 6:31 AM"cpu": "{{ cpu_request }}"
with "1"
in the job_manifest
section of the base job template directly from within Python code using the Flow.deploy()
method, you need to customize the job_configuration
while deploying.
Unfortunately, direct modifications to the job_manifest
in the Flow.deploy()
method aren't supported natively by Prefect, and you would typically use environment variables or template values defined under job_variables
.
However, you can define job_variables
comprehensively to make the required overrides.
### Example Code
Here’s an example of how you can set up the deployment:
python
from prefect import flow
from prefect.infrastructure import S3
# Define your flow
@flow
def my_flow():
print("Running my flow...")
# Load the S3 storage block
s3_storage = S3.load("your-s3-storage-block")
# Set your job variables to override the CPU with "1"
job_variables = {
"cpu_request": "1",
"memory_request": "256Mi",
"env": {
"EXECUTION_ENVIRONMENT": "local",
"MY_NOT_SO_SECRET_CONFIG": "default_value"
}
}
# Deploy the flow
prefect_deployment = my_flow.from_source(
source=s3_storage,
entrypoint="flow:my_flow"
).deploy(
name="demo-deployment",
work_pool_name="my-work-pool",
work_queue_name="my-namespace",
job_variables=job_variables,
tags=["example"],
schedule=None, # Or your schedule object
is_schedule_active=False, # Or True if you want the schedule active
parameters={"param1": "value1"},
image="your-artifactory-image:latest",
build=False
)
### Key Points
- Job Variables: By defining cpu_request
as "1"
in job_variables
, this value will replace {{ cpu_request }}
in job_manifest
during deployment.
- Base Job Template: Ensure that the template placeholders match the keys in the job_variables
.
### References
For more information on job variables and deployment, refer to the Overriding Work Pool Job Variables in the official Prefect documentation.
If you have any more questions or need further assistance, feel free to ask.Scott Pedersen
08/01/2024, 6:38 AMjob_variables
it doesn't seem to contain the same structure as job_manifest
in the deployment configuration. It looks like the below
{
"env" {
....
},
"image": "prefect2-image",
"cpu_limit": "1.0",
"namespace": "flows-namespace",
"cpu_request": "1.0",
"memory_limit": "1Gi",
"memory_request": "1Gi",
"finished_job_ttl": 86400,
"service_account_name": "service-account"
}
any ideas why?Marvin
08/01/2024, 6:38 AMjob_variables
are not directly reflecting the structure of the job_manifest
. This discrepancy is because job_variables
are used to populate the placeholders within job_manifest
rather than directly altering the structure of job_manifest
.
### Adjusting job_variables for Correct Structuring
Here’s how you can correctly populate job_variables
to ensure they override the placeholders correctly in `job_manifest`:
### Example Code with Proper job_variables
python
from prefect import flow
from prefect.infrastructure import S3
# Define your flow
@flow
def my_flow():
print("Running my flow...")
# Load the S3 storage block
s3_storage = S3.load("your-s3-storage-block")
# Set job variables to configure the job manifest correctly
job_variables = {
"env": {
"EXECUTION_ENVIRONMENT": "local",
"MY_NOT_SO_SECRET_CONFIG": "default_value"
},
"image": "your-artifactory-image:latest",
"cpu_limit": "1.0",
"namespace": "flows-namespace",
"cpu_request": "1.0", # Requesting 1 CPU
"memory_limit": "1Gi",
"memory_request": "1Gi",
"finished_job_ttl": 86400,
"service_account_name": "service-account",
"labels": {
"example_label": "label_value"
}
}
# Deploy the flow
prefect_deployment = my_flow.from_source(
source=s3_storage,
entrypoint="flow:my_flow"
).deploy(
name="demo-deployment",
work_pool_name="my-work-pool",
work_queue_name="my-namespace",
job_variables=job_variables,
tags=["example"],
schedule=None, # Or your schedule object
is_schedule_active=False, # Or True if you want the schedule active
parameters={"param1": "value1"},
build=False
)
### Make Sure Placeholders Match
Ensure that the placeholders in your job_manifest
within the base job template correspond to the job_variables
keys:
json
"job_manifest": {
"kind": "Job",
"spec": {
"template": {
"spec": {
"containers": [
{
"env": "{{ env }}",
"args": "{{ command }}",
"name": "prefect-job",
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu_limit }}",
"memory": "{{ memory_limit }}"
},
"requests": {
"cpu": "{{ cpu_request }}",
"memory": "{{ memory_request }}"
}
},
"imagePullPolicy": "{{ image_pull_policy }}"
}
],
"completions": 1,
"parallelism": 1,
"restartPolicy": "Never",
"serviceAccountName": "{{ service_account_name }}"
}
},
"backoffLimit": 0,
"ttlSecondsAfterFinished": "{{ finished_job_ttl }}"
},
"metadata": {
"labels": "{{ labels }}",
"namespace": "{{ namespace }}",
"generateName": "{{ name }}-"
},
"apiVersion": "batch/v1"
}
### Explanation
- {{ cpu_request }}
and {{ memory_request }}
: These placeholders in the job_manifest
are replaced by values from job_variables
like cpu_request
and memory_request
.
- Make sure all other placeholders like {{ env }}
, {{ image }}
, {{ namespace }}
, etc., have corresponding values in the job_variables
dictionary.
### References
For more details on configuring job manifests and job variables, please consult the Prefect documentation on Overriding Work Pool Job Variables.
If you need further clarification or have more questions, feel free to ask!Scott Pedersen
08/01/2024, 6:42 AMMarvin
08/01/2024, 6:42 AMbase_job_template
. This will effectively make them optional, and they won't be required in the job_variables
dictionary.
### Modifying the Base Job Template
To remove fields from your base_job_template
, you simply need to delete the unnecessary placeholders and their corresponding configuration. Here’s how you can streamline the `base_job_template`:
1. Identify which fields you want to keep and which ones to remove.
2. Edit the base_job_template
accordingly.
### Example Base Job Template Without Unnecessary Variables
json
{
"variables": {
"type": "object",
"properties": {
"cpu_request": {
"type": "string",
"title": "CPU Request",
"description": "The CPU allocation to request for this pod.",
"default": "100m"
},
"memory_request": {
"type": "string",
"title": "Memory Request",
"description": "The memory allocation to request for this pod.",
"default": "128Mi"
},
"env": {
"type": "object",
"title": "Environment Variables",
"description": "Environment variables for the container.",
"default": {}
}
},
"definitions": {
"KubernetesClusterConfig": {
"type": "object",
"title": "KubernetesClusterConfig",
"required": ["config", "context_name"],
"properties": {
"config": {
"type": "object",
"title": "Config",
"description": "The entire contents of a kubectl config file."
},
"context_name": {
"type": "string",
"title": "Context Name",
"description": "The name of the kubectl context to use."
}
},
"description": "Stores configuration for interaction with Kubernetes clusters.\n\nSee `from_file` for creation.",
"secret_fields": [],
"block_type_slug": "kubernetes-cluster-config",
"block_schema_references": {}
}
}
},
"job_configuration": {
"env": "{{ env }}",
"name": "{{ name }}",
"namespace": "{{ namespace }}",
"job_manifest": {
"kind": "Job",
"spec": {
"template": {
"spec": {
"containers": [
{
"env": "{{ env }}",
"args": "{{ command }}",
"name": "prefect-job",
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu_limit }}",
"memory": "{{ memory_limit }}"
},
"requests": {
"cpu": "{{ cpu_request }}",
"memory": "{{ memory_request }}"
}
},
"imagePullPolicy": "{{ image_pull_policy }}"
}
],
"completions": 1,
"parallelism": 1,
"restartPolicy": "Never",
"serviceAccountName": "{{ service_account_name }}"
}
},
"backoffLimit": 0,
"ttlSecondsAfterFinished": "{{ finished_job_ttl }}"
},
"metadata": {
"labels": "{{ labels }}",
"namespace": "{{ namespace }}",
"generateName": "{{ name }}-"
},
"apiVersion": "batch/v1"
},
"stream_output": "{{ stream_output }}",
"cluster_config": "{{ cluster_config }}"
}
}
### Updating the Deployment
Then, you can deploy your flow with only those variables you intend to override:
```python
from prefect import flow
from prefect.infrastructure import S3
# Define your flow
@flow
def my_flow():
print("Running my flow...")
# Load the S3 storage block
s3_storage = S3.load("your-s3-storage-block")
# Set job variables to override the required ones
job_variables = {
"cpu_request": "1",
"memory_request": "256Mi",
"env": {Marvin
08/01/2024, 6:42 AMbase_job_template
.
- Optional Variables: Remove variables from the variables
section that you do not intend to override.
- Deploy: When deploying, only the variables defined in job_variables
will be used to populate the relevant placeholders.
This way, you simplify your deployment process by focusing only on the variables you need to override.
For more detailed guidance, refer to the Prefect documentation on job variable overrides.
If you need further assistance, feel free to ask!Scott Pedersen
08/01/2024, 6:50 AMMarvin
08/01/2024, 6:50 AMjob_manifest
in the base_job_template
that you don't intend to populate. This will simplify the job manifest and the deployment configuration.
### Example Simplified Base Job Template
Here's an example of the simplified base job template:
json
{
"variables": {
"type": "object",
"properties": {
"cpu_request": {
"type": "string",
"title": "CPU Request",
"description": "The CPU allocation to request for this pod.",
"default": "100m"
},
"memory_request": {
"type": "string",
"title": "Memory Request",
"description": "The memory allocation to request for this pod.",
"default": "128Mi"
},
"cpu_limit": {
"type": "string",
"title": "CPU Limit",
"description": "The CPU limit for this pod.",
"default": "500m"
},
"memory_limit": {
"type": "string",
"title": "Memory Limit",
"description": "The memory limit for this pod.",
"default": "512Mi"
},
"env": {
"type": "object",
"title": "Environment Variables",
"description": "Environment variables for the container.",
"default": {}
},
"image": {
"type": "string",
"title": "Image",
"description": "The Docker image to use.",
"default": "prefect2-image"
},
"namespace": {
"type": "string",
"title": "Namespace",
"description": "The Kubernetes namespace.",
"default": "default"
},
"finished_job_ttl": {
"type": "integer",
"title": "Finished Job TTL",
"description": "Time to live for finished jobs in seconds.",
"default": 86400
},
"service_account_name": {
"type": "string",
"title": "Service Account Name",
"description": "The service account name.",
"default": "default"
}
}
},
"job_configuration": {
"env": "{{ env }}",
"job_manifest": {
"kind": "Job",
"spec": {
"template": {
"spec": {
"containers": [
{
"env": "{{ env }}",
"name": "prefect-job",
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu_limit }}",
"memory": "{{ memory_limit }}"
},
"requests": {
"cpu": "{{ cpu_request }}",
"memory": "{{ memory_request }}"
}
},
"imagePullPolicy": "IfNotPresent"
}
],
"restartPolicy": "Never",
"serviceAccountName": "{{ service_account_name }}"
}
},
"ttlSecondsAfterFinished": "{{ finished_job_ttl }}"
},
"metadata": {
"namespace": "{{ namespace }}",
"generateName": "prefect-job-"
},
"apiVersion": "batch/v1"
}
}
}
### Key Points
- Keep Only Relevant Placeholders: Ensure that only the placeholders you intend to populate (cpu_request
, memory_request
, cpu_limit
, memory_limit
, env
, image
, namespace
, finished_job_ttl
, service_account_name
) are present in job_manifest
.
- Remove Unnecessary Placeholders: Any other placeholders that you do not intend to populate should be removed from both variables
and job_manifest
.
### Updated Deployment Code
Make sure your deployment code matches the simplified base job template:
```python
from prefect import flow
from prefect.infrastructure import S3
# Define your flow
@flow
def my_flow():
print("Running my flow...")
# Load the S3 storage block
s3_storage = S3.load("your-s3-storage-block")
# Set job variables to configure the job manifest correctly
job_variables = {
"cpu_request": "1.0",
"memory_request": "1Gi",Marvin
08/01/2024, 6:50 AMjob_manifest
is simplified and concisely matches the job_variables
you are dealing with in your deployment.
For more details, refer to the Prefect documentation on Job Variables.
If you have any more questions or need further assistance, feel free to ask!Scott Pedersen
08/01/2024, 6:55 AMMarvin
08/01/2024, 6:56 AMjob_variables
, Prefect will either use the defaults specified in your base job template, or it may leave the placeholders as they are in the job manifest. This can lead to the job manifest being incomplete or having placeholders that aren’t replaced.
### Default Handling in the Base Job Template
1. Defaults in Variables Section:
- If you have specified default values for any variables in the variables
section of your base job template, those default values will be used if you do not provide overrides in your job_variables
.
- Example:
json
"cpu_request": {
"type": "string",
"title": "CPU Request",
"description": "The CPU allocation to request for this pod.",
"default": "100m"
}
2. No Defaults Provided:
- If there are no default values in the variables
section, and you do not provide values in job_variables
, the placeholders will remain unresolved in the final job manifest, which can cause issues during deployment.
### Example Base Job Template with Defaults
Here’s an example to illustrate how you can handle defaults:
json
{
"variables": {
"type": "object",
"properties": {
"cpu_request": {
"type": "string",
"title": "CPU Request",
"description": "The CPU allocation to request for this pod.",
"default": "100m"
},
"memory_request": {
"type": "string",
"title": "Memory Request",
"description": "The memory allocation to request for this pod.",
"default": "128Mi"
},
"cpu_limit": {
"type": "string",
"title": "CPU Limit",
"description": "The CPU limit for this pod.",
"default": "500m"
},
"memory_limit": {
"type": "string",
"title": "Memory Limit",
"description": "The memory limit for this pod.",
"default": "512Mi"
},
"env": {
"type": "object",
"title": "Environment Variables",
"description": "Environment variables for the container.",
"default": {}
},
"image": {
"type": "string",
"title": "Image",
"description": "The Docker image to use.",
"default": "prefecthq/prefect:latest"
},
"namespace": {
"type": "string",
"title": "Namespace",
"description": "The Kubernetes namespace.",
"default": "default"
},
"finished_job_ttl": {
"type": "integer",
"title": "Finished Job TTL",
"description": "Time to live for finished jobs in seconds.",
"default": 86400
},
"service_account_name": {
"type": "string",
"title": "Service Account Name",
"description": "The service account name.",
"default": "default"
}
}
},
"job_configuration": {
"env": "{{ env }}",
"job_manifest": {
"kind": "Job",
"spec": {
"template": {
"spec": {
"containers": [
{
"env": "{{ env }}",
"name": "prefect-job",
"image": "{{ image }}",
"resources": {
"limits": {
"cpu": "{{ cpu_limit }}",
"memory": "{{ memory_limit }}"
},
"requests": {
"cpu": "{{ cpu_request }}",
"memory": "{{ memory_request }}"
}
},
"imagePullPolicy": "IfNotPresent"
}
],
"restartPolicy": "Never",
"serviceAccountName": "{{ service_account_name }}"
}
},
"ttlSecondsAfterFinished": "{{ finished_job_ttl }}"
},
"metadata": {
"namespace": "{{ namespace }}",
"generateName": "prefect-job-"
},
"apiVersion": "batch/v1"
}
}
}
Marvin
08/01/2024, 6:56 AMpython
from prefect import flow
from prefect.infrastructure import S3
# Define your flow
@flow
def my_flow():
print("Running my flow...")
# Load the S3 storage block
s3_storage = S3.load("your-s3-storage-block")
# Set job variables to override only some of the required ones
job_variables = {
"cpu_request": "1.0",
"memory_request": "1Gi",
"env": {
"EXECUTION_ENVIRONMENT": "local",
"MY_NOT_SO_SECRET_CONFIG": "default_value"
}
}
# Deploy the flow
prefect_deployment = my_flow.from_source(
source=s3_storage,
entrypoint="flow:my_flow"
).deploy(
name="demo-deployment",
work_pool_name="my-work-pool",
work_queue_name="my-namespace",
job_variables=job_variables,
tags=["example"],
schedule=None, # Or your schedule object
is_schedule_active=False, # Or True if you want the schedule active
parameters={"param1": "value1"},
build=False
)
In this setup:
- cpu_limit
, memory_limit
, image
, namespace
, finished_job_ttl
, and service_account_name
will use their default values from the base job template.
- cpu_request
and memory_request
as well as env
are overridden with specified values in job_variables
.
### Summary
- Include Defaults: Ensure you provide default values in the base job template for variables that might not always be overridden.
- Remove Placeholders: If no default or override is provided and placeholder is unnecessary, it should be removed to avoid unresolved placeholders.
If you have any more questions or need further assistance, feel free to ask!