Slackbot
11/20/2023, 10:45 AMMarvin
11/20/2023, 10:47 AMMarvin
11/20/2023, 10:48 AMprefect.yaml
file used to create deployments. Variables in .yaml
files are denoted by quotes and double curly brackets, like so: "{{ prefect.variables.my_variable }}"
.
For instance, if you want to specify which branch should be cloned for a deployment, you could set a variable in your prefect.yaml
file that references the branch name.
Here's a simplified example:
yaml
version: 1
name: MyFlow
storage:
type: Git
repo: my/repo
ref: "{{ prefect.variables.branch_name }}"
In this example, branch_name
is a variable that you would set in your deployment environment. The value of branch_name
would be the name of the branch you want to clone for the deployment.
Please note that if a template variable is the only value for a key in the job_configuration
section and no value is provided for the template variable, the key will be removed from the job_configuration
section.
For more detailed information, you can refer to the following Prefect documentation pages:
- Developing a new worker type
- Variables
- Block-based deployments
Let me know if you need further assistance!Marvin
11/20/2023, 10:55 AMprefect.yaml
file. Here's how you can do it:
In your prefect.yaml
file, you can reference job variables using the {{ $VARIABLE_NAME }}
syntax. For example, if you have a job variable named BRANCH_NAME
, you can reference it in your prefect.yaml
file like this:
yaml
version: 1
name: MyFlow
storage:
type: Git
repo: my/repo
ref: "{{ $BRANCH_NAME }}"
In this case, BRANCH_NAME
is a job variable that you would set in your deployment environment. The value of BRANCH_NAME
would be the name of the branch you want to clone for the deployment.
Please note that if a template variable is the only value for a key in the job_configuration
section and no value is provided for the template variable, the key will be removed from the job_configuration
section.
For more detailed information, you can refer to the following Prefect documentation pages:
- Developing a new worker type
- Variables
- Block-based deployments
Let me know if you need further assistance!Marvin
11/20/2023, 2:37 PMyaml
name: Deploy on PR
on:
pull_request:
types: [opened]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
- name: Deploy
run: |
# Your deployment script here
In this workflow:
- The on
section specifies that the workflow should be triggered when a pull request is opened.
- The jobs
section defines a single job named deploy
that runs on the latest version of Ubuntu.
- The steps
section defines the steps that should be executed as part of the job. In this case, the steps are:
- Checkout the code using the actions/checkout@v2
action.
- Setup Python using the actions/setup-python@v2
action.
- Install the dependencies by running pip install -r requirements.txt
.
- Deploy the application. You would replace # Your deployment script here
with the actual commands to deploy your application.
You would need to commit this workflow file to your repository in the .github/workflows
directory.
For more detailed information, you can refer to the following resources:
- Github Actions: a deep dive into pull_request - Frontside
- Events that trigger workflows - GitHub Docs
Let me know if you need further assistance!Marvin
11/20/2023, 2:39 PMprefect run flow
command to trigger a Prefect run.
Here's a simple example:
yaml
name: Deploy on PR
on:
pull_request:
types: [opened]
jobs:
deploy:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install prefect
- name: Trigger Prefect Run
run: |
prefect backend cloud
prefect auth login --key ${{ secrets.PREFECT_API_KEY }}
prefect run flow --name MyFlow --project MyProject
In this workflow:
- The on
section specifies that the workflow should be triggered when a pull request is opened.
- The jobs
section defines a single job named deploy
that runs on the latest version of Ubuntu.
- The steps
section defines the steps that should be executed as part of the job. In this case, the steps are:
- Checkout the code using the actions/checkout@v2
action.
- Setup Python using the actions/setup-python@v2
action.
- Install the Prefect CLI by running pip install prefect
.
- Trigger a Prefect run using the prefect run flow
command.
You would need to replace MyFlow
and MyProject
with the name of your flow and project, respectively. Also, you would need to store your Prefect API key as a secret in your GitHub repository and reference it in the workflow file as secrets.PREFECT_API_KEY
.
For more detailed information, you can refer to the following resources:
- GitHub Actions Documentation
- Prefect CLI Documentation
Let me know if you need further assistance!Marvin
11/21/2023, 9:32 AMenv
section of your job. Then, you can reference these environment variables in your steps using the env
context.
Here's an example:
yaml
name: Deploy on PR
on:
pull_request:
types: [opened]
jobs:
deploy:
runs-on: ubuntu-latest
env:
FLOW_NAME: MyFlow
PROJECT_NAME: MyProject
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install prefect
- name: Trigger Prefect Run
run: |
prefect backend cloud
prefect auth login --key ${{ secrets.PREFECT_API_KEY }}
prefect run flow --name $FLOW_NAME --project $PROJECT_NAME
In this workflow:
- The env
section defines two environment variables: FLOW_NAME
and PROJECT_NAME
.
- The Trigger Prefect Run
step references these environment variables using the env
context.
You would need to replace MyFlow
and MyProject
with the name of your flow and project, respectively. Also, you would need to store your Prefect API key as a secret in your GitHub repository and reference it in the workflow file as secrets.PREFECT_API_KEY
.
For more detailed information, you can refer to the following resources:
- GitHub Actions Documentation
- Prefect CLI Documentation
Let me know if you need further assistance!Marvin
11/21/2023, 9:34 AM--parameters
option of the prefect run flow
command. The parameters should be specified as a JSON object.
Here's an example:
yaml
name: Deploy on PR
on:
pull_request:
types: [opened]
jobs:
deploy:
runs-on: ubuntu-latest
env:
FLOW_NAME: MyFlow
PROJECT_NAME: MyProject
PARAM1: value1
PARAM2: value2
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Setup Python
uses: actions/setup-python@v2
with:
python-version: '3.x'
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install prefect
- name: Trigger Prefect Run
run: |
prefect backend cloud
prefect auth login --key ${{ secrets.PREFECT_API_KEY }}
prefect run flow --name $FLOW_NAME --project $PROJECT_NAME --parameters '{"param1":"'$PARAM1'", "param2":"'$PARAM2'"}'
In this workflow:
- The env
section defines two additional environment variables: PARAM1
and PARAM2
.
- The Trigger Prefect Run
step passes these environment variables as parameters to the Prefect flow run.
You would need to replace MyFlow
, MyProject
, value1
, and value2
with the name of your flow, project, and the values of your parameters, respectively. Also, you would need to store your Prefect API key as a secret in your GitHub repository and reference it in the workflow file as secrets.PREFECT_API_KEY
.
For more detailed information, you can refer to the following resources:
- GitHub Actions Documentation
- Prefect CLI Documentation
Let me know if you need further assistance!Marvin
11/21/2023, 9:36 AMParameter
task to pick up the parameters. Here's an example:
from prefect import Flow, Parameter
# Define your parameters
param1 = Parameter('param1')
param2 = Parameter('param2')
# Define your flow
with Flow('MyFlow') as flow:
# Use your parameters
result = my_task(param1, param2)
In this example, param1
and param2
are Parameter
tasks. When you run your flow with parameters, the values of these parameters will be the values you passed in.
You can then use these parameters in your tasks. In this example, my_task
is a task that takes two arguments. When the flow is run, my_task
will be called with param1
and param2
as arguments.
For more detailed information, you can refer to the following Prefect documentation page:
- Parameters
Let me know if you need further assistance!Marvin
11/21/2023, 11:18 AMParameter
task in a similar way to V1 to import parameters into your flow. Here's an example:
from prefect import Flow, Parameter
# Define your parameters
param1 = Parameter('param1')
param2 = Parameter('param2')
# Define your flow
with Flow('MyFlow') as flow:
# Use your parameters
result = my_task(param1, param2)
In this example, param1
and param2
are Parameter
tasks. When you run your flow with parameters, the values of these parameters will be the values you passed in.
You can then use these parameters in your tasks. In this example, my_task
is a task that takes two arguments. When the flow is run, my_task
will be called with param1
and param2
as arguments.
For more detailed information, you can refer to the following Prefect documentation page:
- Parameters
Let me know if you need further assistance!Marvin
11/21/2023, 11:22 AMdate
. It's an optional parameter of the what_day_is_it
function, which is decorated with the @flow
decorator to make it a Prefect flow.
The date
parameter is of type datetime
, but it has a default value of None
. If no value is provided when calling the what_day_is_it
function (or running the flow), the current date and time (in UTC) will be used, as indicated by the line date = datetime.utcnow()
.
When you call what_day_is_it("2021-01-01T02:00:19.180906")
, you're passing a string to the function, which is not a datetime
object. If you want to pass a datetime
object, you should parse the string to a datetime
object first. Here's how you can do it:
from datetime import datetime
from prefect import flow
@flow
def what_day_is_it(date: datetime = None):
if date is None:
date = datetime.utcnow()
print(f"It was {date.strftime('%A')} on {date.isoformat()}")
date_str = "2021-01-01T02:00:19.180906"
date_obj = datetime.fromisoformat(date_str)
what_day_is_it(date_obj)
# It was Friday on 2021-01-01T02:00:19.180906
In this modified version of your code, the string date_str
is parsed to a datetime
object date_obj
using the datetime.fromisoformat
function. Then, date_obj
is passed to the what_day_is_it
function.Marvin
11/21/2023, 12:53 PMprefect.yaml
file which is used by the prefect deploy
command to deploy your flows. Here's an example of how you can do it:
yaml
# Generic metadata about this project
name: flows
prefect-version: 2.13.8
# build section allows you to manage and build docker images
build:
- prefect_docker.deployments.steps.build_docker_image:
id: build-image
requires: prefect-docker>=0.4.0
image_name: "{{ $PREFECT_IMAGE_NAME }}"
tag: latest
dockerfile: auto
platform: "linux/amd64"
# push section allows you to manage if and how this project is uploaded to remote locations
push:
- prefect_docker.deployments.steps.push_docker_image:
requires: prefect-docker>=0.4.0
image_name: "{{ build-image.image_name }}"
tag: "{{ build-image.tag }}"
# pull section allows you to provide instructions for cloning this project in remote locations
pull:
- prefect.deployments.steps.set_working_directory:
directory: /opt/prefect/flows
# the definitions section allows you to define reusable components for your deployments
definitions:
tags: &common_tags
- "eks"
In this example, the image_name
and tag
in the build
section, and the directory
in the pull
section are specified as parameters. These parameters are denoted by quotes and double curly brackets, like so: "{{ $PREFECT_IMAGE_NAME }}"
.
You can replace $PREFECT_IMAGE_NAME
with the name of your environment variable. The value of this environment variable would be the name of the image you want to build for the deployment.
For more detailed information, you can refer to the following Prefect documentation pages:
- Kubernetes
- Prefect Deploy
Let me know if you need further assistance!Marvin
11/21/2023, 12:54 PMdeployments
section of the prefect.yaml
file, you can specify parameters for each deployment. Here's an example:
yaml
deployments:
- name: MyFlow
kind: flow
flow_file: my_flow.py
image_name: "{{ $IMAGE_NAME }}"
image_tag: "{{ $IMAGE_TAG }}"
parameters:
param1: "{{ $PARAM1 }}"
param2: "{{ $PARAM2 }}"
In this example:
- name
is the name of the deployment.
- kind
is the kind of the deployment. In this case, it's a flow.
- flow_file
is the Python file that defines the flow.
- image_name
and image_tag
are the name and tag of the Docker image that should be used for the deployment. They are specified as parameters.
- parameters
is a dictionary of parameters that should be passed to the flow when it's run. In this case, param1
and param2
are specified as parameters.
You can replace $IMAGE_NAME
, $IMAGE_TAG
, $PARAM1
, and $PARAM2
with the names of your environment variables. The values of these environment variables would be the name and tag of the Docker image, and the values of the parameters, respectively.
For more detailed information, you can refer to the following Prefect documentation pages:
- Kubernetes
- Prefect Deploy
Let me know if you need further assistance!Marvin
11/21/2023, 1:00 PMprefect.yaml
file should be visible in the Prefect UI when you deploy the flow. If the parameters are not showing up, it could be due to a few reasons:
1. There might be an issue with the Prefect version you're using. Ensure you're using a version that supports parameters in the prefect.yaml
file.
2. There could be an issue with the formatting or indentation of the prefect.yaml
file. YAML files are sensitive to indentation, so ensure that the parameters
section is properly indented.
3. If you've recently updated the prefect.yaml
file, make sure to redeploy the flow for the changes to take effect.
4. If you want the Prefect API to verify the parameter values passed to a flow run against the schema defined by parameter_openapi_schema
, set enforce_parameter_schema
to true
.
If none of these suggestions work, I would recommend reaching out to Prefect support or checking their documentation for more information.
Here's the corrected version of your YAML snippet:
yaml
- name: dbt_prod_full_refresh_test
version: 2.0
tags:
- data-transformation
- test
- full-refresh
description: ""
entrypoint: flows/dbt/run_dbt.py:trigger_dbt_flow
parameters:
param1: "test"
Let me know if you need further assistance!Marvin
11/21/2023, 1:23 PM@flow
decorator.
When you define a flow, you specify the parameters it expects. If you try to run a flow with a parameter that it doesn't expect, you'll get an error.
Here's an example of how you can define a flow that expects a parameter:
from prefect import Flow, Parameter
@Flow
def my_flow(param1: str):
print(f"param1: {param1}")
my_flow.run(param1="test")
In this example, my_flow
is a flow that expects a parameter param1
. When you run this flow with my_flow.run(param1="test")
, it will print param1: test
.
If you try to run this flow with a parameter that it doesn't expect, like my_flow.run(param2="test")
, you'll get an error because my_flow
doesn't expect a parameter param2
.
So, you should check the function you're trying to run as a flow and make sure it expects the parameters you're trying to pass to it. If it doesn't, you should modify the function to expect these parameters.
Let me know if you need further assistance!Marvin
11/21/2023, 1:26 PMclone_repository
task is expecting a parameter branch
, but it's not being provided when the task is called in the transform_data
flow.
To pass parameters to a task within a flow, you can use Prefect's Parameter
task. Here's how you can modify your flow to pass a parameter to the clone_repository
task:
from prefect import Flow, Parameter, task
@task(name="Clone repo", log_prints=True)
def clone_repository(branch: str = "master"):
repo_url = f"https://{username}:{pat}@github.com/{organization}/{repository}.git"
destination_dir = "abc"
clone_repo = git.Repo.clone_from(repo_url, destination_dir, branch=branch)
return clone_repo
@flow(name="Data Transformation", log_prints=True)
def transform_data():
branch = Parameter('branch', default='master')
clone_repository(branch)
In this example, branch
is a Parameter
task. When you run your flow with parameters, the value of this parameter will be the value you passed in.
You can then use this parameter in your tasks. In this example, clone_repository
is a task that takes one argument. When the flow is run, clone_repository
will be called with branch
as an argument.
Now, when you run the flow, you can pass the branch name as a parameter:
transform_data.run(parameters={'branch': 'my-branch'})
This will clone the repository from the 'my-branch' branch.
Let me know if you need further assistance!