Hi I am new to Prefect 2 0 I need some assistance from the c Prefect Community #best-practices

Hi. I am new to Prefect 2.0. I need some assistanc...

Daniel Mak

10/19/2022, 8:20 AM

Hi. I am new to Prefect 2.0. I need some assistance from the community here. I am deploying some pipelines that is scheduled to run on a daily basis. How do I pass in the runtime

date

into my flow as I would need to use that as a parameter for my pipeline? E.g runs on Wed 19Oct 2022 23:00. the date parameter

19Oct2022

is passed a parameter into the flow and will be used in the ETL script

✅ 1

Ryan Peden

10/19/2022, 11:52 AM

Hi Daniel, A couple of options come to mind. If you just need the date when the flow run commences, would a parameter with a default value work? For example, would something like this work for you?:

Copy code

import pendulum
from prefect import flow, get_run_logger


@flow(name="my flow")
def my_flow(start_date=f"{pendulum.today('UTC'):%d%b%Y}"):
    logger = get_run_logger()
    <http://logger.info|logger.info>(f"Running on {start_date}")


if __name__ == "__main__":
    my_flow()

If you run that, you'll see output similar to:

Copy code

06:59:35.716 | INFO    | prefect.engine - Created flow run 'agate-marten' for flow 'my flow'
06:59:35.813 | INFO    | Flow run 'agate-marten' - Running on 19Oct2022
06:59:35.830 | INFO    | Flow run 'agate-marten' - Finished in state Completed()

Pendulum is one of Prefect's dependencies, so you wouldn't need to install anything extra. You can also do the same thing with Python's built-in

datetime

, but it would be a bit more verbose. Using a parameter with a default value this way gives you the ability to override the parameter if you ever need to, but provides a sensible default that should (hopefully) give what you need most of the time. You could then pass the date into tasks and subflows as needed for your ETL pipeline. You can also read both the expected and actual start dates from the flow run context, like so:

Copy code

from prefect import flow, get_run_logger
from prefect.context import get_run_context


@flow(name="my flow")
def my_flow():
    logger = get_run_logger()
    context = get_run_context()
    expected_start_date = f"{context.flow_run.expected_start_time:%d%b%Y}"
    actual_start_date = f"{context.flow_run.start_time:%d%b%Y}"
    <http://logger.info|logger.info>(f"Expected to run on {expected_start_date}")
    <http://logger.info|logger.info>(f"Actually running on {actual_start_date}")


if __name__ == "__main__":
    my_flow()

Which results in output like:

Copy code

07:13:32.949 | INFO    | prefect.engine - Created flow run 'bald-heron' for flow 'my flow'
07:13:33.059 | INFO    | Flow run 'bald-heron' - Expected to run on 19Oct2022
07:13:33.059 | INFO    | Flow run 'bald-heron' - Actually running on 19Oct2022
07:13:33.075 | INFO    | Flow run 'bald-heron' - Finished in state Completed()

Will either of these help you accomplish your goal?

🙏 1

🙌 3

Daniel Mak

10/20/2022, 2:17 AM

yes that works for me!

Daniel Mak

10/20/2022, 2:19 AM

@Ryan Peden now I have another question here. when i was trying to get the agent to retrieve the job i got this error

Daniel Mak

10/20/2022, 8:53 AM

Daniel Mak

10/20/2022, 8:56 AM

the thing is my script is as such

Copy code

if __name__ == "__main__":
    <i pass in my parameters from a config file>
    today_date_str = kwargs.get("today_date_str")
    query_parameters = get_query_parameters(config, **kwargs)
    <execute flow here>

Daniel Mak

10/20/2022, 8:56 AM

in that case should I still pass in these parameters to prefect flow?

Ryan Peden

10/20/2022, 1:23 PM

Yes, you should. If you don't provide default values for the params, you'll typically need to pass them when calling the flow in Python, or if you create a deployment you will need to set up parameters in the deployment. If necessary, you can disable flow parameter checks when creating your flow:

Copy code

@flow(validate_parameters=False)
def my_flow(today_date_str: str, query_parameters: dict[str,str]):
    ...

Or if you'd like to keep the validation but mark a params as optional to keep Pydantic happy:

Copy code

from typing import Optional

@flow(validate_parameters=False)
def my_flow(today_date_str: Optional[str], query_parameters: Optional[dict[str,str]]):
    ...

That way, you're not forced to use the parameters but still get validation, so if for example you tried to call the flow with an

int

as your

today_date_str

, validation would fail.

5 Views

Open in Slack

Previous Next