would a better way to do this be to write a simple python sc Prefect Community #ask-community

would a better way to do this be to write a simple...

Aaron Ash

03/18/2022, 7:59 AM

would a better way to do this be to write a simple python script using the client api and override the executor and run_configs there?

Anna Geller

03/18/2022, 11:33 AM

I know it's all related to your one single larger question, but actually, I'm glad you created it in separate threads since it allows me to answer each question separately 😄 When you say you would like to override the executor and run config for dev vs prod, does it mean that your flow is exactly the same for both and those two (executor and run_config) and the only files that differ? If so, you could do something that I did in this repo: https://github.com/anna-geller/prefect-dbt-k8s-snowflake/blob/master/flow_utilities/prefect_configs.py And then before you register your flow to a new environment, you could overwrite this single variable:

Copy code

with Flow(
    FLOW_NAME,
    executor=LocalDaskExecutor(),
    storage=set_storage(FLOW_NAME),
    run_config=set_run_config(local=True),
) as flow:

For the executor, this is more tricky, since as I said before, it's retrieved from storage, but you can try doing something like:

Copy code

with Flow(
        FLOW_NAME,
        executor=LocalDaskExecutor(),
        storage=set_storage(FLOW_NAME),
        run_config=set_run_config(),
) as flow:
    datasets = ["raw_customers", "raw_orders", "raw_payments"]
    dataframes = extract_and_load.map(datasets)
    
if __name__ == '__main__':
    # register for prod
    flow.register("prod_project")
    # register for dev
    flow.executor = LocalExecutor()
    flow.run_config = set_run_config(local=True) 
    flow.register("dev_project")

But I believe in the above the executor won't be respected since main is not evaluated at flow runtime. So probably your best bet is to define your main flow in one python file, say:

aaron_flow.py

- this defines your flow structure without defining run config or executor:

Copy code

with Flow("FLOW_NAME", storage=S3(), # just example
) as flow:
    datasets = ["raw_customers", "raw_orders"]
    dataframes = extract_and_load.map(datasets)

Then, you can have a file called `aaron_flow_dev.py`:

Copy code

from aaron_flow import flow

flow.executor = LocalExecutor()
flow.run_config = KubernetesRun(image="some_dev_image")

and

aaron_flow_prod.py

Copy code

from aaron_flow import flow

flow.executor = LocalDaskExecutor()
flow.run_config = KubernetesRun(image="some_prod_image")

and then you can register using CLI without worrying about the run config and executor

Aaron Ash

03/21/2022, 2:20 AM

Thanks again @Anna Geller

Aaron Ash

03/21/2022, 2:21 AM

This approach with the separate

*_dev.py

modules looks like it's perfect for me

🙌 1

11 Views

Open in Slack

Previous Next