https://prefect.io logo
Title
b

Ben Welsh

01/30/2022, 10:48 PM
Is there an established pattern for including dev/prod variations in a flow file. Basically, in dev I'd like to have the default local storage on my laptop and in prod I'd like to do a customized Docker storage with Google Artifact Registry. The hacky section of my brain is imagining some kind of if/else clause in my flow file that keys off an environment variable. But I'm not sure what that variable would be, and I wonder if there's a sturdier method for achieving this solution that I'm simply ignorant of.
a

Anna Geller

01/30/2022, 11:36 PM
Many users leverage projects or different cloud tenants to distinguish between dev and prod. But since you specifically asked about setting storage and run config, you could build some function that returns a different storage or run config configuration based on whether you run it locally or not - check out this repo for an example
b

Ben Welsh

01/31/2022, 1:09 AM
So in your example, how is the local bool value ever set to true?
a

Anna Geller

01/31/2022, 9:28 AM
when I run it locally I set it to True and then I run it. It’s just one example of how you can go about this, there are many possibilities. For local run you really don’t need any storage and run configurations - when you trigger your script locally via CLI it will ignore storage and run config:
prefect run -p path/to/flow.py
b

Ben Welsh

01/31/2022, 12:55 PM
Thanks. I ended up hacking out an env variable of my own with this little ditty.
def get_storage(env: str = "production"):
    """Get the storage method used by the flow.

    Args:
        env (str): the environment where the task is running. Options are 'development' and 'production'.

    Returns Prefect Storage instance
    """
    logger = prefect.context.get("logger")
    logger.debug(f"Loading {env} storage method")
    options = {
        "development": Local(
            # The flow is stored as a file here on your laptop
            #            path="./flow.py",
            #            stored_as_script=True,
            add_default_labels=False,
        ),
        "production": Docker(
            # An image containing the flow's code, as well as our Python dependencies,
            # will be compiled when `pipenv run prefect register` is run and then
            # uploaded to our repository on Google Artifact Registry.
            registry_url="us-west2-docker.pkg.dev",
            image_name="big-local-news-267923/warn-prefect-flow/warn-act-notices-etl-flow",
            python_dependencies=["warn-scraper"],
        ),
    }
    return options[env]


with prefect.Flow(
    "WARN Act Notices ETL",
    storage=get_storage(os.getenv("PREFECT_FLOW_ENV", "production")),
    run_config=UniversalRun(
        env={
            # Print logs from our dependencies
            "PREFECT__LOGGING__EXTRA_LOGGERS": "['warn',]",
            # Print debugging level code from Prefect
            "PREFECT__LOGGING__LEVEL": "DEBUG",
        },
        # Tag the task
        labels=["etl"],
    ),
    executor=DaskExecutor(),
) as flow:
    # Get logger
    logger = prefect.context.get("logger")
    <http://logger.info|logger.info>("Running WARN Act Notices ETL flow")

    # Get the list of all scrapers
    scraper_list = prefect.Parameter(
        "scrapers", default=warn.utils.get_all_scrapers(), required=True
    )
    delete = prefect.Parameter("delete", default=False)

    # If the delete option has been set, then delete
    runner = get_scrape_runner(delete)

    # Map `scrape` across our sources
    scrape.map(prefect.unmapped(runner), scraper_list)
One little thing I wonder: Is there some internal prefect env indicator that lets me know if I'm on a local server or the cloud server? If so, I think I could use that to infer if the code is in dev or prod.
a

Anna Geller

01/31/2022, 1:07 PM
There is a flag in prefect context that tells you whether you run locally or with backend:
prefect.context.get("running_with_backend")
# "running_with_backend": true,