Am having trouble loading a env file using python dotenv wit Prefect Community #ask-community

Am having trouble loading a .env file using python...

Simon

05/22/2023, 6:20 PM

Am having trouble loading a .env file using python-dotenv within Prefect. Within

Deployment.build_from_flow

, am setting

is_schedule_active=ENV == "production"

based upon whether the env is production, or not. This is determined by an import from another file:

Copy code

from common.config import ENV

and in that file is:

Copy code

# have also tried dotenv.find_dotenv() without success
dotenv_path = os.path.abspath(os.path.join(__file__, "../../../.env"))
if not os.path.isfile(dotenv_path):
    raise RuntimeError(dotenv_path)

This loads happily in a local ipython shell. But when run as a prefect flow, the folder structure appears to be different:

Copy code

Flow could not be retrieved from deployment.
Traceback (most recent call last):
  File "<frozen importlib._bootstrap_external>", line 940, in exec_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
  File "/private/var/folders/d8/sqvg_pms3pd0xvcl2tvvxhqr0000gn/T/tmp8e6a8h47prefect/prefect_daily_internal_metric.py", line 9, in <module>
    from common.config import ENV
  File "/private/var/folders/d8/sqvg_pms3pd0xvcl2tvvxhqr0000gn/T/tmp8e6a8h47prefect/common/config.py", line 9, in <module>
    raise RuntimeError(dotenv_path)
RuntimeError: /private/var/folders/d8/sqvg_pms3pd0xvcl2tvvxhqr0000gn/T/.env

Nate

05/22/2023, 6:32 PM

hi @Simon are you saying you're having trouble trying to access your env file while creating the deployment or while running the flow? I would think you wouldn't want the env file to be cloned when running the flow, rather have certain values set the in env for deployment's infrastructure / passed in as flow run parameters

Simon

05/22/2023, 6:41 PM

The deployment gets applied without error on the server. It’s when a flow run is triggered that the error is thrown. I was hoping to avoid setting env vars in say a venv activate file on the server and instead load them from a .env file on the server.

Nate

05/22/2023, 6:46 PM

what infrastructure are you using for your deployment? each variety should have an

env

field where you can populate an arbitrary dictionary so that the env vars are directly accessible to the flow run without cloning the actual env file

Simon

05/22/2023, 6:49 PM

For prod, four ubuntu 22.04 machines. One running prefect server and three running prefect agents.

Nate

05/22/2023, 6:49 PM

sorry, I meant Infrastructure as in the prefect proper noun 🙂

Simon

05/22/2023, 6:51 PM

Ah. It will be Process infrastructure. So are you saying the env vars need to be hardcoded and in version control somewhere, rather than being loaded from the server’s actual environment?

Simon

05/22/2023, 6:56 PM

I suspect that it is something fundamental in the move from Prefect 1 to 2 which I have yet to grasp!

Simon

05/22/2023, 6:56 PM

This is the relevant part of the flow ‘registration’ file:

Copy code

deployment = Deployment.build_from_flow(
    flow=main,
    name=PIPELINE_STANDARD_DEPLOYMENT_NAME,
    work_pool_name=PIPELINE_STANDARD_WORK_POOL_NAME,
    work_queue_name=PIPELINE_STANDARD_WORK_QUEUE_NAME,
    schedule=(
        IntervalSchedule(
            interval=timedelta(hours=24),  # Daily
            anchor_date=datetime(
                2020, 5, 1, 4, 0, tz="Europe/London"
            ),  # At 4am London time
        )
    ),
    is_schedule_active=ENV == "production",
)

if __name__ == "__main__":
    deployment.apply()

Simon

05/22/2023, 6:57 PM

I guess my question is, can I load ENV from a dot_env file?

Simon

05/22/2023, 6:59 PM

And I guess there is a follow-up question about whether the agents will have any trouble loading their .env files?

Nate

05/22/2023, 7:05 PM

hmm you shouldn't have any trouble accessing env vars at deployment creation time (while calling

build_from_flow

above) right? the issue with trace at the beginning of this thread I think was the fact that you were in a temp dir, which is the default behavior when the agent submits a flow run as a new process somewhere I'm suggesting that you should not have to: • hardcode env values into version control • clone your env file for a given flow run how I would approach this using infra blocks is to create a

Process

infra block per unique environment

prod

dev

stage

etc and populate the

env

on each, so that at deployment creation time (like calling

build_from_flow

above), you can pass an additional kwarg

infrastructure=Process.load("my-process-with-infra-specific-env-vars-set")

Nate

05/22/2023, 7:05 PM

does that make sense?

Nate

05/22/2023, 7:05 PM

also just a heads up, projects (currently in beta) are becoming the default way to manage deployments (replacing the infra block paradigm), so if you're just getting started, I might recommend checking them out

Simon

05/22/2023, 7:06 PM

Ha I just yesterday stripped out a bunch of our Prefect v1 projects code

Nate

05/22/2023, 7:07 PM

projects are a different proper noun in prefect 2 😅 they're basically just a place to share configuration for building potentially many deployments. there's a couple useful files for templating in config like we've been discussing

Simon

05/22/2023, 7:15 PM

you shouldn’t have any trouble accessing env vars at deployment creation time (while calling
build_from_flow
above) right?

Correct. When creating the deployment, these lines in common/config.py are triggered when the import is made and execute without error:

Copy code

dotenv_path = os.path.abspath(os.path.join(__file__, "../../../.env"))
if not os.path.isfile(dotenv_path):
    raise RuntimeError(dotenv_path)
find_dotenv(raise_error_if_not_found=True)  # belt and braces check for this debugging


load_dotenv(dotenv_path, verbose=True)

Nate

05/22/2023, 7:19 PM

so maybe even easier than creating a separate Process infra block per execution environment would be dynamically populating your

env

field of your process infra block at deployment creation time based on whichever environment you're trying to create a deployment for at that time

Nate

05/22/2023, 7:21 PM

Copy code

env = {"prod_key": "prod_value"} if os.getenv("prod_indicator") else {} # not suggesting hardcoding, just that it can be set dynamically

deployment = Deployment.build_from_flow(
    flow=main,
    name=PIPELINE_STANDARD_DEPLOYMENT_NAME+"PROD",
    work_pool_name=PIPELINE_STANDARD_WORK_POOL_NAME,
    work_queue_name=PIPELINE_STANDARD_WORK_QUEUE_NAME,
    env=env,
    schedule=(
        IntervalSchedule(
            interval=timedelta(hours=24),  # Daily
            anchor_date=datetime(
                2020, 5, 1, 4, 0, tz="Europe/London"
            ),  # At 4am London time
        )
    ),
    is_schedule_active=ENV == "production",
)

if __name__ == "__main__":
    deployment.apply()

Simon

05/22/2023, 7:26 PM

the issue with trace at the beginning of this thread I think was the fact that you were in a temp dir, which is the default behavior when the agent submits a flow run as a new process somewhere

So it is the agent which is in a temp dir, and not the server (as only the agent executes the flow run) i.e. the agent loads the file the deployment `flow=`line has pointed it to (which in this case happens to be the same file containing the deployment code block, but no matter).

Simon

05/22/2023, 7:28 PM

OK I think your suggestion will work well. Instead of sending copies of the .env file to the workers as well during CI/CD deployment (an ansible playbook), the env=env line will distribute whatever is in the server’s copy of the file (including secrets) to the agent at flow run preparation time?

Nate

05/22/2023, 7:28 PM

the agent process itself shouldnt be in a tempdir unless you started the process in one, im saying the agent would spawn new flow runs (if its deployment had Process infrastructure) in a temp dir by default

👍 1

Simon

05/22/2023, 7:32 PM

Ok I need to separately work out how to tell the agent to spawn runs in a known place as it will need to interact with the filesystem it’s running on. Don’t want to derail the issue at hand, so will have a go at finding that myself.

👍 1

Nate

05/22/2023, 7:32 PM

the env=env line will distribute whatever is in the server’s copy of the file (including secrets) to the agent at flow run preparation time?

whatever you pass as

env

while creating your deployment, as shown above, will become env vars directly accessible by any flow run created from that deployment so for example, if you had a dot_env file for each execution environment, and you have dev stg prd then you could do something like

Copy code

for environment in {"dev", "stg", "prd"}:
   env_dict = env_dict_from_key(environment) # some random helper that loads your dotenv as a dict

   # call `Deployment.build_from_flow` with env_dict specified as above
   ...

Simon

05/22/2023, 7:33 PM

ok cool. I am planning to have an entirely separate staging prefect server, but it’s nice to know it’s possible

👍 1

Simon

05/22/2023, 7:34 PM

Thanks for the help, I very much appreciate you taking the time, and your patience

Nate

05/22/2023, 7:36 PM

Ok I need to separately work out how to tell the agent to spawn runs in a known place as it will need to interact with the filesystem it’s running on.

you can set the `working_dir` on the Process infra block (which should be better documented)

🙏 1

Nate

05/22/2023, 7:36 PM

sure thing!

45 Views

Open in Slack

Previous Next