Simon
05/22/2023, 6:20 PMDeployment.build_from_flow
, am setting is_schedule_active=ENV == "production"
based upon whether the env is production, or not. This is determined by an import from another file:
from common.config import ENV
and in that file is:
# have also tried dotenv.find_dotenv() without success
dotenv_path = os.path.abspath(os.path.join(__file__, "../../../.env"))
if not os.path.isfile(dotenv_path):
raise RuntimeError(dotenv_path)
This loads happily in a local ipython shell. But when run as a prefect flow, the folder structure appears to be different:
Flow could not be retrieved from deployment.
Traceback (most recent call last):
File "<frozen importlib._bootstrap_external>", line 940, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/private/var/folders/d8/sqvg_pms3pd0xvcl2tvvxhqr0000gn/T/tmp8e6a8h47prefect/prefect_daily_internal_metric.py", line 9, in <module>
from common.config import ENV
File "/private/var/folders/d8/sqvg_pms3pd0xvcl2tvvxhqr0000gn/T/tmp8e6a8h47prefect/common/config.py", line 9, in <module>
raise RuntimeError(dotenv_path)
RuntimeError: /private/var/folders/d8/sqvg_pms3pd0xvcl2tvvxhqr0000gn/T/.env
Nate
05/22/2023, 6:32 PMSimon
05/22/2023, 6:41 PMNate
05/22/2023, 6:46 PMenv
field where you can populate an arbitrary dictionary so that the env vars are directly accessible to the flow run without cloning the actual env fileSimon
05/22/2023, 6:49 PMNate
05/22/2023, 6:49 PMSimon
05/22/2023, 6:51 PMdeployment = Deployment.build_from_flow(
flow=main,
name=PIPELINE_STANDARD_DEPLOYMENT_NAME,
work_pool_name=PIPELINE_STANDARD_WORK_POOL_NAME,
work_queue_name=PIPELINE_STANDARD_WORK_QUEUE_NAME,
schedule=(
IntervalSchedule(
interval=timedelta(hours=24), # Daily
anchor_date=datetime(
2020, 5, 1, 4, 0, tz="Europe/London"
), # At 4am London time
)
),
is_schedule_active=ENV == "production",
)
if __name__ == "__main__":
deployment.apply()
Nate
05/22/2023, 7:05 PMbuild_from_flow
above) right?
the issue with trace at the beginning of this thread I think was the fact that you were in a temp dir, which is the default behavior when the agent submits a flow run as a new process somewhere
I'm suggesting that you should not have to:
• hardcode env values into version control
• clone your env file for a given flow run
how I would approach this using infra blocks is to create a Process
infra block per unique environment prod
, dev
, stage
etc and populate the env
on each, so that at deployment creation time (like calling build_from_flow
above), you can pass an additional kwarg infrastructure=Process.load("my-process-with-infra-specific-env-vars-set")
Simon
05/22/2023, 7:06 PMNate
05/22/2023, 7:07 PMSimon
05/22/2023, 7:15 PMyou shouldn’t have any trouble accessing env vars at deployment creation time (while callingCorrect. When creating the deployment, these lines in common/config.py are triggered when the import is made and execute without error:above) right?build_from_flow
dotenv_path = os.path.abspath(os.path.join(__file__, "../../../.env"))
if not os.path.isfile(dotenv_path):
raise RuntimeError(dotenv_path)
find_dotenv(raise_error_if_not_found=True) # belt and braces check for this debugging
load_dotenv(dotenv_path, verbose=True)
Nate
05/22/2023, 7:19 PMenv
field of your process infra block at deployment creation time based on whichever environment you're trying to create a deployment for at that timeenv = {"prod_key": "prod_value"} if os.getenv("prod_indicator") else {} # not suggesting hardcoding, just that it can be set dynamically
deployment = Deployment.build_from_flow(
flow=main,
name=PIPELINE_STANDARD_DEPLOYMENT_NAME+"PROD",
work_pool_name=PIPELINE_STANDARD_WORK_POOL_NAME,
work_queue_name=PIPELINE_STANDARD_WORK_QUEUE_NAME,
env=env,
schedule=(
IntervalSchedule(
interval=timedelta(hours=24), # Daily
anchor_date=datetime(
2020, 5, 1, 4, 0, tz="Europe/London"
), # At 4am London time
)
),
is_schedule_active=ENV == "production",
)
if __name__ == "__main__":
deployment.apply()
Simon
05/22/2023, 7:26 PMthe issue with trace at the beginning of this thread I think was the fact that you were in a temp dir, which is the default behavior when the agent submits a flow run as a new process somewhereSo it is the agent which is in a temp dir, and not the server (as only the agent executes the flow run) i.e. the agent loads the file the deployment `flow=`line has pointed it to (which in this case happens to be the same file containing the deployment code block, but no matter).
Nate
05/22/2023, 7:28 PMSimon
05/22/2023, 7:32 PMNate
05/22/2023, 7:32 PMthe env=env line will distribute whatever is in the server’s copy of the file (including secrets) to the agent at flow run preparation time?whatever you pass as
env
while creating your deployment, as shown above, will become env vars directly accessible by any flow run created from that deployment
so for example, if you had a dot_env file for each execution environment, and you have dev stg prd then you could do something like
for environment in {"dev", "stg", "prd"}:
env_dict = env_dict_from_key(environment) # some random helper that loads your dotenv as a dict
# call `Deployment.build_from_flow` with env_dict specified as above
...
Simon
05/22/2023, 7:33 PMNate
05/22/2023, 7:36 PMOk I need to separately work out how to tell the agent to spawn runs in a known place as it will need to interact with the filesystem it’s running on.you can set the `working_dir` on the Process infra block (which should be better documented)