Hello I have a Q about ad hoc env vars when running a flow i Prefect Community #ask-community

Hello - I have a Q about ad-hoc env vars when runn...

Tom Klein

04/04/2022, 1:16 PM

Hello - I have a Q about ad-hoc env vars when running a flow — is it possible to “save” them somehow so they can be re-used again? or do I have to refill them manually each time i wanna create an ad-hoc run?

Kevin Kho

04/04/2022, 1:35 PM

Hi @Tom Klein, would the KV Store fit your use case ? You can persist them here if you use cloud

Tom Klein

04/04/2022, 1:37 PM

hmm - so you’re saying i can persist an entire dict and then inject it as the full env var suite ? is it possible to do that from the UI?

Tom Klein

04/04/2022, 1:37 PM

i.e. from here?

Tom Klein

04/04/2022, 1:39 PM

i guess what i’m asking is - what’s the best approach that we should take in order to have a set of env vars --- let’s say we wanna use some “persistent” defaults or values for most of them and then alter some of them ad-hoc when we run this flow manually (since the whole purpose of this flow is to be a manually-run backfill for old data - on demand)

Kevin Kho

04/04/2022, 1:39 PM

No this section does not persist anywhere and is just for ad-hoc triggers. But you could pass the dict to the RunConfig with

Copy code

flow.run_config = RunConfig(..., env={...})

but this will be the same values across all flow runs I think you can register with this and then use the UI to override for the ad-hoc backfills

Anna Geller

04/04/2022, 1:40 PM

if I can recall, you are on AWS right? why not leverage AWS Secrets Manager or AWS Systems Manager Parameter Store?

Tom Klein

04/04/2022, 1:42 PM

@Anna Geller that helps with just fetching vars - but i’m talking about altering them from run to run (but still having most of them “persisted”) - which is created ad-hoc manually (e.g. from the UI) think - credit-card / billing-address form in the browser where it’s all getting auto-filled whenever you buy a new thing, but you can still alter the auto-filled values (e.g. if u wanna use a different credit-card in this specific purchase)

Tom Klein

04/04/2022, 1:45 PM

also - now that i think about it - our problem is even a bit more complex because our flow actually runs (as one of its steps) a Kubernetes job of a specific docker image, and its that internal job that needs to have the env vars, not the flow…. so we need to somehow relay them from the flow into the job

Anna Geller

04/04/2022, 1:56 PM

i’m talking about altering them from run to run (but still having most of them “persisted”)

It seems like you are looking for some external storage solution like Redis - it would allow you to do that. You shouldn't really use Prefect as a parameter store; Prefect is mostly about orchestration and execution

Anna Geller

04/04/2022, 1:57 PM

our flow actually runs (as one of its steps) a Kubernetes job of a specific docker image, and its that internal job that needs to have the env vars, not the flow

maybe Kubernetes secrets is the right approach here?

Tom Klein

04/04/2022, 2:14 PM

hmm, in our “normal” production case (where we just deploy services etc. unrelated to data and/or orchestration) - we do use AWS param/secret store to inject env vars into the containers, but here because it’s orchestrated by Prefect i’m not sure what the correct approach would be --- it’s the Prefect agent which is creating the job here rather than our internal k8s deployment utility i understand that prefect itself isn’t a param store - it’s just that the nature of this whole flow is to be manually run on-demand (and it orchestrates an entire DAG of operations) so obviously it would be more convenient to directly put in the params into Prefect rather than maintain them elsewhere as another “moving part”… especially because they are changing from run to run (which is , again, executed directly in prefect)

Kevin Kho

04/04/2022, 2:31 PM

I am not seeing an easy way to get it into the job. Seems like you would have to modify the AWS param/secret store values from the parent flow in this case?

Anna Geller

04/04/2022, 2:51 PM

because it’s orchestrated by Prefect i’m not sure what the correct approach would be

I don't think Prefect puts any restrictions in that regard. You can totally still use parameter store or secrets manager for that, and this would be actually quite useful because it makes your code easy to migrate to Prefect 2.0

Anna Geller

04/04/2022, 2:53 PM

and now that you mention that those values are changing from run to run, managing those centrally from something like a parameter store would be pretty helpful to avoid "moving parts" as you would have them centrally managed. and they can even be versioned

Tom Klein

04/04/2022, 2:53 PM

right but in our regular (i.e. non-data or non-prefect) case we don’t have ad-hoc runs that are controlled with env vars (or any kind of parameter) e.g. in the service case, there’s some persistent env vars, and then when a user makes a request to the service, the service can accept ad-hoc params but the Prefect use-case is different, there’s a flow (that can be run scheduled or manually) and have let’s say various ad-hoc parameters / env-vars that change how the run is done --- and there’s no “request” or HTTP serving etc. - except that which is done from the UI or from the SDK

Anna Geller

04/04/2022, 8:49 PM

Coming back to your original question: the only way to store those environment variables would be adding those to your run configuration and reregistering your flow

Tom Klein

04/05/2022, 12:26 PM

@Anna Geller ya, that doesn’t fit the use-case 😕 we’ll need to find some solution around it anyway - i just realized that in order to inject these params/env-vars from the flow into the k8s job, we need to manually list every single env-var in the JSON that describes the k8s job spec - did i get this right? is there no better way? 😮

Tom Klein

04/05/2022, 12:26 PM

e.g.:

Anna Geller

04/05/2022, 12:31 PM

it's up to you. the Prefect way of ingesting env variables to your flow is via run config or setting those on the agent

Tom Klein

04/05/2022, 12:34 PM

well we don’t want to set it on the agent because they are specific to this job we’re executing, right? and you’re talking about ingesting the env-vars into the flow, i’m talking about “passing them on” to a job that’s being executed directly as a distinct k8s job via the

RunNamespacedJob

task Basically all we’re trying to do is use the flow to orchestrate several k8s jobs - each having their own image… are we going about this the wrong way? e.g. we have a process that : • pulls data from our DWH • runs some NodeJS code to do stuff with that data (e.g. scrape websites) • runs a python-based ML model on the data • exports the results to S3 and up until now we ran it manually which is error prone so we’re trying to migrate it to Prefect

Anna Geller

04/05/2022, 12:38 PM

again, it's totally up to you; all of those are valid ways of injecting env variables into your workflow. if you want to do it via Prefect, you can use env variables on the run config or on the agent. If you want to do it yourself, you can: 1. Inject them to your Kubernetes job template and reference this template on your run config 2. Set those within your Dockerfile or during Docker image build process to inject those directly into a container image 3. Retrieve those custom parameter values and secrets from some third-party parameter store or secrets backend or Prefect's KV Store. Hope that clarifies this- the rest is up to you to decide. LMK, if you have any other questions I can help with.

Tom Klein

04/05/2022, 12:47 PM

@Anna Geller hmm ok - thanks 🙏 we’ll experiment with it a bit more and see if we can reach some satisfactory path I understand that there’s a variety of options, the issue is i’m not even sure the way we’re going about this makes sense to begin with (e.g. using the raw

RunNamespacedJob

command as a way to execute a program [encapsulated in a Docker image] that was initially designed to run as a standalone process rather than as a step in an orchestrated DAG)

Anna Geller

04/05/2022, 3:14 PM

That's totally fine, Prefect can still orchestrate it, as long as you package it into a task, which you did with the Kubernetes task

4 Views

Open in Slack

Previous Next