Hi everyone! We’re trialling Prefect (so far good ...
# ask-community
n
Hi everyone! We’re trialling Prefect (so far good docs and user experience), I’m just overwhelmed when going from a single local environment to a multi-environment distributed setup (this will help a lot https://github.com/PrefectHQ/prefect/issues/2508#issue-613607085 !). For parametrising resources, I tried the below
config.toml
to have different buckets per env:
Copy code
source_data_bucket = "${environments.${environment}.source_data_bucket}"
transient_data_bucket = "${environments.${environment}.transient_data_bucket}"

[environments]

    [environments.dev-nelson]
        source_data_bucket = "<s3://REDACTED>"
        transient_data_bucket = "<s3://REDACTED>"

    [environments.prod]
        source_data_bucket = "<s3://REDACTED>"
        transient_data_bucket = "<s3://REDACTED>"
I can print the
prefect.config.transient_data_bucket
inside a task, but when used as
S3Result(bucket=prefect.config.transient_data_bucket)
it fails with
Invalid bucket name ""
. How are others doing this? Note I’m providing this result as a task config
Copy code
@task(
    target="{date:%Y-%m-%d}/crunchbase-raw.cloudpickle",
    result=S3Result(bucket=prefect.config.transient_data_bucket)
n
Hi @Nelson! You're able to print the config inside a task that's running in your distributed environment? Or is that just locally?
n
just running locally @nicholas
n
Got it; so what is most likely happening is that
config.toml
(something modified post-installation) isn't available in your distributed environment. I think a better option would be to load your s3 bucket data as Parameter tasks, which can be passed as args to your downstream.
n
Hmm @nicholas actually I always just ran locally, not sure what’s the discrepancy. So I’m providing that value to the decorator:
Copy code
@task(result=S3Result(bucket=prefect.config.transient_data_bucket))
def magic():
...
So I cannot define one task decorator parameter with the output of another task right? Generally the use case is: have different values per environment. A dask hostname, ports, buckets, etc. Here you showed how a bit: https://docs.prefect.io/core/concepts/configuration.html#configuration-interpolation If it was via parameters, I’d have to define these values everywhere, when mostly it’s shared across a lot of flows hence config made sense? But also a task output is not available in contexts like this, or I missing anything? Thanks!
n
Ahh, I think I see the issue (and it looks like maybe an issue in the docs as well): where is
environment
set? I don't see it in that config
n
@nicholas it’s not in the docs, is it a built-in definition? I assumed not, so I launched the agent with
-e PREFECT__ENVIRONMENT=nelson-dev
.
btw if updating that page, I think you want
"${environments.${environment}.user}"
?
c
Hey @Nelson - the value you provide to
bucket=
becomes hardened as an attribute of your
S3Result
when you register your Flow, so I think the pattern you’re going for here isn’t quite supported
However, we are working on a new deployment protocol that would enable this. If you want to open a feature request for something like “dynamically configured results” I can use that to keep you in the loop of when / how this can be achieved when it’s ready
n
Many thanks @Chris White will do! Curious to what is the current recommended practice for this problem as we’ll have to continue with the current version
c
right now you’ll need to choose when you register your flow which bucket to use, and hardcode that value. You can register multiple Flows w/ different environment labels if you want them to run in different clusters but truly dynamic buckets on Results isn’t currently supported unfortunately
👍 1