Hey everyone! I have a question relating to Prefect Cloud and authenticating with AWS. We're attempt...
t
Hey everyone! I have a question relating to Prefect Cloud and authenticating with AWS. We're attempting to add an S3 results backend to our flows but we're running into:
Copy code
ClientError('An error occurred (AccessDenied) when calling the PutObject operation: Access Denied')
We have tried adding a secret 'AWS_CREDENTIALS' to our team secrets, as mentioned here https://docs.prefect.io/core/concepts/secrets.html#default-secrets but this did not work in Cloud, only when running locally.
b
Hey Thomas, what kind of environment are your flows running in?
(ECS, Kubernetes, etc)
t
Hey Billy, we're registering the flows like this:
Copy code
flow.storage = Docker(
            registry_url="",
            image_name="",
            base_image="",
            local_image=True,
        )

flow.executor = LocalExecutor()
flow.run_config = KubernetesRun()
flow.register(project_name="example-project")
(so running in Kubernetes)
b
I can't speak to the prefect specific issues here (although I don't think you need
LocalExecutor
at all) ... but you're going to need to attach some IAM permissions to your kubernetes container.
In my org we do this by linking the appropriate IAM role to a Kubernetes Service Account, which gets attached to the prefect flow using the
KubernetesRun
object
s
Hi Billy, I'm @Thomas Weatherston’s colleague, here's a few more points of context... We usually use a DaskExecutor but were just trying to simplify as many things as possible for debugging purposes. We could try to supply the aws boto credentials in some other way - for example environment variable through kubernetes secrets - but prefect secrets seems to be the recommended way (and easiest). Inspecting the prefect code, I can see that the credentials are fetched from the context https://github.com/PrefectHQ/prefect/blob/master/src/prefect/utilities/aws.py#L65 When we log the context in a prefect task, the secrets is an empty dictionary. Interestingly this is not the case when we run the flow locally using LocalRun. So perhaps that is our real issue - that the prefect context is not correct when inspected from within our task running in Kubernetes?
b
Hey Sam. Can I ask what kind of values you are populating the AWS_CREDENTIALS with? Keys belonging to an IAM user?
s
Yes exactly, a json object with ACCESS_KEY and SECRET_ACCESS_KEY, from an IAM user. When we fetch the AWS_CREDENTIALS using a Prefect task, and use the boto library directly I can successfully upload a file to our s3 results bucket
But given that the secrets in prefect context is empty - I suspect that the credentials are not the issue
b
locally, your process is likely using your profile or some other configuration, so i'm not surprised that local vs k8s gives different results. you are definitely getting an AWS error, which tells us that boto3 is not using the credentials.
anyway i'm sorry i'm not actually helping, I yield to the prefect knowers.
I wonder if prefect knows that this is a secret value and is therefore not writing the value to logs? that would be a smart design decision...
t
we appreciate you taking time out of your day regardless 🙂
b
good luck! curious to hear what the solution is.
s
When we are running locally - we are actually using a docker container and so it should be isolated from our own profile. We'll try to reproduce on a minimal example that we can share. Yes - thanks for your time Billy 🙂
🇳🇵 1
k
Hey guys, I think the team secrets should work. Are you testing this with
flow.run()
or are you registering and running it?
Yep a minimal example would help me see how you’re using it if you can reproduce.
s
Hi @Kevin Kho, here is a minimal example:
Copy code
import os
import sys


import prefect
from prefect.engine.results import S3Result
from prefect import task, Flow
from prefect.run_configs import KubernetesRun, LocalRun
from prefect.storage import Docker

s3_result = S3Result(bucket=os.environ["bucket_name"])


@task(log_stdout=True, result=s3_result)
def example_task():
    print(prefect.context.get("secrets"))
    return "test"


with Flow("My First Flow", result=s3_result) as flow:
    example_task()


def register():
    flow.storage = Docker(
        registry_url=os.environ["docker_repo"],
        image_name="prefect-result-test",
    )

    flow.run_config = KubernetesRun()
    flow.register(project_name=os.environ["project_name"])


def run():
    flow.run_config = LocalRun()
    flow.run()


if __name__ == "__main__":
    {"register": register, "run": run}[sys.argv[1]]()
Locally, all works as expected. When I register it in PrefectCloud and run in kubernetes, it does not work. Thanks for looking into this 👍
k
And you used Cloud secrets or env variables?
s
When running locally, I am using local secrets via PREFECT__CONTEXT__SECRETS__AWS_CREDENTIALS environment variable
when running in the cloud I am using cloud secrets
k
So I tried your snippet to print the secrets and it printed my secrets stored on cloud. Do you somehow have
PREFECT__CLOUD__USE_LOCAL_SECRETS
set to true? Maybe you can explicitly make this False?
s
I tried that and I still get this issue
To confirm, when i use:
Copy code
Secret("AWS_CREDENTIALS").get()
I get the expected creds
but in prefect.context, secrets is empty - which is where I believe S3Result looks for it
b
It doesn't look like you are adding the secret to your prefect context at any point. When you call
Secret("AWS_CREDENTIALS").get()
, you are getting the secret, but you are not adding it to the context. Therefore it is unavailable when the
S3Result
is trying to pull it.
k
I’ll ask someone who knows more than me
s
Could be the case @Billy McMonagle, I'm not sure how context/secrets interact. Seems odd that the S3Result wouldn't look at cloud secrets for the aws creds https://docs.prefect.io/core/concepts/secrets.html#default-secrets
b
True but you wouldn’t want your results to just go parsing through your entire library of secrets… defining your desired behavior more explicitly seems like a good idea. Perhaps you could read the secret in your flow definition and pass it to the s3result boto args?
k
Hey, so this is expected; the existence of a secret in Cloud doesn’t do anything - you have to use it explicitly somehow. For your use case, see how to declare secrets with your storage here
s
Yes makes sense. Thanks for your help. I shall give it a go when I'm back in the office. If you don't hear from me you can assume it worked 🙂 Again thanks for your help 👍
👍 1