<@ULVA73B9P> I'm running into an issue trying to u...
# ask-marvin
j
@Marvin I'm running into an issue trying to use boto3 in a prefect flow. The EC2 instance running prefect has an assumed role and relevant policies. When I execute a plain python script on the EC2 instance, it works fine. However when boto3 runs from the agent (on the same ec2 instance), I get "The security token included in the request is invalid". What could be wrong?
n
@Marvin oi
m
Hello! How can I assist you today? If you have any questions or need support with Python, data engineering, software development, or anything related to Prefect, feel free to ask.
n
@Marvin I'm running into an issue trying to use boto3 in a prefect flow. The EC2 instance running prefect has an assumed role and relevant policies. When I execute a plain python script on the EC2 instance, it works fine. However when boto3 runs from the agent (on the same ec2 instance), I get "The security token included in the request is invalid". What could be wrong?
m
It seems there's not a specific solution immediately available in the public Prefect docs or on GitHub issues regarding the exact error with boto3 and the "security token included in the request is invalid" within a Prefect flow. However, based on your description, this sounds like a potential issue with IAM roles or environment configuration when Prefect is initiating the AWS calls via boto3 within a flow. Here are a few steps you can take to troubleshoot this issue: 1. Check Environment Variables: Ensure that the Prefect job's environment has the correct AWS environment variables set. 2. IAM Role Configuration: Verify that the IAM role assumed by the EC2 instance has permissions to pass roles to the ECS Tasks. 3. IAM Policy and Trust Relationships: Ensure that the IAM policies and trust relationships are set up correctly. An IAM role needs a trust relationship that allows the entity (like ECS tasks) to assume the role. Considering these potential issues, the solution could lie within the AWS permissions setup rather than the Prefect configuration. Although I wasn't able to find a precise solution for your problem, here are some guides that might help you troubleshoot the issue further: - Storage options in Prefect Guide - ECS Guide in Prefect AWS Documentation If you haven’t yet, try logging the complete stack trace from Prefect and boto3 to get a clearer picture of where the error occurs. It is also advisable to check that IAM roles and instance profiles are properly attached to the EC2 instance. Remember, reviewing the stack trace and boto3 logs often provides deeper insight into what might be misconfigured. If you are still having issues, I would suggest reaching out to Prefect's community Slack or GitHub Issues for more targeted help.
j
Because I can establish a boto3 client and make successful calls on the EC2 instance on a stand alone python script, this problem is isolated to the prefect agent. Hmmm
n
if you're using an agent + infra block, which infra block?
maybe there's an old token hanging around in config someplace?
j
@Nate what is an infrablock?
n
did you create a deployment? if so how?
j
For this test script I'm trying to execute, it's not a deployment - just a flow I'm executing (by typing
python3.11 test.py
) - but maybe it needs to be a deployment to be able to read the aws credential block (which is empty because the credentials are assigned to the ec2 instance).
Copy code
import boto3
from prefect import flow

@flow(name="Test Run")
def test_run():    
    bucket_name = 'some_bucket'
    s3 = boto3.client('s3', region_name='ap-southeast-2')
    response = s3.list_objects_v2(Bucket=bucket_name)

    if 'Contents' in response:
        # Iterate over each object and print its key (file name)
        for obj in response['Contents']:
            print(obj['Key'])
    else:
        print("No objects found in the bucket.")

if __name__ == "__main__":
    test_run()
👍 1
n
nice! if you're just running the python file there's no deployment required but if you're using a worker (fka agent) to run forever and listen for new work, you would need to deploy that flow but you should be free to use those
AwsCredentials
directly in your code if you're just running a flow do you happen to have a session token defined in that aws creds block? or perhaps an old token in your
~/.aws/credentials
on your machine?
j
I've tried using the aws credentials block or just without. Neither works. The aws credentials block has no token. There is no ~/.aws or /root/.aws folder.
n
hmm and you're saying if you just comment out the flow decorator it works fine?
j
Example using the credentials block to establish the client
Copy code
aws_credentials_block = AwsCredentials.load("aws-data-account")
    glue_client = aws_credentials_block.get_boto3_session().client("glue")
@Nate ok I think I found the issue. With the @flow decorator when executing on the server works fine. But when I execute it from my local machine (using a prefect profile that talks to a ec2 instance), it does not work. I assumed that because the workload was executing on the remote agent, that it would use local AWS credentials. But it must be forwarding mine somehow?
n
the remote agent
hmm what remote agent do you mean? if you're just running the script, there isnt necessarily an agent involved if you run the script locally, I imagine boto is using whatever creds it can find on your local
j
Sorry, the agent worker pool.
I assume the actual script execution would happen on the worker pool. But that's not the case is it 🙂
n
so workers and work pools deal with deployments. if you're just running the flow as a python script, the sdk will talk to the API and track the flow run, but its just running as a local process and there's no worker / work pool involvement until you deploy that flow, and then create flow runs from that deployment so commonly, you might start a worker on your ec2 like
prefect worker start --pool ec2-subprocess
and use something like systemd to make sure that worker stays alive and then when you create deployments, associate them with that work pool like
Copy code
prefect deploy path/to/flow.py:my_flow --pool ec2-subprocess
note that in this case, you'd need that worker to have all the deps your flows need, which is why people reach for ECS / containers and run an ECS worker or completely sidestep running a worker and just use our ECS push pool to have us spin up containers in your VPC when you create a flow run from a deployment using some
AwsCredentials
you specify on that work pool
j
Yep makes sense, I mis-understood the mechanics because I saw the flow run on the UI and assumed that's where the workload was executed. Makes sense, thanks @Nate!
n
sure thing!