https://prefect.io logo
Title
b

Brian Roepke

01/20/2023, 7:53 PM
I need some troubleshooting help. I created a deployment that uses S3 cloud storage, and when I run it with a local agent, it works great. I'm setting up a LINUX EC2 instance and trying to get the agent to run there. I've successfully installed the libraries I need (
s3fs, boto3, prefect_aws, prefect, prefect-snowflake
). But I'm running into an error when I start the Flow as a run from Prefect Cloud. The agent will pick up the job without a problem - but I'm getting the following error: (note the blank S3 path)
Downloading flow code from storage at ''

Flow could not be retrieved from deployment.
Full dump of the error in the thread. Could anyone think of a missing dependency or authentication step I missed? (also - if I go into
python3
and run
from prefect.filesystems import S3; s3_block = S3.load("s3-prefect"); print(s3_block)
that works just fine... )
Downloading flow code from storage at ''

Flow could not be retrieved from deployment.
Traceback (most recent call last):
  File "/home/ec2-user/.local/lib/python3.7/site-packages/s3fs/core.py", line 112, in _error_wrapper
    return await func(*args, **kwargs)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/aiobotocore/client.py", line 358, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ec2-user/.local/lib/python3.7/site-packages/prefect/engine.py", line 266, in retrieve_flow_then_begin_flow_run
    flow = await load_flow_from_flow_run(flow_run, client=client)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/prefect/client/utilities.py", line 47, in with_injected_client
    return await fn(*args, **kwargs)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/prefect/deployments.py", line 174, in load_flow_from_flow_run
    await storage_block.get_directory(from_path=deployment.path, local_path=".")
  File "/home/ec2-user/.local/lib/python3.7/site-packages/prefect/filesystems.py", line 469, in get_directory
    from_path=from_path, local_path=local_path
  File "/home/ec2-user/.local/lib/python3.7/site-packages/prefect/filesystems.py", line 313, in get_directory
    return self.filesystem.get(from_path, local_path, recursive=True)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/fsspec/asyn.py", line 113, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/fsspec/asyn.py", line 98, in sync
    raise return_result
  File "/home/ec2-user/.local/lib/python3.7/site-packages/fsspec/asyn.py", line 53, in _runner
    result[0] = await coro
  File "/home/ec2-user/.local/lib/python3.7/site-packages/fsspec/asyn.py", line 551, in _get
    rpaths = await self._expand_path(rpath, recursive=recursive)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/fsspec/asyn.py", line 751, in _expand_path
    out = await self._expand_path([path], recursive, maxdepth)
  File "/home/ec2-user/.local/lib/python3.7/site-packages/fsspec/asyn.py", line 769, in _expand_path
    rec = set(await self._find(p, maxdepth=maxdepth, withdirs=True))
  File "/home/ec2-user/.local/lib/python3.7/site-packages/s3fs/core.py", line 777, in _find
    out = [await self._info(path)]
  File "/home/ec2-user/.local/lib/python3.7/site-packages/s3fs/core.py", line 1216, in _info
    **self.req_kw,
  File "/home/ec2-user/.local/lib/python3.7/site-packages/s3fs/core.py", line 340, in _call_s3
    method, kwargs=additional_kwargs, retries=self.retries
  File "/home/ec2-user/.local/lib/python3.7/site-packages/s3fs/core.py", line 139, in _error_wrapper
    raise err
PermissionError: Forbidden
c

Christopher Boyd

01/20/2023, 8:23 PM
do you ahve getWebcredentials set on the role for your ec2 instance?
you should be able to try and access s3 directly from the EC2 instance using boto, if you can’t do that (completely disregarding prefect), you’re likely missing an IAM policy to allow it
specifically this errror is what indicates a role permission:
botocore.exceptions.ClientError: An error occurred (403) when calling the HeadObject operation: Forbidden
b

Brian Roepke

01/20/2023, 8:38 PM
Thanks, @Christopher Boyd!
I was able to download with boto3.
c

Christopher Boyd

01/20/2023, 8:39 PM
Do you have the head permission on that role too though ?
Those should be separate permissions on the role
You should be able to test that with head_object() which is the error
b

Brian Roepke

01/20/2023, 8:44 PM
i'm checking out the IAM role now from your link. there wasn't an explicit IAM role attached at creation.
c

Christopher Boyd

01/20/2023, 8:44 PM
whatever that flow or object is in the bucket, or in the s3 bucket itself
b

Brian Roepke

01/20/2023, 8:49 PM
The new IAM role I created has S3full access and EC2 full access. That should cover the permissions to the ones described in the docs.
>>> object = s3.head_object(Bucket='prefect-test-1',Key='test.txt')
>>> object
{'ResponseMetadata': {'RequestId': 'C4ZBXV4KN062ZNWP', 'HostId': 'kQPT+tnlHAp0mn3DK2vQ+nziiV6dWRhYrto7vrjERzPa+5S8U13bfqHGDs0P3ZPnnJtx2RJYzDE=', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amz-id-2': 'kQPT+tnlHAp0mn3DK2vQ+nziiV6dWRhYrto7vrjERzPa+5S8U13bfqHGDs0P3ZPnnJtx2RJYzDE=', 'x-amz-request-id': 'C4ZBXV4KN062ZNWP', 'date': 'Fri, 20 Jan 2023 20:58:05 GMT', 'last-modified': 'Fri, 20 Jan 2023 20:37:24 GMT', 'etag': '"bcd7301fca81d73fff76a2ca2f4b04ff"', 'accept-ranges': 'bytes', 'content-type': 'text/plain', 'server': 'AmazonS3', 'content-length': '48'}, 'RetryAttempts': 0}, 'AcceptRanges': 'bytes', 'LastModified': datetime.datetime(2023, 1, 20, 20, 37, 24, tzinfo=tzutc()), 'ContentLength': 48, 'ETag': '"bcd7301fca81d73fff76a2ca2f4b04ff"', 'ContentType': 'text/plain', 'Metadata': {}}
🙌 1
c

Christopher Boyd

01/20/2023, 9:00 PM
that looks good
b

Brian Roepke

01/20/2023, 9:01 PM
It's still throwing the same error - I think it's because it's passing an empty string... The line at the top of the stack.
Downloading flow code from storage at ''
c

Christopher Boyd

01/20/2023, 9:02 PM
How is your s3 block configured / saved?
As well as how the deployment is configured?
b

Brian Roepke

01/20/2023, 9:04 PM
Block...
prefect deployment build snow_conn_test.py:snowflake_query_flow -n lambda-tmdb-s3 -sb s3/s3-prefect -q lamdba-tmdb -o snowflake_query_flow-deployment.yaml
yaml
basically followed the tutorial verbatim
prefect deployment apply snowflake_query_flow-deployment.yaml
c

Christopher Boyd

01/20/2023, 9:05 PM
yea, that does look odd - what tutorial did you try? I’d like to try and reproduce
b

Brian Roepke

01/20/2023, 9:05 PM
c

Christopher Boyd

01/20/2023, 9:05 PM
gotcha , I’ll need to walk through it - nothing you shared looks out of sorts, so it shouldn’t be an empty string like that
b

Brian Roepke

01/20/2023, 9:06 PM
Thank you @Christopher Boyd
@Christopher Boyd (sorry to bother on a weekend). I got it resolved! I found this article which contained the error I was getting (I don't know if the mesasge changed or if I modified something to get it...
RuntimeError: File system created with scheme 's3' from base path '<s3://prefect-test-1/flows/lambda-tmdb|s3://prefect-test-1/flows/lambda-tmdb>' could not be created. You are likely missing a Python module required to use the given storage protocol.
I decided to reinstall
s3fs,
and it turned out there was a dependency conflict that I didn't notice the first time. It was on the AWS CLI that I installed prior. I removed that and re-installed
s3fs,
and boom. It worked! Thank you for the troubleshooting.
🙌 1
1
BTW - the storage location string is still empty regardless of it working or not.
c

Christopher Boyd

01/23/2023, 2:35 PM
That’s great to hear! I’ll still take a look the empty string, to see if that’s an issue. Would you mind confirming what version of Prefect you’re using here?
b

Brian Roepke

01/23/2023, 3:16 PM
2.7.8