Hey team! I am new to prefect and I using: - EC2 (...
# ask-community
e
Hey team! I am new to prefect and I using: • EC2 (ubuntu) • S3 storage block • I am build a flow that takes data from mixpanel and dumps into an s3 bucket. I get the following error (trace in thread)
Flow could not be retrieved from deployment.
from boto3 when I try and run the deployment into the ec2 instance. I can deploy to s3 locally and connect to bucket from the ec2 using the AWS cli. Please can anyone help me debug this?
Copy code
Flow could not be retrieved from deployment.
Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.10/site-packages/s3fs/core.py", line 112, in _error_wrapper
    return await func(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/aiobotocore/client.py", line 358, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (AccessDenied) when calling the GetObject operation: Access Denied

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/ubuntu/.local/lib/python3.10/site-packages/prefect/engine.py", line 262, in retrieve_flow_then_begin_flow_run
    flow = await load_flow_from_flow_run(flow_run, client=client)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/prefect/client/utilities.py", line 47, in with_injected_client
    return await fn(*args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/prefect/deployments.py", line 166, in load_flow_from_flow_run
    await storage_block.get_directory(from_path=deployment.path, local_path=".")
  File "/home/ubuntu/.local/lib/python3.10/site-packages/prefect/filesystems.py", line 468, in get_directory
    return await self.filesystem.get_directory(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/prefect/filesystems.py", line 313, in get_directory
    return self.filesystem.get(from_path, local_path, recursive=True)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/fsspec/asyn.py", line 113, in wrapper
    return sync(self.loop, func, *args, **kwargs)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/fsspec/asyn.py", line 98, in sync
    raise return_result
  File "/home/ubuntu/.local/lib/python3.10/site-packages/fsspec/asyn.py", line 53, in _runner
    result[0] = await coro
  File "/home/ubuntu/.local/lib/python3.10/site-packages/fsspec/asyn.py", line 561, in _get
    return await _run_coros_in_chunks(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/fsspec/asyn.py", line 269, in _run_coros_in_chunks
    await asyncio.gather(*chunk, return_exceptions=return_exceptions),
  File "/usr/lib/python3.10/asyncio/tasks.py", line 408, in wait_for
    return await fut
  File "/home/ubuntu/.local/lib/python3.10/site-packages/s3fs/core.py", line 1134, in _get_file
    body, content_length = await _open_file(range=0)
  File "/home/ubuntu/.local/lib/python3.10/site-packages/s3fs/core.py", line 1125, in _open_file
    resp = await self._call_s3(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/s3fs/core.py", line 339, in _call_s3
    return await _error_wrapper(
  File "/home/ubuntu/.local/lib/python3.10/site-packages/s3fs/core.py", line 139, in _error_wrapper
    raise err
PermissionError: Access Denied
z
The flow is running as a process on the EC2 instance?
e
yes
e
Well it never runs actually but that is what I would like to do.
z
It looks like it’s not getting the right credentials
Maybe that documentation will help?
The agent is running on the EC2 instance as well?
e
Yes the agent seem to spin up fine and connect to cloud.
z
And your credentials are at
~/.aws/credentials
or in the IAM metadata or?
What’s the CLI using for auth
e
CLI is using the IAM rule
It looks like boto3 is trying to use the
AWS_ACCESS_KEY_ID
,
AWS_SECRET_ACCESS_KEY
This this might be it
z
Those docs note they’ll try in order: • Env • Config file • IAM
So if those environment variables are set, they’ll be used first.
e
Hmm I deleted them from the block so I wonder how it is settinghtem
z
You can unset environment variables with
env={"EXAMPLE_VARIABLE": None}
in the
Process
infrastructure block.
(i.e. if it’s set on the agent it will not be set for the subprocess)
Otherwise, environment variables are copied from the agent to the processes
e
Ahh I hadn't actually done this part
I have no process infra block.
z
Ah you’re just using the default? Then you should just make sure the variables aren’t set on the agent.
e
Hmm I actually do not see them in the env on ec2.
Are the environment variables different on the agent to the ec2 server?
Does not look like its set:
The s3 storage block
Env and aws variables:
z
Hm I’m a bit at a loss then, cc @Ryan Peden?
👀 1
e
Hmmm so in my flow we reference another bucket is there a chance we could be over writing it ??
z
The agent spawns a subprocess which then pulls the flow code from your S3 bucket and executes it — it looks to be failing during the pull step so any contents of the flow itself should not matter.
e
Can you point me to code that is being run on the pull?
I want to see if I can recreate it at least.
z
You can call the engine directly i.e. with
python -m prefect.engine <flow-run-id>
I’ve also added logs around this https://github.com/PrefectHQ/prefect/pull/8075
e
Hmm i even tried add in the env variable and aws config and I get the same error.