I'm a bit lost / stuck regarding DockerRun on ACR,...
# prefect-server
l
I'm a bit lost / stuck regarding DockerRun on ACR, combined with Azure storage. So, my Docker agent requires the AZURE_STORAGE_CONNECTION_STRING to pull the flow from the store. But when I pass it as an env in the DockerRun() params, I keep getting the same error that it can't be found.. And no matter what I'm changing, I can't get rid of the error.
Copy code
Failed to load and execute flow run: Exception('Azure connection string not provided. Set `AZURE_STORAGE_CONNECTION_STRING` environment variable or save connection string as Prefect secret.')
First I was working on the Local storage, but that gave me ModuleErrors and an old prefect thread explained that it should be solved through the use of Azure/Github/etc storages.
Copy code
with Flow(
    "extract",
    storage=Azure(container="prefect-flows"),
    run_config=DockerRun(image="<http://xxx.azurecr.io/extract:latest|xxx.azurecr.io/extract:latest>", labels=["extract"], env=env_config.dict()),
) as flow:
    print(env_config.dict(), file=stderr)
    # Prints: {'AZURE_STORAGE_CONNECTION_STRING': 'DefaultEndpointsProtocol=https;AccountName=xxxxxx==;EndpointSuffix=<http://core.windows.net|core.windows.net>'}
a
Can you try adding this extra environment variable to your run config?
Copy code
export PREFECT__CLOUD__USE_LOCAL_SECRETS=false
The reason for it is: perhaps your flow run tries to grab the Secrets from Prefect cloud and when you're using Server you need to use local Secrets. You can also check out this similar thread that explains it a bit more
l
Thanks, that did do the trick on that error. But sadly enough I get the same error as my Local storage, regarding imported modules that can't be found. My docker image has the following paths:
Copy code
/app/extract_flows.py

/app/classes:
envs.py
metadata.py
But
extract_flows.py
can't import the classes by using:
from classes.envs import Class
as you normally would do, this gives me the following error:
Failed to load and execute flow run: FlowStorageError('An error occurred while unpickling the flow:\n ModuleNotFoundError("No module named \'classes\'")\nThis may be due to a missing Python module in your current environment. Please ensure you have all required flow dependencies installed.')
How should I import other files within my docker image? Does the pickle influence this?
a
How do you build your Docker image? The default home directory where Prefect looks for your flows and dependencies is
/opt/prefect
. Check this sample Dockerfile that you can use as a template
l
My docker is very very diffferent, I'm going to try to mimic your path names and see if that works, thanks for the example!
👍 1
Hmm when I'm registering the flow now, it's crashing since it can't find the flow_utils. So for example in your example flow: https://github.com/anna-geller/packaging-prefect-flows/blob/master/flows/azure_kubernetes_run_custom_azure_image.py My flow register crashes since it can't find the
flow_utils.db
Copy code
if __name__ == "__main__":
    flow.register(project_name="default")
In my case the classes utils.
Copy code
extract_1  | Traceback (most recent call last):
extract_1  |   File "/opt/prefect/flows/extract_flows.py", line 6, in <module>
extract_1  |     from classes.envs import EnvConfig
extract_1  | ModuleNotFoundError: No module named 'classes
Folder structure:
Copy code
/opt/prefect/classes:
envs.py
metadata.py

/opt/prefect/config:
example.yml

/opt/prefect/flows:
extract_flows.py
@Anna Geller by any chance, do you have an idea how to fix this issue, or see that I did something wrong? Docker image:
Copy code
FROM prefecthq/prefect:latest

RUN /usr/local/bin/python -m pip install --upgrade pip
WORKDIR /opt/prefect

COPY ./requirements.txt /tmp/requirements.txt
RUN pip install -r /tmp/requirements.txt

COPY ./src/classes/ /opt/prefect/classes/
COPY ./src/config/ /opt/prefect/config/
COPY ./src/utils/ /opt/prefect/utils/
COPY ./src/extract_flows.py /opt/prefect/flows/
k
You should install the additional dependencies as a python library or add them to the python path because these will only work at a specific path, and the flow might not be run from there. It’s better to just install them
l
Do you have an example for your own written functions? Another way would be to put everything in the flow file, but that's rather hard to maintain.
k
Yep! Check this. It’s not hard
a
Also, the same repo I shared has the setup.py you can use - basically, you just need to put your src/ modules into flow utils or rename your src to flow_utils and then you should be able to pretty much reuse the code from this repo incl. setup.py https://github.com/anna-geller/packaging-prefect-flows/blob/master/setup.py
then, you need to check if your imports are still fine - e.g. instead of:
Copy code
from src.classes import XYZ
you would need to have:
Copy code
from flow_utils.classes import XYZ
or whatever you named the package in setup.py and the folder name (doesn't have to be named
flow_utils
) - hope this is clearer now but if I confused you even further LMK 🙂
l
So confusing 😂 now the imports seem to be working, but the AZURE_STORAGE died again.
Exception: Azure connection string not provided. Set
AZURE_STORAGE_CONNECTION_STRING
environment variable or save connection string as Prefect secret.
Even though I do give both options to the flow.
Copy code
run_config=DockerRun(image="<http://xxxx.azurecr.io/extract:latest|xxxx.azurecr.io/extract:latest>", labels=["extract"], env=env_config.dict()),
env_config.dict()
Copy code
{'AZURE_STORAGE_CONNECTION_STRING': 'DefaultEndpointsProtocol=https;AccountName=xxxx==;EndpointSuffix=<http://core.windows.net|core.windows.net>', 'PREFECT__CLOUD__USE_LOCAL_SECRETS': False}
Sometimes code has their own life.
k
Yeah I believe something is up with the env_config.dict and if you pass it directly I think it will work
l
Hmm.. Then I would've expected it to work with this:
run_config=DockerRun(image="<http://xxx.azurecr.io/extract:latest|xxx.azurecr.io/extract:latest>", labels=["extract"], env={"PREFECT__CLOUD__USE_LOCAL_SECRETS": False}),
The secret is inside my cloud as key.
Still get the same error regarding the AZURE_STORAGE_CONNECTION_STRING
k
did you try
storage=Azure(container="prefect-flows", connection_string_secret="AZURE_STORAGE_CONNECTION_STRING")
?
you can also try seeing if you can pull it from cloud with
Secret("AZURE….").get()
l
The problem is if I pass the connection_string to the storage, I won't have it available on my Agent 😕 and then the agent proceeds to get the same error.
a
I would also recommend setting this secret before starting the agent or when starting the agent as described in this topic: https://discourse.prefect.io/t/how-to-set-storage-secrets-e-g-github-access-token-on-server/70
Copy code
prefect agent local start --env PREFECT__CONTEXT__SECRETS__AZURE_STORAGE_CONNECTION_STRING=xxx
👍 1
l
@Kevin Kho Hmm.. Then it's trying to look for it locally.
ValueError: Local Secret "AZURE_STORAGE_CONNECTION_STRING" was not found.
k
That line is just saying that in order to pull this storage, use the
AZURE_STORAGE_CONNECTION_STRING
secret stored. The only thing you need it to point it to Cloud so it doesn’t pull a local secret so the Agent needs an env var
PREFECT___CLOUD___USE_LOCAL_SECRETS=false_
l
Hmm.. Unsure why it's trying to find the key through a value instead of the key.. The
flow.register()
should try to find for
AZURE_STORAGE_CONNECTION_STRING
no? Instead of the value
DefaultEndpointsProtocol=https;AccountName=gensonadlsgen2;AccountKey=xxx==;EndpointSuffix=<http://core.windows.net|core.windows.net>
Copy code
Traceback (most recent call last):
  File "/opt/prefect/flows/extract_flows.py", line 45, in <module>
    flow.register(project_name="default")
  File "/usr/local/lib/python3.7/site-packages/prefect/core/flow.py", line 1715, in register
    idempotency_key=idempotency_key,
  File "/usr/local/lib/python3.7/site-packages/prefect/client/client.py", line 848, in register
    serialized_flow = flow.serialize(build=build)  # type: Any
  File "/usr/local/lib/python3.7/site-packages/prefect/core/flow.py", line 1497, in serialize
    storage = self.storage.build()  # type: Optional[Storage]
  File "/usr/local/lib/python3.7/site-packages/prefect/storage/azure.py", line 147, in build
    client = self._azure_block_blob_service.get_blob_client(
  File "/usr/local/lib/python3.7/site-packages/prefect/storage/azure.py", line 178, in _azure_block_blob_service
    conn_str=self.connection_string
  File "/usr/local/lib/python3.7/site-packages/prefect/storage/azure.py", line 162, in connection_string
    return Secret(self.connection_string_secret).get()
  File "/usr/local/lib/python3.7/site-packages/prefect/client/secrets.py", line 159, in get
    ) from exc
KeyError: 'The secret DefaultEndpointsProtocol=https;AccountName=xxxx==;EndpointSuffix=<http://core.windows.net|core.windows.net> was not found.  Please ensure that it was set correctly in your tenant: <https://docs.prefect.io/orchestration/concepts/secrets.html>'
Unsure where it gets that value or key from...
Image of my secret in cloud:
k
Can I see your storage definition? This looks like you called get?
l
@Kevin Kho sorry for the late reply.
Copy code
with Flow(
    "extract",
    storage=Azure(container="prefect-flows", connection_string_secret="xxxxx=="),
    run_config=DockerRun(image="<http://xxx.azurecr.io/extract:latest|xxx.azurecr.io/extract:latest>", labels=["extract"]),
) as flow:
Ahh, I should've put the Cloud secret name there 😅
Great it's working now! Thanks both of you! 🙂
👍 1