https://prefect.io logo
m

matt_innerspace.io

07/07/2023, 8:15 PM
block setup -
azure storage account
- i can't figure out how to properly connect to an azure blobstore. From what I can tell, I need this to be able to deploy code (which exists in bitbucket) to a remote worker pool. I've entered enough information to make it work, but it fails:
Copy code
raise PrefectHTTPStatusError.from_httpx_error(exc) from exc.__cause__
prefect.exceptions.PrefectHTTPStatusError: Client error '401 Unauthorized' for url '<https://api.prefect.cloud/api/accounts/><my account>/workspaces/<my workspace>/block_types/slug/azure/block_documents/name/azure-block-001?include_secrets=true'
Response: {'detail': 'Invalid authentication credentials'}
For more information check: <https://httpstatuses.com/401>
Is there a document somewhere that describes how to do this?
c

Christopher Boyd

07/07/2023, 8:22 PM
Hi Matt, I’d suggest / recommend using the prefect-azure collection: https://github.com/PrefectHQ/prefect-azure/tree/main/prefect_azure and https://prefecthq.github.io/prefect-azure/ The block you are using is much older, and the cloud specific integrations have been moved to separate repositories to better maintain without constantly needing update core prefect code
image.png
m

matt_innerspace.io

07/07/2023, 8:29 PM
i'm creating the block in the prefect2.0 web console.. selecting the
Azure
block here. Is what you mentioned above different than that? then trying to deploy it:
Copy code
prefect deployment build --storage-block azure/azure-block-001/health_check --name health-test --pool default-agent-pool --work-queue aci-test --apply sample/health_flow.py:health_check_flow
c

Christopher Boyd

07/07/2023, 8:37 PM
Hi Matt, Yes - what I mentioned is separate from that. You will need to install the prefect-azure collection, and register the blocks. Then you can configure them in the UI
pip install prefect-azure
and
pip install prefect-azure[blob-storage]
prefect block register -m prefect_azure
Then they will be available in the UI for you to configure
m

matt_innerspace.io

07/07/2023, 8:40 PM
Copy code
(.venv) mattm ~/dev/workflows$ prefect block register -m prefect_azure
Warning!  Failed to load collection 'prefect_azure': ModuleNotFoundError: No module named 'prefect.workers'
Unable to load prefect_azure. Please make sure the module is installed in your current environment.
c

Christopher Boyd

07/07/2023, 8:41 PM
Was that after you installed it?
m

matt_innerspace.io

07/07/2023, 8:42 PM
just noticed this in the output from the previous 2 commands you listed:
Copy code
WARNING: prefect-azure 0.2.7 does not provide the extra 'blob-storage'
c

Christopher Boyd

07/07/2023, 8:43 PM
m

matt_innerspace.io

07/07/2023, 8:44 PM
ok, upgraded prefect-azure, but still seeing:
Copy code
WARNING: prefect-azure 0.2.10 does not provide the extra 'blob-storage'
for this:
pip install prefect-azure[blob-storage]
c

Christopher Boyd

07/07/2023, 8:44 PM
ah, it’s blob_storage
image.png
m

matt_innerspace.io

07/07/2023, 8:46 PM
ok, over that one, seeing this now:
Copy code
(.venv) mattm ~/dev/workflows$ prefect block register -m prefect_azure
Warning!  Failed to load collection 'prefect_azure': ModuleNotFoundError: No module named 'prefect.workers'
Unable to load prefect_azure. Please make sure the module is installed in your current environment.
c

Christopher Boyd

07/07/2023, 8:46 PM
What version of prefect are you using
m

matt_innerspace.io

07/07/2023, 8:46 PM
Copy code
(.venv) mattm ~/dev/workflows$ prefect --version
Warning!  Failed to load collection 'prefect_azure': ModuleNotFoundError: No module named 'prefect.workers'
2.82
c

Christopher Boyd

07/07/2023, 8:48 PM
gotcha - workers (which some of the components in prefect_azure are reliant on) came out after this version. If you wanted to do another venv and install fresh; you could probably try to create an azure blob credentials block at this point from the UI, as the blob storage is not reliant on workers
We are currently at 2.10.20
m

matt_innerspace.io

07/07/2023, 8:49 PM
ok - will the UI look different? i see the same 6 options in the ui after refreshing..
c

Christopher Boyd

07/07/2023, 8:49 PM
You should be able to use the Azure Blob Storage Credentials
The UI should be fine as those blocks are registered with the UI, but if you are trying to run and load a blob storage credential locally, it might fail due to the prefect_azure module being unable to load
Here’s an entire flow that uses that specific credential for reference:
Copy code
from prefect import task, flow
from prefect import get_run_logger
import pandas as pd
import os
from io import BytesIO

from prefect_azure import AzureBlobStorageCredentials
from prefect_azure.blob_storage import blob_storage_download, blob_storage_upload


def azure_creds():

    try:
        azure_credentials_block = AzureBlobStorageCredentials.load("boydoblobbo")
        return azure_credentials_block
    except ValueError as e:
        get_run_logger().info(f"No azure_credentials_block found :{e}")
        try:
            connection_string = os.getenv("AZURE_STORAGE_CONNECTION_STRING")
            return AzureBlobStorageCredentials(connection_string=connection_string)
        except Exception as f:
            get_run_logger().info("No connection string found")
            connection_string = None
            raise


def load_from_azure():

    blob_storage_credentials = azure_creds()
    data = blob_storage_download(
        blob="file.csv",
        container="prefect-logs",
        blob_storage_credentials=blob_storage_credentials,
    )
    return data


@task
def read_file(data):
    return pd.read_csv(BytesIO(data))


def write_df(data):
    df = pd.DataFrame(data, columns=["output"])
    csv_data = df.to_csv()
    blob = blob_storage_upload(
        data=csv_data,
        container="prefect-logs",
        blob="csv_data",
        blob_storage_credentials=azure_creds(),
        overwrite=True,
    )
    return blob


@task
def transform_pd(df):
    results = [row["col1"] * row["col2"] for index, row in df.iterrows()]
    get_run_logger().info(f"{results=}")
    return results


@flow(log_prints=True)
def transform_flow():

    file = load_from_azure()
    df = read_file(file)
    transformed_output = transform_pd(df)
    write_df(transformed_output)


if __name__ == "__main__":
    transform_flow()
m

matt_innerspace.io

07/07/2023, 8:52 PM
i'm actually just trying to deploy code to a storage-block, so it'll run on a remote worker pool. I don't intend to read from the block at all.
Copy code
prefect deployment build --storage-block azure/azure-block-001/health_check --name health-test --pool default-agent-pool --work-queue aci-test --apply health_flow.py:health_check_flow
Is this not the right way to do it?
c

Christopher Boyd

07/07/2023, 8:54 PM
That’s deprecated unfortunately
Much of this was re-factored as it was a bit difficult to use for many. The way to do this currently is either pythonically in code, or through prefect init / prefect deploy
m

matt_innerspace.io

07/07/2023, 8:56 PM
ok, i was following the thread before this one, trying to get it working - https://prefect-community.slack.com/archives/C048K1CAV7Z/p1686623786614179 is there a similar example I can follow for an azure example? Otherwise I'll try to work my way through the docs you referenced.
you should be able to upgrade your prefect version
then run
prefect init
it will walk you through an interactive process
m

matt_innerspace.io

07/07/2023, 8:58 PM
ok thanks, i'll take a look.. thanks so much for your help.
c

Christopher Boyd

07/07/2023, 8:58 PM
sec, that didn’t seem right
https://docs.prefect.io/2.10.18/tutorial/deployments/ It should be much more straight-forward -
prefect init
will allow you to choose from an interactive process. Since you’re using azure, it will setup your
prefect.yaml
for you You’ll likely still need to create an AzureBlobStorageCredential block which should just require a connection string Then the command would just be
prefect deploy
and it will walk you interactively through the process of your deployment - you can certainly pass them all via cli if you want and know them, but this makes it much simpler to follow
I know it probably seems like a lot to shift at the moment, but I assure you it’s much simpler now than it’s ever been and after the first one will be a lot more intuitive
m

matt_innerspace.io

07/07/2023, 9:04 PM
ok, i'll start over from this point and try again. again, thanks so much for your help. obviously, the docs online and examples online are confusing.
c

Christopher Boyd

07/07/2023, 9:05 PM
I certainly understand and agree - I think if you already have your flow ready to go, that last link should be what you need to get it running
m

matt_innerspace.io

07/07/2023, 9:06 PM
ok, thx.
c

Christopher Boyd

07/07/2023, 9:09 PM
Just for a last demonstration - I ran prefect init, selected azure, and provided the name ; this creates almost everything you need, with the only change needing to be adding the credentials field from here: https://prefecthq.github.io/prefect-azure/deployments/steps/
👍 1
with that done, you can just run
prefect deploy
provide your flow, and it will go to the storage you selected and pull from the storage you selected
m

matt_innerspace.io

07/10/2023, 6:31 PM
just getting back to this now and am stuck already? what did i miss?
c

Christopher Boyd

07/10/2023, 6:38 PM
image.png
m

matt_innerspace.io

07/10/2023, 6:56 PM
Screen Shot 2023-07-10 at 2.56.09 PM.png
let me try to upgrade to 2.10.18
ok, that worked.
🙌 1