<@ULVA73B9P> Is there a pattern where I don't hav...
# marvin-in-the-wild
p
@Marvin Is there a pattern where I don't have to specify the file system type and just load it from the block type? For example, currently I have
Copy code
fs = SMB.load("azure-file-share")
while I'd like something like
Copy code
fs = RemoteFileShare.load("azure-file-share")
pass
"azure-file-share"
as a parameter to the flow to future proof the pipeline against swapping storage backend type. Does something like this make sense?
m
There isn't a specific pattern or direct support for creating a dynamic filesystem loader that abstracts the actual storage backend in Prefect 2.x. However, it is possible to achieve a similar functionality by using
RemoteFileSystem
and parametrizing the backend type and settings. Here's how you can do it: 1. Install Necessary Libraries: Ensure that you have the necessary libraries for different filesystems (e.g.,
s3fs
for S3,
adlfs
for Azure, etc.) installed. 2. Dynamic Loading Using
RemoteFileSystem
: Prefect filesystems support a variety of backends as specified by the protocol in the base path. You can define a function to load the correct filesystem based on the protocol. 3. Example Implementation:
Copy code
python
   from prefect.filesystems import RemoteFileSystem

   def load_file_system(block_name: str):
       # This is where you might retrieve settings from a config or environment
       # In practice, you can fetch these from Prefect's secrets or environment variables
       settings = {
           "azure": {
               "basepath": "<az://my-azure-container/path>",
               "key": "<azure-key>",
               # additional settings for azure
           },
           "s3": {
               "basepath": "<s3://my-bucket/path>",
               "key": "<aws-key>",
               # additional settings for S3
           },
           "gcs": {
               "basepath": "<gcs://my-bucket/path>",
               "key": "<gcloud-key>",
               # additional settings for GCS
           },
           # Add other providers as needed
       }

       # Assume block_name maps to one of the above keys
       config = settings.get(block_name)
       if not config:
           raise ValueError(f"No configuration found for block type: {block_name}")

       # Create the RemoteFileSystem with the appropriate settings
       fs = RemoteFileSystem(basepath=config["basepath"], settings=config)
       return fs

   # Example usage
   fs = load_file_system("azure")
4. Reference in Flow: Now
load_file_system
can be used within your flows, allowing you to switch storage backends dynamically without changing the flow logic. For more details and examples on using different storage backends, refer to the Prefect Storage Documentation. This approach helps to future-proof your pipeline by allowing you to switch storage backends easily by changing configurations rather than the code itself.
n
hi @Paweł Biernat with any block, you can do
smb_type_lock_instance = Block.load("smb/my-smb-block-name")
does that help?
p
that's looks like what I was after, thanks!
so I tried it and it doesn't quite work
Copy code
fs = Block.load("smb/azure-file-share")
leads to
ValueError: Unable to find block document named azure-file-share for block type smb
, although
fs = SMB.load("azure-file-share")
works just fine. Maybe I'm doing it wrong...
My bad, I used a wrong block name, it's all good now!