Hey Bianca,
Thanks for your quick response! 🙂 Steffen and I are working together on this topic.
> For transferring data between tasks B and C, and E and F, are you considering using remote storage? (ie: S3, GCS, Azure Blob storage, etc.?)
To be honest, we’re still exploring how data transfer between tasks works in Prefect. Due to the above mentioned privacy constraints, using cloud resources isn’t an option for us. However, we could set up e.g. an S3-compliant database on a local server, such as
MinIO.
Our initial idea was to configure a local Prefect server (e.g., within Organization 1) and use two Work Pools - one in each organization - to execute tasks. However, we’ve run into some challenges in setting up a workflow that spans across different machines. Specifically, we’re trying to execute tasks like Task A and B on Server 1, and Task C, D, and E on Server 2, within one workflow.
We came across this
discussion and this
issue, which suggest that distributed workflows like this are not yet fully supported in Prefect. While this doesn’t seem to be a fundamental limitation, it might be that this use case hasn’t been a primary focus or thoroughly documented yet.
Looking forward to your insights and any recommendations you might have for addressing this setup!