Hi, I’m currently trying to evaluate the usability...
# ask-community
s
Hi, I’m currently trying to evaluate the usability of Prefect (and other orchestration tools) for a rather complex workflow that we have at work. Just to give you a brief context: I’m working for a research institute and we want to chain some of our simulation models together with the models of another institute. However, due to legal constraints (export control), we have very strict limitations on what we are allowed to exchange between both organizations. The general idea is that the models shall stay within each organization, to be executed on local hardware, and only input/output data is transferred. The goal is that the data transfer and workflow continuation is organized by Prefect. The structure of a typical workflow can look like the attached diagram (tasks could also be subflows): For this workflow the following privacy constrains would need to be respected: • The task’s source code of one organization shall not be visible for the other organization. • The data exchange within one organization shall not be accessible for the other organization, e.g. Organization 1 shall not get access to the data exchanged between Task C, D & E. • The data exchange between organizations shall be SSL secured, e.g from Task B to C. • The task logs of one organization shall not be visible for the other organization. (In case of failures, log files have a high potential to expose the internals of the models) Given the very modular architecture of Prefect I would assume that such a setup should be possible. However, so far I could not find documentation which is pointing me into the right direction. So, any help or ideas on this topic would be very appreciated! P.S.: I could imagine that such a use case is also very relevant for other Prefect users, e.g. for companies in a customer-supplier relationship.
🤔 1
b
Hey Steffen! > Given the very modular architecture of Prefect I would assume that such a setup should be possible After reading the requirements that you described above, I think Prefect would be a good fit. You can read more about Prefect's security posture, and hybrid execution model, here. In short, Prefect doesn't need ingress access to your environment in order to orchestrate your workflows. Your team's source code and data don't need to be shared with Prefect (or the other organization for that matter) in order for your workflows to run.
For transferring data between tasks B and C, and E and F, are you considering using remote storage? (ie: S3, GCS, Azure Blob storage, etc.?)
s
Hey Bianca, Thanks for your quick response! 🙂 Steffen and I are working together on this topic. > For transferring data between tasks B and C, and E and F, are you considering using remote storage? (ie: S3, GCS, Azure Blob storage, etc.?) To be honest, we’re still exploring how data transfer between tasks works in Prefect. Due to the above mentioned privacy constraints, using cloud resources isn’t an option for us. However, we could set up e.g. an S3-compliant database on a local server, such as MinIO. Our initial idea was to configure a local Prefect server (e.g., within Organization 1) and use two Work Pools - one in each organization - to execute tasks. However, we’ve run into some challenges in setting up a workflow that spans across different machines. Specifically, we’re trying to execute tasks like Task A and B on Server 1, and Task C, D, and E on Server 2, within one workflow. We came across this discussion and this issue, which suggest that distributed workflows like this are not yet fully supported in Prefect. While this doesn’t seem to be a fundamental limitation, it might be that this use case hasn’t been a primary focus or thoroughly documented yet. Looking forward to your insights and any recommendations you might have for addressing this setup!