Okay, follow-up question. Is there any way to crea...
# prefect-community
o
Okay, follow-up question. Is there any way to create a subflow by flow name, id, deployment_id or something like that? I've tried using create_flow_run_from_deployment but the flow it creates is not a subflow, so it's not suitable for a flow-of-flows scenario. I'm trying to avoid using direct flow-object references since my flows will be deployed individually, using separate docker containers etc. If it's not possible, that's alright, I'll just have to rethink my strategy in that case. I know I've asked a similar question before but the response to that was "use create_flow_run_from_deployment", which I've finally had time to attempt and it's not what I need. 🙂
m
Hey @Oscar Björhn Can I ask what the purpose behind the individual docker containers is?
o
Absolutely! We have a generic base image we call "transformation" that we load with packages such as dbt and soda (soda is a data quality framework/service), and then we have a generic image we call "integration" that we load with some basic integration packages such as singer. On top of that we have more specific integration-images loaded with very specific source system packages/drivers. This comes in handy when we, for example, need to load data from an old on-prem Oracle database that requires a 6-years old python package/driver bundle that won't get updated, and might otherwise conflict with our other packages. So essentially, we do this to avoid conflicts and to avoid having to compromise in which packages we use. We're hardly experts at Python (or Docker, for that matter), we have a more traditional C#/SQL background, but we figure this is what Docker and microservice infrastructure was built for. Would be glad to hear it if we're doing things wrong.
Note that my description above applies to how our Prefect 1 infrastructure is set up, it hasn't been fully converted to Prefect 2 yet.
m
Brilliant That definitely makes complete sense, individual deployments specify the runtime environment for a flow and it's dependent flows, i.e. subflows, so if each flow requires it's own dependencies they would need to have their own deployments thereby ensuring the appropriate runtime environments for each flow but that means they are their own independent flow rather than a subflow. Sorry for the stream of consciousness I'm just getting the thoughts out of my head haha it's definitely an interesting process, What's the goal of the parent flow in this context beyond just running each step in the process?
o
No worries, seems like you're getting what we're doing, which I appreciate! 🙂 The goal of our parent flow, in this case, is an attempt to cut down on having to schedule a ton of individual flows with complex dependencies. At my previous workplaces I've always ended up rolling my own flow dependency handlers that would have rules such as "don't build the marts unless all the integrations have reported success, UNLESS it's > 9 am, then run it anyway" and other rules that just keep getting more complex. Having a single main flow is my attempt at returning to some sort of simplicity, where we only have to schedule a single orchestrator flow and we let it handle first loading integrations, then testing the raw data, then loading the marts, then testing the marts before finally loading power bi (and other exports). I should probably take a screenshot of what it looks like, give me a minute. I don't know if I'll be able to keep it this simple as we add more integrations, but so far it's been working pretty well.
o
Neat, thank you! I'll try that out first thing tomorrow, just to make sure the created flow shows up as a sub flow that you can click through from the parent flow. Otherwise I'll re-think how I orchestrate things, for now.
It didn't make a difference compared to what I already had. Specifically, the flow created by create_flow_run_from_deployment has no visible relation to its parent flow, since it's essentially creating a new top-level flow. Thanks anyway. 🙂
👍 1
a
I see what you mean. We plan to introduce a feature (likely next year) that will allow seeing the connection between flows deployed with orchestrator pattern, it will be very 😎! but it will take time, so full transparency here
🎉 1
👍 1
o
Awesome, appreciate it! Makes it easier for me to justify investing some time in a workaround when I know the feature isn't right around the corner.
🙌 1