HI everyone , I have a bit of confusion on what in...
# ask-community
d
HI everyone , I have a bit of confusion on what infrastructure subflows are run on? The documentation is not very clear regarding this. E.g. I have deployed agents on self hosted kubernetes on multiple Machines running in different regions of a cloud provider (E.g 3 machines running on EU, ASIA and US) all the agents are querying the same queue. If the parent flow is run on EU, does the subflow run on 1. The same pod 2. Goes through the queue again and can be picked by any agent running in any region To rephrase, does subflow run create a fork/spawn in the same infrastructure or go through the queue by inheriting the deployment infra blocks?
r
They will run on the same pod if just imported and called, if you want new jobs/pods to be used try run_deployment https://github.com/PrefectHQ/prefect-recipes/blob/main/flows-advanced/parent-orchestrator/pokemon_weight.py
and I agree the docs around subflows could highlight the options here a bit better
d
I think subflows must be a separate section by itself
I am looking into the example you have provided thanks. will ask in thread if i have more questions
thanks
👍 1
This sounds overly complicated.
run_deployment
requires that the deployment actually exists. Hence I will actually need to dynamically deploy the sub flow if the deployment doesnt exist - which will add to the deployment page adding clutter. And say for some reason my code in subflow changes, Then its a new host of issue. Ill either have to predelete all the autogenerated subflow deployments (not sure what sideeffect that will have on subflow run history) OR version my subflow deployments check version before every run to see if new deployment is required OR deploy the subflow on every run. Any solutions to this issue?
r
If you have a decent ci/cd setup deployments should not be a huge issue
I prefer keeping them separate and isolated - easier to build pipelines than importing multiple flows into one
d
yes deployments arent an issue 😄 but having to redeploy on every run doesnt sound right
r
why would you have to redeploy for every run though?
d
Because the code might change
the sub flows are dynamic
ill provide an example case
Main flow [Scheduled hourly] 1. API returns registered countries 2. Generate sub flow for each country and set sub flow name to include the country name 3. Run the sub flows Sub flow [Run only through main flow] 1. Get few data points from multiple different API for the country [each api call is a task] 2. Do processing and save to db [final task] Having seperate flow name for sub flow is crutial as I would get insights on status of country on flow page on hourly basis.
Just an example case
r
can you not just use parameters for this and a country flow, can tag it with the country
or are the api's specific to each country - event so you can have a list of countries and their api's and iterate them (I assume you know these up front)
d
The first option on using parameters will only generate a single flow for all the countries ie all the parallel runs for different country would be summarized onto a single flow in the prefect cloud flow UI ?
The list of countries returned is dynamic
r
no you could have a parametrised instance of each flow per country
call country_flow with param DE|UK|ETC
d
yup, but all of them will share a single flow name
r
d
Yes but that will need different deployments for each dynamic flow
r
Ok, I dont get why you cant use a single flow and take in a country - but I am not inside your codebase/product
d
Can do that but the flow page will have summarized view on all the countries in a single block in UI in the flow page
Guess maybe I can ask prefect team if running subflow through the queue without deployments
Thanks for ur time though
r
can always build your own view of flow runs with the api - I wouldn't constrain what you do given how the UI currently is
d
Yup that sounds good