Hi, I'd like to discuss parametrizing flows. At my...
# ask-community
b
Hi, I'd like to discuss parametrizing flows. At my job we have a single 'flow type' let's say (the same sequence of tasks defined in Python) for a specific feature. This flow type has different parameters per customer, leading to a list of Flows in Prefect that have the same Python functions but different parameter values (and different schedules usually). We want to add a new Flow automatically when a new set of customer parameters is added to our backend, but without having access to the
Flow
object in Python. So I was thinking I'd query an existing flow (possibly a special 'template' one), modify it and insert it again, but it feels kind of hacky. Any thoughts?
k
Hey @Bouke Krom, if the Flow is the same and only the Parameters change, you can just add a Schedule with a different set of Parameters using the GraphQL API since Parameters can be attached to a Schedule. Would that work for you?
b
Oh thanks I didnt know that! Let me try it in dev today and I'll let you know.
k
See this for syntax
b
Thanks, that would indeed fix the problem of making new flows (actually schedule + parameter) without having access to the full python flow object. Is there a way to set individual schedules active/inactive or can you only do that at a flow-group level?
(or add/update/delete individual schedules)
k
I think only at the flow group level unfortunately.
b
Well we could work around that by querying
Copy code
{flow_group(where: {id: {_eq: $low_id}}){
  id
  schedule
}}
modifying the schedule list as needed and then setting it again.
Is there no way to duplicate a flow?
k
Yeah that would be the way. Will double check but I don’t think there is a way to duplicate the Flow or pull it down and edit then re-register with the GraphQL API
b
OK thanks so much for your help! Prefect docs are pretty good but this kind of advice is even more valuable
k
Just talked to an engineer. What you proposed can be done. You can 1. query for a template flow using graphql 2. update the name of the flow (or whatever else makes it unique?) 3. register the new flow using graphql The
create_flow
is the endpoint to register and you can see it in action by checking the code for
Client.register
here
b
Thanks, checking it out 👀
Just to check back in: we decided to go for duplicating flows because it makes the GUI experience more convenient (easy to enable/disable schedules for example). Actually since the duplicating is done with a Python-based lambda you can instantiate the Flow object from the
serialized_flow
data from the GraphQL query, then modify it to your heart's content and register it with the usual
client.register
. Btw any reason that the
client
interface doesnt provide
get_flow
or
delete_flow
methods (while it does for flow runs)?
k
I think just because it’s not used a lot. I have never seen this use case before. 😅. But
client.graphql
should be enough to put the query together. Do you need help with that?
b
No that's fine thank, already implemented it 👍 Just wondered why the 'crud' for flows only has a shorthand 'c' and 'rud' is less polished. Is it something you would appreciate a PR for?
k
I suppose that would make sense if you wanted to contribute. Could you post in the #C0106HZ1CMS, to see if the core team thinks it makes sense because they would know more than me.
b
Following up on this. What I'm trying to achieve is duplicating a flow while changing some 'metadata' (parameters, schedule, title), with just API calls (so without access to the code that is ran). The flow is supposed to run as a
LocalRun
on an agent. I tried querying a template flow, deserializing it, modifying it, then registering it again. This seems to work fine but when running, it fails on storage:
Failed to load and execute Flow's environment: ValueError('Flow is not contained in this Storage')
. It seems that on registering, the default
LocalStorage
is populated with paths from the system that calls
register
, but I want it to be populated with paths from the agent.. Any ideas?
I might still fall back to the other solution: having one flow (group) with many different schedule/parameter combinations. It will make the GUI interaction a lot worse, but seems like a much easier solution from the code side.
k
So the
Local
storage takes in 2 things that might help.
stored_as_script
and
path
, which will point to a location the agent will retrieve it from. If you don’t store as script (default), the Flow gets serialized and saved locally, so of course the agent won’t find it. You need
stored_as_script=True
and the
path=path/on/agent
and this might work.