Constantino Schillebeeckx
08/29/2023, 3:03 PMfor flow in $flows; do prefect deploy $flow -n prod; done
How do I define the (CRON) schedule for each of those flows; the decorator doesn't have it as an argument. Do I need to define a deployment in the Flow file and do a build_from_flow
?redsquare
08/29/2023, 3:13 PMSanz Al
08/29/2023, 3:16 PMprefect deployment build -n test_flow -p prod -q default -o ~/deployments/test_flow -t monitoring --skip-upload --cron "55 6 * * */3" ~/prefect/flows/monitoring/test_flow.py:test_flow -a
Constantino Schillebeeckx
08/29/2023, 3:28 PMsh
file somewhere else - I don't think it's a great user experience. ideally all the deployment details of a flow are abstracted away from the user.Nate
08/29/2023, 3:35 PMwhen users author a flow, I don't want them to worry about the deployment part.agreed! this is one of the things we were thinking about when developing the
prefect deploy
/ prefect.yaml
deployment UX! but it does sound like in your case, someone will have to pick a schedule
if you don't want to set a schedule at deployment time, you don't have to. the deployment creator can pop into the UI after prefect deploy
and click the buttons to get a custom schedule. if you want it be declarative, you could have definitions
in your prefect.yaml
for schedules that a user could select from an attach to their new deployment (like this)Constantino Schillebeeckx
08/29/2023, 3:43 PMredsquare
08/29/2023, 3:44 PMNate
08/29/2023, 3:45 PMNate
08/29/2023, 3:47 PMEmerson Franks
08/29/2023, 3:47 PMConstantino Schillebeeckx
08/29/2023, 3:48 PMNate
08/29/2023, 3:50 PMwhat
sounds like the flow they just wrote, and then when
is the schedule defintion
that you have for them to select from the list and attach to their new entry in the prefect.yaml
- do you have a problem with that deployment ux?Constantino Schillebeeckx
08/29/2023, 3:51 PMConstantino Schillebeeckx
08/29/2023, 3:51 PMNate
08/29/2023, 3:59 PMyep, as soon as they need a new schedule, they ping me and are like, I need a cron for 12:43hmm its fairly straightforward to define a schedule - just adding this would be the declarative way if you're interested
schedule:
cron: 43 12 * * *
Emerson Franks
08/29/2023, 4:08 PMDeployment.build_from_flow
to build each flow from code, rather than .yaml. The output of the factory method is a flow w/ a .apply
command that our dispatcher invokes. So every flow that we have ends up at this set of code:
def deploy(self):
deployment = self.flow_factory.generate_flow_deployment(
self.flow_name,
self.flow_logic,
self.is_schedule_active,
self.flow_interval_in_seconds,
description=self.description,
flow_parameters=self.flow_parameters,
extra_tags=self.tags,
schedule_anchor=self.schedule_anchor,
job_ttl=self.job_ttl)
deployment.apply()
For the generated flows themselves, we have checked in template files that are leveraged by generators to create the .py files during CD. The templates are really 'simple' for the most part and just have parameters that get filled during the CD using values from the JSON config files. So for Fivetran, we have something like this:
# THIS CODE IS GENERATED BY THE FIVETRAN FLOW CREATOR, DO NOT MANUALLY UPDATE
from prefect import flow, get_run_logger
from prefect.blocks.system import Secret
from fivetran_provider import FivetranProvider
@flow(timeout_seconds={timeout_seconds})
def {flow_name}():
logger = get_run_logger()
fivetran_key = Secret.load('five-tran-key')
fivetran_secret = Secret.load('five-tran-secret')
fivetran_provider = FivetranProvider(fivetran_key.get(), fivetran_secret.get(), logger)
<http://logger.info|logger.info>('running sync for connector_name: {connector_name}')
fivetran_provider.sync_connector('{connector_id}', {timeout_seconds}-60)
if __name__ == "__main__":
{flow_name}()
This code is filled using a method like this:
def create_flow_file_from_template(self):
with open('fivetran_flow_template.py', 'r') as flow_file_template:
flow_file = flow_file_template.read().format(
flow_name=self.flow_name,
timeout_seconds=self.five_tran_flow_config.timeout_seconds,
connector_name=self.five_tran_flow_config.connector_name,
connector_id=self.five_tran_flow_config.connector_id
)
with open(f'{self.flow_name}.py', 'w') as py_file:
py_file.write(flow_file)
You then end up needing to import the modules before you can do any of the deployment. That is done using `importlib`and calling import_module
then finally calling getattr
. From there, the code is just pushed into our FlowFactory as described above.Nate
08/29/2023, 4:16 PMi'm in the process of migrating overI'll say that I'd recommend
prefect.yaml
/ prefect deploy
over infra / storage block & Deployment.build_from_flow
deployment ux because with the latter you cannot leverage workers fully (pull
step for example) and eventually the block/agent -based deployments will be no longer be our main recommendation
as a pretty heavy user on both sides (deployment creator / work pool creator) i'll say that there's nothing I can think of that you can do with build_from_flow
+ python that you can't do with prefect.yaml
+ custom deployment steps (which can be arbitrary python)Emerson Franks
08/29/2023, 4:19 PMbuild_from_flow
being deprecated. Most of my team really despises yaml. I'll definitely have to follow up with our AM as this would be a pretty big show stopper for us.Constantino Schillebeeckx
08/29/2023, 4:19 PMprefect.yaml
vs Deployment.build_from_flow
UX - I'm guessing the former is the "new" way of doing things, and the latter is a bit older? I don't have much context for feature development in 2.0 - it's made getting up to speed a bit more difficult as the docs don't really highlight the differences very well IMHONate
08/29/2023, 4:21 PMthe former is the "new" way of doing things, and the latter is a bit older?correct @Emerson Franks
Deployment.build_from_flow
will be around for a while, and by the time its deprecated, we should have an analogous python interface for people who dont want to use yamlConstantino Schillebeeckx
08/29/2023, 4:24 PMas a pretty heavy user on both sides (deployment creator / work pool creator)@Nate what exactly do you mean by work pool creator?
Nate
08/29/2023, 4:25 PMNate
08/29/2023, 4:26 PMwhat exactly do you mean by work pool creator?dev ops person that supports deployment authors, for example, I setup a k8s cluster for my team who shouldnt have to worry about infra
redsquare
08/29/2023, 4:26 PMredsquare
08/29/2023, 4:29 PMredsquare
08/29/2023, 4:30 PMEmerson Franks
08/29/2023, 4:31 PMbuild_from_flow
allows me to stay 100% in Python (which I guess is real code 😉) and not have to worry about yaml. We, of course, have to use yaml for things like our CI/CD pipeline and k8s deployments but these are really static and after they are setup, we can wash our hands of yaml 🙂
+1 to not seeing why I would ever run a worker instead of k8s.redsquare
08/29/2023, 4:31 PMNate
08/29/2023, 4:39 PMworker instead of k8sits not worker instead of a specific infra, its worker instead of agent workers/agents generally support the same infras. workers are just strongly typed (instead of agents which try to submit work anywhere) and have extra capabilities (like executing arbitrary job setup in a
pull
step) which may not be relevant for you all.
but point taken
would hate to have to revisit it without a good reasoni know its a pain to migrate (i have lots of deployments to manage too 🙂) - anyways, if/when you wanna switch (again, a python interface will exist before you have to worry about it), happy to help smooth over rough edges
Constantino Schillebeeckx
08/29/2023, 5:02 PMNate
08/29/2023, 5:05 PMConstantino Schillebeeckx
08/29/2023, 5:11 PMbuild
?)Nate
08/29/2023, 5:20 PMprefect_docker.deployments.steps.build_docker_image
lives here
in general, these steps like build_docker_image
or push_to_s3
are defined / documented in their service's collection (e.g. prefect-docker
prefect-aws
) and with the exception of the pull
step, they're optional, if you want to build an image / push your code with other CI/CD, thats fine! the worker just needs to know where to get it at runtimeNate
08/29/2023, 5:23 PMcustom deployment stepsi mean that I could have a file called
my_steps.py
and write
async def my_fancy_step(arg1: str, arg2: dict):
# do whatever your step should do
and in my prefect.yaml
have
- my_steps.my_fancy_step:
arg1: "foo"
arg2:
key: val
and prefect deploy
will run that for you at deployment time so long as you have my_steps
available in the runtimeConstantino Schillebeeckx
08/29/2023, 5:25 PMNate
08/29/2023, 5:26 PMMarty Ko
09/07/2023, 6:59 AMprefect deploy
all configurations that done from the UI will get over-written from the prefect.yaml. Is there anyway to avoid this?