How many concerns should one workflow deal with? This is a system design question.
Let’s say that I have a web application that persists its data to its own PostgreSQL Docker instance.
Then I have my workflow deployment through the Prefect orchestrator, and that workflow writes some data to, say, Snowflake.
Is it OK for my workflow to deal with cross cutting concerns such as updating both the Snowflake (business logic data model) and the PostgreSQL (for the web application data model)?
Or should the web app (e.g. Node.js) be subscribed, notified and act upon workflow completions?
I can view the workflow as a conductor that knows all the moving parts of the system, e.g. all different databases, but I wanted to hear from real usages (in the wild) about system design considerations and separation (or mixing) of concerns and what are the tradeoffs/considerations.
03/03/2023, 1:50 PM
I think that’s just a matter of anti-fragility - I’d rather have one item run twice, and hedge 50% of my problems, than run both, potentially failing both
But using Prefect, each could just be a flow / sub-flow, where you have a main flow that executes each simultaneously, or with [wait_for] and result persistence to not retry (and keep your updates idempotent)
Personally, i’d keep atomic operations distinctly separate from each other, so any failures in one don’t impact the other