Well first of all, what you need is exactly a workflow orchestration engine like Prefect! This will handle scheduling, task/step retries, observablility, flow visualization, ..
Do note however that there are several other alternatives on the market as well. Most notably Airflow and Argo Workflows. The former is one of the first of its kind so it is established, but in my opinion also becoming legacy (it was build in the era of long running Hadoop jobs and it seriously lacks flexibility compared to the others). Argo Workflows on the other hand is a newer generation, k8s native workflow engine and does exactly what you want to do run each tasks as separate container. We use it at our company and even though I really like it, I think there are two major drawbacks:
1. because it is k8s native, workflows are defined using yaml manifest. So maintaining and deploying these requires a certain skillset that not a lot of devs have. There is a python API too, but imo that's doesn't change things too much if have a large number of workflows to maintain
2. each task/step runs as a separate pod. Considering pod startup time (pulling a container image from registry and starting it), you are forced to bundle tasks together into bigger chunks to avoid too much overhead. But you can only benefit from e.g. retries a pod level. So it's a constant balancing act…
Prefect imo solves both of these problems with their excellent flow-of-flows pattern and the fact that its Python native