Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

We are close getting our initial orchestration pipeline ported :slightly_smiling_face: we're a bit confused on how to get long jobs running. tips appreciated.

setup:
-- server 'ui': running the ui container
-- server 'gpu': running a prefect agent as well. registers with ui so it can pick up gpu jobs.
-- server 'nb': jupyter notebooks we're using to submit jobs. has a local prefect agent installed that points to 'ui' so we can submit jobs. notebooks often die

we can do quick one-offs fine. hurray!

*tricky case 1: long historic job*
we want to do a ~3 day job that processes 200 files, one at a time sequentially in sorted order.  the problem is notebook server that runs the job will periodically stop, so we really want to submit a job like `seq([ task_1(file_1), task_2(file_2), ... task_n(file_n)])` . as soon as the meta-task is submitted, the notebook (and its local agent) can stop. however, for the next 3 days, we want those tasks to run one at a time, and we see status in the ui (incl. fails/retries). if we ever want to, we can rerun the flow to add/swap tasks.

It possible to run these as 200 ephemeral notebook servers? Do they have to be sequential?

No, we need ~one notebook to kick off one chain of jobs