Noam
09/04/2024, 1:24 PMNate
09/04/2024, 1:29 PMNoam
09/04/2024, 1:54 PMNate
09/04/2024, 2:01 PMSrul Pinkas
09/04/2024, 2:02 PMrun
object (spamming the dashboard / adding load to prefect-DB as an independent run) , right?Nate
09/04/2024, 2:03 PMSrul Pinkas
09/04/2024, 2:07 PMNate
09/04/2024, 2:11 PMDevin McCabe
09/05/2024, 1:15 PMNate
09/05/2024, 3:15 PMserve
and task workers are built for static infrastructure setupsDevin McCabe
09/05/2024, 3:28 PM@task
decorator, like in this Flyte example. I'm not aware of a Prefect executor that supports such a method, though.
For example, you could imagine writing a simple Prefect flow that has a few tasks that do simple data munging, none of which would require much memory or special compute, but another task that fits a model needs a larger, GPU-accelerated instance.Srul Pinkas
09/05/2024, 3:43 PMresult = some_task.submit(input=something, memory=xx, cpu=yy, external_resources=True)
where the resources are not part of the current flow-run resources. The scenario i have requires running 500 sub-training with different inputs, and then merging the results to a single ensemble. I think Metaflow
also supports this in their task ("step") decorator if i remember correctly.
Doing this without sub-flows will require a huge machine to be used (perhaps even too big) for no reason.
Using sub-flows like you mentioned @Nate, can be a good solution - but it does have some overhead (defining it as a separate flow and managing its deployment ; spamming the UI a bit with flow-runs even though they are all tasks of a parent and have no meaning of their own ; input-output json parsing for each sub-flow...).Devin McCabe
09/05/2024, 3:45 PMSrul Pinkas
09/05/2024, 3:46 PMNate
09/05/2024, 3:54 PMthis sounds like aresult = some_task.submit(input=something, memory=xx, cpu=yy, external_resources=True)
TaskRunner
to me (i.e. we currently have dask, ray), it sounds like you might like to see something like a ModalTaskRunner
that doesn't require you to engage with work pools as a means of configuring infra, but allows passing infra config at runtime (which is also possible with subflows / run_deployment via job_variables
)
A first-class solution in Prefect would look more like being able to specify CPU/GPU/memory/disk/Docker image/etc. as part of thethis is similar to how thedecorator@task
Flow
object worked in prefect 1, and can definitely see the value in doing something like that for tasks
if anyone wants to codify a specific DX ask in an issue, please feel free!Devin McCabe
09/05/2024, 8:41 PM