Iván Sánchez
04/01/2022, 7:49 AMMatthias
04/01/2022, 8:53 AMAnna Geller
04/01/2022, 10:08 AMMatthias
04/01/2022, 11:05 AMIf your problem is serving ML models, I'd agree that Prefect is not directly intended for this. It would be better to build a custom API or use some dedicated tool for thatUnless of course, you do batch predictions (in which case Prefect is the perfect tool to do that)
Anna Geller
04/01/2022, 11:19 AMIván Sánchez
04/01/2022, 11:32 AM@task
def process_doc(doc):
ocr_engine = OCREngine() # This will unnecesarily load the model every time the task is called
for page in doc:
print(f'Processing page `{page.page_num}` from doc `{doc.name}`')
ocr_engine.process_page(page)
return doc
ocr_engine = OCREngine()
is being instantiated inside the task. This is suboptimal because it gets loaded every time the task is called. I'd like to have a pool of N
ocr_engines that work in parallel and the tasks are the process_page
of each of the instantiated ocr_engines
Anna Geller
04/01/2022, 11:57 AMIván Sánchez
04/01/2022, 12:02 PMAnna Geller
04/01/2022, 12:22 PMKevin Kho
04/01/2022, 2:03 PM