Hey @Abhas P, Prefect is mainly for batch processing and not meant for streaming purposes since there is an overhead that comes from wrapping Python code as tasks and monitoring the state of it. For example, there are at least 3 API calls with each task run to update the state (starting, running, finished).
For number 2, I guess Prefect does not. You should have to spin up a Flask API independent of Prefect and then send requests to that API. Prefect spins up and down batch jobs so it doesn’t have a mechanism for keeping models in memory.
You could put the model in Docker, but the container would still have to be downloaded.