Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

Hey folks, I want to use Prefect for some of our AI/ML workflows (in addition to ingest, etc., which we're currently using it for) in conjunction with NVidia's Triton Inference Server.  I want to try to get a sense for how to deal with long-running persistent connections across tasks (like giving them all access to the same triton client rather than constantly recreating the connection), and how serialization for parameters and return values will work for things like preprocessing and inference results (multidimensional numpy arrays, etc.)  I've run into issues with bsonids from MongoDB documents not being serializable and needing to stringify them, but that's not really a path I want to go down for large binary content.

hi <@U0201NVNMHR> - if you want to have a process that farms out work to a bunch of tasks (perhaps reusing a client), perhaps you could write a flow like
```from prefect import flow, task, unmapped
from pydantic import BaseModel, Field

class DataModel(BaseModel):
   foo: str
   bar: int
   baz: list = Field(default_factory=list)

@task
async def call_my_service(item, client):
    pass

@flow(log_prints=True)
async def ai_workflow(input_data: DataModel):
   print(f"got {input_data}")
   
   client = get_some_service_client()
   
   await call_my_service.map(input_data.baz, unmapped(client))

if __name__ == "__main__":
   import asyncio
   asyncio.run(
      ai_workflow(
         {"foo": "marvin", "bar": 42, "baz": [1, 2, 3]}
      )
   )```
where you're sharing the client instance across tasks and deserializing via the model given to the flow signature

not exactly sure how pydantic plays with binary formats as you mention, but perhaps this could be a starting place

I am uncertain that I could rely on data arriving in batches to do that, but I have definitely found success in the mapping technique in Prefect 1.0. when I could.  For inference workloads ideally I want to maximize throughput and have each request independently traceable.  I'll probably just try it live this week, but since I know there's data science folks about, I thought they might have some lessons learned in  specific applications

you’re probably right about this!
&gt; since I know there's data science folks about, I thought they might have some lessons learned in specific applications
feel free to ask if you've specific questions about prefect / infra stuff