Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

<@U02GMEZU18B> Are you using TurboPuffer cloud service?

Very cool. Can you share any other alternatives you've considered?

<https://www.trychroma.com/|chromadb> is my go to for almost everything! its open source and they also recently started hosting a managed solution

ive primarily used chromadb over time and dabbled in turbopuffer, pinecone (not a fan) and pgvector

i'd like to spend more time with <https://github.com/pydantic/pydantic-ai/blob/main/pydantic_ai_examples/rag.py|pgvector>

well I do use chromadb in a lot of places currently. i haven't updated the slackbot implementation in quite a while so tpuf was just the vectorstore i was checking out when I deployed the slackbot

i will eventually rewrite the slackbot using controlflow or pydantic ai, I just haven't gotten around to it.

imo the RAG performance bottleneck is not "who gives the best cosine similarity between documents" but instead how to
• enrich documents with metadata at ingest time
• empower the AI to filter by this metadata at query time
• combine traditional search with semantic search
who gives me cosine similarity is not the most consequential in my eyes, they're all quite similar