< Marvin> Are u a RAG bot Prefect Community #ask-marvin

Join Slack

<@ULVA73B9P> Are u a RAG bot?

# ask-marvin

Tom Han

08/28/2025, 4:14 AM

@Marvin Are u a RAG bot?

Marvin

08/28/2025, 4:14 AM

✅ thought for 9.3 seconds

Marvin

08/28/2025, 4:14 AM

Short answer: kind of. I’m Marvin, Prefect’s AI assistant. I use a mix of reasoning and retrieval: - I can look up current Prefect docs, GitHub issues/discussions, and CLI help to ground answers. - I also verify code and commands before suggesting them. - I don’t have access to your private data unless you share context in the chat. So I do use RAG-like tools, but I’m not just a simple retriever. What are you working on?

Tom Han

08/28/2025, 4:15 AM

What base model are u running on? 4o, gpt5, gemini-2.5-pro?

Tom Han

08/28/2025, 4:15 AM

@Marvin What base model are u running on? 4o, gpt5, gemini-2.5-pro?

Marvin

08/28/2025, 4:15 AM

✅ thought for 12.4 seconds

Marvin

08/28/2025, 4:15 AM

I don’t have visibility into the exact base model/codename. I’m exposed as Marvin with tool integrations for Prefect (docs/issues/discussions/CLI verification), and the underlying model may be updated over time. If you need a formal statement for compliance or procurement, I can route the request to the Prefect team to confirm current model details.

Tom Han

08/28/2025, 4:17 AM

@Marvin can I adapt u for self use with a different repo/different base model like gemini-2.5-flash?

Marvin

08/28/2025, 4:17 AM

✅ thought for 41.8 seconds

Marvin

08/28/2025, 4:18 AM

Short answer: you can’t “repoint Marvin” directly, but you can absolutely build your own version with your repo and a different base model (e.g., Gemini 2.5 Flash). I can outline a lightweight pattern and share starter code. Clarifying questions - Where do you want to use it: Slack, CLI, or web? - What sources: a specific GitHub repo, your docs, tickets, databases? - Any constraints: on‑prem only, budget limits, latency targets? How to build a Marvin‑like assistant (model‑agnostic) 1) Choose your base model - Use Gemini 2.5 Flash for low latency, or any provider you prefer. 2) Add “tools” for retrieval/grounding - GitHub Issues/Discussions search, docs search, CLI help, etc. - Give the model function-callable tools so it can decide when to invoke them. 3) Optional RAG - Index your docs/repo (embeddings + vector store), then use retrieved chunks as context. 4) Orchestrate and operate - Use Prefect to wrap calls, handle retries/timeouts, store secrets, log traces, and monitor runs. - Deploy behind a Slack bot/webhook for a chat UI. Minimal example (Gemini 2.5 Flash + GitHub search tool + simple router)

Copy code

# pip install google-generativeai requests

import os, requests, json
import google.generativeai as genai

GENAI_API_KEY = os.environ["GENAI_API_KEY"]
GITHUB_TOKEN = os.environ["GITHUB_TOKEN"]

genai.configure(api_key=GENAI_API_KEY)
model = genai.GenerativeModel(
    model_name="gemini-2.5-flash",
    tools=[{
        "name": "search_github_issues",
        "description": "Search GitHub issues or discussions",
        "input_schema": {
            "type": "object",
            "properties": {
                "repo": {"type": "string", "description": "owner/repo"},
                "query": {"type": "string", "description": "search terms"}
            },
            "required": ["repo", "query"]
        },
    }]
)

def search_github_issues(repo, query, n=5):
    headers = {"Authorization": f"Bearer {GITHUB_TOKEN}",
               "Accept": "application/vnd.github+json"}
    # Search both issues and discussions via issues endpoint (discussions appear if enabled)
    url = "<https://api.github.com/search/issues>"
    q = f"repo:{repo} {query}"
    r = requests.get(url, headers=headers, params={"q": q, "per_page": n})
    r.raise_for_status()
    items = r.json().get("items", [])
    return [
        {
            "title": it["title"],
            "url": it["html_url"],
            "state": it.get("state"),
            "snippet": (it.get("body") or "")[:400]
        } for it in items
    ]

def chat_once(user_message):
    response = model.generate_content(
        contents=[{"role":"user","parts":[user_message]}],
        tools=[{"name":"search_github_issues"}],
    )

    # Handle tool calls (function calling)
    if response.candidates and response.candidates[0].content.parts:
        for part in response.candidates[0].content.parts:
            if getattr(part, "function_call", None):
                fc = part.function_call
                if fc.name == "search_github_issues":
                    args = json.loads(fc.args)
                    results = search_github_issues(**args)
                    # Send tool result back to the model to get a final grounded answer
                    response = model.generate_content(
                        [
                            {"role":"user","parts":[user_message]},
                            {"role":"tool","parts":[json.dumps({"tool":"search_github_issues","args":args,"results":results})]}
                        ]
                    )
                    break
    return response.text

if __name__ == "__main__":
    print(chat_once("Find open issues about retries in prefecthq/prefect"))

Optional: wrap with Prefect for reliability and observability ``` # pip install prefect google-generativeai requests from prefect import flow, task @task(retries=2, retry_delay_seconds=5) def ask(question: str) -> str: return chat_once(question) @flow

Marvin

08/28/2025, 4:18 AM

def assistant_flow(question: str) -> str: return ask(question) if name == "__main__": print(assistant_flow("What are active discussions about deployments in prefecthq/prefect?")) ``` Slack wiring sketch - Create a Slack app (Bot token + Events API; subscribe to message.channels). - Host a small web service that forwards Slack messages to your

assistant_flow

. - Post replies back to the channel/thread. Prefect gives you logs, retries, and metrics. RAG indexing (optional) - Embed files from your repo/docs, store in a vector DB (pgvector, Pinecone, etc.). - Add a

retrieve(query)

tool that returns top-k chunks; pass them back to the model for final synthesis. If you share your target sources and runtime (cloud vs on‑prem) I can tailor a starter template and deployment plan, including secrets, rate limits, and cost controls.

Nate

08/28/2025, 2:57 PM

@Tom Han https://github.com/PrefectHQ/marvin/tree/main/examples/slackbot

can I adapt u for self use with a different repo/different base model like gemini-2.5-flash?

yea its built on pydantic-ai. there's definitely lots of idiosyncratic-to-prefect stuff here but if you asked claude code or similar to clone prefecthq/marvin and generalize the design of examples/slackbot, itd probably do a good job for you

2 Views

Open in Slack

Previous Next