I am really struggling with controlflow in combination with Prefect Community #marvin-ai

I am really struggling with controlflow in combina...

mira

09/29/2024, 11:06 AM

I am really struggling with controlflow in combination with the ChatGroq model. If a flow has only one task, everything works fine (str output as well as structured output), but as soon as a flow gets more tasks (>2) the orchestration freezes. The issue which triggers this behaviour is always different, but what all attempts have in common is, that a tool use fails. Sometimes it is the tool for marking a task as completed, sometimes it is hallucinating about a tool it doesn't have (Tool use failed: no tool can be called with name generate_toc_from_syllabus ), for structured output it sometimes freezes with "indefinitely" repeating "tool use failed" without mentioning the tool or input anymore (does it loose the information?) or it just forgets "what to do" at all and is just waiting or the orchestrator keeps explaining or greeting but the assistant doesn't answer ... In general it is interesting to see that tasks fail when having 3 tasks in a flow, which are successful when having just 2 tasks in the flow ... Do you have any advice for me? I tried to get the example from the docs to run (interactive language tutor) but the only possibility to get it to work was to additionally introduce a tool to generate the structured output (but works still not stable): Without get_lesson: ╭─ Agent: Tutor ────────────────────────────────────────────────────────────────────────╮ │ │ │ Great, I'm excited to create a fun and engaging French lesson for you, Mira! Let's start with │ │ a few warm-up exercises. │ │ │ │ Tool use failed: │ │ │ ╰────────────────────────────────────────────────────────────────────── 122330 PM ─╯ The additional tool:

Copy code

@cf.tool
def get_lesson(topic: str) -> Lesson:
    """create a lesson about `topic`"""
    pass

I also tried a slightly different example, which I didn't get to run at all. Example:

Copy code

import controlflow as cf
from pydantic import BaseModel

class Course(BaseModel):
    subject: str
    syllabus: str
    content_table: list[str]
    exercises: list[str]

tutor = cf.Agent(
    name="Marvin",
    model="groq/mixtral-8x7b-32768",
    instructions="You are a diligent and conscientious tutor."
)

subject = "SQL"
level = "intermediate"

#@cf.tool
#def get_course(topic: str, syllabus: str, content_table: list[str], exercises: list[str])-> Course:
#    """create a course with the given `topic`, `syllabus`, `content_table` and `exercises`."""

with cf.Flow(default_agent=tutor):

    syllabus = cf.Task(
        f"Create a detailed syllabus for an interesting {level} course on {subject}.",
        result_type=str,
    )

    content_table = cf.Task(
        "Create a table of contents for the course based on the syllabus.",
        result_type=list[str],
        depends_on=[syllabus]
    )

   exercises = cf.Task(
        "Create a list of interesting exercises based on the course content table and the course level.",
         instructions="""
            - Create one exercise per topic of the content table
            - An exercise should be either a detailed exercise the user can solve with the given information
              or an interesting question in the field of the specific topic
        """,
        depends_on=[content_table],
        result_type=list[str]
    )

    course = cf.Task(
        "Put the previously created parts of the course into a course object",
       instructions="Use the previously created syllabus, content table and exercises for the course.",
        depends_on=[exercises],
        #tools=[get_course],
        result_type=Course,
    ).run()

One example for the failing tool use: │ │ I apologize for the confusion. Here is the tool call for marking task 3bf3e5d3 as successful: │ │ │ │ Tool use failed: │ │ │ ╰──────────────────────────────────────────────────────────────────────── 114905 AM ─╯ ╭─ Agent: Marvin ────────────────────────────────────────────────────────────────────╮ │ │ │ │ │ Please let me know if you have any further questions or concerns.

Jeremiah

09/30/2024, 8:48 PM

Hi Mira, I'm also seeing odd hanging behavior with

groq/mixtral-8x7b-32768

on the language tutor example

Jeremiah

09/30/2024, 8:49 PM

It does work well with

groq/llama3-70b-8192

, so I think this is an issue with the model rather than groq as a provider

Jeremiah

09/30/2024, 8:49 PM

My mixtral model attempted to call a bunch of functions that don't exist; I'm wondering if the model struggles to follow instructions well? Perhaps we need to introduce special rules for it

mira

10/01/2024, 7:15 AM

Hi Jeremiah, thank you for your fast response. I should also mention how much I love your new controlflow framework! It's exactly what we need for AI-backed applications. And it is amazing how easy and straightforward controlflow let's you define the tasks 🚀. Thank you for this powerful new tool! And all my attempts with openai have worked well. I tried a bunch of the groq models:

Copy code

# model = "mixtral-8x7b-32768",
  # model = "llama-3.1-70b-versatile",
  # model = "llama3-8b-8192"
  model = "llama3-70b-8192",

and each of them indeed behaves different. But unfortunately none of them was able to produce more than one exercise with the learning tutor (if any at all). This is the error message of my last attempt with llama3-70b-8192 after the agent tried to mark the first exercise as successful:

Copy code

╭─ Agent: Tutor ───────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                  │
│  Tool use failed: parameters for tool call 'mark_task_fa429c74_successful' do not match the      │
│  expected schema, errors: result.content: Invalid type. Expected: string, given: object;         │
╭─ Agent: Tutor ───────────────────────────────────────────────────────────────────────────────────╮
│                                                                                                  │
│  Tool use failed: parameters for tool call 'mark_task_fa429c74_successful' do not match the      │
│  expected schema, errors: result.content: Invalid type. Expected: string, given: object;         │
│  result.exercises: Invalid type. Expected: array, given: object; result.topic: Invalid type.     │
│  Expected: string, given: object                                                                 │
│                                                                                                  │
╰────────────────────────────────────────────────────────────────────────────────────  7:59:00 AM ─╯

I also tried some different, easier examples (but with structured output and more than 2 tasks), it did unfortunately not work with any groq model, they always (!) freeze because of any tool failure (always different failures). Hopefully you can get them under control ;).

Jeremiah

10/01/2024, 8:46 PM

Glad to hear it (about the framework!) and sorry to hear it (about Groq) -- hopefully we can get it to perform better soon

17 Views

Open in Slack

Previous Next