In order to get automatic dependency detection between tasks Prefect Community #ask-community

In order to get automatic dependency detection bet...

Taylor Brown

12/02/2024, 7:15 PM

In order to get automatic dependency detection between tasks that are invoked directly, I'll need to make sure I return something from my task other than a primitive, and pass that into the next task that depends on it? It appears that if my task returns only an int, using that int in future tasks is not detected as a dependency (unless I invoke it with .submit). That makes sense to me, as I can't imagine how else it would work, but wasn't made clear by the documentation. (What's further surprising is that calling

.visualize

on my flow is smart enough detect primitive dependencies even if they are primitives. ) Example:

Copy code

@task(log_prints=True, viz_return_value=3)
def get_random_number() -> int:
    return random.randint(0, 100)


@task(log_prints=True, viz_return_value={"random_number": 3})
def get_random_number_object() -> dict[str, int]:
    return {"random_number": random.randint(0, 100)}


@task(log_prints=True)
def print_random_number(random_number: int | dict[str, int]) -> None:
    if isinstance(random_number, dict):
        random_number = random_number["random_number"]
    print(f"The random number is: {random_number}")


@flow(log_prints=True)
def random_number_flow():
    print("Invoking directly (dependency not detected):")
    print_random_number(get_random_number())

    print("Calling object-returning task (dependency auto-detected)")
    print_random_number(get_random_number_object())

    print("Invoking via submit (dependency auto-detected):")
    random_number_result = get_random_number.submit()
    print_random_number.submit(random_number_result).wait()

    print("Invoking via submit.result() (dependency not detected):")
    random_number_result = get_random_number.submit()

👀 1

Bianca Hoch

12/02/2024, 11:15 PM

Hey Taylor, thanks for sharing this! Yes, typically when defining dependencies between flows and tasks, you need to pass the result from one flow/task to the other (result = any returned value from a task or flow). So something like:

Copy code

@task
def task_a():
    return "Hello"

@task
def task_b(input_string):
    return f"{input_string} World"

@flow
def my_flow():
    a_result = task_a()
    b_result = task_b(a_result)

or with .submit() -

Copy code

@task
def say_hello(name):
    return f"Hello {name}!"

@task
def print_result(result):
    print(type(result))
    print(result)

@flow(name="hello-flow")
def hello_world():
    future = say_hello.submit("Marvin")
    print_result.submit(future).wait()

It's worth noting that you can also define dependencies using

wait_for

, even if there's no data being passed from one task to another.

wait_for

requires that the task wait for upstream tasks to finish before execution. ie:

Copy code

@flow
def my_flow():
    a_future = task_a.submit()
    b_future = task_b.submit(wait_for=[a_future])

Taylor Brown

12/02/2024, 11:18 PM

Thank you! This is clever. It looks like the automatic dependency injection only breaks down in the case where I am reliant upon an integer result though, right? If you changed your task_a to

return 5

, Prefect no longer knows that task_b depends on task_a. (It wasn't documented anywhere, and I used an integer result to test dependency resolution, so I almost thought Prefect couldn't detect dependencies.)

Bianca Hoch

12/04/2024, 11:33 PM

Hmm okay I definitely see your point. After speaking with the team about this, my understanding is that the

int

may be too small to track effectively. It has to do with how python uses memory for small data.

Bianca Hoch

12/04/2024, 11:34 PM

Oddly enough if I do something like this, I'm able to start seeing the dependencies:

Copy code

from prefect import flow, task

@task
def task_a():
    return 500

@task(log_prints=True)
def task_b(my_number):
    print(f"The number is: {my_number} ")
    

@flow
def my_flow():
    a_result = task_a()
    b_result = task_b(a_result)
    
if __name__ == "__main__":
    my_flow()

Bianca Hoch

12/04/2024, 11:35 PM

Things start breaking down with smaller numbers, though batman think

Taylor Brown

12/04/2024, 11:35 PM

Oh how interesting! Was it literally just the size of the number that allowed it to work? Fascinating!

nod 1

Taylor Brown

12/04/2024, 11:36 PM

I didn’t realize how Prefect was tracking these dependencies. It’s much more interesting than I had assumed!

7 Views

Open in Slack

Previous Next