https://prefect.io logo
Title
r

Ross Leung

05/25/2023, 7:36 PM
@Zanie not sure who else to tag, but I will also create an issue on Prefect’s github. Possible code braking bug found with a complete reproducible code included in this thread. Basically, if a dataclass has a
__post_init__
method and checks and initializes a
self
parameter, the
__post_init__
gets called AGAIN when the a task run gets created.
1
Code to reproduce the bug.
import prefect
from prefect import flow, task
from dataclasses import dataclass

@dataclass
class testDC:
    frodo: int
    baggings: int = None

    def __post_init__(self):
        print("post_init triggered")
        if self.baggings is None:
            print("self.baggings is None triggered")
            self.baggings = self.frodo * 2
        else:
            print("self.baggings should not be checked again. This should never be triggered.")
            self.baggings = 420

# Simple task that takes in an instance of the test dataclass above and prints it. Does no manipulation.
@task(log_prints=True)
def testTask(dc):
    print(dc)

# Simple flow that takes in a parameter, instantiates the test dataclass, and pass the dataclass to the task.
@flow(log_prints=True)
def testFlow(a: int):

    foo = testDC(a)
    testTask(foo)

if __name__ == "__main__":

    # Run the flow with a simple parameter
    testFlow(a=1)
    print(prefect.__version__)
Prefect log showing the post init method being called again, and thus overriding the correct value:
12:32:41.624 | INFO    | prefect.engine - Created flow run 'russet-doberman' for flow 'testFlow'
12:32:41.775 | INFO    | Flow run 'russet-doberman' - post_init triggered
12:32:41.777 | INFO    | Flow run 'russet-doberman' - self.baggings is None triggered
12:32:41.811 | INFO    | Flow run 'russet-doberman' - Created task run 'testTask-0' for task 'testTask'
12:32:41.812 | INFO    | Flow run 'russet-doberman' - Executing 'testTask-0' immediately...
12:32:41.817 | INFO    | Flow run 'russet-doberman' - post_init triggered
12:32:41.818 | INFO    | Flow run 'russet-doberman' - self.baggings should not be checked again. This should never be triggered.
12:32:41.879 | INFO    | Task run 'testTask-0' - testDC(frodo=1, baggings=420)
12:32:41.923 | INFO    | Task run 'testTask-0' - Finished in state Completed()
12:32:41.969 | INFO    | Flow run 'russet-doberman' - Finished in state Completed('All states completed.')
2.8.6
I’m using
Prefect 2.8.6
z

Zanie

05/25/2023, 7:47 PM
You can avoid this by using
quote
on the parameter
r

Ross Leung

05/25/2023, 7:49 PM
So where should the quote be at? testFlow(‘a’=1), testTask(‘foo’), or testDC(‘a’) ?
z

Zanie

05/25/2023, 7:50 PM
testTask(quote(foo))
— then we won’t traverse your dataclass
from prefect.utilities.annotations import quote
r

Ross Leung

05/25/2023, 7:50 PM
oh
Thanks 🙂 So is this considered a “feature” in Prefect? I thought I read the Prefect documentation pretty extensively but I guess I missed the note on this non-Pythonic behavior?
z

Zanie

05/25/2023, 7:52 PM
Well, it’s not a feature that we reinitialize the dataclass
The feature is that if you pass futures from upstream tasks downstream we automatically resolve them
The future can even be assigned to an attribute of a dataclass
r

Ross Leung

05/25/2023, 7:53 PM
icicic
That’s why the post init was called again.
z

Zanie

05/25/2023, 7:53 PM
Unfortunately recursively resolving things like that requires us to copy the data structures
Yeah
r

Ross Leung

05/25/2023, 7:53 PM
Thanks!!!!! You were very quick and very helpful!
z

Zanie

05/25/2023, 7:54 PM
We could probably avoid reinitializing things, but it’s tricky.
r

Ross Leung

05/25/2023, 7:54 PM
My solution is actually to not check the self.parameter and set the value to itself. I will use a different flag that doesn’t get changed.
so I don’t need to use the
quote()
throughout my code
z

Zanie

05/25/2023, 7:55 PM
👍 We’re also considering a setting that you could toggle to disable this behavior for dataclasses and pydantic models
Since it’s not exactly common for people to pass futures in those
r

Ross Leung

05/25/2023, 7:55 PM
I feel like this is a relatively “silent” behavior. I think it’s good to still create the issue on github so others can search for it?
z

Zanie

05/25/2023, 7:55 PM
Yeah feel free!
r

Ross Leung

05/25/2023, 7:55 PM
thanks!!!
have a nice day
z

Zanie

05/25/2023, 7:55 PM
There have been some complaints about Pydantic and there could definitely be more docs
you too
r

Ross Leung

05/25/2023, 7:55 PM
Prefect’s great btw, i love it
❤️ 2