https://prefect.io logo
Title
t

Ton Steijvers

02/28/2023, 10:41 AM
Hello. When I call a task and pass a Block instance, it looks as if the task receives a copy of the block instance, not the actual reference:
@task
def mytask(obj):
  get_run_logger().info(f"in mytask: {id(obj)}")  # task shows different id than flow
  with obj.get_connection() as connection:
    connection.execute_string("SELECT 1;")

@flow
def myflow():
  obj = SnowflakeConnector.load(block_name)
  get_run_logger().info(f"in flow: {id(obj)}")
  mytask(obj)
  mytask.submit(obj)
Is this behaviour by design that I can rely upon? Are there cases where I would get the actual reference instead of a copy? _After having used the (SnowflakeConnector) object in the task, its underlying connection might have been closed. If subsequent tasks would receive the same instance, this would result in a "Connection already closed" exception. Getting a copy ensures that I can safely call get_connection in every task._
c

Christopher Boyd

02/28/2023, 1:59 PM
Maybe I’m not quite understanding your question, but you are creating a local object variable in
myflow()
then passing it as a reference to
mytask
. I’m fairly sure this is expected / consistent behavior unless you made it a global and were passing in a global reference?
t

Ton Steijvers

02/28/2023, 2:49 PM
Yes, it is a local object. If fact, the object might be the result of another task. Since it is an object, I was expecting it to be passed by reference. If I would have created a "regular" (not a BaseModel) object it would behave differently, i.e. it would get passed by reference and the id would be the same in both the flow and the task. But the object being a pydantic BaseModel causes a copy of it to get passed to the task. That is confusing so I want to make sure the behaviour I'm seeing is expected.
Example to clarify:
class Foo:
    pass

@task
def create_foo() -> Foo:
    foo = Foo()
    get_run_logger().info(f"created {id(foo)}")
    return foo

@task
def show_foo(foo: Foo):
    get_run_logger().info(f"showing {id(foo)}")

@flow
def myflow():
    foo = create_foo()
    show_foo(foo)
this logs:
created 4884560672
showing 4884560672
Same id is shown, so it's the same object. If class foo would have been
class Foo(BaseModel)
then the logs show this:
created 5092601472
showing 5092876640
different ids, the show_foo gets a copy of the object instead of a reference to the original object.
c

Christopher Boyd

02/28/2023, 3:02 PM
I can’t say that I know the answer here unfortunately , I’ll have to defer to someone else on this behavior
t

Ton Steijvers

03/01/2023, 9:52 AM
I see in prefect utilities
collections.py
that it explicitly makes copies if the object is a BaseModel instance. Would be great to know the thought behind this and whether I can rely on it in my code.