Ton Steijvers
02/28/2023, 10:41 AM@task
def mytask(obj):
get_run_logger().info(f"in mytask: {id(obj)}") # task shows different id than flow
with obj.get_connection() as connection:
connection.execute_string("SELECT 1;")
@flow
def myflow():
obj = SnowflakeConnector.load(block_name)
get_run_logger().info(f"in flow: {id(obj)}")
mytask(obj)
mytask.submit(obj)
Is this behaviour by design that I can rely upon? Are there cases where I would get the actual reference instead of a copy?
_After having used the (SnowflakeConnector) object in the task, its underlying connection might have been closed. If subsequent tasks would receive the same instance, this would result in a "Connection already closed" exception. Getting a copy ensures that I can safely call get_connection in every task._Christopher Boyd
02/28/2023, 1:59 PMmyflow()
then passing it as a reference to mytask
. I’m fairly sure this is expected / consistent behavior unless you made it a global and were passing in a global reference?Ton Steijvers
02/28/2023, 2:49 PMclass Foo:
pass
@task
def create_foo() -> Foo:
foo = Foo()
get_run_logger().info(f"created {id(foo)}")
return foo
@task
def show_foo(foo: Foo):
get_run_logger().info(f"showing {id(foo)}")
@flow
def myflow():
foo = create_foo()
show_foo(foo)
this logs:
created 4884560672
showing 4884560672
Same id is shown, so it's the same object.
If class foo would have been class Foo(BaseModel)
then the logs show this:
created 5092601472
showing 5092876640
different ids, the show_foo gets a copy of the object instead of a reference to the original object.Christopher Boyd
02/28/2023, 3:02 PMTon Steijvers
03/01/2023, 9:52 AMcollections.py
that it explicitly makes copies if the object is a BaseModel instance. Would be great to know the thought behind this and whether I can rely on it in my code.