Hi all. Can anyone explain how does garbage collec...
# ask-community
t
Hi all. Can anyone explain how does garbage collection works? like, will the result of
produce
stay in memory until the flow finishes, or will it be removed as soon as
consume
finishes?
Copy code
@task
def produce(url):
    return download_big_json(url)

@task
def consume(big_json):
    do_something(big_json)

with Flow('my_flow') as flow:
    urls = Parameter('urls')
    produced = produce.map(urls)
    consume.map(produced)
k
Prefect doesn’t natively handle garbage collection so this will stay until the flow finishes. Garbase collection can sometimes be done inside the task (
del
and/or
gc.collect()
). In your situation though, it seems like you can’t do this. You may have to store the results somewhere and pass the reference to downstream tasks or you can look at Results: https://docs.prefect.io/core/concepts/results.html
t
Ah, so if i use results on
produce
then
consume
will take
big_json
using
Result.read
instead of taking it from memory?
i'm asking because i've found this issue https://github.com/PrefectHQ/prefect/issues/3373 which is still open so i assume that despite using Result, the task's results are still kept in memory until the flow finishes.
k
Looks like you’re right with Results and large objects still being in memory. I think saving and passing the reference is the way to go.
I have gotten confirmation with the team about this and your understanding is right
t
ok, thanks for clarification i think i'll try modifying my Result subclass in a way that it's
read
and
write
methods return a Result instance with
value
attribute set to reference instead of an actual value. A little dirty but i think there shouldn't be problems with that.
🙌 1