When we have to loop a task to run multiple times,...
# ask-community
t
When we have to loop a task to run multiple times, how can we avoid memory issue? I found that each time a task run, the memory steadily going up but not going down even if the variable is overwritten every time. (i’m using Prefect 2)
m
You can persist results, instead of caching them in memory (which indeed happens by default). More info on this page, particularly the sections "Persisting results" and "Caching of results in memory"
🙌 1
n
Hey @Tony Yun - can you share your code where you're looping a task to run several times?
t
hey @Nate, this is my code(this
video_data
is a large data frame which is memory intense):
Copy code
batch_size = 100
    to_fetch_channels = []
    for i, upload_id in enumerate(uploadIds):
        counter = i + 1
        to_fetch_channels.append(upload_id)
        if counter % batch_size == 0:    
            base_video_data = get_upload_details2(youtube_api_keys, to_fetch_channels)
            to_fetch_channels = []

            video_ids = [i for i in base_video_data if i not in excluded_video_data]
            
            # this video_data is a large data frame which is memory intense
            video_data = get_video_details(youtube_api_keys, video_ids)

            <http://logger.info|logger.info>(f'Loading {counter}/{len(uploadIds)} video IDs to Snowflake...')

            try:
                videos_table = build_video_table(video_data)
                snowflake_load_data(
                    "video_data",
                    merge_videos("video_data"),
                    snowflake_auth,
                    videos_table
                    )
            except DataEmptyError as e:
                logger.error(e)
@Mathijs Carlu got it. Let me try that way.