08/02/2023, 9:32 AM
HI Prefect Devs, Continuing the clean up processes of Prefect OSS database, I have more questions , this time regarding the
table. Question and details in thread. TLDR: Why are there so many
Local File System
records in
table, even when I am not using Local File system directly on my code? Is it safe to just delete them?
My prefect production database seems to have large number of records in
table. I did a quick analysis.
Copy code
select,count(1) from block_document bd , block_type bt 
and bd.updated >= now() - interval '1 week'
group by;
Results are as follows:
Copy code
Kubernetes Job,19
Date Time,11
Local File System,17072
NOTE: This is data for only 1 week. I decrypted the
with the Fernet key. The decrypted values are the same and refer to the PREFECT_DIRECTORY base path. This is pain ful as the table is estimated to be 50+ GB for us and most of the values are just encrypted duplicates of the same value.
SO I ran a sample flow in a fresh Prefect Database.
Copy code
import prefect
from prefect import flow, task

def ta():
    prefect.get_run_logger().info("THis is from task")

def sub_flow():
    prefect.get_run_logger().info("This is sub flow")

def main_flow():
    prefect.get_run_logger().info("This is main flow")

if __name__ == "__main__":
This created a single record in block document. I took the dump sql of the entire database in text format, searched for the ID of the block document record and there are no references to that ID in any other tables.
Unsure as to why a record is being created in the block document with every flow run. I would like to know if it is safe to run the following sql.
Copy code
select bd.* from block_document bd 
where bd.updated < now()-interval '1 week'
and is_anonymous = True
and exists (
	select 1 from block_type bt where 
	and'Local File System'
*delete instead of select
Tagging @Jake Kaplan cuz hes the man 😄

Jake Kaplan

08/02/2023, 1:40 PM
I don't think it's dangerous to delete. Would you mind filing an issue for this? I am pretty sure this a bug? I don't see why we need a different anonymous block every time. I believe it's for default result storage path, which I would think could get or create the same block


08/02/2023, 1:41 PM
I think a duplicate bug has already been filed by me ages ago.
Let me try find it
🙏 1
This is not the same but similar