I'm still new to tools such as Prefect. I am trying to automate our onboarding process which is pulling(api/json) employee information from one source and create accounts into 3 other applications via api/json. My thought process would be creating a flow to pull the data and check to see if the accounts exists, if they don't create them. My question is: Would it be better to pass the data between the tasks or place the data into a source such as a file(csv maybe?) or database?
Darren, I have an overall flow with sub-tasks that all pass data to each other, read and write .csv files to Google Cloud storage (GCS), plus read and write from a database. It all depends on what I need for the data. If I just need the data to pull right into another subsequent task I just pass it in. If I need to archive the data then I might put the data into GCS or a database. If I need to access the data with another application I put the data into a database (Cockroachdb).
does data size matter? At most, my data shouldn't be over 1mb in total
1 month ago
Good question! With the size of that data if you are just passing data to another task I would just keep it as a variable and pass it over. <1mb data honestly you could do either of the three options I presented.