Q: What is a decent way to deal with a workflow that has a high number of really small tasks? Th...
w
Q: What is a decent way to deal with a workflow that has a high number of really small tasks? The workflow makes about 40k http requests that I've represented as 40k tasks, in order to use prefects retry logic and have that representation of what succeeds and fails. Obviously this doesn't work too well from the UI side of things. It doesn't present itself well in the UI and it's probably a bit harder on the server than it needs to be. These tasks do go to the same worker and do (I believe) reuse the same http client sessions, so the http requests themselves are not really inefficient. The obvious optimization is to batch groups of requests into maybe 1000 requests per task or pick a tasking level that is a bit less verbose. The down side with batching requests is I'm not sure prefect has the concept of partial success or if there's a way to only retry requests that fail in a task ( ignoring any internal retry logic to a task). Ideally there would be a way to represent a high volume subtask where 39,500 success and 500 fail, retry only the 500 failed requests. I feel like the UI doesn't show this too well but it's pretty close. (Open source server). I'm using an iterator so it does at least hide the really long list by default. Any suggestions? Am I missing a feature of prefect that might handle this better?
n
hi @walter - on first read you seem right that there's a non-ideal tradeoff here between isolating atomic operations (for retries and caching etc) and the macroscopic observability at high scale i would probably recommend something like you were mentioning, that is batching and running some quick logic inside a task that runs a batch to decide that final state of that batch and retry operations if needed for example, we have this or you could vendor that logic or use tenacity if you don't want to use an internal thing happy to hear any ideas you have around representation in the UI or anything else about operating at large scales!
w
Ok, ty, at least I wasn't missing anything obvious to handle this. I was playing with this a bit further and was finding that batching has a number of trade-offs that ended up creating more challenges. For additional context: Essentially, I'm pulling data from one data source, enriching these records by calculating some values, then submitting the resulting relationships in this data to a different place. There's a lot of repeat data in subsequent runs as this workflow is more about maintaining consistency and making sure relationships are present in the destination system. Limitations 1. I'm using caching to reduce duplicate requests which gets lost with batching 2. Failing a batch results in cascading failures and retries (perhaps not the correct term). So sticking to one task for all requests or a single task per request is kind of where I've landed. For caching, I was finding that using Prefect's caching drastically reduced the number of requests I had to make in subsequent runs. There's different caching mechanisms but I was finding the input values to a task to worked well enough. Moving to batching lost the benefit of using Prefect's caching (at least easily). Retry logic on batches just resulted in far more requested One failed request resulted in 1000 request attempts, so creates an opportunity for a cascading failure. Of course, I could handle caching or retries outside of Prefect but it would be interesting to see what prefect can do. At this point this is code golf. For now, I'm sticking to individual tasks for the 40k requests. I am initiating those tasks via an iterator that's yeilding the values to submit. Caching and retry logic is working well enough. To handle the UI issue, It would be interesting if there was a way to show summary info of the iterator that doesn't involve expanding the list of 40k tasks. (Succeeded, Failed, Running) If I was to go the route of having one task represent all 40k requests.. that would involve using caching and retry logic in a different way outside of Prefect, as a down side. It would be interesting to make a custom retry that's more recursive in a sense. The input to the task is a list, the failure could return the list of items that haven't succeeded yet and start the retry logic with the shorter list. Additionally, I don't know if there is a concept of progress for a single task I'll play with this a bit more. Unfortunately I think this pattern with be somewhat recurring for me.
n
thanks for the context! these thoughts would be really valuable to persist as a github discussion, are you willing to create one?
w
I can probably do that (if I don't forget by the weekend)