An Hoang

04/17/2020, 2:50 PM
Hello I’m trying a simple flow of doing linear regression in batches. The flow works when doing it sequentially but when I try with Dask backend it causes memory problems. What is confusing is that there is ample memory per worker. Can someone help me identify the problem?
The Dask Dashboard looks normal except for more memory consumption than expected (~12GB). But when the task
finishes it loops around and runs it again infinitely, indicating that something failed
Here is all code in one place version