Hi Fabian,

Thank you very much for your email.

I had the same concern about the accumulation error of one-scan approach as you do, even though mathematically they should be exactly the same. I remember about 10 years ago, I did some experimentations on both of the approaches and I did not found any noticeable differences between the results, but the sample size I tried then is not huge.

For the sample size of the order 1 trillion, I have the same concern even with the two-scan approach. For example, After 500 billion entries have been added to the sum of square, the rest may not be able to be added up anymore, since it may well with the range of accumulation errors! For a huge BAT, if the precision is top priority, we may need to use an hierarchical approach: divide the huge sample into some manageable block, compute it for each block first, and then aggregate the results.

Thanks,
Yinhe