I have a table of 93 columns and ~90M rows which is a 60GB "bat" directory on disk. Mostly I'm doing things with a few columns - Monet has been _amazing_ for those use cases so thanks for all your hard work. However sometimes I need to return all columns for a small subset of rows and I can't quite understand what Monet is doing or why it is taking a long time.
Here's some timings for 'select * from table order by "unsorted_numerical_column" limit N;' caches all flushed before each query:
N=1 907ms
N=100 3105ms
N=200 6782ms
N=500 14656ms
N=1000 30100ms
N=10000 71093ms
For the last two my 32GB ram is being completely filled with the cache. I'm clearly missing something in my understanding as I don't see why that should be needed. Surely you only need to read the whole of the sort column and then grab only N rows from all the other columns? Is there anything I can do to reduce the amount of data loaded? The output of "explain" is at
https://gist.github.com/benjeffery/30e9f6173b625f5d0c31.
Thanks,
Ben