Trying to understand select performance when returning many columns

I have a table of 93 columns and ~90M rows which is a 60GB "bat" directory on disk. Mostly I'm doing things with a few columns - Monet has been _amazing_ for those use cases so thanks for all your hard work. However sometimes I need to return all columns for a small subset of rows and I can't quite understand what Monet is doing or why it is taking a long time. Here's some timings for 'select * from table order by "unsorted_numerical_column" limit N;' caches all flushed before each query: N=1 907ms N=100 3105ms N=200 6782ms N=500 14656ms N=1000 30100ms N=10000 71093ms For the last two my 32GB ram is being completely filled with the cache. I'm clearly missing something in my understanding as I don't see why that should be needed. Surely you only need to read the whole of the sort column and then grab only N rows from all the other columns? Is there anything I can do to reduce the amount of data loaded? The output of "explain" is at https://gist.github.com/benjeffery/30e9f6173b625f5d0c31 https://gist.github.com/benjeffery/30e9f6173b625f5d0c31if. Thanks, Ben
participants (1)
-
Ben Jeffery