Hi Stefan (the non-german .de one)
Maybe a very strange idea, could be your better solution. If by default these operations are mmap'ed, they should be taken care of by a disk based memory extension. Hence being independent of RAM/SWAP but having all pro's of both.
You are right. In fact, this is what MonetDB is supposed to do. Memory is organized in contiguous Heap objects. This does not only hold for BATs but also for Hash Tables. At some moment, I did add the possibility to use a real memory mapped files for their allocation, as a last resort. The problem was there that the Heap() API did not have a filename parameter, and a HEAPextend() thus would not know what file to use. In order not to break the API, I decided to initialize the filename fields at BAT creation time (this is why you may sometimes see .hhash and .thash files in a just stopped MonetDB repository -- even though hash tables are not persistent). With hindsight, this API decision was bad judgement; because the question whether real file mapping is used now depends on the question whether the filename was initialized at BAT or Hash table creation. Now, this should be the case, but it would be my first question to address in a debugging session. I also recall that in case of the Skyserver TB dataset, key checking was disabled to to similar problems. Of course, if we would find and fix this bug, life will not be rosy for you, doing random access (key uniqueness checking) into a 14GB heap that will never fit your 2GB RAM. That goes back to your question whether it is in fact reasonable to materialize 14GB. Questions like these led to a redesign that is much more careful with memory consumption (X100) whose spin-off development regrettably means that my time for such debugging sessions currently is limited. Peter PS The likely cause for the problem is a shortage of swap space. Increasing your swap file size (with a few 10s of GB) should be a workaround for the problem.