Hi Stefan! No problem, I'm still in experimental mode so trying to find the limits/bounds etc. On Fri, 11 Feb 2005, Stefan Manegold wrote:
Edmund,
as small question (just short for now, I'll add more details later once I have more time ...)
are you using MonetDB on a 32bit or on a 64bit system?
Currently using a 32-bit Linux system as the test base.
on 32bit systems, a single BATs in MonetDB cannot contain more than 2^31 rows, and it cannot be larger than 2GB (on some OS maybe 4GB), what ever is less.
the reason is that BATs are basically "just" arrays (neglegting the heap for variable sized types for now), hence each BAT has to fit entirely into your address space, and that's obviously limited to 32bit on a 32bit system... actually, all BATs you're using at a time need to fit into 2GB/4GB...
I assume this limit holds for the heaps as well? Given this limit, then an average record BLOB size of 1kB would translate to about 2M rows per BAT. This would probably not be an issue if there could be persistent BATs within BATs. One solution would then be to partition the data into sets, and hoping the queries would not die as I run them over each partitioned set. Trying to manage this using the global name space would seem to be quite a hack. As an aside: even if I used the global namespace, and had temporary BATs to manage the collection of persistent BATs, how would the various algorithms perform? I assume I would need BAT loops to run over each sub-BAT, aggregate results, and then run a final query to get the real results. Would this just kill performance, or should be fairly close to "one large BAT" (assuming a query would run empty on most BATs and using some accelerator to hopefully determine that)?
I'm sorry, if I had/have to disappoint you, but holding 500M rows in a single BAT of say [void,int], i.e., requiring 4 bytes per row will already hit the 2GB limit on 32bit systems.
Sorry for that, but that's how MonetDB is designed...
Obviously, on 64bit systems, the limit is 63/64 bit address space, which should be enough for some time ... Btw, on 64bit systems MonetDB's OID's are also 64 (63) bit...
Unfortunately, this is not an option on most Intel hardware. 64 bit Xeons running a 64 bit Linux is out of bounds both on budget and on reliability without a lot of testing for us at the moment.
As I said, I'm quite busy right now (pre-release bug fixing, etc...), but I'll come back to your prevous questions later (probabaly only tomorrow, or on Sunday, though...)
Sorry for any inconveniences that these news might have caused...
Thanks for the info so far! As I said, I am currently evaluating whether this would meet our needs, so need to know what boundaries exist, and if there are any reasonable work-arounds. Regards! Ed