[Monetdb-developers] X100
Is it possible to look at the X100 source code or more detailed information right now? What I'm trying to accomplish is basically the same, in Common Lisp. I read through Peter Boncz's Ph.D. thesis as well as the X100 paper. One thing that I don't understand is whether the X100 scan retrieves vectors from _all_ participating BATs. I'm also wondering how the size of the cache (and thus the size of the vector) is determined. Thanks, Joel -- http://wagerlabs.com/uptick
Is it possible to look at the X100 source code or more detailed information right now? As Peter mentioned, the open-source aspect of the X100 is not decided at
Hi Joel, the moment.
... One thing that I don't understand is whether the X100 scan retrieves vectors from _all_ participating BATs. The Scan operator in one next() call delivers vectors from all the relevant _columns_ (we don't have BATs anymore). Naturally, the scans are buffered so the I/O granularity is much larger than a vector size.
I'm also wondering how the size of the cache (and thus the size of the vector) is determined. Currently there is no automatic tuning of the vector size. Usually we use a size of ca.1k but for some queries and/or hardware platforms other sizes would be better. We plan to do some work on this issue in the future to make the system more complete. To find the memory system characteristics we will most probably use a "Calibrator" program developed by Stefan Manegold: http://homepages.cwi.nl/~manegold/Calibrator/calibrator.shtml It is also a part of the MonetDB distribution and a module available in MIL.
Thanks for your interest in our work and good luck with your project. Regards, Marcin
Marcin, On Aug 2, 2005, at 4:58 PM, Marcin Zukowski wrote:
The Scan operator in one next() call delivers vectors from all the relevant _columns_ (we don't have BATs anymore). Naturally, the scans are buffered so the I/O granularity is much larger than a vector size.
The X100 paper mentions that the engine is based on top of MonetDB. Are you still using BATs at the lowest level or did you move to a different way of storing data? Could you elaborate on the new disk structure if you are no longer using BATs? Thanks, Joel -- http://wagerlabs.com/uptick
The Scan operator in one next() call delivers vectors from all the relevant _columns_ (we don't have BATs anymore). Naturally, the scans are buffered so the I/O granularity is much larger than a vector size. The X100 paper mentions that the engine is based on top of MonetDB. Are you still using BATs at the lowest level or did you move to a different way of storing data? BATs can still be used using a specialized Scan operator, but only if
Joel, they are void-headed, i.e. physically they are a single column. MIL is additionally used as X100 front-end and to implement features that X100 currently misses. We currently work on the ColumnBM storage manager, but there is no detailed description available yet. The overview of the system you can find in "MonetDB/X100 - A DBMS In The CPU Cache", available here: ftp://ftp.research.microsoft.com/pub/debull/A05june/issue1.htm regards, Marcin
participants (2)
-
Joel Reymont
-
Marcin Zukowski