Re: [Monetdb-developers] Tuple reconstruction algorithm
Hi Tim, [I prefer to keep the mailing list involved to share the discussion; I hope you don't mind ... ;-) ] On Mon, Jul 04, 2011 at 07:30:36PM +0200, Tim Speier wrote:
Hello Stefan,
thanks for your fast reply.
I am especially interested in tuple reconstruction mechanisms in column-oriented databases. So I'm not a user.
I see; please also consider reading our papers (cf., http://repository.cwi.nl/search/searchrepository.php?stheme=INS1), in particular http://repository.cwi.nl/search/fullrecord.php?publnr=14290 http://repository.cwi.nl/search/fullrecord.php?publnr=11117 and/or these PhD these: http://repository.cwi.nl/search/fullrecord.php?publnr=14832 http://repository.cwi.nl/search/fullrecord.php?publnr=14301 http://repository.cwi.nl/search/fullrecord.php?publnr=16706 Basically, "even" in main memory, sequential access is fast, while random access is slow.
Further, for any set of BATs the represent the individual columns of a relational table, the BATs' "heads" (first/left column) are aligned and consist of OIDs that form a dense (i.e., increment == 1) ascending sequence of integer numbers starting at a given offset (usually 0). Thus, these OIDs can indeed be seen as tuple-/record- identifiers. Be aware, though, that these OIDs are only used internally and not available as tuple-/record- identifiers in SQL. Given their denseness properly, these OIDs are usually not materialized (read: physically stored) but rather computed on the fly, and they effectively indicate (module offset) the physical position of the respective record's attribute values in all BATs involved.
I am sorry, but it is not yet clear to me. Let me create an example: A relation R with two attributes X and Y. It consists of the following tuples:
+---+-----+ | X | Y | +---+-----+ | A | 300 | +---+-----+ | B | 100 | +---+-----+ | C | 400 | +---+-----+
Now let's create two BATs: each record gets its own OID and, my naive approach, map each attribute values to a BAT:
BAT X BAT Y +---+---+ +---+-----+ |OID|VAL| |OID|VAL | +---+---+ +---+-----+ | 0 | A | | 0 | 300 | +---+---+ +---+-----+ | 1 | B | | 1 | 100 | +---+---+ +---+-----+ | 2 | C | | 2 | 400 | +---+---+ +---+-----+
Now I could reconstruct each tuple by its OID and the index. But I would loose some advantages of the column-oriented databases.
Like which? I guess you mean having each column sorted individually?
A more sophisticated approach would be to store values in ascending order like this:
BAT X BAT Y +---+---+ +---+-----+ |OID|VAL| |OID|VAL | +---+---+ +---+-----+ | 0 | A | | 1 | 100 | +---+---+ +---+-----+ | 1 | B | | 0 | 300 | +---+---+ +---+-----+ | 2 | C | | 2 | 400 | +---+---+ +---+-----+
But how can I now reconstruct the tuples efficient now? Is there an additional index structure (as you said in your foregoing email). ?
In the first case, tuple reconstruction is cheap (almost "for free"), as columns are physically aligned, and hence, highly efficient in-order positional lookup can be exploited for tuple reconstruction. Obviously, with keeping the columns physically aligned does not allow to order them individually according to their values. Thus without further indeces, value-lookups using binary search are only possible for on sorted column; for all other columns, some secondary index structure would be required. MonetDB only used (automatically) hash-indeces for single-value (point-)lookups, but resorts to (highly optimized) sequential scans for any range lookups on non-sorted columns. In the second case, value-lookups can be done (reasonably) efficient using binary search on each column, however tuple-reconstruction (i.e., OID lookups) can no longer be done with simple highly efficient in-order positional lookups. In that case, MonetDB would (automatically on-the-fly) build a hash-index on the non-ordered (non-dense) OID columns. In general, there is no such thing as a "free lunch" ... ;-) Hope this helps you further --- don't hesitate to keep asking in case you have more questions ... Gruß, Stefan
I'm very sorry for my innocent questions. Tim
-- | Stefan.Manegold @ CWI.nl | DB Architectures (INS1) | | http://CWI.nl/~manegold/ | Science Park 123 (L321) | | Tel.: +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
[I prefer to keep the mailing list involved to share the discussion; I hope you don't mind ... ;-) ]
Oops, I didn't notice...
http://repository.cwi.nl/search/searchrepository.php?stheme=INS1), in particular http://repository.cwi.nl/search/fullrecord.php?publnr=14290 http://repository.cwi.nl/search/fullrecord.php?publnr=11117
and/or these PhD these: http://repository.cwi.nl/search/fullrecord.php?publnr=14832 http://repository.cwi.nl/search/fullrecord.php?publnr=14301 http://repository.cwi.nl/search/fullrecord.php?publnr=16706
Wow, thank you. Especially "Self-organizing Tuple Reconstruction in Column-stores" seems to be very informative. I'm going to study ...
In the first case, tuple reconstruction is cheap (almost "for free"), as columns are physically aligned, and hence, highly efficient in-order positional lookup can be exploited for tuple reconstruction. Obviously, with keeping the columns physically aligned does not allow to order them individually according to their values. Thus without further indeces, value-lookups using binary search are only possible for on sorted column; for all other columns, some secondary index structure would be required. MonetDB only used (automatically) hash-indeces for single-value (point-)lookups, but resorts to (highly optimized) sequential scans for any range lookups on non-sorted columns.
In the second case, value-lookups can be done (reasonably) efficient using binary search on each column, however tuple-reconstruction (i.e., OID lookups) can no longer be done with simple highly efficient in-order positional lookups. In that case, MonetDB would (automatically on-the-fly) build a hash-index on the non-ordered (non-dense) OID columns.
In general, there is no such thing as a "free lunch" ... ;-)
Again, thank you very much for your fast and illuminating answer. Gruß Tim
Tim,
In fact, MonetDB is kind of unique in the sense that it *never* does tuple reconstruction. The low-level APIs tgat receive the result if a query get back tables, which are still represented as BATs (arrays in which values belonging to the same tuple are located at the same position).
Other column stores typically at some point in the query tree do stitch values together. In plans with sequential access there is no good reason to do so. The exception are data structures which receive a random access patterns, in particular hash tables. So for hash aggregation and hash join tuple reconstruction actually makes sense (VectorWise does it there).
Peter
Op 4 jul. 2011 om 19:54 heeft Stefan Manegold
Hi Tim,
[I prefer to keep the mailing list involved to share the discussion; I hope you don't mind ... ;-) ]
On Mon, Jul 04, 2011 at 07:30:36PM +0200, Tim Speier wrote:
Hello Stefan,
thanks for your fast reply.
I am especially interested in tuple reconstruction mechanisms in column-oriented databases. So I'm not a user.
I see; please also consider reading our papers (cf., http://repository.cwi.nl/search/searchrepository.php?stheme=INS1), in particular http://repository.cwi.nl/search/fullrecord.php?publnr=14290 http://repository.cwi.nl/search/fullrecord.php?publnr=11117
and/or these PhD these: http://repository.cwi.nl/search/fullrecord.php?publnr=14832 http://repository.cwi.nl/search/fullrecord.php?publnr=14301 http://repository.cwi.nl/search/fullrecord.php?publnr=16706
Basically, "even" in main memory, sequential access is fast, while random access is slow.
Further, for any set of BATs the represent the individual columns of a relational table, the BATs' "heads" (first/left column) are aligned and consist of OIDs that form a dense (i.e., increment == 1) ascending sequence of integer numbers starting at a given offset (usually 0). Thus, these OIDs can indeed be seen as tuple-/record- identifiers. Be aware, though, that these OIDs are only used internally and not available as tuple-/record- identifiers in SQL. Given their denseness properly, these OIDs are usually not materialized (read: physically stored) but rather computed on the fly, and they effectively indicate (module offset) the physical position of the respective record's attribute values in all BATs involved.
I am sorry, but it is not yet clear to me. Let me create an example: A relation R with two attributes X and Y. It consists of the following tuples:
+---+-----+ | X | Y | +---+-----+ | A | 300 | +---+-----+ | B | 100 | +---+-----+ | C | 400 | +---+-----+
Now let's create two BATs: each record gets its own OID and, my naive approach, map each attribute values to a BAT:
BAT X BAT Y +---+---+ +---+-----+ |OID|VAL| |OID|VAL | +---+---+ +---+-----+ | 0 | A | | 0 | 300 | +---+---+ +---+-----+ | 1 | B | | 1 | 100 | +---+---+ +---+-----+ | 2 | C | | 2 | 400 | +---+---+ +---+-----+
Now I could reconstruct each tuple by its OID and the index. But I would loose some advantages of the column-oriented databases.
Like which?
I guess you mean having each column sorted individually?
A more sophisticated approach would be to store values in ascending order like this:
BAT X BAT Y +---+---+ +---+-----+ |OID|VAL| |OID|VAL | +---+---+ +---+-----+ | 0 | A | | 1 | 100 | +---+---+ +---+-----+ | 1 | B | | 0 | 300 | +---+---+ +---+-----+ | 2 | C | | 2 | 400 | +---+---+ +---+-----+
But how can I now reconstruct the tuples efficient now? Is there an additional index structure (as you said in your foregoing email). ?
In the first case, tuple reconstruction is cheap (almost "for free"), as columns are physically aligned, and hence, highly efficient in-order positional lookup can be exploited for tuple reconstruction. Obviously, with keeping the columns physically aligned does not allow to order them individually according to their values. Thus without further indeces, value-lookups using binary search are only possible for on sorted column; for all other columns, some secondary index structure would be required. MonetDB only used (automatically) hash-indeces for single-value (point-)lookups, but resorts to (highly optimized) sequential scans for any range lookups on non-sorted columns.
In the second case, value-lookups can be done (reasonably) efficient using binary search on each column, however tuple-reconstruction (i.e., OID lookups) can no longer be done with simple highly efficient in-order positional lookups. In that case, MonetDB would (automatically on-the-fly) build a hash-index on the non-ordered (non-dense) OID columns.
In general, there is no such thing as a "free lunch" ... ;-)
Hope this helps you further --- don't hesitate to keep asking in case you have more questions ...
Gruß, Stefan
I'm very sorry for my innocent questions. Tim
-- | Stefan.Manegold @ CWI.nl | DB Architectures (INS1) | | http://CWI.nl/~manegold/ | Science Park 123 (L321) | | Tel.: +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
------------------------------------------------------------------------------ All of the data generated in your IT infrastructure is seriously valuable. Why? It contains a definitive record of application performance, security threats, fraudulent activity, and more. Splunk takes this data and makes sense of it. IT sense. And common sense. http://p.sf.net/sfu/splunk-d2d-c2 _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
participants (3)
-
Peter Boncz
-
Stefan Manegold
-
Tim Speier