Running aggregates?
Hi, does MonetDB support any efficient/optimized functionality to compute running aggregates? A quick check/test suggest that aggregation functions with windows functions like below does not seem to be supported, is it? ======== SELECT somedate, somevalue, SUM(somevalue) OVER(ORDER BY somedate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal FROM Table ======== Is their anything else that would provide running aggregates? Or am I (for now?) bound to calculating running aggregates using the obvious theta-self-join (expected to be "non-optimal" due to redundant work and large (huge) intermediate results), or "hijacking"/"mis-using" the bulk-version of a (to be implemented) scalar function? Thanks! Stefan -- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
Hi Stefan, Am 26.02.2017 um 13:40 schrieb Stefan Manegold:
Hi,
does MonetDB support any efficient/optimized functionality to compute running aggregates? As far as I know only for row numbering.
A quick check/test suggest that aggregation functions with windows functions like below does not seem to be supported, is it?
======== SELECT somedate, somevalue, SUM(somevalue) OVER(ORDER BY somedate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal FROM Table ========
Is their anything else that would provide running aggregates? As a tiny part of my master thesis I wrote a few primitives for MAL to support running aggregates (sum, min, max, avg, first_value). I don't know how to embed them in SQL as I did not use it. Note that the window functions are neither comprehensively tested nor used (and most likely don't have the quality MonetDB provides). At the moment semantics for unbounded and bounded windows exist (the latter is not very efficient at the moment, i.e. O(n * window_len), as it does not use group properties, i.e. subtraction).
I know that it is probably not too helpful, but who knows. Best wishes, Moritz
Hi Moritz - In case you are interested, there have been two VLDB papers recently on implementing efficient analytic functions: Efficient Processing of Window Functions in Analytical SQL Queries http://www.vldb.org/pvldb/vol8/p1058-leis.pdf Incremental Computation of Common Windowed Holistic Aggregates https://research.tableau.com/sites/default/files/p1221-wesley.pdf There was also a third one on optimisation a few years back Optimization of Analytic Window Functions http://vldb.org/pvldb/vol5/p1244_yucao_vldb2012.pdf
On Feb 26, 2017, at 05:26, Moritz Bruder
wrote: Hi Stefan,
Am 26.02.2017 um 13:40 schrieb Stefan Manegold:
Hi,
does MonetDB support any efficient/optimized functionality to compute running aggregates? As far as I know only for row numbering.
A quick check/test suggest that aggregation functions with windows functions like below does not seem to be supported, is it?
======== SELECT somedate, somevalue, SUM(somevalue) OVER(ORDER BY somedate ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS RunningTotal FROM Table ========
Is their anything else that would provide running aggregates? As a tiny part of my master thesis I wrote a few primitives for MAL to support running aggregates (sum, min, max, avg, first_value). I don't know how to embed them in SQL as I did not use it. Note that the window functions are neither comprehensively tested nor used (and most likely don't have the quality MonetDB provides). At the moment semantics for unbounded and bounded windows exist (the latter is not very efficient at the moment, i.e. O(n * window_len), as it does not use group properties, i.e. subtraction).
I know that it is probably not too helpful, but who knows.
Best wishes,
Moritz
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Best regards, Richard Wesley Senior Research Scientist
In case you are interested, there have been two VLDB papers recently on implementing efficient analytic functions: I mostly implemented those for feature completeness. They were no substantial part of my thesis (which was exploiting implicit/in-memory order for representation of order-based and nested queries), although
Hi Richard, Am 27.02.2017 um 15:48 schrieb Richard Wesley: they could be important for future work (since partitioning and order are both implicitly given).
Efficient Processing of Window Functions in Analytical SQL Queries http://www.vldb.org/pvldb/vol8/p1058-leis.pdf This one I read, which outlines a method (Removable Cumulative Aggregation) for amortized O(n) window functions with bounded preceding window. It should be relatively easy to adapt my code for window functions whose accumulator has a group property (sum, avg, count).
Again, I don't know whether my code is useful. But given that none of those exist yet in vanilla MonetDB, it might be beneficial to provide a solution that already works to some degree (at least within MAL). Regards, Moritz
participants (3)
-
Moritz Bruder
-
Richard Wesley
-
Stefan Manegold