Hi, could some one possible explain to me why the MonetDB corr() aggregate function, that IMHO is supposed to calculate the correlation between two attributes, does return the same type as (the larger of) its input type(s)? IMHO the correlation between two attributes (or (samples of) random variables is a factional value between -1 and 1. Consequently, MonetDB corr() on two integer columns returns an integer results, i.e., one of {-1,0,1} ... Shouldn't corr() --- much like avg(), var_pop(), var_samp(), stddev_samp(), stddev_pop() always return type double ? Thanks! Best, Stefan -- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
hi Stefan, not sure if this is helpful.. Niels made this change on the same
topic recently https://www.monetdb.org/bugzilla/show_bug.cgi?id=6287
On Sun, Oct 22, 2017 at 1:48 PM, Stefan Manegold
Hi,
could some one possible explain to me why the MonetDB corr() aggregate function, that IMHO is supposed to calculate the correlation between two attributes, does return the same type as (the larger of) its input type(s)?
IMHO the correlation between two attributes (or (samples of) random variables is a factional value between -1 and 1.
Consequently, MonetDB corr() on two integer columns returns an integer results, i.e., one of {-1,0,1} ...
Shouldn't corr() --- much like avg(), var_pop(), var_samp(), stddev_samp(), stddev_pop() always return type double ?
Thanks!
Best, Stefan
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) | _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
Hi Anthony, thanks. I am aware of this, but IMHO this does not really fix the problem, assuming my assumption is right that corr should always return a float, no matter what the input types are. Also, I think the internal (MAL) implementation of corr() (and covar() for that matter) is/are not entirely correct: too vulnarable to overflows and not necessarily respecting SQL null semantics correctly. Best, Stefan ----- On Oct 23, 2017, at 4:27 AM, Anthony Damico ajdamico@gmail.com wrote:
hi Stefan, not sure if this is helpful.. Niels made this change on the same topic recently https://www.monetdb.org/bugzilla/show_bug.cgi?id=6287
On Sun, Oct 22, 2017 at 1:48 PM, Stefan Manegold < Stefan.Manegold@cwi.nl > wrote:
Hi,
could some one possible explain to me why the MonetDB corr() aggregate function, that IMHO is supposed to calculate the correlation between two attributes, does return the same type as (the larger of) its input type(s)?
IMHO the correlation between two attributes (or (samples of) random variables is a factional value between -1 and 1.
Consequently, MonetDB corr() on two integer columns returns an integer results, i.e., one of {-1,0,1} ...
Shouldn't corr() --- much like avg(), var_pop(), var_samp(), stddev_samp(), stddev_pop() always return type double ?
Thanks!
Best, Stefan
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) | _______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
participants (2)
-
Anthony Damico
-
Stefan Manegold