Type of integers used to calculate a SUM
With hugeint enabled, it seems that the SUM of INT (32-bit) is done using HUGEINT (128-bit) while the SUM of BIGINT (64-bit) is done using BIGINT (64-bit). That seems weird. Is SUM supposed to use the largest type or use the same type as the input or something else? Database: MonetDB v11.33.3 (Apr2019) sql>CREATE TABLE test (bi BIGINT, i INT); sql>INSERT INTO test VALUES (0x7000000000000000, 0x70000000); sql>INSERT INTO test VALUES (0x7000000000000000, 0x70000000); sql>INSERT INTO test VALUES (1, 3); sql>INSERT INTO test VALUES (2, 4); sql>SELECT SUM(i) FROM test; -- larger than an INT -- +------------+ | L3 | +============+ | 3758096391 | +------------+ 1 tuple sql>SELECT SUM(bi) FROM test; -- larger than a BIGINT -- overflow in calculation. sql>EXPLAIN SELECT SUM(i) FROM test; +------------------------------------------------------------------------------------------------------+ | function user.s10_22():void; | | X_1:void := querylog.define("explain select sum(i) from test;":str, "default_pipe":str, 12:int); | | barrier X_74:bit := language.dataflow(); | | X_4:int := sql.mvc(); | | C_5:bat[:oid] := sql.tid(X_4:int, "sys":str, "test":str); | | X_8:bat[:int] := sql.bind(X_4:int, "sys":str, "test":str, "i":str, 0:int); | | X_17:bat[:int] := algebra.projection(C_5:bat[:oid], X_8:bat[:int]); | | X_18:hge := aggr.sum(X_17:bat[:int]); | | exit X_74:bit; | | sql.resultSet("sys.L3":str, "L3":str, "hugeint":str, 128:int, 0:int, 7:int, X_18:hge); | | end user.s10_22; | +------------------------------------------------------------------------------------------------------+ sql>EXPLAIN SELECT SUM(bi) FROM test; +-------------------------------------------------------------------------------------------------------+ | function user.s12_22():void; | | X_1:void := querylog.define("explain select sum(bi) from test;":str, "default_pipe":str, 12:int); | | barrier X_74:bit := language.dataflow(); | | X_4:int := sql.mvc(); | | C_5:bat[:oid] := sql.tid(X_4:int, "sys":str, "test":str); | | X_8:bat[:lng] := sql.bind(X_4:int, "sys":str, "test":str, "bi":str, 0:int); | | X_17:bat[:lng] := algebra.projection(C_5:bat[:oid], X_8:bat[:lng]); | | X_18:lng := aggr.sum(X_17:bat[:lng]); | | exit X_74:bit; | | sql.resultSet("sys.L3":str, "L3":str, "bigint":str, 64:int, 0:int, 7:int, X_18:lng); | | end user.s12_22; | +-------------------------------------------------------------------------------------------------------+ sql>EXPLAIN SELECT SUM(CAST(i AS HUGEINT)) FROM test; +-----------------------------------------------------------------------------------------------------------------------+ | function user.s20_22():void; | | X_1:void := querylog.define("explain select sum(cast(i as hugeint)) from test;":str, "default_pipe":str, 13:int); | | barrier X_75:bit := language.dataflow(); | | X_4:int := sql.mvc(); | | C_5:bat[:oid] := sql.tid(X_4:int, "sys":str, "test":str); | | X_8:bat[:int] := sql.bind(X_4:int, "sys":str, "test":str, "i":str, 0:int); | | X_17:bat[:int] := algebra.projection(C_5:bat[:oid], X_8:bat[:int]); | | X_18:bat[:hge] := batcalc.hge(X_17:bat[:int]); | | X_19:hge := aggr.sum(X_18:bat[:hge]); | | exit X_75:bit; | | sql.resultSet("sys.L4":str, "L4":str, "hugeint":str, 128:int, 0:int, 7:int, X_19:hge); | | end user.s20_22; | +-----------------------------------------------------------------------------------------------------------------------+ sql>EXPLAIN SELECT SUM(CAST(bi AS BIGINT)) FROM test; +-----------------------------------------------------------------------------------------------------------------------+ | function user.s8_22():void; | | X_1:void := querylog.define("explain select sum(cast(bi as bigint)) from test;":str, "default_pipe":str, 12:int); | | barrier X_74:bit := language.dataflow(); | | X_4:int := sql.mvc(); | | C_5:bat[:oid] := sql.tid(X_4:int, "sys":str, "test":str); | | X_8:bat[:lng] := sql.bind(X_4:int, "sys":str, "test":str, "bi":str, 0:int); | | X_17:bat[:lng] := algebra.projection(C_5:bat[:oid], X_8:bat[:lng]); | | X_18:lng := aggr.sum(X_17:bat[:lng]); | | exit X_74:bit; | | sql.resultSet("sys.L4":str, "L4":str, "bigint":str, 64:int, 0:int, 7:int, X_18:lng); | +-----------------------------------------------------------------------------------------------------------------------+ sql>EXPLAIN SELECT SUM(CAST(i AS INT)) FROM test; +-----------------------------------------------------------------------------------------------------------+ | function user.s10_22():void; | | X_1:void := querylog.define("select sum(cast(i as int)) from test;":str, "default_pipe":str, 12:int); | | barrier X_74:bit := language.dataflow(); | | X_4:int := sql.mvc(); | | C_5:bat[:oid] := sql.tid(X_4:int, "sys":str, "test":str); | | X_8:bat[:int] := sql.bind(X_4:int, "sys":str, "test":str, "i":str, 0:int); | | X_17:bat[:int] := algebra.projection(C_5:bat[:oid], X_8:bat[:int]); | | X_18:hge := aggr.sum(X_17:bat[:int]); | | exit X_74:bit; | | sql.resultSet("sys.L4":str, "L4":str, "hugeint":str, 128:int, 0:int, 7:int, X_18:hge); | | end user.s10_22; | +-----------------------------------------------------------------------------------------------------------+ sql>SELECT SUM(CAST(i AS INT)) FROM test; +------------+ | L4 | +============+ | 3758096391 | +------------+ 1 tuple Also, the cast in SUM(CAST(bi AS BIGINT)) seems to be optimized out while SUM(CAST(i AS HUGEINT)) has one more BAT operation than SUM(i) even though they seem to do the same thing, and the cast in SUM(CAST(i AS INT)) seems to have no effect. Di-an Jan
participants (1)
-
Jan, Di-An