
Hello, (google translate) I have a query that takes a long time : select count(distinct idvisiteur) from src_inbox__at__lm__visites; After two hours of running, it is not over , I'll kill her . if I look at the heapsize column storage ( ) sql>select * from storage() WHERE "table"='src_inbox__at__lm__visites'; +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+-------------+---------+--------+ | schema | table | column | type | location | count | typewidth | columnsize | heapsize | indices | sorted | +========+============================+=====================+=========+=============+===========+===========+============+=============+=========+========+ | sys | src_inbox__at__lm__visites | idvisite | varchar | 01/00/10025 | 315819936 | 5 | 2526559488 | 27755937792 | 0 | false | | sys | src_inbox__at__lm__visites | idvisiteur | varchar | 76/7667 | 315819936 | 19 | 2526559488 | 80449110016 | 0 | false | | sys | src_inbox__at__lm__visites | idvisiteuridentifie | varchar | 62/6201 | 315819936 | 2 | 1263279744 | 2222391296 | 0 | false | | sys | src_inbox__at__lm__visites | siteniveau1 | varchar | 77/7704 | 315819936 | 5 | 315819936 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | debutvisite | varchar | 30/3064 | 315819936 | 19 | 2526559488 | 10741678080 | 0 | false | | sys | src_inbox__at__lm__visites | catvisiteur | varchar | 31/3171 | 315819936 | 0 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | region | varchar | 67/6706 | 315819936 | 12 | 1263279744 | 2077556736 | 0 | false | | sys | src_inbox__at__lm__visites | ville | varchar | 40/4071 | 315819936 | 10 | 2526559488 | 22635741184 | 0 | false | | sys | src_inbox__at__lm__visites | os | varchar | 01/26/12673 | 315819936 | 8 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | browser | varchar | 01/44/14410 | 315819936 | 17 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | sourcevisite | varchar | 42/4234 | 315819936 | 1 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | idcampagne | varchar | 01/41/14100 | 315819936 | 0 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | prefixe | varchar | 01/22/12277 | 315819936 | 0 | 315819936 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | libcampagne | varchar | 65/6573 | 315819936 | 0 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | emailing | varchar | 01/04/10434 | 315819936 | 0 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | dateenvois | varchar | 01/22/12264 | 315819936 | 0 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | nbpages | int | 56/5615 | 315819936 | 4 | 1263279744 | 0 | 0 | false | | sys | src_inbox__at__lm__visites | dureevisite | time | 01/02/10211 | 315819936 | 4 | 1263279744 | 0 | 0 | false | | sys | src_inbox__at__lm__visites | id_unique | int | 01/23/12370 | 315819936 | 4 | 1263279744 | 0 | 0 | false | +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+-------------+---------+--------+ Is the heapsize column represents the amount of data to be placed in RAM ? I have 64GB of RAM on the server , so the idvisiteur column can not be held in RAM. Is this true? ( 64 000 000 000 < 80 449 110 016 ) I execute a vacuum (experimental command ?) : https://www.monetdb.org/Documentation/UserGuide/AutoloadingScript (Table vacuum commands (experimental!)) call vacuum('sys','src_inbox__at__lm__visites'); sql>select * from storage() WHERE "table"='src_inbox__at__lm__visites'; +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+------------+---------+--------+ | schema | table | column | type | location | count | typewidth | columnsize | heapsize | indices | sorted | +========+============================+=====================+=========+=============+===========+===========+============+============+=========+========+ | sys | src_inbox__at__lm__visites | idvisite | varchar | 36 | 315819936 | 5 | 2526559488 | 4906811392 | 0 | false | | sys | src_inbox__at__lm__visites | idvisiteur | varchar | 64/6435 | 315819936 | 19 | 1263279744 | 3667394560 | 0 | false | | sys | src_inbox__at__lm__visites | idvisiteuridentifie | varchar | 66/6640 | 315819936 | 2 | 1263279744 | 40042496 | 0 | false | | sys | src_inbox__at__lm__visites | siteniveau1 | varchar | 67/6743 | 315819936 | 5 | 315819936 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | debutvisite | varchar | 70/7045 | 315819936 | 19 | 2526559488 | 7576551424 | 0 | false | | sys | src_inbox__at__lm__visites | catvisiteur | varchar | 71/7152 | 315819936 | 0 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | region | varchar | 72/7252 | 315819936 | 12 | 1263279744 | 106561536 | 0 | false | | sys | src_inbox__at__lm__visites | ville | varchar | 73/7353 | 315819936 | 10 | 1263279744 | 1362165760 | 0 | false | | sys | src_inbox__at__lm__visites | os | varchar | 74/7453 | 315819936 | 8 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | browser | varchar | 75/7554 | 315819936 | 17 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | sourcevisite | varchar | 76/7655 | 315819936 | 1 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | idcampagne | varchar | 77/7761 | 315819936 | 0 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | prefixe | varchar | 01/00/10064 | 315819936 | 0 | 315819936 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | libcampagne | varchar | 01/01/10165 | 315819936 | 0 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | emailing | varchar | 01/02/10270 | 315819936 | 0 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | dateenvois | varchar | 01/03/10372 | 315819936 | 0 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | nbpages | int | 01/04/10476 | 315819936 | 4 | 1263279744 | 0 | 0 | false | | sys | src_inbox__at__lm__visites | dureevisite | time | 01/06/10600 | 315819936 | 4 | 1263279744 | 0 | 0 | false | | sys | src_inbox__at__lm__visites | id_unique | int | 01/07/10703 | 315819936 | 4 | 1263279744 | 0 | 0 | false | +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+------------+---------+--------+ heapsize of idvisteur = 3 667 394 560 (3,6GB) instead of 80 449 110 016 (80 GB) my query is present MUCH faster sql>select count(distinct idvisiteur) from src_inbox__at__lm__visites; +----------+ | L1 | +==========+ | 75528081 | +----------+ 1 tuple (4m 22s) HOURA !!! but WHY WHY WHY ? I want to understand Thank Pierre -- 1G6 52 route de bischwiller 67300 Schiltigheim Société de Services et de Formations en Logiciels Libres http://1g6.biz Tél : 06 64 63 70 35