Hello, I would guess you are doing a lot of deletes from the table. Monet does not delete rows physically but merely marks them as deleted. Vacuum command removes deleted rows physically and table scan does not go over 80 GB but over 3.6 GB. Hope that helps, Radovan On 10/22/2014 11:44 PM, Pierre-Adrien Coustillas wrote:
Hello,
(google translate)
I have a query that takes a long time : select count(distinct idvisiteur) from src_inbox__at__lm__visites;
After two hours of running, it is not over, I'll kill her.
if I look at the heapsize column storage () sql>select * from storage() WHERE "table"='src_inbox__at__lm__visites'; +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+-------------+---------+--------+ | schema | table | column | type | location | count | typewidth | columnsize | heapsize | indices | sorted | +========+============================+=====================+=========+=============+===========+===========+============+=============+=========+========+ | sys | src_inbox__at__lm__visites | idvisite | varchar | 01/00/10025 | 315819936 | 5 | 2526559488 | 27755937792 | 0 | false | | sys | src_inbox__at__lm__visites | idvisiteur | varchar | 76/7667 | 315819936 | 19 | 2526559488 | 80449110016 | 0 | false | | sys | src_inbox__at__lm__visites | idvisiteuridentifie | varchar | 62/6201 | 315819936 | 2 | 1263279744 | 2222391296 | 0 | false | | sys | src_inbox__at__lm__visites | siteniveau1 | varchar | 77/7704 | 315819936 | 5 | 315819936 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | debutvisite | varchar | 30/3064 | 315819936 | 19 | 2526559488 | 10741678080 | 0 | false | | sys | src_inbox__at__lm__visites | catvisiteur | varchar | 31/3171 | 315819936 | 0 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | region | varchar | 67/6706 | 315819936 | 12 | 1263279744 | 2077556736 | 0 | false | | sys | src_inbox__at__lm__visites | ville | varchar | 40/4071 | 315819936 | 10 | 2526559488 | 22635741184 | 0 | false | | sys | src_inbox__at__lm__visites | os | varchar | 01/26/12673 | 315819936 | 8 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | browser | varchar | 01/44/14410 | 315819936 | 17 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | sourcevisite | varchar | 42/4234 | 315819936 | 1 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | idcampagne | varchar | 01/41/14100 | 315819936 | 0 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | prefixe | varchar | 01/22/12277 | 315819936 | 0 | 315819936 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | libcampagne | varchar | 65/6573 | 315819936 | 0 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | emailing | varchar | 01/04/10434 | 315819936 | 0 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | dateenvois | varchar | 01/22/12264 | 315819936 | 0 | 631639872 | 65536 | 0 | false | | sys | src_inbox__at__lm__visites | nbpages | int | 56/5615 | 315819936 | 4 | 1263279744 | 0 | 0 | false | | sys | src_inbox__at__lm__visites | dureevisite | time | 01/02/10211 | 315819936 | 4 | 1263279744 | 0 | 0 | false | | sys | src_inbox__at__lm__visites | id_unique | int | 01/23/12370 | 315819936 | 4 | 1263279744 | 0 | 0 | false | +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+-------------+---------+--------+
Is the heapsize column represents the amount of data to be placed in RAM ? I have 64GB of RAM on the server, so the idvisiteur column can not be held in RAM. Is this true? ( 64 000 000 000 < 80 449 110 016 )
I execute a vacuum (experimental command ?) : https://www.monetdb.org/Documentation/UserGuide/AutoloadingScript (Table vacuum commands (experimental!)) call vacuum('sys','src_inbox__at__lm__visites');
sql>select * from storage() WHERE "table"='src_inbox__at__lm__visites'; +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+------------+---------+--------+ | schema | table | column | type | location | count | typewidth | columnsize | heapsize | indices | sorted | +========+============================+=====================+=========+=============+===========+===========+============+============+=========+========+ | sys | src_inbox__at__lm__visites | idvisite | varchar | 36 | 315819936 | 5 | 2526559488 | 4906811392 | 0 | false | | sys | src_inbox__at__lm__visites | idvisiteur | varchar | 64/6435 | 315819936 | 19 | 1263279744 | 3667394560 | 0 | false | | sys | src_inbox__at__lm__visites | idvisiteuridentifie | varchar | 66/6640 | 315819936 | 2 | 1263279744 | 40042496 | 0 | false | | sys | src_inbox__at__lm__visites | siteniveau1 | varchar | 67/6743 | 315819936 | 5 | 315819936 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | debutvisite | varchar | 70/7045 | 315819936 | 19 | 2526559488 | 7576551424 | 0 | false | | sys | src_inbox__at__lm__visites | catvisiteur | varchar | 71/7152 | 315819936 | 0 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | region | varchar | 72/7252 | 315819936 | 12 | 1263279744 | 106561536 | 0 | false | | sys | src_inbox__at__lm__visites | ville | varchar | 73/7353 | 315819936 | 10 | 1263279744 | 1362165760 | 0 | false | | sys | src_inbox__at__lm__visites | os | varchar | 74/7453 | 315819936 | 8 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | browser | varchar | 75/7554 | 315819936 | 17 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | sourcevisite | varchar | 76/7655 | 315819936 | 1 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | idcampagne | varchar | 77/7761 | 315819936 | 0 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | prefixe | varchar | 01/00/10064 | 315819936 | 0 | 315819936 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | libcampagne | varchar | 01/01/10165 | 315819936 | 0 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | emailing | varchar | 01/02/10270 | 315819936 | 0 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | dateenvois | varchar | 01/03/10372 | 315819936 | 0 | 631639872 | 73728 | 0 | false | | sys | src_inbox__at__lm__visites | nbpages | int | 01/04/10476 | 315819936 | 4 | 1263279744 | 0 | 0 | false | | sys | src_inbox__at__lm__visites | dureevisite | time | 01/06/10600 | 315819936 | 4 | 1263279744 | 0 | 0 | false | | sys | src_inbox__at__lm__visites | id_unique | int | 01/07/10703 | 315819936 | 4 | 1263279744 | 0 | 0 | false | +--------+----------------------------+---------------------+---------+-------------+-----------+-----------+------------+------------+---------+--------+
heapsize of idvisteur = 3 667 394 560 (3,6GB) instead of 80 449 110 016 (80 GB)
my query is present MUCH faster sql>select count(distinct idvisiteur) from src_inbox__at__lm__visites; +----------+ | L1 | +==========+ | 75528081 | +----------+ 1 tuple (4m 22s)
HOURA !!!
but WHY WHY WHY ? I want to understand
Thank
Pierre
-- 1G6 52 route de bischwiller 67300 Schiltigheim Société de Services et de Formations en Logiciels Libres http://1g6.biz Tél : 06 64 63 70 35
_______________________________________________ users-list mailing list users-list@monetdb.org https://www.monetdb.org/mailman/listinfo/users-list