[Monetdb-developers] glib - double free or corruption
Dear Monet developers, We're having trouble with a script that uses the {} operator to compute the avg of a large (4M rows) table: we get a 'double free' error from glib, and Monet crashes. This is using Monet 4.16.2 compiled for 64-bits and 32-bit oids on Linux. (BTW, it also happens if we use {sum}, but not if we use {count}) The code takes two tables, links and attr, and does the equivalent of SELECT avg(attr.value) FROM links, attr WHERE links.id = attr.id GROUP BY links.from Here's the MIL code: # Get the BATs var var_attr:=bat(bat("prox_link_attr").fetch(2)).find(oid(0)); var var_attr_id:=bat(bat(var_attr).fetch(0)); var var_attr_val:=bat(bat(var_attr).fetch(1)); var var_link_id:=bat(bat("prox_link").fetch(0)); var var_link_from:=bat(bat("prox_link").fetch(1)); # Join ATTR x LINK, keep LINK.from, ATTR.val var var_1:=var_link_id.join(var_attr_id.reverse()); var var_2:=var_1.mark(0@0); var var_3:=var_1.reverse().mark(0@0); var var_5:=var_2.reverse().join(var_link_from); var var_8:=var_3.reverse().join(var_attr_val); # GROUP BY LINK.from and compute AVG(ATTR.val) var var_9:=var_5.reverse().join(var_8); var_5.info().print(); var_8.info().print(); var_9.info().print(); var var_10:={avg}(var_9); At the end of the last statement, Monet crashes with the error: *** glibc detected *** double free or corruption (!prev): 0x0000000000632080 *** I've run this using gdb; below is the trace, including the information printed by the .info() calls at the end of the script, and some extra info that is printed with debugmask(32 and 131072). Any thoughts? Any more information that you would need to debug this problem? Thanks a lot, -- Agustin ------------ $ gdb --args /usr/local/Monet-venus-debug/bin/Mserver --dbname xxx ~/ test.mil GNU gdb Red Hat Linux (6.3.0.0-1.143.el4rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "x86_64-redhat-linux-gnu"...Using host libthread_db library "/lib64/tls/libthread_db.so.1". (gdb) run Starting program: /usr/local/Monet-venus-debug/bin/Mserver --dbname xxx /home/aschapira/test.mil [Thread debugging using libthread_db enabled] [New Thread 182918188384 (LWP 14037)] [New Thread 1082132832 (LWP 14040)] # Monet Database Server V4.16.2 # Copyright (c) 1993-2007, CWI. All rights reserved. # Compiled for x86_64-redhat-linux-gnu/64bit with 32bit OIDs; dynamically linked. # Visit http://monetdb.cwi.nl/ for further information. #-----------------------------------------# # h t # name # str str # type #-----------------------------------------# [ "version", "25105" ] [ "batId", "tmp_41" ] [ "batCacheid", "-33" ] [ "batParentid", "0" ] [ "batSharecnt", "0" ] [ "head", "oid" ] [ "tail", "oid" ] [ "batPersistence", "transient" ] [ "batRestricted", "read-only" ] [ "batRefcnt", "1" ] [ "batLRefcnt", "1" ] [ "batDirty", "dirty" ] [ "batSet", "0" ] [ "void_tid", "0" ] [ "void_cnt", "0" ] [ "hsorted", "65" ] [ "hident", "t" ] [ "hdense", "1" ] [ "hseqbase", "0@0" ] [ "hkey", "1" ] [ "hloc", "4" ] [ "hvarsized", "0" ] [ "halign", "1008981" ] [ "hnosorted", "0" ] [ "hnosorted_rev", "0" ] [ "hnodense", "0" ] [ "hnokey[0]", "0" ] [ "hnokey[1]", "0" ] [ "tident", "h" ] [ "tdense", "0" ] [ "tseqbase", "nil" ] [ "tsorted", "0" ] [ "tkey", "0" ] [ "tloc", "0" ] [ "tvarsized", "0" ] [ "talign", "1005793" ] [ "tnosorted", "58570" ] [ "tnosorted_rev", "0" ] [ "tnodense", "0" ] [ "tnokey[0]", "0" ] [ "tnokey[1]", "1" ] [ "batInserted", "0" ] [ "batDeleted", "0" ] [ "batFirst", "0" ] [ "top", "4180239" ] [ "batStamp", "-73" ] [ "lastUsed", "24012" ] [ "curStamp", "86" ] [ "batCopiedtodisk", "0" ] [ "batDirtydesc", "dirty" ] [ "batDirtybuns", "clean" ] [ "batBuns.free", "33441912" ] [ "batBuns.size", "33441912" ] [ "batBuns.maxsize", "40173552" ] [ "batBuns.storage", "malloced" ] [ "batBuns.filename", "41.buns" ] [ "hheapdirty", "clean" ] [ "theapdirty", "clean" ] #-----------------------------------------# # h t # name # str str # type #-----------------------------------------# [ "version", "25105" ] [ "batId", "tmp_42" ] [ "batCacheid", "34" ] [ "batParentid", "0" ] [ "batSharecnt", "0" ] [ "head", "oid" ] [ "tail", "int" ] [ "batPersistence", "transient" ] [ "batRestricted", "read-only" ] [ "batRefcnt", "1" ] [ "batLRefcnt", "1" ] [ "batDirty", "dirty" ] [ "batSet", "0" ] [ "void_tid", "-1" ] [ "void_cnt", "0" ] [ "hsorted", "65" ] [ "hident", "h" ] [ "hdense", "1" ] [ "hseqbase", "0@0" ] [ "hkey", "1" ] [ "hloc", "0" ] [ "hvarsized", "0" ] [ "halign", "1009002" ] [ "hnosorted", "0" ] [ "hnosorted_rev", "0" ] [ "hnodense", "0" ] [ "hnokey[0]", "0" ] [ "hnokey[1]", "0" ] [ "tident", "t" ] [ "tdense", "0" ] [ "tseqbase", "0@0" ] [ "tsorted", "0" ] [ "tkey", "0" ] [ "tloc", "4" ] [ "tvarsized", "0" ] [ "talign", "1009003" ] [ "tnosorted", "0" ] [ "tnosorted_rev", "0" ] [ "tnodense", "0" ] [ "tnokey[0]", "0" ] [ "tnokey[1]", "0" ] [ "batInserted", "0" ] [ "batDeleted", "0" ] [ "batFirst", "0" ] [ "top", "4180239" ] [ "batStamp", "-84" ] [ "lastUsed", "24028" ] [ "curStamp", "86" ] [ "batCopiedtodisk", "0" ] [ "batDirtydesc", "dirty" ] [ "batDirtybuns", "clean" ] [ "batBuns.free", "33441912" ] [ "batBuns.size", "35202008" ] [ "batBuns.maxsize", "42270704" ] [ "batBuns.storage", "malloced" ] [ "batBuns.filename", "42.buns" ] [ "hheapdirty", "clean" ] [ "theapdirty", "clean" ] #-----------------------------------------# # h t # name # str str # type #-----------------------------------------# [ "version", "25105" ] [ "batId", "tmp_43" ] [ "batCacheid", "-35" ] [ "batParentid", "0" ] [ "batSharecnt", "0" ] [ "head", "oid" ] [ "tail", "int" ] [ "batPersistence", "transient" ] [ "batRestricted", "read-only" ] [ "batRefcnt", "1" ] [ "batLRefcnt", "1" ] [ "batDirty", "dirty" ] [ "batSet", "0" ] [ "void_tid", "6122832" ] [ "void_cnt", "0" ] [ "hsorted", "0" ] [ "hident", "t" ] [ "hdense", "0" ] [ "hseqbase", "nil" ] [ "hkey", "0" ] [ "hloc", "4" ] [ "hvarsized", "0" ] [ "halign", "1005793" ] [ "hnosorted", "58570" ] [ "hnosorted_rev", "0" ] [ "hnodense", "0" ] [ "hnokey[0]", "0" ] [ "hnokey[1]", "1" ] [ "tident", "h" ] [ "tdense", "0" ] [ "tseqbase", "0@0" ] [ "tsorted", "0" ] [ "tkey", "0" ] [ "tloc", "0" ] [ "tvarsized", "0" ] [ "talign", "1009003" ] [ "tnosorted", "0" ] [ "tnosorted_rev", "0" ] [ "tnodense", "0" ] [ "tnokey[0]", "0" ] [ "tnokey[1]", "0" ] [ "batInserted", "0" ] [ "batDeleted", "0" ] [ "batFirst", "0" ] [ "top", "4180239" ] [ "batStamp", "-85" ] [ "lastUsed", "24043" ] [ "curStamp", "86" ] [ "batCopiedtodisk", "0" ] [ "batDirtydesc", "dirty" ] [ "batDirtybuns", "clean" ] [ "batBuns.free", "33441912" ] [ "batBuns.size", "33441912" ] [ "batBuns.maxsize", "40173552" ] [ "batBuns.storage", "malloced" ] [ "batBuns.filename", "43.buns" ] [ "hheapdirty", "clean" ] [ "theapdirty", "clean" ] *** double free or corruption (!prev): 0x0000000000632080 *** Program received signal SIGABRT, Aborted. [Switching to Thread 182918188384 (LWP 14037)] 0x0000003a7b72e21d in raise () from /lib64/tls/libc.so.6 (gdb) bt #0 0x0000003a7b72e21d in raise () from /lib64/tls/libc.so.6 #1 0x0000003a7b72fa1e in abort () from /lib64/tls/libc.so.6 #2 0x0000003a7b763451 in __libc_message () from /lib64/tls/libc.so.6 #3 0x0000003a7b76906e in _int_free () from /lib64/tls/libc.so.6 #4 0x0000003a7b7693b6 in free () from /lib64/tls/libc.so.6 #5 0x0000002a96665c43 in GDKfree (blk=0x632088) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB/src/ gdk/gdk_utils.mx:1121 #6 0x0000002a965ea1cf in HEAPfree (h=0x631940) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB/src/ gdk/gdk_heap.mx:264 #7 0x0000002a966dc693 in BATdelete (b=0x631828) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB/src/ gdk/gdk_storage.mx:716 #8 0x0000002a965e5e6d in BBPaddtobin (b=0x631828) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB/src/ gdk/gdk_bbp.mx:2366 #9 0x0000002a965e252a in BBPdestroy (b=0x631828) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB/src/ gdk/gdk_bbp.mx:1665 #10 0x0000002a965e12b1 in BBPreclaim (b=0x631828) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB/src/ gdk/gdk_bbp.mx:1514 #11 0x0000002a95ef5e0f in interpret_setaggr (name=0x62a038 "{avg}", argc=2, argv=0x620c18, res=0x7fbffff790, tt=0x631078, stk=1) at /export/scratch0/monet/monet.GNU.64.64.d.14791/ MonetDB4/src/monet/monet_multiplex.mx:1819 #12 0x0000002a95e95c57 in interpret (stk=1, lt=0x625090, res=0x7fbffff790) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ monet/monet_interpreter.mx:1202 #13 0x0000002a95e9da04 in interpret_assignment (stk=1, lt=0x625040, res=0x7fbffff790) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ monet/monet_interpreter.mx:1842 #14 0x0000002a95e97349 in interpret_var (stk=1, lt=0x625018, res=0x7fbffff790) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ monet/monet_interpreter.mx:1329 #15 0x0000002a95e93e30 in interpret (stk=1, lt=0x625018, res=0x7fbffff790) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ monet/monet_interpreter.mx:832 #16 0x0000002a95e9dca3 in interpret_seqblock (stk=1, lt=0x5ca8b8, res=0x7fbffff790, scope=0) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ monet/monet_interpreter.mx:1892 #17 0x0000002a95e93606 in interpret (stk=1, lt=0x629478, res=0x7fbffff790) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ monet/monet_interpreter.mx:770 #18 0x0000002a95e9173d in interpret_str (stk=0, buf=0x629908 "# Get the BATs\nvar var_attr:=bat(bat (\"prox_link_attr\").fetch(2)).find(oid(0));\nvar var_attr_id:=bat(bat (var_attr).fetch(0));\nvar var_attr_val:=bat(bat(var_attr).fetch(1)); \n\nvar var_link_id:=bat(bat(\"p"..., res=0x7fbffff790) at /export/scratch0/monet/monet.GNU.64.64.d. 14791/MonetDB4/src/monet/monet_interpreter.mx:246 #19 0x0000002a95e91bf0 in interpret_file (stk=0, lt=0x5411c8, res=0x7fbffff790) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ monet/monet_interpreter.mx:291 #20 0x0000002a95e93f39 in interpret (stk=0, lt=0x5411c8, res=0x7fbffff790) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ monet/monet_interpreter.mx:858 #21 0x0000002a95ebf65e in handleRequest (t=0x2a96a46840, q=0x64d5a8, res=0x7fbffff790) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ monet/monet_queue.mx:537 ---Type <return> to continue, or q <return> to quit--- #22 0x0000002a95ebfadb in doRequest (t=0x2a96a46840, preference=0x0) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ monet/monet_queue.mx:563 #23 0x0000002a95f1108c in monetInterpreter (status=0x7fbffff7f8) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ monet/monet_process.mx:112 #24 0x0000000000402674 in main (argc=4, av=0x7fbffff908) at /export/scratch0/monet/monet.GNU.64.64.d.14791/MonetDB4/src/ tools/Mserver.mx:403 When I run this with debugmask(32 + 131072), I get the following debug output as well. This is right after the calls to .info(), during the processing of {avg}: #interpret_unpin(print) on bat(59) refcnt = 0 ##60 = new tmp_74(int,int) ##BBPreclaim: bat(60) view=0 lrefs=0 ref=1 stat=1 #interpret_pin({avg}) on bat(-35) refcnt = 1 ##61 = new tmp_75(oid,int) ##62 = new tmp_76(oid,void) ##65 = new tmp_101(oid,void) ##66 = new tmp_102(oid,void) ##67 = new tmp_103(oid,void) ##BBPreclaim: bat(67) view=0 lrefs=0 ref=1 stat=1 ##clear 67 (tmp_103) ##uncache 67 (tmp_103) ##BBPreclaim: bat(66) view=0 lrefs=0 ref=1 stat=1 ##clear 66 (tmp_102) ##uncache 66 (tmp_102) ##BBPreclaim: bat(65) view=0 lrefs=0 ref=1 stat=1 ##clear 65 (tmp_101) ##uncache 65 (tmp_101) ##BBPreclaim: bat(62) view=0 lrefs=0 ref=1 stat=1 ##clear 62 (tmp_76) ##uncache 62 (tmp_76) ##BBPreclaim: bat(61) view=0 lrefs=0 ref=1 stat=1 ##clear 61 (tmp_75) ##uncache 61 (tmp_75) #setaggr impl: non-optimized hash #interpret_pin(count) on bat(61) refcnt = 2 #interpret_unpin(count) on bat(61) refcnt = 1 #interpret_pin(sum_lng) on bat(61) refcnt = 2 #interpret_unpin(sum_lng) on bat(61) refcnt = 1 ... [these 4 lines repeat 20,000 times, which corresponds to the number of unique values in the HEAD of var_9] ... ##BBPreclaim: bat(65) view=0 lrefs=0 ref=1 stat=1 ##clear 65 (tmp_101) ##uncache 65 (tmp_101) ##BBPreclaim: bat(61) view=0 lrefs=0 ref=1 stat=1 I hope this helps. Thanks again.
participants (1)
-
Agustin Schapira