[Monetdb-developers] Monet crashes with {sum} and {avg}
Hi, I am having more difficulties with {sum} and {avg}. The problems happen both on 4.16.2 and 4.18.0, on Linux and OS X, compiled with 32- and 64- bits, with and without optimizations. I have been able to isolate the problem and write a simple test case that shows the error. The attached file has the contents of a BAT; if you save it and then run the following script: module(ascii_io); var b:=bat(str,int).import(<filename>); var x:={sum}(b); then you will get the following error: *** glibc detected *** double free or corruption (!prev): 0x00000000006a6360 *** and Monet will crash. The failing free is called from line 1819 of MonetDB4/src/monet/ monet_multiplex.mx. It's inside the 'interpret_setaggr' function, and it happens only with the 'non-optimized hash' implementation of setaggr starting on line 1694 (in fact, if you sort the BAT before calling {sum}, there will be no error. The same is true for example when you call {count}, which uses a different implementation of setaggr to process the {}). The failing free appears within the 'bunins_failed' label (line 1800), which in turn is called by the loop that processes the aggregate when, I think, its call to 'bunins_unary' fails. Here's the failing code, starting on line 1694: BATloopFast(extent, p, q, yy) { [...] if (BATprepareHash(b)) goto bunins_failed; HASHloop(b, b->hhash, hh, last) { r = BUNptr(b, hh); @:bunins_unary(r,hh)@ } [...] } I don't understand why @:bunins_unary(r,hh)@ should fail. I have noticed, however, that if I remove the last couple of lines from the attached file then the call to {sum} doesn't crash. These last two lines are exactly the same as others, so it's not their particular value that matters, but rather the number of rows in the BAT with the "Other" head value. Do you have any suggestions? Any other information that I can provide to help you fix the problem? Thanks a lot again, -- Agustin PS: On OS X, instead of the crash you get an error message "malloc: *** Deallocation of a pointer not malloced: 0x281ee00; This could be a double free(), or free() called with the middle of an allocated block", but for all practical purposes it's the same: the free() is called from the same place in interpret_setaggr. The difference is just in the way the Mac equivalent of glibc handles the double free.
On Wed, Aug 22, 2007 at 10:13:23AM -0400, Agustin Schapira wrote:
Hi,
Hi Agustin, You found an interesting bug. The interpret_setaggr implementation has indeed many cases. One set of cases uses an intermediate bat in a strait forward (normal inserts) way. This case is the one you trigger. As soon as many values exist in your input bat the intermediate may need extending which happens. Now many of the other cases don't used the nested bat in such a clean way, but do 'dirty' tricks, such as reusing the heaps of your input bat. Therefor at the end of interpret_setaggr the nested (bats) are cleared by restoring the initial contend. As you can see restoring the 'initial' contend of the normal inserts with extends is a problem. I now have a fix for this which I hope to check in soon. Index: monet_multiplex.mx =================================================================== RCS file: /cvsroot/monetdb/MonetDB4/src/monet/monet_multiplex.mx,v retrieving revision 1.4 diff -u -r1.4 monet_multiplex.mx --- monet_multiplex.mx 20 Feb 2007 11:32:16 -0000 1.4 +++ monet_multiplex.mx 22 Aug 2007 19:06:44 -0000 @@ -1430,7 +1430,7 @@ BAT *b, *nested, *nested_rev, *histo = 0, *extent = 0, *ret = 0; size_t minpos; int ret_val = -1, varsize, head_type, hidden_vartype = 0; - int prop_key, prop_sorted = 0; + int prop_key, prop_sorted = 0, restore = 1; GDKfcn direct_call = NULL, packed_call = NULL; ValRecord argv_bak; BUN p, q, r, s; @@ -1688,6 +1688,7 @@ assert(batbuns->base!=(ptr)1); nested->batDeleted = nested->batInserted = batbuns->base; + restore = 0; if (direct_call == NULL) { if (GDKdebug & 131072) THRprintf(GDKout, "setaggr impl: non-optimized hash\n"); @@ -1814,8 +1815,16 @@ argv[1] = argv_bak; /* undo all our hacks in nested before freeing it */ - *(BATstore*) nested = nested_bak; - *(BAT*) nested_rev = nested_rev_bak; + if (restore) { + *(BATstore*) nested = nested_bak; + *(BAT*) nested_rev = nested_rev_bak; + } else { + BATstore *n = (BATstore*)nested; + nested->H = &n->H; + nested->T = &n->T; + nested_rev->H = &n->T; + nested_rev->T = &n->H; + } BBPreclaim(nested); if (ret_val < 0) { Niels
I am having more difficulties with {sum} and {avg}. The problems happen both on 4.16.2 and 4.18.0, on Linux and OS X, compiled with 32- and 64- bits, with and without optimizations.
I have been able to isolate the problem and write a simple test case that shows the error. The attached file has the contents of a BAT; if you save it and then run the following script:
module(ascii_io); var b:=bat(str,int).import(<filename>); var x:={sum}(b);
then you will get the following error:
*** glibc detected *** double free or corruption (!prev): 0x00000000006a6360 ***
and Monet will crash.
The failing free is called from line 1819 of MonetDB4/src/monet/ monet_multiplex.mx. It's inside the 'interpret_setaggr' function, and it happens only with the 'non-optimized hash' implementation of setaggr starting on line 1694 (in fact, if you sort the BAT before calling {sum}, there will be no error. The same is true for example when you call {count}, which uses a different implementation of setaggr to process the {}).
The failing free appears within the 'bunins_failed' label (line 1800), which in turn is called by the loop that processes the aggregate when, I think, its call to 'bunins_unary' fails. Here's the failing code, starting on line 1694:
BATloopFast(extent, p, q, yy) { [...] if (BATprepareHash(b)) goto bunins_failed; HASHloop(b, b->hhash, hh, last) { r = BUNptr(b, hh); @:bunins_unary(r,hh)@ } [...] }
I don't understand why @:bunins_unary(r,hh)@ should fail. I have noticed, however, that if I remove the last couple of lines from the attached file then the call to {sum} doesn't crash. These last two lines are exactly the same as others, so it's not their particular value that matters, but rather the number of rows in the BAT with the "Other" head value.
Do you have any suggestions? Any other information that I can provide to help you fix the problem?
Thanks a lot again,
-- Agustin
PS: On OS X, instead of the crash you get an error message "malloc: *** Deallocation of a pointer not malloced: 0x281ee00; This could be a double free(), or free() called with the middle of an allocated block", but for all practical purposes it's the same: the free() is called from the same place in interpret_setaggr. The difference is just in the way the Mac equivalent of glibc handles the double free.
"Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Course",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "Faculty",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "ResearchProject",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Staff",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Student",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1 "Other",1
------------------------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Still grepping through log files to find problems? Stop. Now Search log events and configuration files using AJAX and a browser. Download your FREE copy of Splunk now >> http://get.splunk.com/ _______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
-- Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
participants (2)
-
Agustin Schapira
-
Niels Nes