[Monetdb-developers] @= atomtostr memory usage
Hi, While creating some new code and seeing unexpected, but interesting, behavior I stumbled upon the @= atomtostr code, some simple questions regarding this. It seems that the memory requested in @= atommem is rather pessimistic. It seems that for a regular integer 24 characters are allocated. Now since this integer seems to go up to 2^31 (10 chars) + sign + \0 (2 chars), why is the double amount needed if a per type approach is available in MAL? The actual toStr code finishes with a strlen, now this is nice behavior, but if we now know what the length of the string actually is, why not shoot a GDKrealloc on it, and make the *len the actual size? Regarding to memory management and memory function calling, the code isn't very efficient. Effectively all the functions seem to introduce their own buffers. If we look from a perspective of for example strConcat, the value is copied several times from serveral position. We could overcome this bahavior if we would return a maximum length for a type, the concat function would add the length of the second string alloc this buffer, forwards the pointer to the typical atomToStr call, and instead of allocing a new string, using the prealloced buffer. (...and yes, doing a GDKrealloc after that function wouldn't be good.) I took the liberty to come up with a patch for the second issue; Since snprintf will return the amount of bytes written, and since we have established that there is no numeric type overflowing the string space, we can ommit the strlen, use the amount of bytes + 1 from snprintf, realloc, and return the actual amount of bytes representing the string. Catches; 1) Old implementation resulted bestcase in an off by one return vs len 2) Is there any code dependent on the *len or that length will include the \0 value? If there are no objections, I'll check in that patch later today in head.
From benchmarking point of view, it might be worthwhile to test with and without the realloc.
Stefan
Stefan de Konink wrote:
Hi,
If there are no objections, I'll check in that patch later today in head.
From benchmarking point of view, it might be worthwhile to test with and without the realloc.
Refrain from checking in this patch. Martin
Stefan
------------------------------------------------------------------------
------------------------------------------------------------------------------ SOLARIS 10 is the OS for Data Centers - provides features such as DTrace, Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW http://p.sf.net/sfu/solaris-dev2dev
------------------------------------------------------------------------
_______________________________________________ Monetdb-developers mailing list Monetdb-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-developers
On Thu, 11 Feb 2010, Martin Kersten wrote:
Refrain from checking in this patch.
Can the part that uses the output of snprintf instead of strlen be used? Your comment regarding memory fragmentation is ofcourse valid in a sense there will be fragmentation since there is allocated too much memory in the first place. But more interestingly can only happen when there is significant concurrency. More importantly why would fragmentation be worse than reporting more free memory over time? Is this tested, if not, can we? Stefan
participants (2)
-
Martin Kersten
-
Stefan de Konink