[MonetDB-users] Counting open BATs
Greetings! In the architectural overview, I find the text: "Most memory resource problems in MonetDB are easily avoidable by minimizing your open BAT references (as discussed above) and making your access patterns cache friendly (see the radix module)." Is there anyplace that actually describes when a BAT is considered open? Is it a scope thing (BATs are closed as soon as they are out of scope)? Is a BAT open when I append/update? Does it close when I perform a commit? If not, is there a way to specify "I am done, clean it out"? I am noticing a large amount of open memory (approximately 500MB out of a 1GB system) being used by MonetDB. I have hand partitioned my BATs into groups using a naming convention, but I am wondering if during inserts I am somehow keeping open references to the BATs which may cause issues. Is there a way to find out where memory is being allocated? Which BATs are open? Also, I am setting some of the larger BATs to STORE_MMAP. I notice in the documentation that if I compress the heaps, then these must be set to STORE_MEM. Is there a reason that when uncompressing the heap, that the uncompressed version cannot be STORE_MMAP to avoid memory problems? Regards! Ed
Greetings!
In the architectural overview, I find the text: "Most memory resource problems in MonetDB are easily avoidable by minimizing your open BAT references (as discussed above) and making your access patterns cache friendly (see the radix module)."
Is there anyplace that actually describes when a BAT is considered open?
It's also mentioned in the architectural overview. Basically, a BAT is considered opne, if the is any active reference, e.g., a MIL variable or from a nested "BAT of BATs", to that BAT.
Is it a scope thing (BATs are closed as soon as they are out of scope)?
Kind of: MIL variables are destroyed at the end of the scope (i.e., {} or {||} MIL block) they were declared in; and thus also the BATs they refer to are closed (provided it's the last reference to that BAT).
Is a BAT open when I append/update? Does it close when I perform a commit?
BATs are active/open as soon / as long as you touch/access/use them, i.e., as long as there is a reference to a BAT (see above). commit() does not remove any reference.
If not, is there a way to specify "I am done, clean it out"?
there are two ways: on the one hand, you can unload a BAT using the "unload(str)" command (e.g., "unload(bbpname(b));", see `help("unload");` for details. on the other hand, you can remove references to BATs by setting the respective variable(s) to nil ("b:=nil;"), or by limiting the scope of variables using "{}" MIL blocks.
I am noticing a large amount of open memory (approximately 500MB out of a 1GB system) being used by MonetDB. I have hand partitioned my BATs into groups using a naming convention, but I am wondering if during inserts I am somehow keeping open references to the BATs which may cause issues.
I read from you previus posting over the crashing Mserver, that you have about 7500 (possibly large) BATs open, because you just created them and inserted data into them. This can indeed be a problem, if the total size of all these BATs gets close to the address space limit (2-4GB on your 32-bit system). Do you really need to have all these BATs active at the same time? Can you use them one by one, or at least in smaller groups? Could you give us a more detailed description of your application/approach/MIL-program? This might help us, to give you some more advice how to use MonetDB more efficiently.
Is there a way to find out where memory is being allocated? Which BATs are open?
Try any of these: print(memory()); mem_printmap(); print(vm_usage()); dir(); ls();
Also, I am setting some of the larger BATs to STORE_MMAP. I notice in the documentation that if I compress the heaps, then these must be set to STORE_MEM. Is there a reason that when uncompressing the heap, that the uncompressed version cannot be STORE_MMAP to avoid memory problems?
Sorry, I cannot answer thisone right now. I'll ask someone who knows... Regards, Stefan
Regards! Ed
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
Hi Stefan! On Thu, 21 Apr 2005, Stefan Manegold wrote:
In the architectural overview, I find the text: "Most memory resource problems in MonetDB are easily avoidable by minimizing your open BAT references (as discussed above) and making your access patterns cache friendly (see the radix module)."
Is there anyplace that actually describes when a BAT is considered open?
It's also mentioned in the architectural overview.
Basically, a BAT is considered opne, if the is any active reference, e.g., a MIL variable or from a nested "BAT of BATs", to that BAT.
Is it a scope thing (BATs are closed as soon as they are out of scope)?
Kind of: MIL variables are destroyed at the end of the scope (i.e., {} or {||} MIL block) they were declared in; and thus also the BATs they refer to are closed (provided it's the last reference to that BAT).
Is a BAT open when I append/update? Does it close when I perform a commit?
BATs are active/open as soon / as long as you touch/access/use them, i.e., as long as there is a reference to a BAT (see above). commit() does not remove any reference.
If not, is there a way to specify "I am done, clean it out"?
there are two ways:
on the one hand, you can unload a BAT using the "unload(str)" command (e.g., "unload(bbpname(b));", see `help("unload");` for details.
on the other hand, you can remove references to BATs by setting the respective variable(s) to nil ("b:=nil;"), or by limiting the scope of variables using "{}" MIL blocks.
I am noticing a large amount of open memory (approximately 500MB out of a 1GB system) being used by MonetDB. I have hand partitioned my BATs into groups using a naming convention, but I am wondering if during inserts I am somehow keeping open references to the BATs which may cause issues.
I read from you previus posting over the crashing Mserver, that you have about 7500 (possibly large) BATs open, because you just created them and inserted data into them. This can indeed be a problem, if the total size of all these BATs gets close to the address space limit (2-4GB on your 32-bit system). Do you really need to have all these BATs active at the same time? Can you use them one by one, or at least in smaller groups?
Could you give us a more detailed description of your application/approach/MIL-program? This might help us, to give you some more advice how to use MonetDB more efficiently.
The pattern of usage: Read a row of data. Determine the "partition" to stick the data in (generates a number). In one BAT, track all such partitions. If this partition does not exist, then create the BAT, rename it, set the base, etc. Store the name in the tracking BAT. Insert the data into the data partition BAT. Every 10,000 rows, do a commit.
Hi Ed, sorry for the delayed reply ...
The pattern of usage:
Read a row of data. Determine the "partition" to stick the data in (generates a number). In one BAT, track all such partitions. If this partition does not exist, then create the BAT, rename it, set the base, etc. Store the name in the tracking BAT. Insert the data into the data partition BAT. Every 10,000 rows, do a commit.
From your description, after each row, I do not have a variable that is keeping the BAT open (I always use "bat("name").XXX", or keep a ref variable in a block. So, as far as I can tell, I should not have 7500 BATs open at once.
Am I misunderstanding?
No, as far as I can see from your discription, everything looks fine. Unfortunately, I cannot yet see, where your memory problems come from. Would it be possbile to provide us with your MIL code --- or at least the curcial parts? Maybe, you could even send us a kind of trace or log of actions that lead to the crash?
Or is there some underlying component/module keeping the BATs open?
No, none that I'm aware of.
Or is this just the amount of memory MonetDB will use?
Well, could be, but to be sure, I'd need more details of you usage of MonetDB (see above).
I do not have a BAT of BATs (as I was sure that this would cause problems, and from your description, will indeed).
Good. BATs of BATs are indeed considered "problematic" (if not "evil"); better use BATs of bat-names (I guess, that's what you do with your one BAT that tracks all the partitions?). Stefan -- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
participants (2)
-
Edmund Dengler
-
Stefan.Manegold@cwi.nl