BAT Buffer Pool functionality
Hi, I am reading Monetdb code to understand the functionality of BAT Buffer Pool (BBP). I understood that it doesn't allocate allocate memory for actual BAT but BATs are first instantiated on the HEAP memory using BATcreatedesc, BATnewstorage where BATcreatedesc also registers the entry of new BAT into BBP. BBP only allocates HEAP memory for maintaining the bookkeeping information for the BATs and keeping the directory structure information. I understood from the code that BBP keeps looking for the HOT/COLD state of BATs and performs unloading if memory is growing or descriptors are increasing using BBPTrim. Does this mean File descriptor here? Questions: 1) I want to know if MonetDB allocates a large heap space internally then reuse that space for allocating memory chunks to BATs on demand or every time MonetDB ask operating system to extend heap size when new BAT is created? 2) I want to understand how BBP manages the HEAP memory. When a BAT is no longer needed, then what BBP does with that memory. Does it return that memory to OS and shrinks the heap size used by MonetDB or it holds that space for future allocations? 3) A typical TPCH query generates 100's of intermediate BATs. Does MonetDB allocates memory for those BATs during query execution and returns it back to operating system when query finishes? 4) When we start MonetDB then how much initial heap memory is allocated by the MonetDB system. Does BBP allocates this memory on the HEAP? If not, then which part of the code knows which BATs to load at the start time and how much HEAP memory needs to be allocated for those BATs. I found the list of all BATs in Monetdb through bbp.get() mal call. I also understand that SQL catalog maintains the mapping of table to column names and column names to BAT_in_BBP mapping. But I couldn't able to find which BAT I should access through mal code to get the sql catalog mappings. I want to use MAL to see which table is composed of which column is linked to which BAT in BBP. Thanks for the help. Kind Regards, Ahmad
Hi, We appreciate that people take the effort to study code produced over a period of 20 years. However, understanding the complete software stack and its rationale is not something we can answer in a timely manner on a mailing list. Likewise the many alternative implementations that might be/have been considered in the code base. We welcome discussions to (re)solve specific critical issues, provided they materialize at the SQL level, and to a lesser extend to the MAL layer. The latter is under severe reorganization. Furthermore, any such issue is subordinate to our prime research mission and activities. If a detailed study/explanation of the code base is needed for a planned technical improvement, I have to revert you to the professional services of MonetDB BV, which has limited manpower for such educational activities. regards, Martin Kersten ps. Of course we welcome our external users and developers to cast a light on the issues where appropriate. On 6/4/13 9:22 PM, Hassan, Ahmad wrote:
Hi,
I am reading Monetdb code to understand the functionality of BAT Buffer Pool (BBP). I understood that it doesn’t allocate allocate memory for actual BAT but BATs are first instantiated on the HEAP memory using BATcreatedesc, BATnewstorage where BATcreatedesc also registers the entry of new BAT into BBP. BBP only allocates HEAP memory for maintaining the bookkeeping information for the BATs and keeping the directory structure information. I understood from the code that BBP keeps looking for the HOT/COLD state of BATs and performs unloading if memory is growing or descriptors are increasing using BBPTrim. Does this mean File descriptor here? Page mapping is done automatically by the OS.
Questions:
1)I want to know if MonetDB allocates a large heap space internally then reuse that space for allocating memory chunks to BATs on demand or every time MonetDB ask operating system to extend heap size when new BAT is created? Heaps are mostly memory mapped files.
2)I want to understand how BBP manages the HEAP memory. When a BAT is no longer needed, then what BBP does with that memory. Does it return that memory to OS and shrinks the heap size used by MonetDB or it holds that space for future allocations?
Memory mapped files are released to the OS, except for a few
3)A typical TPCH query generates 100’s of intermediate BATs. Does MonetDB allocates memory for those BATs during query execution and returns it back to operating system when query finishes?
yes
4)When we start MonetDB then how much initial heap memory is allocated by the MonetDB system. Does BBP allocates this memory on the HEAP? If not, then which part of the code knows which BATs to load at the start time and how much HEAP memory needs to be allocated for those BATs.
All dynamic determined and there is no such thing as a fixed memory heap size.
I found the list of all BATs in Monetdb through bbp.get() mal call. I also understand that SQL catalog maintains the mapping of table to column names and column names to BAT_in_BBP mapping. But I couldn’t able to find which BAT I should access through mal code to get the sql catalog mappings. I want to use MAL to see which table is composed of which column is linked to which BAT in BBP.
The SQL catalog is a database by itself, the BBP has no knowledge about SQL and never will. For a given SQL database you can find there corresponding BATS using 'SELECT location FROM storage()' Succes, Martin
Thanks for the help.
Kind Regards, Ahmad
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
Hi Martin, Many thanks for the answers. Please can I get a clue about the use of mmap in MonetDB;
Memory mapped files are released to the OS, except for a few
I am analysing the memory allocation of data structures in MonetDB and I noticed that there are 10 or less mmap allocations in whole execution of TPCH query. But there are 1000's (millions) of object allocations through malloc/realloc. So this made me thinking that the TPCH tables that are stored on the disk are mmap'ed in to the memory. But all other runtime memory for the intermediate BATs or other MonetDB operations are allocated through malloc/realloc. This observation of less than 10 mmap is true for all 22 TPCH queries and for scale factor of 40. Thanks again. Kind Regards, Ahmad -----Original Message----- From: users-list [mailto:users-list-bounces+ahmad.hassan=sap.com@monetdb.org] On Behalf Of Martin Kersten Sent: 04 June 2013 20:59 To: users-list@monetdb.org Subject: Re: BAT Buffer Pool functionality Hi, We appreciate that people take the effort to study code produced over a period of 20 years. However, understanding the complete software stack and its rationale is not something we can answer in a timely manner on a mailing list. Likewise the many alternative implementations that might be/have been considered in the code base. We welcome discussions to (re)solve specific critical issues, provided they materialize at the SQL level, and to a lesser extend to the MAL layer. The latter is under severe reorganization. Furthermore, any such issue is subordinate to our prime research mission and activities. If a detailed study/explanation of the code base is needed for a planned technical improvement, I have to revert you to the professional services of MonetDB BV, which has limited manpower for such educational activities. regards, Martin Kersten ps. Of course we welcome our external users and developers to cast a light on the issues where appropriate. On 6/4/13 9:22 PM, Hassan, Ahmad wrote:
Hi,
I am reading Monetdb code to understand the functionality of BAT Buffer Pool (BBP). I understood that it doesn't allocate allocate memory for actual BAT but BATs are first instantiated on the HEAP memory using BATcreatedesc, BATnewstorage where BATcreatedesc also registers the entry of new BAT into BBP. BBP only allocates HEAP memory for maintaining the bookkeeping information for the BATs and keeping the directory structure information. I understood from the code that BBP keeps looking for the HOT/COLD state of BATs and performs unloading if memory is growing or descriptors are increasing using BBPTrim. Does this mean File descriptor here? Page mapping is done automatically by the OS.
Questions:
1)I want to know if MonetDB allocates a large heap space internally then reuse that space for allocating memory chunks to BATs on demand or every time MonetDB ask operating system to extend heap size when new BAT is created? Heaps are mostly memory mapped files.
2)I want to understand how BBP manages the HEAP memory. When a BAT is no longer needed, then what BBP does with that memory. Does it return that memory to OS and shrinks the heap size used by MonetDB or it holds that space for future allocations?
Memory mapped files are released to the OS, except for a few
3)A typical TPCH query generates 100's of intermediate BATs. Does MonetDB allocates memory for those BATs during query execution and returns it back to operating system when query finishes?
yes
4)When we start MonetDB then how much initial heap memory is allocated by the MonetDB system. Does BBP allocates this memory on the HEAP? If not, then which part of the code knows which BATs to load at the start time and how much HEAP memory needs to be allocated for those BATs.
All dynamic determined and there is no such thing as a fixed memory heap size.
I found the list of all BATs in Monetdb through bbp.get() mal call. I also understand that SQL catalog maintains the mapping of table to column names and column names to BAT_in_BBP mapping. But I couldn't able to find which BAT I should access through mal code to get the sql catalog mappings. I want to use MAL to see which table is composed of which column is linked to which BAT in BBP.
The SQL catalog is a database by itself, the BBP has no knowledge about SQL and never will. For a given SQL database you can find there corresponding BATS using 'SELECT location FROM storage()' Succes, Martin
Thanks for the help.
Kind Regards, Ahmad
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
Every MAL instruction comes with a bunch of malloc/frees. It starts with its parsing and ends when the last optimizer. During query processing, mallocs may be needed to deal with strings taken out of BATs. Simple strace (or mserver5 --memory) will show you what happens. On 6/4/13 10:21 PM, Hassan, Ahmad wrote:
Hi Martin,
Many thanks for the answers. Please can I get a clue about the use of mmap in MonetDB;
Memory mapped files are released to the OS, except for a few
I am analysing the memory allocation of data structures in MonetDB and I noticed that there are 10 or less mmap allocations in whole execution of TPCH query. But there are 1000's (millions) of object allocations through malloc/realloc. So this made me thinking that the TPCH tables that are stored on the disk are mmap'ed in to the memory. But all other runtime memory for the intermediate BATs or other MonetDB operations are allocated through malloc/realloc. This observation of less than 10 mmap is true for all 22 TPCH queries and for scale factor of 40.
Thanks again.
Kind Regards, Ahmad
-----Original Message----- From: users-list [mailto:users-list-bounces+ahmad.hassan=sap.com@monetdb.org] On Behalf Of Martin Kersten Sent: 04 June 2013 20:59 To: users-list@monetdb.org Subject: Re: BAT Buffer Pool functionality
Hi,
We appreciate that people take the effort to study code produced over a period of 20 years. However, understanding the complete software stack and its rationale is not something we can answer in a timely manner on a mailing list. Likewise the many alternative implementations that might be/have been considered in the code base.
We welcome discussions to (re)solve specific critical issues, provided they materialize at the SQL level, and to a lesser extend to the MAL layer. The latter is under severe reorganization. Furthermore, any such issue is subordinate to our prime research mission and activities.
If a detailed study/explanation of the code base is needed for a planned technical improvement, I have to revert you to the professional services of MonetDB BV, which has limited manpower for such educational activities.
regards, Martin Kersten
ps. Of course we welcome our external users and developers to cast a light on the issues where appropriate.
On 6/4/13 9:22 PM, Hassan, Ahmad wrote:
Hi,
I am reading Monetdb code to understand the functionality of BAT Buffer Pool (BBP). I understood that it doesn't allocate allocate memory for actual BAT but BATs are first instantiated on the HEAP memory using BATcreatedesc, BATnewstorage where BATcreatedesc also registers the entry of new BAT into BBP. BBP only allocates HEAP memory for maintaining the bookkeeping information for the BATs and keeping the directory structure information. I understood from the code that BBP keeps looking for the HOT/COLD state of BATs and performs unloading if memory is growing or descriptors are increasing using BBPTrim. Does this mean File descriptor here? Page mapping is done automatically by the OS.
Questions:
1)I want to know if MonetDB allocates a large heap space internally then reuse that space for allocating memory chunks to BATs on demand or every time MonetDB ask operating system to extend heap size when new BAT is created? Heaps are mostly memory mapped files.
2)I want to understand how BBP manages the HEAP memory. When a BAT is no longer needed, then what BBP does with that memory. Does it return that memory to OS and shrinks the heap size used by MonetDB or it holds that space for future allocations?
Memory mapped files are released to the OS, except for a few
3)A typical TPCH query generates 100's of intermediate BATs. Does MonetDB allocates memory for those BATs during query execution and returns it back to operating system when query finishes?
yes
4)When we start MonetDB then how much initial heap memory is allocated by the MonetDB system. Does BBP allocates this memory on the HEAP? If not, then which part of the code knows which BATs to load at the start time and how much HEAP memory needs to be allocated for those BATs.
All dynamic determined and there is no such thing as a fixed memory heap size.
I found the list of all BATs in Monetdb through bbp.get() mal call. I also understand that SQL catalog maintains the mapping of table to column names and column names to BAT_in_BBP mapping. But I couldn't able to find which BAT I should access through mal code to get the sql catalog mappings. I want to use MAL to see which table is composed of which column is linked to which BAT in BBP.
The SQL catalog is a database by itself, the BBP has no knowledge about SQL and never will.
For a given SQL database you can find there corresponding BATS using 'SELECT location FROM storage()'
Succes, Martin
Thanks for the help.
Kind Regards, Ahmad
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list _______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
participants (2)
-
Hassan, Ahmad
-
Martin Kersten