[MonetDB-users] Question about Vertical Partitioning and Horizontal Partitioning from newbie [ Need help ]
Hi, I am a new user who was attracted to MonetDB by the need to perform query operations in the area of data mining/ business intelligence and I have been doing some research and tried to download and play with MonetDB and so far I have been having a great experience! The situation we are dealing here is that we have transactional data of various stores of the same retailer that go back several years in the history and we have several machines that we would like to install MonetDB and perform parallel query. (let 's assume that we have data for 8 stores and we have 8 machines for simplicity) I have the following questions: 1. I understand that MonetDB automatically perform vertical partitioning on the same machine (correct me if I am wrong), would it be possible to perform vertical partitioning across several machines? Will it make the query faster by doing so? Any pointers or guidance would be highly appreciated. 2. As of the current release of MonetDB, I understand that it does not yet support horizontal partitioning, however, I am wondering that would it be possible to manually perform horizontal partitioning and put the horizontally partitioned database on different machines (i.e. each store on each machine in the example)? Then can we use tcpip module to communicate to each machine and assemble BAT together? How much effort-intensive is this approach? Any guidance is highly appreciated. I guess, in all, we are wondering how can we get the most of MonetDB across multiple machines doing parallel query. Thanks, Natthapol
Hi,
I am a new user who was attracted to MonetDB by the need to perform query operations in the area of data mining/ business intelligence and I have been doing some research and tried to download and play with MonetDB and so far I have been having a great experience!
The situation we are dealing here is that we have transactional data of various stores of the same retailer that go back several years in the history and we have several machines that we would like to install MonetDB and perform parallel query. (let 's assume that we have data for 8 stores and we have 8 machines for simplicity)
I have the following questions:
1. I understand that MonetDB automatically perform vertical partitioning on the same machine (correct me if I am wrong), would it be possible to correct perform vertical partitioning across several machines? Will it make the At this stage this is not handled by the SQL front-end. However, it is
Hi Natthapol Thank you for your interest in MonetDB and support. We appreciate learning about experiences outside our lab. Both negative and positive ;-) Natthapol Wongsaroj wrote: possible to set-up a multi-server environment using the MIL intermediate language.
query faster by doing so? Any pointers or guidance would be highly appreciated. Unless you are an experienced programmer I would suggest to wait and attempt first to see how far you get on a single system. You will also experience some performance loss when the database hot-set (the part needed in a query) greatly exceeds your primary memory. You would experience the same loss of performance in other opensource systems as well.
We are in the middle of a process where we attack the distributed processing of SQL. The target application is Skyserver, a 2.5TB demanding application in astronomy. But don't hold your breath.
2. As of the current release of MonetDB, I understand that it does not yet support horizontal partitioning, however, I am wondering that would it be possible to manually perform horizontal partitioning and put the horizontally partitioned database on different machines (i.e. each store
on each machine in the example)? Then can we use tcpip module to communicate to each machine and assemble BAT together? How much I would stay at the SQL layer and mimick multiple clients accessing
effort-intensive is this approach? Any guidance is highly appreciated. Effort depends on the complexity of your application. If you have a few well-defined queries and knowledge on JDBC it should not take more
Absolutely. But then you have to resolve the distributed query processing in a middle-layer software. It is not automatically done by the current release. the distributed database. The results (hopefully small) can be handled by a single system. than a few days to get something running. The benefit would indeed be a faster scan/aggr operation. Joining and grouping over multiple sites would be more expensive.
I guess, in all, we are wondering how can we get the most of MonetDB across multiple machines doing parallel query.
Potentially a lot, but we are working on the infrastructure to make it easy for end-users. regards, Martin
Thanks, Natthapol
------------------------------------------------------------------------
------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
------------------------------------------------------------------------
_______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Hi,
I am a new user who was attracted to MonetDB by the need to perform query operations in the area of data mining/ business intelligence and I have been doing some research and tried to download and play with MonetDB and so far I have been having a great experience!
The situation we are dealing here is that we have transactional data of various stores of the same retailer that go back several years in the history and we have several machines that we would like to install MonetDB and perform parallel query. (let 's assume that we have data for 8 stores and we have 8 machines for simplicity)
I have the following questions:
1. I understand that MonetDB automatically perform vertical partitioning on the same machine (correct me if I am wrong), would it be possible to correct perform vertical partitioning across several machines? Will it make the At this stage this is not handled by the SQL front-end. However, it is
Martin,
Thanks for the insightful information, I really appreciate. Initially, I am hoping to be able to run generic query across multiple machines using MonetDB but from what you said, it seem like it 's difficult to accomplish and you guys are working it right now. I will be looking forward to the next generation of MonetDB.
Regards,
Natthapol
Martin Kersten
query faster by doing so? Any pointers or guidance would be highly appreciated. Unless you are an experienced programmer I would suggest to wait and attempt first to see how far you get on a single system. You will also experience some performance loss when the database hot-set (the part needed in a query) greatly exceeds your primary memory. You would experience the same loss of performance in other opensource systems as well.
We are in the middle of a process where we attack the distributed processing of SQL. The target application is Skyserver, a 2.5TB demanding application in astronomy. But don't hold your breath.
2. As of the current release of MonetDB, I understand that it does not yet support horizontal partitioning, however, I am wondering that would it be possible to manually perform horizontal partitioning and put the horizontally partitioned database on different machines (i.e. each store
on each machine in the example)? Then can we use tcpip module to communicate to each machine and assemble BAT together? How much I would stay at the SQL layer and mimick multiple clients accessing
effort-intensive is this approach? Any guidance is highly appreciated. Effort depends on the complexity of your application. If you have a few well-defined queries and knowledge on JDBC it should not take more
Absolutely. But then you have to resolve the distributed query processing in a middle-layer software. It is not automatically done by the current release. the distributed database. The results (hopefully small) can be handled by a single system. than a few days to get something running. The benefit would indeed be a faster scan/aggr operation. Joining and grouping over multiple sites would be more expensive.
I guess, in all, we are wondering how can we get the most of MonetDB across multiple machines doing parallel query.
Potentially a lot, but we are working on the infrastructure to make it easy for end-users. regards, Martin
Thanks, Natthapol
------------------------------------------------------------------------
------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
------------------------------------------------------------------------
_______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
participants (2)
-
Martin Kersten
-
Natthapol Wongsaroj