Hi all, I'm using monetdb on two servers and meet some strange behavior. The two servers are of the same configuration as the following: CPU: Intel(R) Xeon(R) CPU E5-2430L, 6 Cores, 12 Threads, 2 GHz, 64 bits. Memory: 64GB OS: Ubuntu Server 13.04 with 3.8.0-25-generic x86_64 kernel MonetDB: v11.15.7 And they are both bought for less than 4 months. The problem is that the performance of MonetDB is very unstable, even for very simple select * queries. The experiment setting is as the following: Data: Server1: Table1 loaded from a 19GB CSV file (CSV1). Server2: Table1 loaded from another 19GB CSV file (CSV2). CSV1 and CSV2 have the same schema but have different contents. But they both have size of about 19GB and both have 155000000 tuples. The schema is: Field1 VARCHAR(16), Field2 VARCHAR(100), Field3 DATE, Field4 FLOAT, Field5 VARCHAR(64), Field6 CHAR(3), Field7 CHAR(6), Field8 VARCHAR(32), Field9 INT In addition, both the tables have indices on Field1, Field2 and Field3. Experiment Query: echo -e '\\w-1 \n \\> /dev/null \n select * from Table1;' | mclient -d expdatabase --interactive The problem is that the performance varies a lot between the servers and also between different runs on the same server. Here is the result: Server1: 4m 12s Server2: 305m 29s The system conditions during the experiment: Disk speed: I run the command: cat CSV > /dev/null. The speed is normal, about 2m ~ 2m 30s . System load: The machine load during the query executions are both very low. The CPUs are almost completely idle and free memories are abundant. Each monetdb server took about 50G. To eliminate possible system configuration problems. I did two more experiments. 1) I move CSV2 to server1, stop the server1 monetdb and build and start a new monetdb database using CSV2. Then I run the same query again on the new database. The time elapse is 134m 8s. 2) After 1) is done I stop the new monetdb and restart the old monetdb on server1, and run the same query again. This time the time elapse becomes 388m 37s (Comparing 4m 12s in the first run). Further disk observations: During the above two experiments, I monitor the disk read speed of monetdb using iotop. The speed keeps about only 500 ~ 550 k/s. While the read speed of cat Table1.csv > /dev/null keeps around 135m/s. I did not do any configuration about monetdb. Is it possible that I misconfigured something? Thanks. Victor
Hi Victor, thanks for testing and reporting! I have a few questions and comments: Did you compile MonetDB yourself or install binary packages? If the former, did you use any configure options, and if so which? For "very simple" select * from table queries, I do not see a reason why a DBMS is required, and would prefer and recommend to use 'cat file.csv' instead. With such queries, you mainly test the DBMSs ability to serialize (in your case) your entire 19 GB table, send 19 GB serialized (textual) data from server to client, and the clients ability to render that amount of data for user-readable display. Indexes should (do) not play any role in such queries; surely not in MonetDB, where the system automatically and transparently chooses when to build and use indexes. Having said all that, I still would not expect the performance differences you reported. Here are some more questions to analyse the problem further: Does your schema contain string (char, varchar, etc.) columns? If so, could you share the results of "select count(col_name distinct) from table;" for each string column of both tables? Did / could you run your query on the same table 3 times after another and report times of all runs? Could you run your query prefixed with "TRACE" (on both systems / tables) and share the first result set, i.e., the performance trace? Thanks! Stefan ----- Original Message -----
Hi all,
I'm using monetdb on two servers and meet some strange behavior.
The two servers are of the same configuration as the following:
CPU: Intel(R) Xeon(R) CPU E5-2430L, 6 Cores, 12 Threads, 2 GHz, 64 bits. Memory: 64GB OS: Ubuntu Server 13.04 with 3.8.0-25-generic x86_64 kernel MonetDB: v11.15.7
And they are both bought for less than 4 months.
The problem is that the performance of MonetDB is very unstable, even for very simple select * queries.
The experiment setting is as the following:
Data: Server1: Table1 loaded from a 19GB CSV file (CSV1). Server2: Table1 loaded from another 19GB CSV file (CSV2).
CSV1 and CSV2 have the same schema but have different contents. But they both have size of about 19GB and both have 155000000 tuples.
The schema is: Field1 VARCHAR(16), Field2 VARCHAR(100), Field3 DATE, Field4 FLOAT, Field5 VARCHAR(64), Field6 CHAR(3), Field7 CHAR(6), Field8 VARCHAR(32), Field9 INT
In addition, both the tables have indices on Field1, Field2 and Field3.
Experiment Query:
echo -e '\\w-1 \n \\> /dev/null \n select * from Table1;' | mclient -d expdatabase --interactive
The problem is that the performance varies a lot between the servers and also between different runs on the same server.
Here is the result:
Server1: 4m 12s Server2: 305m 29s
The system conditions during the experiment:
Disk speed: I run the command: cat CSV > /dev/null. The speed is normal, about 2m ~ 2m 30s .
System load: The machine load during the query executions are both very low. The CPUs are almost completely idle and free memories are abundant. Each monetdb server took about 50G.
To eliminate possible system configuration problems. I did two more experiments.
1) I move CSV2 to server1, stop the server1 monetdb and build and start a new monetdb database using CSV2.
Then I run the same query again on the new database. The time elapse is 134m 8s.
2) After 1) is done I stop the new monetdb and restart the old monetdb on server1, and run the same query again. This time the time elapse becomes 388m 37s (Comparing 4m 12s in the first run).
Further disk observations: During the above two experiments, I monitor the disk read speed of monetdb using iotop. The speed keeps about only 500 ~ 550 k/s. While the read speed of cat Table1.csv > /dev/null keeps around 135m/s.
I did not do any configuration about monetdb. Is it possible that I misconfigured something?
Thanks. Victor
_______________________________________________ users-list mailing list users-list@monetdb.org http://mail.monetdb.org/mailman/listinfo/users-list
-- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold/ | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |
participants (2)
-
Stefan Manegold
-
Victor Almeida