On Thu, Mar 28, 2013 at 06:41:08PM +0000, Bouslah Ahmed wrote:
Good afternoon,
I have 2 questions : - 1st question I have a csv file with 300000000 million lines. ^^^^^^^^^^^^^^^^^ Is that 300000000 aka. 300 million aka. 3*10^8 or indeed 300000000 million aka. 300 billion aka. 3*10^14 ?
when I load it in a MonetDB table, it takes more than 16 hours, whereas in Oracle it only takes 15 minutes. Where is the problem? I use mclient as follows :? mclient -u user1 -d demo -ims -s " COPY 300000009 RECORDS INTO LINEORDER FROM STDIN USING DELIMITERS '|','\n'" - < D:\SSB\LINEORDER.tbl
Is the client running on the same machine as the server? Did/could you try my earlier suggestion of loading directly from file on the server side (requires to use default admin user "monetdb"), as opposed to from STDIN, i.e., passing all data through mclient?
?for information I use :
MonetDB5
Which release? Feb2013-SP1?
Windows Memory : 16 Go
How many CPU cores / HW threads?
There are two constraints : primary key and foreign key on the table prior to loading data?
Did/could you try creating the table without constraints, load the data, and then add the constraints via ALTER TABLE statements. Measure the time for both loading the data and adding the constraints separately. With MonetDB Feb2013-SP1, on an 8-core 16 GB Linux machine, we load the lineorder table of SSBM SF-100 (~600 million rows) in ~30 minutes, and create (and validate) the keys in ~60 minutes).
2nd question : how inner joins are made in MongoDB? ^^^^^^^ I have no idea. Please ask the MongoDB people.
In MonetDB, we usually use hash joins.
cordialy.
Stefan -- | Stefan.Manegold@CWI.nl | DB Architectures (DA) | | www.CWI.nl/~manegold | Science Park 123 (L321) | | +31 (0)20 592-4212 | 1098 XG Amsterdam (NL) |