Howdy!
Did the deletes using the "BAT.delete()" function (the one fell swoop
method).
- Ran the Mserver with modules: module(mapi,blob,monettime);
- One run of the database was created by:
+----------------------------------
bat(void, str).rename("entity__aggregator__name").persists(TRUE).seqbase(oid(1));
bat(void, str).rename("entity__host__name").persists(TRUE).seqbase(oid(1));
bat(void, str).rename("entity__log__name").persists(TRUE).seqbase(oid(1));
bat(void, str).rename("entity__host_agent__guid").persists(TRUE).seqbase(oid(1));
bat(void, oid).rename("entity__host_agent__host").persists(TRUE).seqbase(oid(1));
bat(lng, oid).rename("eventlog__raw_record__host").persists(TRUE);
bat(lng, oid).rename("eventlog__raw_record__log").persists(TRUE);
bat(lng, oid).rename("eventlog__raw_record__host_agent").persists(TRUE);
bat(lng, oid).rename("eventlog__raw_record__aggregator").persists(TRUE);
bat(lng, blob).rename("eventlog__raw_record__record").persists(TRUE);
bat(lng, timestamp).rename("eventlog__raw_record__time_logged").persists(TRUE);
commit();
+----------------------------------
- Then I run some scripts to load up the first 5 tables (3-50 rows each).
- Then I run a script to import 250k rows into the last 6 tables. The oid
entries point back to the first 5 tables. Each row is inserted using
a single "execute()" where the execute consists of 6 MIL statements
(these perform "BAT.insert()" as well as lookups to find the OIDs for
various pieces of information, such as a host's name -> OID).
- Sorry, no raw data, but easily emulated with some random functions.
- The rough cleanup code:
+----------------------------------
bat("eventlog__raw_record__host").delete();
bat("eventlog__raw_record__log").delete();
bat("eventlog__raw_record__host_agent").delete();
bat("eventlog__raw_record__aggregator").delete();
bat("eventlog__raw_record__record").delete();
bat("eventlog__raw_record__time_logged").delete();
commit();
+----------------------------------
- Everything is fast til the commit, then _BAM!_, 100% CPU.
- If I do everything except bat("eventlog__raw_record__record"), and
commit, everything is good (instantaneous response).
- If I then do
Hi Ed,
first of all thanks for being interested in MonetDB ;-)
A delete of all BUNs in a BAT (i.e., "bat("bat_name").delete();") should actually work "instantly", at least for the BATs with simple types ("atoms) like INT, OID, TIMESTAMP. For complex types such as BLOB, the delete might be more expensive, due to the internal management of such atoms.
Moreover, a sequence of deletions of single BUNs ("bat("bat_name").delete(head_value);", "bat("bat_name").delete(head_value, tail_value);") will usually be more expensive than a deletion of all BUNs at once, but even then I don't expect more than an hour for 250k BUNs...
Hence, could you please provide us with the following additional information:
- Do you indeed delete all BUNs at once, i.e., using delete() with no other parameter than the BAT?
- Do the deletes on all BATs take that long, or is it only the delete on the BLOB BAT, while the others work "instantly" as expected?
- How big is each of your BLOBs (on average)?
- Are your BATs persistent or transient?
These information should help us to better analyse your problem. It would be perfect, if you could even provide us with the scripts that you use, so that we could replay your scenario.
Regards from A'dam,
Stefan
Howdy!
I'm taking some time to see what MonetDB is all about, and I have some questions re performance.
I created 6 BATs (basically BAT[LNG, X] where X = INT, OID, TIMESTAMP, BLOB) to test. I loaded up about 250k rows (loading was quite reasonable, even though I was using the Perl interface to do it one row at a time). I am currently trying to delete the BUNs by doing a series of "bat("bat_name").delete();", and it is just taking forever (currently at over an hour!!). I/O is extremely minimal, CPU is nailed to 100% (this is running the latest release candidate off of sourceforge on a Linux 2.6 Gentoo system).
Questions: Should delete be so slow? Is there a dependency between tables that is somehow forcing an N^2 or N^3 algorithm to run? Is there an equivalent to "truncate" other than destroy() and recreate?
Regards! Ed
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users