[MonetDB-users] Delete is slow?
Howdy! I'm taking some time to see what MonetDB is all about, and I have some questions re performance. I created 6 BATs (basically BAT[LNG, X] where X = INT, OID, TIMESTAMP, BLOB) to test. I loaded up about 250k rows (loading was quite reasonable, even though I was using the Perl interface to do it one row at a time). I am currently trying to delete the BUNs by doing a series of "bat("bat_name").delete();", and it is just taking forever (currently at over an hour!!). I/O is extremely minimal, CPU is nailed to 100% (this is running the latest release candidate off of sourceforge on a Linux 2.6 Gentoo system). Questions: Should delete be so slow? Is there a dependency between tables that is somehow forcing an N^2 or N^3 algorithm to run? Is there an equivalent to "truncate" other than destroy() and recreate? Regards! Ed
Hi Ed, first of all thanks for being interested in MonetDB ;-) A delete of all BUNs in a BAT (i.e., "bat("bat_name").delete();") should actually work "instantly", at least for the BATs with simple types ("atoms) like INT, OID, TIMESTAMP. For complex types such as BLOB, the delete might be more expensive, due to the internal management of such atoms. Moreover, a sequence of deletions of single BUNs ("bat("bat_name").delete(head_value);", "bat("bat_name").delete(head_value, tail_value);") will usually be more expensive than a deletion of all BUNs at once, but even then I don't expect more than an hour for 250k BUNs... Hence, could you please provide us with the following additional information: - Do you indeed delete all BUNs at once, i.e., using delete() with no other parameter than the BAT? - Do the deletes on all BATs take that long, or is it only the delete on the BLOB BAT, while the others work "instantly" as expected? - How big is each of your BLOBs (on average)? - Are your BATs persistent or transient? These information should help us to better analyse your problem. It would be perfect, if you could even provide us with the scripts that you use, so that we could replay your scenario. Regards from A'dam, Stefan
Howdy!
I'm taking some time to see what MonetDB is all about, and I have some questions re performance.
I created 6 BATs (basically BAT[LNG, X] where X = INT, OID, TIMESTAMP, BLOB) to test. I loaded up about 250k rows (loading was quite reasonable, even though I was using the Perl interface to do it one row at a time). I am currently trying to delete the BUNs by doing a series of "bat("bat_name").delete();", and it is just taking forever (currently at over an hour!!). I/O is extremely minimal, CPU is nailed to 100% (this is running the latest release candidate off of sourceforge on a Linux 2.6 Gentoo system).
Questions: Should delete be so slow? Is there a dependency between tables that is somehow forcing an N^2 or N^3 algorithm to run? Is there an equivalent to "truncate" other than destroy() and recreate?
Regards! Ed
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
Howdy!
Did the deletes using the "BAT.delete()" function (the one fell swoop
method).
- Ran the Mserver with modules: module(mapi,blob,monettime);
- One run of the database was created by:
+----------------------------------
bat(void, str).rename("entity__aggregator__name").persists(TRUE).seqbase(oid(1));
bat(void, str).rename("entity__host__name").persists(TRUE).seqbase(oid(1));
bat(void, str).rename("entity__log__name").persists(TRUE).seqbase(oid(1));
bat(void, str).rename("entity__host_agent__guid").persists(TRUE).seqbase(oid(1));
bat(void, oid).rename("entity__host_agent__host").persists(TRUE).seqbase(oid(1));
bat(lng, oid).rename("eventlog__raw_record__host").persists(TRUE);
bat(lng, oid).rename("eventlog__raw_record__log").persists(TRUE);
bat(lng, oid).rename("eventlog__raw_record__host_agent").persists(TRUE);
bat(lng, oid).rename("eventlog__raw_record__aggregator").persists(TRUE);
bat(lng, blob).rename("eventlog__raw_record__record").persists(TRUE);
bat(lng, timestamp).rename("eventlog__raw_record__time_logged").persists(TRUE);
commit();
+----------------------------------
- Then I run some scripts to load up the first 5 tables (3-50 rows each).
- Then I run a script to import 250k rows into the last 6 tables. The oid
entries point back to the first 5 tables. Each row is inserted using
a single "execute()" where the execute consists of 6 MIL statements
(these perform "BAT.insert()" as well as lookups to find the OIDs for
various pieces of information, such as a host's name -> OID).
- Sorry, no raw data, but easily emulated with some random functions.
- The rough cleanup code:
+----------------------------------
bat("eventlog__raw_record__host").delete();
bat("eventlog__raw_record__log").delete();
bat("eventlog__raw_record__host_agent").delete();
bat("eventlog__raw_record__aggregator").delete();
bat("eventlog__raw_record__record").delete();
bat("eventlog__raw_record__time_logged").delete();
commit();
+----------------------------------
- Everything is fast til the commit, then _BAM!_, 100% CPU.
- If I do everything except bat("eventlog__raw_record__record"), and
commit, everything is good (instantaneous response).
- If I then do
Hi Ed,
first of all thanks for being interested in MonetDB ;-)
A delete of all BUNs in a BAT (i.e., "bat("bat_name").delete();") should actually work "instantly", at least for the BATs with simple types ("atoms) like INT, OID, TIMESTAMP. For complex types such as BLOB, the delete might be more expensive, due to the internal management of such atoms.
Moreover, a sequence of deletions of single BUNs ("bat("bat_name").delete(head_value);", "bat("bat_name").delete(head_value, tail_value);") will usually be more expensive than a deletion of all BUNs at once, but even then I don't expect more than an hour for 250k BUNs...
Hence, could you please provide us with the following additional information:
- Do you indeed delete all BUNs at once, i.e., using delete() with no other parameter than the BAT?
- Do the deletes on all BATs take that long, or is it only the delete on the BLOB BAT, while the others work "instantly" as expected?
- How big is each of your BLOBs (on average)?
- Are your BATs persistent or transient?
These information should help us to better analyse your problem. It would be perfect, if you could even provide us with the scripts that you use, so that we could replay your scenario.
Regards from A'dam,
Stefan
Howdy!
I'm taking some time to see what MonetDB is all about, and I have some questions re performance.
I created 6 BATs (basically BAT[LNG, X] where X = INT, OID, TIMESTAMP, BLOB) to test. I loaded up about 250k rows (loading was quite reasonable, even though I was using the Perl interface to do it one row at a time). I am currently trying to delete the BUNs by doing a series of "bat("bat_name").delete();", and it is just taking forever (currently at over an hour!!). I/O is extremely minimal, CPU is nailed to 100% (this is running the latest release candidate off of sourceforge on a Linux 2.6 Gentoo system).
Questions: Should delete be so slow? Is there a dependency between tables that is somehow forcing an N^2 or N^3 algorithm to run? Is there an equivalent to "truncate" other than destroy() and recreate?
Regards! Ed
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Hi Ed, thanks for you detailed info!
Howdy!
Did the deletes using the "BAT.delete()" function (the one fell swoop method).
right.
- Ran the Mserver with modules: module(mapi,blob,monettime);
- One run of the database was created by:
+---------------------------------- bat(void, str).rename("entity__aggregator__name").persists(TRUE).seqbase(oid(1)); bat(void, str).rename("entity__host__name").persists(TRUE).seqbase(oid(1)); bat(void, str).rename("entity__log__name").persists(TRUE).seqbase(oid(1)); bat(void, str).rename("entity__host_agent__guid").persists(TRUE).seqbase(oid(1)); bat(void, oid).rename("entity__host_agent__host").persists(TRUE).seqbase(oid(1));
bat(lng, oid).rename("eventlog__raw_record__host").persists(TRUE); bat(lng, oid).rename("eventlog__raw_record__log").persists(TRUE); bat(lng, oid).rename("eventlog__raw_record__host_agent").persists(TRUE); bat(lng, oid).rename("eventlog__raw_record__aggregator").persists(TRUE); bat(lng, blob).rename("eventlog__raw_record__record").persists(TRUE); bat(lng, timestamp).rename("eventlog__raw_record__time_logged").persists(TRUE); commit(); +----------------------------------
looks fine to me (using void's like a MonetDB expert! ;-)
- Then I run some scripts to load up the first 5 tables (3-50 rows each).
- Then I run a script to import 250k rows into the last 6 tables. The oid entries point back to the first 5 tables. Each row is inserted using a single "execute()" where the execute consists of 6 MIL statements (these perform "BAT.insert()" as well as lookups to find the OIDs for various pieces of information, such as a host's name -> OID).
As such, using BAT.insert() is fine, at least if you get your input date one-by-one. However, if you're actually bulk-loading the 250k rows into the last 6 BATs, ans you have the respective data in some plain format (e.g., text file with comma separated value), you should better consider using the bulkload features of module ascii_io. Also, the 6x250k lookups in to the first 5 BATs might be solved in a bulk-fashion, at least as far as I can interprete your problem right now. Anyway, you could provide us some more background info, e.g., the "ratio" of your inserts, we could provide you with some more help, here.
- Sorry, no raw data, but easily emulated with some random functions.
- The rough cleanup code:
+---------------------------------- bat("eventlog__raw_record__host").delete(); bat("eventlog__raw_record__log").delete(); bat("eventlog__raw_record__host_agent").delete(); bat("eventlog__raw_record__aggregator").delete(); bat("eventlog__raw_record__record").delete(); bat("eventlog__raw_record__time_logged").delete(); commit(); +----------------------------------
- Everything is fast til the commit, then _BAM!_, 100% CPU.
- If I do everything except bat("eventlog__raw_record__record"), and commit, everything is good (instantaneous response).
- If I then do
, it's fine until . - Thought: could the system be compacting the heap? But then, I would think I would be seeing I/O.
Any thoughts?
Yep, well, not me, but Niels Nes: Your info that its actually the commit on the blob that caused the problems, helped Niels to find and fix a bug in commit for complex types. The bug is fixed in the MonetDB_4-6 branch in CVS. We're not completely sure, whether this bug actaully caused your problem, but we do think so. Unfortunately, we haven't had the time/chance to replay your experiment to check on this. Hence, if you could cvs checkout/update and recompile your MonetDB, and re-run your test, you could check it yourself. Would be great. If cvs checkout/update it not a convenient option for you, we could also provide you with the respective patch, only, but also this requires that you use the source distribution of MonetDB. If neither in an option since you're using the binary distribution, you'll have to wait until we release the final MonetDB 4.6 (hopefully on Monday). In that case, we'll have to find some time to replay your experients before Monday... So much for now. I'll come back to your "Side notes:" separately in a minute.. Stefan -- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
Greetings! CVS should be fine (I'm working with the source tarball already). I should get a chance this evening to rerun the tests. Regards! Ed On Thu, 10 Feb 2005, Stefan Manegold wrote:
Yep, well, not me, but Niels Nes: Your info that its actually the commit on the blob that caused the problems, helped Niels to find and fix a bug in commit for complex types. The bug is fixed in the MonetDB_4-6 branch in CVS. We're not completely sure, whether this bug actaully caused your problem, but we do think so. Unfortunately, we haven't had the time/chance to replay your experiment to check on this. Hence, if you could cvs checkout/update and recompile your MonetDB, and re-run your test, you could check it yourself. Would be great. If cvs checkout/update it not a convenient option for you, we could also provide you with the respective patch, only, but also this requires that you use the source distribution of MonetDB. If neither in an option since you're using the binary distribution, you'll have to wait until we release the final MonetDB 4.6 (hopefully on Monday). In that case, we'll have to find some time to replay your experients before Monday...
So much for now. I'll come back to your "Side notes:" separately in a minute..
Stefan
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
Greetings! On Thu, 10 Feb 2005, Stefan Manegold wrote:
Yep, well, not me, but Niels Nes: Your info that its actually the commit on the blob that caused the problems, helped Niels to find and fix a bug in commit for complex types. The bug is fixed in the MonetDB_4-6 branch in CVS. We're not completely sure, whether this bug actaully caused your problem, but we do think so. Unfortunately, we haven't had the time/chance to replay your experiment to check on this. Hence, if you could cvs checkout/update and recompile your MonetDB, and re-run your test, you could check it yourself. Would be great. If cvs checkout/update it not a convenient option for you, we could also provide you with the respective patch, only, but also this requires that you use the source distribution of MonetDB. If neither in an option since you're using the binary distribution, you'll have to wait until we release the final MonetDB 4.6 (hopefully on Monday). In that case, we'll have to find some time to replay your experients before Monday...
Applied the patch, tried the same test. << commit(); >> after << bat("eventlog__raw_record__record").delete(); >> now takes about a second. Doing a << bat("eventlog__raw_record__record").print(); >> shows no BUNs. However, I am left with a 111MB X.theap. Now when I tried the following (as a way of getting rid of a BAT): mil>bat("eventlog__raw_record__host").persists(FALSE); mil>bat("eventlog__raw_record__log").persists(FALSE); mil>bat("eventlog__raw_record__host_agent").persists(FALSE); mil>bat("eventlog__raw_record__aggregator").persists(FALSE); mil>bat("eventlog__raw_record__record").persists(FALSE); mil>bat("eventlog__raw_record__time_logged").persists(FALSE); mil>commit(); MAPI = anonymous@localhost:50000 QUERY = commit(); ERROR = !ERROR: BBPdecref: eventlog__raw_record__record does not have pointer fixes. Is this a real error, or a bug (ie, shouldn't happen)? Aside: Anybody can suggest a better/cleaner way to get rid of a BAT (preferably leaving the definition, but as completely empty as possible file)? Regards! Ed
On Thu, Feb 10, 2005 at 08:09:42PM -0500, Edmund Dengler wrote:
Greetings!
On Thu, 10 Feb 2005, Stefan Manegold wrote:
Yep, well, not me, but Niels Nes: Your info that its actually the commit on the blob that caused the problems, helped Niels to find and fix a bug in commit for complex types. The bug is fixed in the MonetDB_4-6 branch in CVS. We're not completely sure, whether this bug actaully caused your problem, but we do think so. Unfortunately, we haven't had the time/chance to replay your experiment to check on this. Hence, if you could cvs checkout/update and recompile your MonetDB, and re-run your test, you could check it yourself. Would be great. If cvs checkout/update it not a convenient option for you, we could also provide you with the respective patch, only, but also this requires that you use the source distribution of MonetDB. If neither in an option since you're using the binary distribution, you'll have to wait until we release the final MonetDB 4.6 (hopefully on Monday). In that case, we'll have to find some time to replay your experients before Monday...
Applied the patch, tried the same test. << commit(); >> after << bat("eventlog__raw_record__record").delete(); >> now takes about a second. Doing a << bat("eventlog__raw_record__record").print(); >> shows no BUNs. However, I am left with a 111MB X.theap. The heap doesn't shrink, but inserting again a similar number should reuse the space without need for enlarging it.
Now when I tried the following (as a way of getting rid of a BAT):
mil>bat("eventlog__raw_record__host").persists(FALSE); mil>bat("eventlog__raw_record__log").persists(FALSE); mil>bat("eventlog__raw_record__host_agent").persists(FALSE); mil>bat("eventlog__raw_record__aggregator").persists(FALSE); mil>bat("eventlog__raw_record__record").persists(FALSE); mil>bat("eventlog__raw_record__time_logged").persists(FALSE); mil>commit(); MAPI = anonymous@localhost:50000 QUERY = commit(); ERROR = !ERROR: BBPdecref: eventlog__raw_record__record does not have pointer fixes.
Is this a real error, or a bug (ie, shouldn't happen)? That is a bug, I'll try to fix it.
Regards! Ed
Regards Niels -- Niels Nes, Centre for Mathematics and Computer Science (CWI) Kruislaan 413, 1098 SJ Amsterdam, The Netherlands room C0.02, phone ++31 20 592-4098, fax ++31 20 592-4312 url: http://www.cwi.nl/~niels e-mail: Niels.Nes@cwi.nl
Hi, again ;-)
Side notes: - I had to make some mods to get the monetdb.pm Perl module to work. In particular, in MIL mode you need to use double-quotes ("") rather than single quotes ('') when wrapping a parameter so that it can be translated (ie, it is a string, not a character).
Since I'm not a perl expert, I feel free to forward this to Steffen Goeldner, who is generously taking care of the Perl interface of MonetDB. Steffen, I hope, you don't mind, if I put his on your desk, do you? I would be great, I you could have a look at this problem. You can find the whole thread at http://sourceforge.net/mailarchive/forum.php?thread_id=6536996&forum_id=42505 If necessary, you can probabaly ask Ed for his patch... Thank you very much in advance! [ The remainder is not Perl-related, but I didn't want to split this mail in two. Sorry for the inconvenience...]
- I was able to rename a BAT to a name with a "." in it, but then was unable to refer to the BAT using the new name.
Thanks. This is actually also a bug that hasn't been detected, yet. Niels is also working on this one. It will be fixed in the final release. We will then disallow "strange characters" such as '.' in BAT names.
- I tried a load where I committed after every 100 rows. Noticed _huge_ I/O surges. Looking at the subdirectory, it looked like the X.theap file was being rewritten over and over from scratch. If this is a memory-mapped file, why would this occur? When you start playing with 55MB heap files, this kills performance.
For now, we cannot do much about that, and I suppose, there'll never be any change. Basically, we have to write the whole file to make sure that all changes are properly committed. But why do you consider having commits after every 100 rows, if you do load 250k rows in one go? Isn't a single commit at the end enough?
- Is it better to use "BAT.reverse().find(X)" than "BAT.search(X)"?
Well, this are two "different pairs of shoes": Actually, there is no "BAT.search(X)" in MIL, only a "str.search(str)" (or "str.search(chr)") for seaching in a single string: ======== MonetDB>help("search"); COMMAND: search(str, str) : int MODULE: str COMPILED: by adm on Thu Feb 10 20:16:51 2005 Search for a substring. Returns position, -1 if not found. COMMAND: search(str, chr) : int MODULE: str COMPILED: by adm on Thu Feb 10 20:16:51 2005 Search for a character. Returns position, -1 if not found. DEPRECATED: does not work for non-ASCII characters, use: search(str,str) ======== "BAT.find(X)", however, looks-up a value in the head-column of a BAT, and returns the respective tail value: ======== MonetDB>help("find"); COMMAND: find(BAT[any::1,any::2], any::1) : any::2 MODULE: algebra COMPILED: by adm on Thu Feb 10 20:16:52 2005 Returns the tail value 't' for which some [h,t] BUN exists in b. If no such BUN exists, an error occurs. ======== Note, find return only a single tail value. If the requested head-value occurs in multiple BUNs (tuples), ind choses a random one and return its tail value. Maybe, you actually ment "select" instead of "search": ======== MonetDB>help("select"); COMMAND: select(BAT[any::1,any::2], any::2) : BAT[any::1,any::2] MODULE: algebra COMPILED: by adm on Thu Feb 10 20:16:52 2005 Select all BUNs of a BAT with a certain tail value. Selection on NIL is also possible (it should be properly casted, e.g.: int(nil)). ======== Hence, using "find" only makes sense, if you only need the tail-value, and - you are sure that the looked-up value does exit exactly once in the head-column, - or, it occurs more than once, but you do not care, which tail-value you get. Note, "head" and "tail" in the above discussion can be swapped "at no cost", since "BAT.reverse()" does not cost anything (i.e., no CPU, no IO, no memory) in MonetDB. I hope, this answers your question. Please let us know, if you have more questions, problems, comments, etc. Regards, Stefan
Regards! Ed
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
Thanks for the answers Stefan! The MonetDB system is definitely looking quite interesting. On Thu, 10 Feb 2005, Stefan Manegold wrote:
- I tried a load where I committed after every 100 rows. Noticed _huge_ I/O surges. Looking at the subdirectory, it looked like the X.theap file was being rewritten over and over from scratch. If this is a memory-mapped file, why would this occur? When you start playing with 55MB heap files, this kills performance.
For now, we cannot do much about that, and I suppose, there'll never be any change. Basically, we have to write the whole file to make sure that all changes are properly committed. But why do you consider having commits after every 100 rows, if you do load 250k rows in one go? Isn't a single commit at the end enough?
I'm looking at potentially using this on a live system, where data will be arriving at some steady rate (approximatly 10-50 items per second). The current data set has some tables with over 500M rows at the moment (and growing by >5M rows per day). After looking a bit, I can see the following solutions: (1) Only ever do bulk loads (say at the end of the day). Disadvantage, no analysis of date from the current day. Advantage, can try to use a bulk load mechanism rather than "one row at a time". (2) Leave the system "dirty" and do a commit only at the end of the day. Still trying to determine the exact concurrency model. From what I've read so far, there is only global locking and no "multi-view" similar to most databases (such as Oracle or Postgresql), so this should work fine as any uncommitted data could be seen by other processes (NOTE: I have not tested this theory yet, so please excuse if I have it wrong, only so much time to play with the system!). (3) I was thinking of using persistent sub-BATs (a BAT within a BAT). Use this to partition the data into sets (say, one sub-BAT per day). But I was seeing some comments on the mailing lists re persistent BATs within BATs no longer supported? I have had no time to dig/experiment with this option. And I would be worried about the performance of operations trying to run over a partitioned set. On the other hand, I will definitely have to do something due to the sheer size of the data. Any suggestions from the community on how to deal with a live data stream?
- Is it better to use "BAT.reverse().find(X)" than "BAT.search(X)"?
Well, this are two "different pairs of shoes":
Doh! That will teach me from trying to type via memory. I did mean uselect(), and thanks for the pointers to the differences re "definite existence" vs a set. It was the tail vs head operation, but given that "reverse()" is essentially free, I won't worry about it. Thanks for the hints! Ed
Stefan Manegold wrote:
Since I'm not a perl expert, I feel free to forward this to Steffen Goeldner, who is generously taking care of the Perl interface of MonetDB.
Steffen,
I hope, you don't mind, if I put his on your desk, do you?
That's absolutely ok.
I would be great, I you could have a look at this problem. You can find the whole thread at http://sourceforge.net/mailarchive/forum.php?thread_id=6536996&forum_id=42505
If necessary, you can probabaly ask Ed for his patch...
Yes Ed, please send me a diff -u. In the face of the imminent release date, I'd prefer to make modifications in the development trunk only, not the release branch. Would that be acceptable? Steffen
I hope, you don't mind, if I put his on your desk, do you?
That's absolutely ok.
thanks a lot. we highly appreciate your help!
I would be great, I you could have a look at this problem. You can find the whole thread at http://sourceforge.net/mailarchive/forum.php?thread_id=6536996&forum_id=42505
If necessary, you can probabaly ask Ed for his patch...
Yes Ed, please send me a diff -u. In the face of the imminent release date, I'd prefer to make modifications in the development trunk only, not the release branch. Would that be acceptable?
I guess so, at least for me. If you would manage to fix and test it before sunday, we could even consider to put it into the release, but otherwise, we better keep it in the development trunk. If you checkin after the release, you could also check-in to the release branch (MonetDB_4-6), then the changes will be in a bug fix release MonetDB 4.6.2, provide we will make one before we will release MonetDB 4.8.0. The latter is actually planned for mid April, hence, the is not much chance for a ug fix release in between, unless severe bugs in 4.6.0 will unexpectedly pop-up... (if so, the bug fix release would be made around mid March). Stefan
Steffen
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
Greetings! As per request: +-------------------------------------------- diff -u ORIG.monetdb.pm NEW.monetdb.pm --- ORIG.monetdb.pm 2005-01-31 23:55:25.000000000 +0000 +++ NEW.monetdb.pm 2005-02-11 19:16:38.360595999 +0000 @@ -163,7 +163,8 @@ s/"/\\"/g; s/'/''/g; } - return "'$statement'"; + #return "'$statement'"; + return "\"$statement\""; } sub _count_param { @@ -629,7 +630,11 @@ my $rows = MapiLib::mapi_rows_affected($hdl); - if ( MapiLib::mapi_get_querytype($hdl) != 3 ) { +# if ( MapiLib::mapi_get_querytype($hdl) != 3 ) { +# $sth->{monetdb_rows} = $rows; +# return $rows || '0E0'; +# } + if ( MapiLib::mapi_get_querytype($hdl) == 0 ) { # Was a 0 meant here ??? $sth->{monetdb_rows} = $rows; return $rows || '0E0'; } +-------------------------------------------- Comments: (1) The first group was because '' demarks characters, while what seemed to be wanted in MIL is "". _However_, I have not tested this code using SQL statements, so it may be required that there is some test needed depending on MIL vs SQL mode for proper operation. (2) The second group was because I was getting a "statement not marked as Active" type error from DBI after a $sth->execute(). It is a complete hack just to get testing done, without digging to determine if this is the right correction (ie, what I did is most likely wrong and needs fixing). This may be a case of MIL vs SQL again. This can be tested via the sample MIL perl code, as this is how I found the problems. Even with the above 2 changes, the sample code does not run (fails to recognize error return codes, etc). Regards! Ed On Fri, 11 Feb 2005, Steffen Goeldner wrote:
Stefan Manegold wrote:
Since I'm not a perl expert, I feel free to forward this to Steffen Goeldner, who is generously taking care of the Perl interface of MonetDB.
Steffen,
I hope, you don't mind, if I put his on your desk, do you?
That's absolutely ok.
I would be great, I you could have a look at this problem. You can find the whole thread at http://sourceforge.net/mailarchive/forum.php?thread_id=6536996&forum_id=42505
If necessary, you can probabaly ask Ed for his patch...
Yes Ed, please send me a diff -u. In the face of the imminent release date, I'd prefer to make modifications in the development trunk only, not the release branch. Would that be acceptable?
Steffen
------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click _______________________________________________ MonetDB-users mailing list MonetDB-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/monetdb-users
Edmund Dengler wrote:
This can be tested via the sample MIL perl code, as this is how I found the problems. Even with the above 2 changes, the sample code does not run (fails to recognize error return codes, etc).
Stefan, let's start with a few questions to milsample.pl: 1) The script connects as "joe"/"joe's password". It seems that the server ignores user/password in a MIL session. Indeed, I found no BAT where users may be stored - only the SQL server creates a user table. Am I right that a MIL session is 'authentication free'? 2) The script tries to connect to database=test. DBD::monetdb ignores this part of the DSN, because mapi_connect doesn't have a database parameter. Is there another method to choose the database, or is this legacy code, or a marker for future enhancements? Should we drop the database part? 3) In the test script, b:=new(int,int); gives !ERROR: interpret_assignment: unknown variable 'b'. It seems that the 'var' keyword is now required. I read somewhere that the MIL interpreter is somewhat pickier in this release. Can you point me to an up to date manual or similar? Steffen
Steffen, first of all (once again ;-) thank you very much for your work, and your "last minute" fixes. Sorry for my late replay, but I've been (and actually still am) quite busy with the final steps toward MonetDB 4.6.0 ... Anyway, ther should be at least some time for a (short) reply. Let me start with admitting, that I'm everthing but a perl expert, and there also people here (Sjoerd, Fabian) that know Mapi better then me. Nonetheless, I'll try to give some feedback, hoping the someone else (Sjoerd, Fabian, Niels?) will fill in the rest.
Edmund Dengler wrote:
This can be tested via the sample MIL perl code, as this is how I found the problems. Even with the above 2 changes, the sample code does not run (fails to recognize error return codes, etc).
Stefan, let's start with a few questions to milsample.pl:
Not being a perl expert, I didn't produce this script. Martin originally provided it, and I'm not sure who "maintained" it since then. I notice, though, that there are two versions, that are almost identical: ======== 19:06:27 manegold@draco:~/Monet/Stable/MonetDB $ l ./src/mapi/clients/perl/milsample.pl ./src/mapi/clients/perl/Tests/milsample.pl -rwxr-x--- 1 manegold monet 3217 Feb 14 17:49 ./src/mapi/clients/perl/milsample.pl* -rwxr-x--- 1 manegold monet 2589 Jan 20 18:12 ./src/mapi/clients/perl/Tests/milsample.pl* 19:06:30 manegold@draco:~/Monet/Stable/MonetDB $ diff ./src/mapi/clients/perl/milsample.pl ./src/mapi/clients/perl/Tests/milsample.pl 2,19d1 < < # The contents of this file are subject to the MonetDB Public < # License Version 1.0 (the "License"); you may not use this file < # except in compliance with the License. You may obtain a copy of < # the License at < # http://monetdb.cwi.nl/Legal/MonetDBLicense-1.0.html < # < # Software distributed under the License is distributed on an "AS < # IS" basis, WITHOUT WARRANTY OF ANY KIND, either express or < # implied. See the License for the specific language governing < # rights and limitations under the License. < # < # The Original Code is the Monet Database System. < # < # The Initial Developer of the Original Code is CWI. < # Portions created by CWI are Copyright (C) 1997-2005 CWI. < # All Rights Reserved. < 36c18 < { PrintError => 0, RaiseError => 1 }); ---
{'PrintError' =>0, 'RaiseError' => 0});
49,51c31,36 < # deliberately executing a wrong MIL statement: < $sth= $dbh->prepare("( xyz 1);\n"); < eval { $sth->execute }; print "ERROR REPORTED: $@" if $@; ---
$sth= $dbh->prepare("( xyz 1);\n"); $sth->execute() ;#|| die "Excution error:\n".$sth->{errstr}; if($sth->{err}){ print "ERROR REPORTED:".$sth->{errstr}."\n"; } print "STH:".$sth->{row}."\n"; 53c38 < $dbh->do("var b:=new(int,int);");
$dbh->do("b:=new(int,int);"); 72d56 < $sth->execute; 19:06:34 manegold@draco:~/Monet/Stable/MonetDB $ ========
Being a sample, I'd propose to remove ./src/mapi/clients/perl/milsample.pl and keep only ./src/mapi/clients/perl/Tests/milsample.pl --- well, that is keep the newest on of the tow, but place it in ./src/mapi/clients/perl/Tests/ Unless you (or anybody else) see an urgend need to have this clean-up done "immediately", I'd propose to postpone until after the release.
1) The script connects as "joe"/"joe's password". It seems that the server ignores user/password in a MIL session. Indeed, I found no BAT where users may be stored - only the SQL server creates a user table. Am I right that a MIL session is 'authentication free'?
Right, there is no authentication at all in MIL sessions; as yet, only the SQL frontend supports (and requires) authentication.
2) The script tries to connect to database=test. DBD::monetdb ignores this part of the DSN, because mapi_connect doesn't have a database parameter. Is there another method to choose the database, or is this legacy code, or a marker for future enhancements? Should we drop the database part?
Could the MAPI experts help here?
3) In the test script, b:=new(int,int); gives !ERROR: interpret_assignment: unknown variable 'b'. It seems that the 'var' keyword is now required. I read somewhere that the MIL interpreter is somewhat pickier in this release. Can you point me to an up to date manual or similar?
The major MIL changes were already done some time (almost half a year) ago, when we officially released MonetDB 4.4, but as I saw from your check-ins, you already found the respective info in http://sourceforge.net/mailarchive/forum.php?thread_id=5564755&forum_id=36229 The fact, that we did overlook the "legacy" MIL code in the milsample.pl underlines the fact, that we indeed need more tests for our automatic nightly test system, especially tests for the various MAPI clients, like perl, python, php --- I'll come back to this point after the release... Well, that's all for now. Admittedly not much info, yet, but I count on my colleagues ;-) Regards from A'dam, Stefan
Steffen
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
Stefan Manegold wrote:
Steffen,
first of all (once again ;-) thank you very much for your work, and your "last minute" fixes. /me agrees
2) The script tries to connect to database=test. DBD::monetdb ignores this part of the DSN, because mapi_connect doesn't have a database parameter. Is there another method to choose the database, or is this legacy code, or a marker for future enhancements? Should we drop the database part?
Could the MAPI experts help here?
A DatabasePool which supports a transparent mapping of databases to multiple Mservers is currently in a very alpha development stage. JDBC currently is the only client that uses it. I would suggest to keep it in for now, as in the future it might get useful.
Edmund Dengler wrote:
- if ( MapiLib::mapi_get_querytype($hdl) != 3 ) { +# if ( MapiLib::mapi_get_querytype($hdl) != 3 ) { +# $sth->{monetdb_rows} = $rows; +# return $rows || '0E0'; +# } + if ( MapiLib::mapi_get_querytype($hdl) == 0 ) { # Was a 0 meant here ??? $sth->{monetdb_rows} = $rows; return $rows || '0E0'; }
(2) The second group was because I was getting a "statement not marked as Active" type error from DBI after a $sth->execute(). It is a complete hack just to get testing done, without digging to determine if this is the right correction (ie, what I did is most likely wrong and needs fixing). This may be a case of MIL vs SQL again.
Like ODBC and SQL/CLI, DBI doesn't provide separate methods for queries and DML, but only one common execute method. In this method, DBD::monetdb avoids to equip the statement handle with the full set of attributes, which are necessary only for the fetch loop, if mapi_get_querytype != 3. However, in a MIL session, mapi_get_querytype seems to return always -1. I could omit the shortcut described above in a MIL session and always establish a 'fetch-able' handle: if ( MapiLib::mapi_get_querytype($hdl) != 3 && $dbh->{monetdb_language} eq 'sql') { $sth->{monetdb_rows} = $rows; return $rows || '0E0'; } Any objections? Steffen
Edmund Dengler wrote:
- if ( MapiLib::mapi_get_querytype($hdl) != 3 ) { +# if ( MapiLib::mapi_get_querytype($hdl) != 3 ) { +# $sth->{monetdb_rows} = $rows; +# return $rows || '0E0'; +# } + if ( MapiLib::mapi_get_querytype($hdl) == 0 ) { # Was a 0 meant here ??? $sth->{monetdb_rows} = $rows; return $rows || '0E0'; }
(2) The second group was because I was getting a "statement not marked as Active" type error from DBI after a $sth->execute(). It is a complete hack just to get testing done, without digging to determine if this is the right correction (ie, what I did is most likely wrong and needs fixing). This may be a case of MIL vs SQL again.
Like ODBC and SQL/CLI, DBI doesn't provide separate methods for queries and DML, but only one common execute method. In this method, DBD::monetdb avoids to equip the statement handle with the full set of attributes, which are necessary only for the fetch loop, if mapi_get_querytype != 3. However, in a MIL session, mapi_get_querytype seems to return always -1. I could omit the shortcut described above in a MIL session and always establish a 'fetch-able' handle:
if ( MapiLib::mapi_get_querytype($hdl) != 3 && $dbh->{monetdb_language} eq 'sql') { $sth->{monetdb_rows} = $rows; return $rows || '0E0'; }
Any objections?
I'll also forward this one to the MAPI experts: Sjoerd, Fabian? Stefan
Steffen
-- | Dr. Stefan Manegold | mailto:Stefan.Manegold@cwi.nl | | CWI, P.O.Box 94079 | http://www.cwi.nl/~manegold/ | | 1090 GB Amsterdam | Tel.: +31 (20) 592-4212 | | The Netherlands | Fax : +31 (20) 592-4312 |
However, in a MIL session, mapi_get_querytype seems to return always -1. I could omit the shortcut MIL output currently is not as 'rich' as SQL output. Indeed with MIL
Stefan Manegold wrote: there is no query type for a result.
described above in a MIL session and always establish a 'fetch-able' handle:
if ( MapiLib::mapi_get_querytype($hdl) != 3 && $dbh->{monetdb_language} eq 'sql') { $sth->{monetdb_rows} = $rows; return $rows || '0E0'; }
Any objections?
I'll also forward this one to the MAPI experts: Sjoerd, Fabian?
The final say is to Sjoerd here, but I think that with MIL you even have to fetch everything what is fetchable because the output can be anything and isn't known upfront as with SQL's query type.
participants (5)
-
Edmund Dengler
-
Fabian
-
Niels Nes
-
Stefan.Manegold@cwi.nl
-
Steffen Goeldner