On Nov 6, 2009, at 11:09, Jan Flokstra wrote:
On Thursday 05 November 2009 23:45, Jan Rittinger wrote:
Hi Anandeshwar,
thanks a lot for extending pathfinder with full-text functionality.
Hi Jan,
Thanks for your comments. Anand had a development version on his PC for half a year which survived a couple of disk crashes and a couple of nasty integration problems because the compiler code changed. Because Anand moves on soon we decided to check his stuff in. We did a thorough test of his changes and after his changes did not break any test from the entire Pathfinder test collection we decided it was time to check his stuff in. We are aware we changed the original compiler stuff at a couple of places but most of them were inevitable.
Hi Jan & Anand, In my eyes your checkins should have started half a year ago (in a branch). Then the integration and recovery problems wouldn't have occurred. Inevitable does not mean that they should never be communicated before. To be honest I was quite surprised by this big checkin.
I'm however a little bit taken by surprise and your checkin in my opinion leaves some questions open:
- Why did you change the translation of PFcore_some in core.c? Could this be and old re-checkin of something changed long time ago. It could have been accidently reintroduced with one of the reintegration or crash recoveries. This is indeed no change necessary for a fulltext construction.
Are you removing this code regression?
- Why did this rewrite touch any file in the algebra folder? - There were some score handling issues which could be handled easyer in algebra. When the complete score handling and propagation in core to algebra is finished these algebra changes can be removed. If scores are not used the changes do not do anyting and when used they will not break anything.
Ok, I did not make myself clear enough. I did not mean core2alg.brg. But the following files are changed: algebra/prop/prop_unq_names.c algebra/prop/prop_ocol.c xmlimport/xml2lalg.c Why? - Are there any real changes or are only whitespaces and error checks 'accidentally' removed?
Whitespace removal is not ftcontains support as your message suggests. I do not understand which message you mean? It looks like a cryptic way of you to say the indent and whitespace layout of files are changed unnecessarily.
see above. If a checkin does something else than its message suggests then it makes things harder to understand.
Additionally disabling consistency checks without an explanation isn't a nice thing. This is indeed an error. A lot of checks have been disabled during development but they should all have be switched 'on' again. I suppose you mean the PFprop_type_of_() check where the PFoops has been disabled. There should have been an 'incomplete' message there.
yes, see also above.
- Why did you duplicate the translation of a for loop? - Isn't the score just an optional modification like the positional variable? I at least don't like this code duplication as it will make code maintenance a lot harder. - As far as I can oversee you did break the compositionality of the compilation scheme. Now the different rules provide different result schemas.
You are right about the last two points but the way I understand it it is necessary to do it this way because the ftcontains clause changes the semantics of the for loop. I heard Stefan Klinger also complain a lot about the horrible semantics of the xquery fulltext for loops. So the new rules indeed break the compositionality but they are supposed to :-(
The code in core2alg.brg however speaks otherwise. From the 220 lines of code necessary to cope with the for-score-loop 200 lines are a verbatim copy of the for-rule. Only a single if-block that introduces the score variable is added. That's what I call unncessary code duplication!
I think (mainly because of the last issue) this checkin should better go into a separate branch instead of the CVS HEAD until the translation is more mature.
We could go into a separate branch but it looks this fulltext project will be a long project. We will have a separate branch for year with lots of overhead for the CSV maintainers and our mailboxes. I think 'ifdeffing' most of the fulltext stuff will give a better overview what has changed because of xquery fulltext and what are the real problems. Also compiler developpers have a better overview when they change things where fulltext isssues are involved,
Your changes make the core->algebra translation inconsistent. As long as this is the case I object having this code in the CVS HEAD! You introduced calls to a function attach_score. This function is called whenever you might find an additional score column useful -- i.e. in an inconsistent way. If you want to introduce scores then introduce them everywhere and make the compilation scheme consistent again (with an iter|pos|item|score schema). This problematic behavior is perhaps most prominent in logical.c where you try to fix inconsistencies in a constructor instead of the code generation. A node constructor should not know anything about the translation strategy. Especially using a hard-coded column name in a constructor, that needs to cope with arbitrary column names, is bad style. Jan -- Jan Rittinger Lehrstuhl Datenbanken und Informationssysteme Wilhelm-Schickard-Institut für Informatik Eberhard-Karls-Universität Tübingen http://www-db.informatik.uni-tuebingen.de/team/rittinger