
On May 18, 2009, at 1:39 PM, Martin Kersten wrote:
Hi, where can I find some MAL samples? There are a large collection of MAL test scripts available in
Henri Asseily wrote: the source distribution. That would inspire most.
Thanks, I just saw those. I am now looking for the scripts that have meat in them.
Aside from these example, there is the MAL documentation.
That's the m5manual.pdf?
I'm trying to do very simple things such as storing and loading a bat, joining bats and iterating through them.
I know I can do an explain on SQL statements, but those seem incredibly convoluted even for a simple select on a single column, which should be a straightforward sql.bind(,,,0) with follow-through sql.resultSet and sql.rsColumn if I understand this whole business correctly. Instead, it's doing a couple of algebra kunion and kdifference that I don't understand.
The manual contains an appendix with a short description of the operators.
What I meant was that it doesn't make sense to me to do 2 kunion and 2 kdifference in a simple select on a single column. I understand what kunion and kdifference do, I just don't see why they're used in this case.
I'm not trying to do MAL algebra on a SQL database, I'm trying to go 100% MAL and hand-coding some simple joins and iterators.
Iterators should hardly be needed for most algebraic plans.
Here's an example: I've got a 3-word input phrase, and I want to see which of a large set of strings have those 3 words in it. Furthermore, I want to value a string more if there's an exact match on the phrase rather than a partial or scattered match (words in different positions). The first thing I do is to tokenize all the words from all the strings and build an inverted index that I store in the db. For each word, I have one entry for each string id and a list of positions of the word in that string. When I make the query for the 3-word phrase, I grab the 3 sublists of string ids and append them to each other. Now I've got all the strings that have any combo of those 3 words. Then I need to loop through this to determine the actual match type and make sure I find the string ids that have consecutive positions for all 3 words. Whatever technique I use (MAL, SQL with view/temp tables), I've got to iterate to match those positions. Unless you've got a better idea?
I would have gone SQL all the way, but I need to iterate through a bat and calculate a result for each row, then return that sorted new bat based on the result. Doing it in SQL is overly complicated when some simple MAL statements should do the job.
you pay by complexity and may win a little. You may have a look at the TRACE SELECT * ,... to see if your expectation of loss of performance is correct to go for a MAL specific solution. Iteration on tuple level in MAL is still not fast ( <1 microsec/instr)
If iteration is that slow then I'm much better off using MAL or SQL as a first pass to do the basic joins, then pulling all this data in C and scanning through it as necessary. Does that sound reasonable? Thanks for the info, Henri.