User Defined Functions

This description is outdated.

An open source solution provides a stepping stone for others to extend its kernel functionality with specific types and functions. Experience shows that the need for those are fairly limited. Often the use of the built-in data types, the MAL algebra and functional abstraction, provide the necessary toolkit to achieve your goal.

In the few cases where the MonetDB kernel and SQL runtime system needs extensions, it calls for access to the source code of MonetDB and proficiency in C-programming, compilation and debugging. The openness of MonetDB means that extensions are not sand-boxed; they run within the system address space. Moreover, the multi-layered architecture means you have to make the functions written in C known to the MAL interpreter, before they can be made known to the SQL compiler. The current setup makes this a little more cumbersome, but the added benefit is that both simple scalar functions and columnar operations can be introduced.

In this section we show how to extend SQL with a simple scalar function to reverse a string, i.e.

sql> select 'hello',reverse('hello');
+---------------+
| hello | olleh |
+---------------+`

step 1. You should access and be able to compile and install MonetDB in a private directory.

step 2. Go to the sql/backends/monet5/UDF directory from the sources top directory. It provides the reverse example as a template. A group of user-defined functions is assembled in a directory like UDF. It contains files that described the SQL signature, the MAL signature, and the C-code implementation.

step 3. Extension starts with a definitin of the MAL signatures. See the example given, or browse through the files in monetdb5/modules/mal/*.mal to get a glimpse on how to write them. The MonetDB kernel documentation provides more details. The file contains the MAL snippet: command reverse(ra1:str):str address UDFreverse comment "Reverse a string";

step 4. The signature says that it expects a command body implementation under the name UDFreverse, shown below. The C-signature is a direct mapping, where arguments are passed by reference and the return value(s) references are the first in the arguments list. The body should return a (malloced) string to denote an exception being raised or MAL_SUCCEED upon access.

#include "udf.h"`

static str
reverse(const char *src)
{
       size_t len;
       str ret, new;

       /* The scalar function returns the new space */
       len = strlen(src);
       ret = new = GDKmalloc(len + 1);
       if (new == NULL)
              return NULL;
       new[len] = 0;
       while (len > 0)
              *new++ = src[--len];
       return ret;
}

str
UDFreverse(str *ret, str *src)
{
       if (*src == 0 || strcmp(*src, str_nil) == 0)
              *ret = GDKstrdup(str_nil);
       else
              *ret = reverse(*src);
       return MAL_SUCCEED;
}

step 5. The next step is to administer the routine in the SQL catalog. This calls for a SQL statement to be executed once for each database. The autoload method can relieve you from loading the modules manually in the server after each restart. The UDF template contains the file 80_udf.sql and 80_udf.mal. The former contains the definition needed for SQL:

create function reverse(src string)
returns string external name udf.reverse;`

step 6. The MAL interpreter should be informed about the linked in functionality. This is facilitated using an autoload feature too. The MAL script simply contains the module signature.

include udf;

step 7. After all pieces are prepared, you have to call the bootstrap program in the root of your checked out source tree once. Thereafter a configure/make/make install attempts compilation and places the interface files and libraries in the proper place.

Creation of bulk and polymorphmic operations require much more care. In general, it is best to find an instruction that is already close to what you need. Clone it, expand it, compile it, and test it. A bulk variation of the reverse operation is included in the sample UDF template. As a last resort you can contact us on the mailing lists for further advice.