The MonetDB team at MonetDB BV is pleased to announce the Jul2021 feature release of the MonetDB suite of programs. More information about MonetDB can be found on our website at https://www.monetdb.org/. For details on this release, please see the release notes at https://www.monetdb.org/Downloads/ReleaseNotes. As usual, the download location is https://www.monetdb.org/downloads/. Jul2021 Feature Release (11.41.5) Client Package * The MonetDB stethoscope has been removed. There is now a separate package available with PIP (monetdb_stethoscope) or as an RPM or DEB package (stethoscope) from the monetdb.org repository. Mapi Library * Add optional MAPI header field which can be used to immediately set reply size, autocommit, time zone and some other options, see mapi.h. This makes client connection setup faster. Support has been added to mapilib, pymonetdb and the jdbc driver. ODBC Driver * A typo that made the SQLSpecialColumns function unusable was fixed. MonetDB Common * A bug in the grouping code has been fixed. * Hash indexes are no longer maintained at all cost: if the number of distinct values is too small compared to the total number of values, the index is dropped instead of being maintained during updates. * A new type, called msk, was introduced. This is a bit mask type. In a bat with type msk, each row occupies a single bit, so 8 rows are stored in a single byte. There is no NULL value for this type. * The function of the BAT iterator (type BATiter, function bat_iterator) has been expanded. The iterator now contains more information about the BAT, and it contains a pointer to the heaps (theap and tvheap) that are stable, at least in the sense that they will remain available even when parallel threads update the BAT and cause those heaps to grow (and therefore possibly move in memory). A call to bat_iterator must now be accompanied by a call to bat_iterator_end. * Implemented function BUNreplacemultiincr to replace multiple values in a BAT in one go, starting at a given position. * Implemented new function BUNreplacemulti to replace multiple values in a BAT in one go, at the given positions. * Removed function BUNinplace, just use BUNreplace, and check whether the BAT argument is of type TYPE_void before calling if you don't want to materialize. * Implemented a function BUNappendmulti which appends an array of values to a BAT. It is a generalization of the function BUNappend. * Changed the interface of the atom read function. It now requires an extra pointer to a size_t value that gives the current size of the destination buffer, and when that buffer is too small, it receives the size of the reallocated buffer that is large enough. In any case, and as before, the return value is a pointer to the destination buffer. * Environment variables (sys.env()) must be UTF-8, but since they can contain file names which may not be UTF-8, there is now a mechanism to store the original values outside of sys.env() and store %-escaped (similar to URL escaping) values in the environment. The key must still be UTF-8. * We now save the location of the min and max values when known. MonetDB5 Server * When using the --in-memory option, mserver5 will run completely in memory, i.e. not create a database on disk. The server can still be connected to using the name of the in-memory database. This name is "in-memory". * By using the option "--dbextra=in-memory", mserver5 can be instructed to keep transient BATs completely in memory. SQL Frontend * The system view sys.ids has been updated to include some more system IDs. * The sys.storage() function now only returns meta data, i.e. data that can be calculated without access to the column contents. * Since STREAM tables support is removed, left over STREAM tables are dropped from the catalog. * Fix a warning emitted by some implementations of the tar(1) command when unpacking hot snapshot files. * support reading the concatenation of compressed files as a single compressed file. * COPY BINARY overhaul. Allow control over binary endianness using COPY [ (BIG | LITTLE | NATIVE) ENDIAN] BINARY syntax. Defaults to NATIVE. Strings are now \0 terminated rather than \n. Support for BOOL, TINYINT, SMALLINT, INT, LARGEINT, HUGEINT, with their respective "INTMIN" values as the NULL representation; 32 and 64 bit FLOAT/REAL, with NaN as the NULL representation; VARCHAR/TEXT, JSON and URL with \x80 as the NULL representation; UUID as fixed width 16 byte binary values, with (by default) all zeroes as the NULL representation; temporal type structs as defined in copybinary.h with any invalid value as the NULL representation. * In the Jul2021 release the storage and transaction layers have undergone major changes. The goal of these changes is robust performance under inserts/updates and deletes and lowering the transaction startup costs, allowing faster (small) queries. Where the old transaction layer duplicated a lot of data structures during startup, the new layer shares the same tree. Using object timestamps the isolation of object is guaranteed. On the storage side the timestamps indicate whether a row is visible (deleted or valid), to a transaction as well. The changes also give some slight changes on the perceived transactional behavior. The new implementation uses shared structures among all transactions, which do not allow multiple changes of the same object. And we then follow the principle of the first writer wins, i.e., if a transaction creates a table with name 'table_name', and concurrently one other transaction does the same the later of the two will fail with a concurrency conflict error message (even if the first writer never commits). We expect most users not to notice this change, as such schema changes aren't usually done concurrently. * There is now a function sys.current_sessionid() to return the session ID of the current session. This ID corresponds with the sessionid in the sys.queue() result. * Merge statements could not produce correct results on complex join conditions, so a renovation was made. As a consequence, subqueries now have to be disabled on merge join conditions. * preserve in-query comments * Use of CTEs inside UPDATE and DELETE statements are now more restrict. Previously they could be used without any extra specification in the query (eg. with "v1"("c1") as (...) delete from "t" where "t"."c1" = "v1"."c1"), however this was not conformant with the SQL standard. In order to use them, they must be specified in the FROM clause in UPDATE statements or inside a subquery. * Added 'schema path' property to user, specifying a list of schemas to be searched on to find SQL objects such as tables and functions. The scoping rules have been updated to support this feature and it now finds SQL objects in the following order: 1. On occasions with multiple tables (e.g. add foreign key constraint, add table to a merge table), the child will be searched on the parent's schema. 2. For tables only, declared tables on the stack. 3. 'tmp' schema if not listed on the 'schema path'. 4. Session's current schema. 5. Each schema from the 'schema path' in order. 6. 'sys' schema if not listed on the 'schema path'. Whenever the full path is specified, ie "schema"."object", no search will be made besides on the explicit schema. * To update the schema path ALTER USER x SCHEMA PATH y; statement was added. [SCHEMA PATH string] syntax was added to the CREATE USER statement. The schema path must be a single string where each schema must be between double quotes and separated with a single comma, e.g. '"sch1","sch2"' For every created user, if the schema path is not given, '"sys"' will be the default schema path. * Changes in the schema path won't be reflected on currently connected users, therefore they have to re-connect to see the change. Non existent schemas on the path will be ignored. * Leftover STREAM table definition from Datacell extension was removed from the parser. They had no effect anymore. Merovingian * Deprecate `profilerstart` and `profilerstop` commands. Since stethoscope is a separate project (https://github.com/MonetDBSolutions/monetdb-pystethoscope) the installation directory is not standard anymore. `profilerstart` and `profilerstop` commands assume that the stethoscope executable is in the same directory as `mserver5`. This is no longer necessarily true since stethoscope can now be installed in a python virtual environment. The commands still work if stethoscope is installed using the official MonetDB installers, or if a symbolic link is created in the directory where `mserver5` is located. * The exittimeout value can now be set to a negative value (e.g. -1) to indicate that when stopping the dbfarm (using monetdbd stop dbfarm), any mserver5 processes are to be sent a termination signal and then waited for until they terminate. In addition, if exittimeout is greater than zero, the mserver5 processes are sent a SIGKILL signal after the specified timeout and the managing monetdbd is sent a SIGKILL signal after another five seconds (if it didn't terminate already). The old situation was that the managing monetdbd process was sent a SIGKILL after 30 seconds, and the mserver5 processes that hadn't terminated yet would be allowed to continue their termination sequence. Bug Fixes * 2030: Temporary table is semi-persistent when transaction fails * 7031: I cannot start MoentDb, because the installation path has Chinese. * 7055: Table count returning function used inside other function gives wrong results. * 7075: Inconsistent Results using CTEs in Large Queries * 7079: WITH table AS... UPDATE ignores the WHERE conditions on table * 7081: Attempt to allocate too much space in UPDATE query * 7093: `current_schema` not in sys.keywords * 7096: DEBUG SQL statement broken * 7115: Jul2021: ParseException while upgrading Oct2020 database * 7116: Jul2021: Cannot create filter functions * 7125: MonetDB Round Function issues in the latest release * 7126: The "lower" and "upper" functions doesn't work for Cyrillic alphabet * 7127: Bug report: "write error on stream" that results in mclient crash * 7128: Bug report: strange error message "Subquery result missing" * 7129: Bug report: TypeException:user.main[19]:'batcalc.between' undefined * 7130: Bug report: TypeException:user.main[396]:'algebra.join' undefined * 7131: Bug report: TypeException:user.main[273]:'bat.append' undefined * 7133: WITH <alias> ( SELECT x ) DELETE FROM ... deletes wrong tuples * 7136: MERGE statement is deleting rows if the column is set as NOT NULL even though it should not * 7137: Segmentation fault while loading data * 7138: Monetdb Python UDF crashes because of null aggr_group_arr * 7141: COUNT(DISTINCT col) does not calculate correctly distinct values * 7142: Aggregates returning tables should not be allowed * 7144: Type up-casting (INT to BIGINT) doesn't always happen automatically * 7146: Query produces this error: !ERROR: Could not find %102.%102 * 7147: Internal error occurs and is not shown on the screen * 7148: Select distinct is not working correctly * 7151: Insertion is too slow * 7153: System UDFs lose their indentation - Python functions broken * 7158: Python aggregate UDF returns garbage when run on empty table * 7161: fix priority