The Jul2021 documentation can be found here.
LONG TERM SUPPORT
For more information on long term support releases visit MonetDB Solutions
Jul2021-SP15 Bugfix Release (11.41.47)
Bug Fixes
- 7254: Commit with deletions is very slow
Jul2021-SP14 Bugfix Release (11.41.45)
Bug Fixes
- 7501: files remain in backup causing problems at restart
- 7526: deadlock, causing new connections to hang indefinitely
- 7531: loading more than 2147483647 rows gives issue
- 7546: monetdbd leaks file descriptors when starting mserver5
Jul2021-SP13 Bugfix Release (11.41.43)
MonetDB Common
- Fixed a regression where bats weren’t always cleaned up when they
weren’t needed anymore. In particular, after a DELETE FROM table query
without a WHERE clause (which deletes all rows from the table), the
bats for the table get replaced by new ones, and the old, now unused,
bats weren’t removed from the database.
Jul2021-SP12 Bugfix Release (11.41.41)
MonetDB Common
Introduced options wal_max_dropped, wal_max_file_age and
wal_max_file_size that control the write-ahead log file rotation.
Fixed a (rare) race condition between copying a bat (COLcopy) and
updates happening in parallel to that same bat. This may only be
an actual problem with string bats, and then only in very particular
circumstances.
Jul2021-SP11 Bugfix Release (11.41.39)
- Do a lot more error checking, mostly for allocation failures. More is
still needed, though.
MonetDB Common
When saving the SQL catalog during a low-level commit, we should
only save the part of the catalog that corresponds to the part of the
write-ahead log that has been processed. What we did was save more,
which resulted in the catalog containing references to tables and
columns whose disk presence is otherwise only in the write-ahead log.
A bug was fixed where the administration of which bats were in use was
interpreted incorrectly during startup, causing problems later. One
symptom that has been observed was failure to startup with a message
that the catalog tables could not be loaded.
When memory is tight, it could happen a bat was backed up so that it
could be saved. It was then possible that after commit the backup was
not removed so that after a restart the old backup was used instead
of the committed copy. This was fixed.
Fixed a number of data races (race conditions).
Fixed a reference counting problem when a BAT could nog be loaded,
e.g. because of resource limitations.
Only check for virtual memory limits when creating or growing bats,
not for general memory allocations. There is (still) too much code
that doesn’t properly handle failing allocations, so we need to avoid
those as much as possible. This has mostly an effect if there are
virtual memory size restrictions imposed by cgroups (memory.swap.max
in cgroups v2, memory.memsw.limit_in_bytes in cgroups v1).
The low-level commit turned out to always commit every persistent bat
in the system. There is no need for that, it should only commit bats
that were changed. This has now been fixed.
Warnings and informational messages are now sent to stdout instead of
stderr, which means that monetdbd will now log them with the tag MSG
instead of ERR.
MonetDB5 Server
If the server is sent a SIGUSR1 signal, it prints out some useful
information to the standard output. When run under monetdbd, this
output will appear in the merovingian.log file.
There is now a new option –set tablet_threads=N to limit the number
of threads used for a COPY INTO from CSV file query. This option can
also be set for a specific database using the monetdb command using
the ncopyintothreads property.
Merovingian
- The command ‘monetdb snapshot write …’ caused a crash of the monetdb
program. This has been fixed.
Bug Fixes
- 7410: SIGSEGV cause database corruption
END OF OPEN SOURCE SUPPORT
Jul2021-SP10 Bugfix Release (11.41.33)
MonetDB Common
- Fixed parsing of the BBP.dir files when BAT ids grow larger than 2**24
(i.e. 100000000 in octal).
MonetDB5 Server
- A bug was fixed where data from a client context was freed after the
context was closed. This meant that the data being freed could belong
to the next user of the context (a next client that just connected),
leading to chaos (i.e. crashes).
SQL Frontend
- When creating a hot snapshot, allow other clients to proceed, even
with updating queries.
Jul2021-SP9 Bugfix Release (11.41.31)
MonetDB Common
When processing the WAL, if a to-be-destroyed object cannot be found,
don’t stop, but keep processing the rest of the WAL.
A race condition was fixed where certain write-ahead log messages
could get intermingled, resulting in a corrupted WAL file.
If opening of a file failed when it was supposed to get memory mapped,
an incorrect value was returned to indicate the failure, causing
crashes later on. This has been fixed.
When saving a bat failed for some reason during a low-level commit,
this was logged in the log file, but the error was then subsequently
ignored, possibly leading to files that are too short or even missing.
The write-ahead log (WAL) is now rotated a bit more efficiently by
doing multiple log files in one go (i.e. in one low-level transaction).
Fixed a race condition that could lead to a bat being added to the SQL
catalog but nog being made persistent, causing a subsequent restart
of the system to fail (and crash).
Fixed a race condition where a hash could have been created on a
bat using the old bat count while in another thread the bat count
got updated. This would make the hash be based on too small a size,
causing failures later on.
When extending a bat failed, the capacity had been updated already and
was therefore too large. This could then later cause a crash. This has
been fixed by only updating the capacity if the extend succeeded.
A bug was fixed when dealing with copy-on-write memory maps. These can
occur for some bats used by the write-ahead log code when they grow
large enough.
MonetDB5 Server
Client connections are cleaned up better so that we get fewer instances
of clients that cannot connect.
Fix a bug where the MAL optimizer would use the starttime of the
previous query to determine whether a query timeout occurred.
SQL Frontend
Increased the size of a variable counting the number of changes made
to the database (e.g. in case more than a 2 billion rows are added to
a table).
Improved cleanup after failures such as failed memory allocations.
An insert into a table from which a column was dropped in a parallel
transaction was incorrectly not flagged as a transaction conflict.
Added some error checking to prevent crashes. Errors would mainly
occur under memory pressure.
Fixed cleanup after a failed allocation where the data being cleaned
up was uninitialized but still used as pointers to memory that also had
to be freed.
A bug was fixed when optimizing combining of range select
subexpressions.
If there was an error in one of the special commands to the server
(e.g. setting the reply size for result sets), the server could get
into an infinite loop. This has been fixed.
Fixed a double cleanup after a failed allocation in COPY INTO. The
double cleanup could cause a crash due to a race condition it enabled.
Merovingian
- Stop logging references to monetdbd’s logfile in said logfile.
Jul2021-SP8 Bugfix Release (11.41.27)
MonetDB Common
A bug was fixed when upgrading a database from the Oct2020 releases
(11.39.X) or older when the write-ahead log (WAL) was not empty and
contained instructions to create new tables.
Avoid logging of failure to backup files that didn’t need to be backed
up in the first place.
Avoid an attempt to access a file when the database is in memory.
SQL Frontend
- Fixed a busy loop in the code that applies the write-ahead log when
there are log files that cannot yet be cleaned due to active
transactions. This loop can become nasty when mserver5 is exiting.
Merovingian
- In certain cases (when an mserver5 process exits right after producing
a message) the log message was logged over and over again, causing
monetdbd to use 100% CPU. This has been fixed.
Jul2021-SP7 Bugfix Release (11.41.25)
MonetDB Common
When destroying a bat, make sure there are no files left over in
the BACKUP directory since they can cause problems when the bat id
gets reused.
Fixed an off-by-one error in the logger which caused older log files
to stick around longer in the write-ahead log than necessary.
When an empty BAT is committed, skip writing (and synchronizing to
disk) the heap (tail and theap) files and write 0 for their sizes to
the BBP.dir file. When reading the BBP.dir file, if an empty BAT is
encountered, set the sizes of those files to 0. This fixes potential
issues during startup of the server (BBPcheckbats reporting errors).
Make sure heap files of transient bats get deleted when the bat is
destroyed. If the bat was a partial view (sharing the vheap but not
the tail), the tail file wasn’t deleted.
Various changes were made to satisfy newer compilers.
The batDirtydesc and batDirtyflushed Boolean values have been deprecated
and are no longer used. They were both holdovers from long ago.
Various race conditions (data races) have been fixed.
All accesses to the BACKUP directory need to be protected by the
same lock. The lock already existed (GDKtmLock), but wasn’t used
consistently. This is now fixed. Hopefully this makes the hot snapshot
code more reliable.
MonetDB5 Server
- Various race conditions (data races) have been fixed.
Merovingian
When multiple identical messages are written to the log, write the
first one, and combine subsequent ones in a single message.
Fixed a leak where the log file wasn’t closed when it was reopened
after a log rotation (SIGHUP signal).
Try to deal more gracefully with “inherited” mserver5 processes.
This includes not complaining about an “impossible state”, and allowing
such processes to be stopped by the monetdbd process.
When a transient failure occurs during processing of a new connection to
the monetdbd server, sleep for half a second so that if the transient
failure occurs again, the log file doesn’t get swamped with error
messages.
Bug Fixes
Jul2021-SP6 Bugfix Release (11.41.23)
Bug Fixes
Jul2021-SP5 Bugfix Release (11.41.21)
MonetDB Common
Fixed a race condition which could cause a too large size being written
for a .theap file to the BBP.dir file after the correct size file had
been saved to disk.
We now ignore the size and capacity columns in the BBP.dir file.
These values are essential during run time, but not useful in the
on-disk image of the database.
Merovingian
- Disabled logging into merovingian.log of next info message types:
“proxying client <host>:<port> for database ‘<dbname>’ to <url>” and
“target connection is on local UNIX domain socket, passing on filedescriptor instead of proxying”.
These messages were written to the log file at each connection. In most
cases this information is not used. The disabling reduces the log file size.
Bug Fixes
Jul2021-SP4 Bugfix Release (11.41.19)
Bug Fixes
- 7267: Update after delete does not update some rows
Jul2021-SP3 Bugfix Release (11.41.15)
MonetDB Common
Fixed race condition during backup of BATs.
Fixed append to BATs of type msk (bit mask).
Fix to WAL logger when a BAT gets replaced within a transaction.
SQL Frontend
Bug Fixes
7225: Invalid memory access when extending a BAT during appends
7228: COMMIT: transaction is aborted because of concurrency conflicts, will ROLLBACK instead
Jul2021-SP2 Bugfix Release (11.41.13)
Client Package
- Dumping the database now also dumps the read-only and insert-only states of tables.
MonetDB Common
- Sometimes when the server was restarted, it wouldn’t start anymore due to an error from BBPcheckbats. We finally found and fixed a (hopefully “the”) cause of this problem.
SQL Frontend
- Number parsing for SQL was fixed. If a number was immediately followed by letters (i.e. without a space), the number was accepted and the alphanumeric string starting with the letter was interpreted as an alias (if aliases were allowed in that position).
Bug Fixes
- 7163: Multiple sql.mvc() invocations in the same query
- 7167: sys.shutdown() problems
- 7184: Insert into query blocks all other queries
- 7185: GROUPING SETS on groups with aliases provided in the SELECT returns empty result
- 7186: data files created with COPY SELECT .. INTO ‘file.csv’ fail to be loaded using COPY INTO .. FROM ‘file.csv’ when double quoted string data contains the field values delimiter character
- 7191: [MonetDBe] monetdbe_cleanup_statement() with bound NULLs on variable-sized types bug
- 7196: BATproject2: does not match always
- 7198: Suboptimal query plan for query containing JSON access filter and two negative string comparisons
- 7200: PRIMARY KEY unique constraint is violated with concurrent inserts
- 7206: Python UDF fails when returning an empty table as a dictionary
Jul2021-SP1 Bugfix Release (11.41.11)
MonetDB Common
- Some deadlock and race condition issues were fixed.
- Handling of the list of free bats has been improved, leading to less thread contention.
- A problem was fixed where the server wouldn’t start with a message from BBPcheckbats about files being too small. The issue was not that the file was too small, but that BBPcheckbats was looking at the wrong file.
- An issue was fixed where a “short read” error was produced when memory was getting tight.
- When appending to a string bat, we made an optimization where the string heap was sometimes copied completely to avoid having to insert strings individually. This copying was still done too eagerly, so now the string heap is copied less frequently. In particular, when appending to an empty bat, the string heap is now not always copied whole.
SQL Frontend
- If the server has been idle for a while with no active clients, the write-ahead log is now rotated.
- A problem was fixed where files belonging to bats that had been deleted internally were not cleaned up, leading to a growing database (dbfarm) directory.
- A leak was fixed where extra bats were created but never cleaned up, each taking up several kilobytes of memory.
- [This feature was already released in Jul2021 (11.41.5), but the ChangeLog was missing] Grant indirect privileges. With “GRANT SELECT ON <my_view> TO <another_user>” and “GRANT EXECUTE ON FUNCTION <my_func> TO <another_user>”, one can grant access to “my_view” and “my_func” to another user who does not have access to the underlying database objects (e.g. tables, views) used in “my_view” and “my_func”. The grantee will only be able to access data revealed by “my_view” or conduct operations provided by “my_func”.
- Improved error reporting in COPY INTO by giving the line number (starting with one) for the row in which an error was found. In particular, the sys.rejects() table now lists the line number of the CSV file on which the record started in which an error was found.
Bug Fixes
- 7140: SQL Query Plan Non Optimal with View
- 7165: ‘JOINIDX: missing ‘.’’ when running distributed join query on merged remote tables
- 7172: Unexpected query result with merge tables
- 7173: If truncate is in transaction then after restart of MonetDB the table is empty
- 7178: Remote Table Throws Error - createExceptionInternal: !ERROR: SQLException:RAstatement2:42000!The number of projections don’t match between the generated plan and the expected one: 1 != 1200
Jul2021 Feature Release (11.41.5)
Client Package
- The MonetDB stethoscope has been removed. There is now a separate package available with PIP (monetdb_stethoscope) or an RPM or DEB package (stethoscope) from the monetdb.org repository.
Mapi Library
- Add optional MAPI header field which can be used to immediately set reply size, autocommit, time zone and some other options, see mapi.h. This makes client connection setup faster. Support has been added to mapilib, pymonetdb and the jdbc driver.
ODBC Driver
- A typo that made the SQLSpecialColumns function unusable was fixed.
MonetDB Common
- A bug in the grouping code has been fixed.
- Hash indexes are no longer maintained at all cost: if the number of distinct values is too small compared to the total number of values, the index is dropped instead of being maintained during updates.
- A new type, called msk, was introduced. This is a bit mask type. In a bat with type msk, each row occupies a single bit, so 8 rows are stored in a single byte. There is no NULL value for this type.
- The function of the BAT iterator (type BATiter, function bat_iterator) has been expanded. The iterator now contains more information about the BAT, and it contains a pointer to the heaps (theap and tvheap) that are stable, at least in the sense that they will remain available even when parallel threads update the BAT and cause those heaps to grow (and therefore possibly move in memory). A call to bat_iterator must now be accompanied by a call to bat_iterator_end.
- Implemented function BUNreplacemultiincr to replace multiple values in a BAT in one go, starting at a given position.
- Implemented new function BUNreplacemulti to replace multiple values in a BAT in one go, at the given positions.
- Removed function BUNinplace, just use BUNreplace, and check whether the BAT argument is of type TYPE_void before calling if you don’t want to materialize.
- Implemented a function BUNappendmulti which appends an array of values to a BAT. It is a generalization of the function BUNappend.
- Changed the interface of the atom read function. It now requires an extra pointer to a size_t value that gives the current size of the destination buffer, and when that buffer is too small, it receives the size of the reallocated buffer that is large enough. In any case, and as before, the return value is a pointer to the destination buffer.
- Environment variables (sys.env()) must be UTF-8, but since they can contain file names which may not be UTF-8, there is now a mechanism to store the original values outside of sys.env() and store %-escaped (similar to URL escaping) values in the environment. The key must still be UTF-8.
- We now save the location of the min and max values when known.
MonetDB5 Server
- When using the –in-memory option, mserver5 will run completely in memory, i.e. not create a database on disk.
The server can still be connected to using the name of the in-memory database. This name is “in-memory”.
- By using the option “–dbextra=in-memory”, mserver5 can be instructed to keep transient BATs completely in memory.
SQL Frontend
- The system view sys.ids has been updated to include some more system IDs.
- The sys.storage() function now only returns meta data, i.e. data that can be calculated without access to the column contents.
- Since STREAM tables support is removed, left over STREAM tables are dropped from the catalog.
- Fix a warning emitted by some implementations of the tar(1) command when unpacking hot snapshot files.
- support reading the concatenation of compressed files as a single compressed file.
- COPY BINARY overhaul. Allow control over binary endianness using COPY [ (BIG | LITTLE | NATIVE) ENDIAN] BINARY syntax.
Defaults to NATIVE. Strings are now \0 terminated rather than \n.
Support for BOOL, TINYINT, SMALLINT, INT, LARGEINT, HUGEINT, with their respective “INTMIN” values as the NULL
representation; 32 and 64 bit FLOAT/REAL, with NaN as the NULL representation;
VARCHAR/TEXT, JSON and URL with \x80 as the NULL representation; UUID as fixed width 16 byte binary values,
with (by default) all zeroes as the NULL representation; temporal type structs as defined in copybinary.h
with any invalid value as the NULL representation.
- In the Jul2021 release the storage and transaction layers have undergone major changes.
The goal of these changes is robust performance under inserts/updates and deletes and lowering the transaction
startup costs, allowing faster (small) queries. Where the old transaction layer duplicated a lot of data structures
during startup, the new layer shares the same tree. Using object timestamps the isolation of object is guaranteed.
On the storage side the timestamps indicate whether a row is visible (deleted or valid), to a transaction as well.
The changes also give some slight changes on the perceived transactional behavior. The new implementation uses shared structures among all transactions, which do not allow multiple changes of the same object.
And we then follow the principle of the first writer wins, i.e., if a transaction creates a table with name ’table_name’,
and concurrently one other transaction does the same the later of the two will fail with a concurrency conflict error
message (even if the first writer never commits). We expect most users not to notice this change, as such schema changes
aren’t usually done concurrently.
- There is now a function sys.current_sessionid() to return the session ID of the current session.
This ID corresponds with the sessionid in the sys.queue() result.
- Merge statements could not produce correct results on complex join conditions, so a renovation was made.
As a consequence, subqueries now have to be disabled on merge join conditions.
- preserve in-query comments
- Use of CTEs inside UPDATE and DELETE statements are now more restrict. Previously they could be used without any extra
specification in the query (eg. with “v1”(“c1”) as (…) delete from “t” where “t”.“c1” = “v1”.“c1”), however this was not
conformant with the SQL standard. In order to use them, they must be specified in the FROM clause in UPDATE statements or inside a subquery.
- Added ‘schema path’ property to user, specifying a list of schemas to be searched on to find SQL objects such as tables
and functions. The scoping rules have been updated to support this feature and it now finds SQL objects in the
following order: 1. On occasions with multiple tables (e.g. add foreign key constraint, add table to a merge table),
the child will be searched on the parent’s schema. 2. For tables only, declared tables on the stack. 3. ’tmp’ schema if
not listed on the ‘schema path’. 4. Session’s current schema. 5. Each schema from the ‘schema path’ in order. 6. ‘sys’ schema if not listed on the ‘schema path’. Whenever the full path is specified, ie “schema”.“object”, no search will be made besides on the explicit schema.
- To update the schema path ALTER USER x SCHEMA PATH y; statement was added. [SCHEMA PATH string] syntax was added to
the CREATE USER statement. The schema path must be a single string where each schema must be between double quotes and
separated with a single comma, e.g. ‘“sch1”,“sch2”’ For every created user, if the schema path is not given, ‘“sys”’ will be the default schema path.
- Changes in the schema path won’t be reflected on currently connected users, therefore they have to re-connect to see the change. Non existent schemas on the path will be ignored.
- Leftover STREAM table definition from Datacell extension was removed from the parser. They had no effect anymore.
Merovingian
- Deprecate ‘profilerstart’ and ‘profilerstop’ commands. Since stethoscope is a separate project
(https://github.com/MonetDBSolutions/monetdb-pystethoscope) the installation directory is not standard anymore.
‘profilerstart’ and ‘profilerstop’ commands assume that the stethoscope executable is in the same directory as ‘mserver5’.
This is no longer necessarily true since stethoscope can now be installed in a python virtual environment.
The commands still work if stethoscope is installed using the official MonetDB installers, or if a symbolic link is created in the directory where ‘mserver5’ is located.
- The exittimeout value can now be set to a negative value (e.g. -1) to indicate that when stopping the dbfarm
(using monetdbd stop dbfarm), any mserver5 processes are to be sent a termination signal and then waited for until
they terminate. In addition, if exittimeout is greater than zero, the mserver5 processes are sent a SIGKILL signal
after the specified timeout and the managing monetdbd is sent a SIGKILL signal after another five seconds
(if it didn’t terminate already). The old situation was that the managing monetdbd process was sent a SIGKILL after
30 seconds, and the mserver5 processes that hadn’t terminated yet would be allowed to continue their termination sequence.
Bug Fixes
- 2030: Temporary table is semi-persistent when transaction fails
- 7031: I cannot start MonetDB, because the installation path has Chinese.
- 7055: Table count returning function used inside other function gives wrong results.
- 7075: Inconsistent Results using CTEs in Large Queries
- 7079: WITH table AS… UPDATE ignores the WHERE conditions on table
- 7081: Attempt to allocate too much space in UPDATE query
- 7093: ‘current_schema’ not in sys.keywords
- 7096: DEBUG SQL statement broken
- 7115: Jul2021: ParseException while upgrading Oct2020 database
- 7116: Jul2021: Cannot create filter functions
- 7125: MonetDB Round Function issues in the latest release
- 7126: The “lower” and “upper” functions doesn’t work for Cyrillic alphabet
- 7127: Bug report: “write error on stream” that results in mclient crash
- 7128: Bug report: strange error message “Subquery result missing”
- 7129: Bug report: TypeException:user.main[19]:‘batcalc.between’ undefined
- 7130: Bug report: TypeException:user.main[396]:‘algebra.join’ undefined
- 7131: Bug report: TypeException:user.main[273]:‘bat.append’ undefined
- 7133: WITH ( SELECT x ) DELETE FROM … deletes wrong tuples
- 7136: MERGE statement is deleting rows if the column is set as NOT NULL even though it should not
- 7137: Segmentation fault while loading data
- 7138: Monetdb Python UDF crashes because of null aggr_group_arr
- 7141: COUNT(DISTINCT col) does not calculate correctly distinct values
- 7142: Aggregates returning tables should not be allowed
- 7144: Type up-casting (INT to BIGINT) doesn’t always happen automatically
- 7146: Query produces this error: !ERROR: Could not find %102.%102
- 7147: Internal error occurs and is not shown on the screen
- 7148: Select distinct is not working correctly
- 7151: Insertion is too slow
- 7153: System UDFs lose their indentation - Python functions broken
- 7158: Python aggregate UDF returns garbage when run on empty table
- 7161: fix priority