What’s New in MariaDB 10.3
Let me start with an announcement: From now on, MariaDB is being treated as a distinct database on modern-sql.com.
The reason for the inclusion in my club of major SQL databases0 is simple: Although MariaDB was originally described as a “branch of MySQL that is, on the user level, compatible with the main version”1, both versions have diverged considerably in the past few years. At first, the differences were mostly limited to operative aspects (including some legal aspects). Over the last two years, even the SQL dialects started to diverge notably. Treating MariaDB as a distinct product is the unavoidable consequence for me.
Furthermore, MariaDB’s popularity is still growing2 and it seems that the MariaDB team is finally embracing the SQL standard. I must actually say that “they now embrace modern SQL standards”—not the SQL-92 standard that’s been overhauled six times.3
The release of MariaDB 10.3 demonstrates this in an impressive way. Read on to see what I mean.
- System-Versioned Tables
- Sequence Generators
- Two-Phase Processing of
- Off Topic:
- Event Note: Modern SQL Training in Vienna
Have you ever had the requirement to keep the old data when using
delete? Pretty much every business application needs that—e.g. for customer master data. System-versioned tables is one of the standard SQL features that get’s it done.4
The SQL standard uses closed-open intervals5—stored in “from” and “to”
timestamp columns—to denote the period in which each row is valid. The “system” in system-versioned tables means that those time stamps are automatically maintained by the database whenever you write to such a table. System versioning is meant to be transparent for the application.
The syntax to create system-versioned tables looks like this:
CREATE TABLE … ( [ regular columns and constraints ] , valid_from TIMESTAMP(6) GENERATED ALWAYS AS ROW START NOT NULL , valid_to TIMESTAMP(6) GENERATED ALWAYS AS ROW END NOT NULL , PERIOD FOR SYSTEM_TIME (valid_from, valid_to) ) WITH SYSTEM VERSIONING
As you can see, it explicitly adds two columns to hold the validity time stamps. These are basically regular columns, and are visible to the user.6 That is the only required change that is not transparent to
selectand data-modifying statements. Next, the time stamps are logically combined into a period called
system_time. Finally, the
with system versioning clause enables the magic.
Of course you can also
alter existing tables to add system versioning. If you already have the validity stored as an closed-open interval, you can use it right away.
Once enabled, system versioning maintains the columns in the
system_time period automatically. That means that
delete doesn’t actually remove anything. It just sets the end of the validity of the affected rows to the transaction time.7
Update does the same, but it also adds new rows with the modified data and the transaction time as the start of validity. Once again,
update is basically
The SQL standard does not specify how system-versioned tables store the data physically, nor does it define data retention mechanisms.
MariaDB keeps the old data in the same table by default. However, partitioning capabilities have been enhanced so you can physically separate the current data from the old one.
select from a system-versioned table you’ll just get the current data. To access old data, the syntax of the
from clause was extended:
FROM <table> FOR SYSTEM_TIME [ AS OF <ts> | BETWEEN <ts> AND <ts> | FROM <ts> TO <ts> ] [ [AS] <new name> ]
for system_time clause immediately follows the table name in the
from clause—i.e. a new table (
AS <new name>) follows at the very end.
Of course the
as of clause delivers the data as of the specified time.
from take two time stamps. The difference between them is that
between included the upper bound while
from excludes it. The lower bound is exclusive for both of them.8
- System-Versioned Tables in the MariaDB documentation
- Temporal features in SQL:2011 [PDF]
- Make system versioning use standard compliant transaction timestamps in the MariaDB bug tracker
values clause is probably the most basic SQL feature at all. Most people know about it from its use in the
insert statement. However, the
values clause is actually a concept in its own right that’s also useful outside of
values clause is the standard SQL approach to
from. In addition to that, the
values clause has the big advantage that it can easily create multiple rows in one go:
VALUES (<columns of row 1>) , (<columns of row 2>) , …
There is no need to
union all multiple
SELECT * FROM ( VALUES ('a', 'b') ) t(a,b)
Unfortunately, MariaDB 10.3 does not support the
from clause renaming shown above. It remains the
with clause, which is somewhat bulky for this case.
Another problem with the
values clause is its lack of support in many databases. Even though MariaDB 10.3 now supports the
from still has better support among the tested databases.
values— Create Rows out of Nothing
- Assign Names to Columns without Known Names
- Crash when VALUES in derived table contains expressions in the MariaDB bug tracker
“A sequence generator is a mechanism for generating successive exact numeric values, one at a time.”9 They are similar to identity columns, but they are not tied to an
insert operation on a specific table so that they can be used for arbitrary purposes. The
next value for <sequence name> expression is used to obtain a new value.
Naturally, sequence generator can be used in a stand-alone
valuesstatement to fetch a new value.
VALUES (NEXT VALUE FOR <sequence name>)
percentile_cont functions are used to get a percentile value—e.g. the median—from an ordered set of values.
In standard SQL these functions require the
within group clause and optionally accept an
over clause. However, in MariaDB 10.3 the
overclause is also required.
PRECENTILE_DISC (0.5) WITHIN GROUP (ORDER BY x)OVER …
The difference between these two functions is how they cope with a case in which the specified percentile falls in between two rows. For example, when the middle row—holding the median value—out of four rows is needed.
Percentile_disc always returns a discrete value from the input data—even if the specified percentile falls between two rows. In that case, it returns the first one with respect to the specified
order by clause.
Percentile_cont, on the other hand, performs a weighted linear interpolation between the two values of the adjacent rows.
MariaDB 10.3 also added the proprietary
median(x) function, which is a different syntax for
percentile_disc(0.5) within group (order by x).
except are table operators similar to
union. Instead of concatenating two tables, they produce the intersection and the difference, respectively.
- INTERSECT at the MariaDB documentation
- EXCEPT at the MariaDB documentation
is [not] distinct from(
Two-Phase Processing of
This is best explained by example. Have a close look at this query.
UPDATE … SET c1 = c2 , c2 = c1
Note that both columns appear on both sides of the assignment operator (
=)—i.e. both columns are read and changed in the same statement.
In this case, the SQL standard requires that all read operations are effectively completed before the first writing. You can think of it as a read-only phase in which all right-hand side expressions are evaluated, followed by a write-only phase, which actually stores the results. If this rule is obeyed, the statement above exchanges the values in the
Nonetheless the default behavior in MariaDB (including 10.3) as well as in MySQL is different. They execute the
set clauses in the order they appear in the statement. Thus the result of the above query is that both columns contain the value previously stored in
SIMULTANEOUS_ASSIGNMENT mode introduced with MariaDB 10.3 activates standard-conforming handling of this case.
SET sql_mode = (SELECT CONCAT( @@sql_mode , ',SIMULTANEOUS_ASSIGNMENT' ) )
This is very similar to the previous topic. The main difference is that previous issue deals with cyclic references between columns of the same row, whereas this issue is about cyclic references between different rows of the same table.
The problem can appear when a data-modifying statement has a query—e.g. inside
in10—that fetches data from the very same table that the statement is writing to. Again, there could be a cyclic reference, this time between rows.
Consider the simplest possible example:
INSERT INTO t1 SELECT * FROM t1
The question is basically whether the
select can already see the rows that are just inserted. Well, the obvious answer is “no”—otherwise it would end up as infinite loop.
The problem can also appear with the other data-modifying statements:
merge. However, the problem can only occur if these statements contain a query that accesses the target table again—otherwise it cannot access different rows from the same table—cyclic references cannot emerge.
The SQL standard solves the problem in the same way as for the
setclause discussed above. It requires the execution to be effectively split into read-only and write-only phases. This time, the standard doesn’t strictly require this processing because it is an optional feature (F781, “Self-referencing operations”). Databases not supporting this feature should yield an error when you try to execute a self-referencing statement.
Indeed, MySQL and MariaDB do produce an error.
You can’t specify target table ‘t1’ for update in FROM clause
Note that this message always says “update” even when the statement was a
delete. With MariaDB 10.0, the message was rephrased:
Table ‘t1’ is specified twice, both as a target for ‘UPDATE’ and as a separate source for data
MariaDB 10.3 can execute self-referencing
Group_concat is not a standard SQL feature. The standard function for that is
listagg. However, I though I’d include this enhancement here nevertheless because
listagg offers a different solution for the same problem—how to prevent exceeding the length limitation of the results’s data type?
The SQL standard’s
listagg has the
on overflow clause that can stop adding further values before exceeding the limit. MariaDB’s
group_concat can now take an optional
limit clause to put an upper bound on the number of concatenated values.
Event Note: Modern SQL Training in Vienna
If you have read this far, you are probably interested in learning more about all the modern SQL goodness. If so, have a look at my upcoming training in Vienna (Sep 17-21).
It’s a full week long and covers all the recurring issues that I have observed in my training and consulting assignments over the years. That includes indexing in detail, basics you might be afraid to ask about, and modern SQL features such as window functions and recursions in detail.