Chapter 3: Expressions, Literals, Nullability, Sensitivity

Until this point we've only discussed simple kinds of expressions as well as variables and table columns marked with NOT NULL. These are indeed the easiest types for CQL to work with as they tend to correspond most directly to the types known to C. However, SQL provides for many more types of expressions as well as nullable types and these require handling in any language that purports to be like SQL.

Expression Examples

The usual arithmetic operators apply in CQL:

Example expressions (these are all true)

(1 + 2) * 3 == 9
+ 2 * 3 == 7
/ 3 == 2
- 5 == 2
% 5 == 1
/ 2.5 == 2
& 3 == 2 | 1
<< 2 == 4

However, before going any further it's important to note that CQL is inherently a two-headed beast. Expressions are either evaluated by transpiling to C (like the predicate of an IF statement, or a variable assignment) or by sending them to SQLIte for evaluation (like expressions inside a SELECT statement or the WHERE part of a DELETE).

CQL evaluation rules are designed to be as similar as possible but some variance is inevitable because evaluation is done in two fundamentally different ways.

Operator Precedence

The operator precedence rules in CQL are as follows; the top-most rule binds the most loosely and the bottom-most rule binds the most tightly:

ASSIGNMENT:     :=
LOGICAL_OR:     OR
LOGICAL_AND:    AND
LOGICAL_NOT:    NOT
EQUALITY:       = == != <>  IS [NOT], [NOT] IN, [NOT] LIKE,
                [NOT] MATCH, [NOT] GLOB, [NOT] BETWEEN
INEQUALITY:     <  <=  >  >=
BINARY:         << >> & |
ADDITION:       + -
MULTIPLICATION: * / %
CONCAT:         ||
COLLATE:        COLLATE
UNARY:          ~  -

The above rules are not the same as C's operator precedence rules! Instead, CQL follows SQLite's rules. Parentheses are emitted in the C output as needed to force that order.

NOTE: CQL emits minimal parentheses in all outputs. Different parentheses are often needed for SQL output as opposed to C output.

Order of Evaluation

In contrast to C, CQL guarantees a left-to-right order of evaluation for arguments. This applies both to arguments provided to the operators mentioned in the previous section as well as arguments provided to procedures.

Variables, Columns, Basic Types and Nullability

CQL needs type information for both variables in the code and columns in the database. Like SQL, CQL allows variables to hold a NULL value and just as in SQL the absence of NOT NULL implies that NULL is a legal value. Consider these examples:

-- real code should use better names than this :)
create table all_the_nullables(
  i1 integer,
  b1 bool,
  l1 long,
  r1 real,
  t1 text,
  bl1 blob
);

declare i2 integer;
declare b2 bool;
declare l2 long;
declare r2 real;
declare t2 text;
declare bl2 blob;

ALL of i1, i2, b1, b2, l1, l2, r1, r2, t1, t2, and bl1, bl2 are nullable. In some sense variables and columns declared nullable (by virtue of the missing NOT NULL) are the root sources of nullability in the SQL language. That and the NULL literal. Though there are other sources as we will see.

NOT NULL could be added to any of these, e.g.

-- real code should use better names than this :)
declare i_nn integer not null;

In the context of computing the types of expressions, CQL is statically typed and so it must make a decision about the type of any expression based on the type information at hand at compile time. As a result it handles the static type of an expression conservatively. If the result might be null then the expression is of a nullable type and the compiled code will include an affordance for the possibility of a null value at runtime.

The generated code for nullable types is considerably less efficient and so it should be avoided if that is reasonably possible.

LET Statement

You can declare and initialize a variable in one step using the LET form, e.g.

LET x := 1;

The named variable is declared to be the exact type of the expression on the right. More on expressions in the coming sections. The right side is often a constant in these cases but does not need to be.

LET i := 1;  -- integer not null
LET l := 1L;  -- long not null
LET t := "x";  -- text not null
LET b := x IS y; -- bool not null
LET b := x = y;  -- bool (maybe not null depending on x/y)

The pseudo function "nullable" removes not null from the type of its argument but otherwise does no computation. This can be useful to initialize nullable types.

LET n_i := nullable(1);  -- nullable integer variable initialized to 1
LET n_l := nullable(1L);  -- nullable long variable initialized to 1

The pseudo function "sensitive" adds @sensitive to the type of its argument but otherwise does no computation. This also can be useful to initialize nullable types.

LET s_i := sensitive(1);  -- sensitive nullable integer variable initialized to 1
LET s_l := sensitive(1L);  -- sensitive nullable long variable initialized to 1

The `@RC` special variable

CQL also has the special built-in variable @RC which refers to the most recent error code returned by a SQLite operation, e.g. 0 == SQLITE_OK, 1 == SQLITE_ERROR. @RC is of type integer not null. Specifically:

each catch block captures the error code when it is entered into its own local variable
this variable is created lazily, so it only exists if it is used
- the variable is called _rc_thrown_n where n is the catch block number in the procedure
any reference to @RC refers to the above error variable of the innermost catch block the @RC reference is in
if the @RC reference happens outside of any catch block its value is SQLITE_OK (i.e. zero).

Types of Literals

There are a number of literal objects that may be expressed in CQL. These are as follows:

String Literals

A double quoted string is a C style string literal
- the usual simple C escape sequences are supported
- the \xNN form for embedded hex characters is supported, however
- the \0NNN octal form is not supported, and
- embedded nulls in string literals (\0 or \0x00) are not supported (you must use blobs in such cases)
A single quoted string is a SQL style string literal
- No escape sequences are supported other than '' to indicate a single quote character (this is just like normal SQLite)
A sequence of single or double quoted strings separated by whitespace such as "xx" 'yy' "zz" which are concatenated to make one literal
The sequence @FILE("some_string") is a special string literal
- the value of this literal is the path of the current compiland starting at the letters in some_string, or
- the entire path of the current compiland if some_string does not occur in the path
- the purpose of the @FILE construct is to provide a partial path to a file for diagnostics that is consistent even if the file is built in various different root paths on different build machines

Blob Literals

SQLite Blob literals are supported in SQL contexts (i.e. where they will be processed by SQLite), CQL produces an error if you attempt to use a blob literal in a loose expression

Numeric Literals

All numeric literals are considered to be positive; negative numbers are actually a positive literal combined with unary minus (the negation operator)
Base 10 and hexadecimal literals are supported
Literals with a decimal point are of type REAL and stored as the C type double
Literals that can fit in a signed integer without loss, and do not end in the letter L are integer literals
Larger literals, or those ending with the letter L are long integer literals.
Literals that begin with 0x are interpreted as hex

Examples:

  1.3            -- real
  2L             -- long
  123456789123   -- long
  123            -- integer
  0x10           -- hex integer
  0x10L          -- hex long integer

The NULL literal

The use of NULL always gives a nullable result however this literal is special in that it has no storage class. NULL is neither numeric nor string itself but rather mutates into whatever it is first combined with. For instance NULL + 1 results in a nullable integer. Because NULL has no primitive type in some cases where type knowledge is required you might have to use the CAST() function to cast the NULL to a specific type such as CAST(NULL as TEXT). This construct guarantees type consistence in cases like SELECT from different sources combined with UNION ALL

Note: constructs like CAST(NULL as TEXT) are always rewritten to just NULL before going to SQLite as the cast is uninteresting except for the type information which SQLite doesn't need/use anyway.

Other Considerations

There are no boolean literals other than the integers 0 and 1.

The C pre-processor is often combined with CQL in which case the _FILE_ and _LINE_ directives may be used to create literals; they will be preprocessed into normal literals.

The use of _FILE_ can give surprising results in the presence of build systems, hence the existence of @FILE(...).

Const and Enumerations

It's possible to use named constants in CQL with nothing more than the C pre-processor features that have already appeared, however use of #define in such a way is not entirely satisfactory. For one thing, CQL will not know these constants exist in any way as they will be replaced before it ever sees them. This means CQL can't provide their values for you in the JSON output for instance.

To help with this problem, CQL includes constants, note, this is not the same as enumerated types as we'll see later. You can now write something like this:

declare enum business_type integer (
  restaurant,
  laundromat,
  corner_store = 11+3  /* math added for demo purposes only */
);

After this enum is declared, this:

select business_type.corner_store;

is the same as this:

select 14;

And that is exactly what SQLite will see, the literal 14.

You can also use the enum to define column types:

CREATE TABLE businesses (
name  TEXT,
type  business_type
);

CQL will then enforce that you use the correct enum to access those columns. For example, this is valid:

SELECT * FROM businesses WHERE type = business_type.laundromat;

While this does not type check:

SELECT * FROM businesses WHERE type = business_corp_state.delaware;

Enumerations follow these rules:

the enumeration can be any numeric type (bool, integer, long integer, real)
the values of the enumeration start at 1 (i.e. if there is no = expression the first item will be 1, not 0)
if you don't specify a value, the next value is the previous value plus one
if you do specify a value it can be any constant expression and it will be cast to the type of the enumeration (even if that is lossy)
the enumeration can refer to previous values in itself with no qualification (big = 100.0, medium = big/2, small = medium/2)
the enumeration can refer to previously defined enumerations as usual (code = business_type.restaurant)
once the enumeration is defined you refer to its members in a fully qualified fashion enum_name.member_name elsewhere

With these forms you get some additional useful output:

the JSON includes the enumerations and their values in their own section
you can use the @emit_enums directive to put declarations like this into the .h file that corresponds to the current compiland

enum business_type {
  business_type__restaurant = 1,
  business_type__laundromat = 2,
  business_type__corner_store = 14
};

Note that C does not allow for floating point enumerations, so in case of floating point values such as:

declare enum floating real (
  one = 1.0,
  two = 2.0,
  e = 2.71828,
  pi = 3.14159
);

you get:

// enum floating (floating point values)
#define floating__one 1.000000e+00
#define floating__two 2.000000e+00
#define floating__e 2.718280e+00
#define floating__pi 3.141590e+00

In order to get useful expressions in enumeration values, constant folding and general evaluation was added to the compiler; these expressions work on any numeric type and the literal null. The supported operations include:

+, -, *, /, %, |, &, <<, >>, ~, and, or, not, ==, <=, >=, !=, <, >, the cast operator and the case forms (including the iif function). These are enough to make a lot of very interesting expressions, all of which are evaluated at compile time.

Constant folding was added to allow for rich enum expressions, but there is also the const() primitive in the language which can appear anywhere a literal could appear. This allows you do things like:

create table something(
  x integer default const((1<<16)|0xf) /* again the math is just for illustration */
);

The const form is also very useful in macros:

#define SOMETHING const(12+3)

This form ensures that the constant will be evaluated at compile time. The const pseudo-function can also nest so you can build these kinds of macros from other macros or you can build enum values this way. Anywhere you might need literals, you can use const.

Named Types

A common source of errors in stored procedures is incorrect typing in arguments. For instance, a particular key for an entity might need to be LONG or even always LONG NOT NULL or LONG NOT NULL @SENSITIVE and the only way to do this in the past was maybe with some #define thing. Otherwise you have to diligently get the type right in all the places, and should it ever change, again you have to visit all the places. To help with this situation, and to make the code a little more self-describing we added named types to the language. This is a lot like typedef in the C language. They do not create different incompatible types but they do let you name things well.

You can now write these sorts of forms:

declare foo_id type long not null;

create table foo(
  id foo_id primary key autoincrement,
  name text
);

create proc inserter(name_ text, out id foo_id)
begin
  insert into foo(id, name) values(NULL, name_);
  set id := last_insert_rowid();
end;

declare function func_return_foo_id() foo_id;

declare var foo_id;

Additionally any enumerated type can be used as a type name. e.g.

declare enum thing integer (
  thing1,
  thing2
);

declare thing_type type thing;

Enumerations always get "not null" in addition to their base type. Enumerations also have a unique "kind" associated, specifically the above enum has type integer<thing> not null. The rules for type kinds are described below.

Type Kinds

Any CQL type can be tagged with a "kind" for instance real can become real<meters>, integer can become integer<job_id>. The idea here is that the additional tag, the "kind" can help prevent type mistakes in arguments, in columns and in procedure calls. For instance:

create table things(
  size real<meters>,
  duration real<seconds>
);

create proc do_something(size_ real<meters>, duration_ real<seconds>)
begin
  insert into things(size, duration) values(size_, duration_);
end;

In this situation you couldn't accidentally switch the columns in do_something even though both are real, and indeed SQLite will only see the type real for both. If you have your own variables typed real<size> and real<duration> you can't accidentally do:

  call do_something(duration, size);

even though both are real. The type kind won't match.

Importantly, an expression with no type kind is compatible with any type kind (or none). Hence all of the below are legal.

declare generic real;
set generic := size;        -- no kind may accept <meters>
set generic := duration;    -- no kind may accept <seconds>
set duration := generic;    -- no kind may be stored in <seconds>

Only mixing types where both have a kind, and the kind is different generates errors. This choice allows you to write procedures that (for instance) log any integer or any real, or that return an integer out of a collection.

These rules are applied to comparisons, assignments, column updates, anywhere and everywhere types are checked for compatibility.

To get the most value out of these constructs, the authors recommend that type kinds be used universally except when the extra compatibility described above is needed (like low level helper functions.)

Importantly, type kind can be applied to object types as well, allowing object<dict> to be distinct from object<list>.

At run time the kind information is lost. But it does find it's way into the JSON output so external tools also get to see the kinds.

Nullability

Nullability Rules

Nullability is tracked via CQL's type system. To understand whether or not an expression will be assigned a nullable type, you can follow these rules; they will hopefully be intuitive if you are familiar with SQL:

The literal NULL is, of course, always assigned a nullable type. All other literals are nonnull.
In general, the type of an expression involving an operator (e.g., +, ==, !=, ~, LIKE, et cetera) is nullable if any of its arguments are nullable. For example, 1 + NULL is assigned the type INTEGER, implying nullability. 1 + 2, however, is assigned the type INTEGER NOT NULL.
IN and NOT IN expressions are assigned a nullable type if and only if their left argument is nullable: The nullability of the right side is irrelevant. For example, "foo" IN (a, b) will always have the type BOOL NOT NULL, whereas some_nullable IN (a, b) will have the type BOOL.
- NOTE: In CQL, the IN operator behaves like a series of equality tests (i.e., == tests, not IS tests), and NOT IN behaves symmetrically. SQLite has slightly different nullability rules for IN and NOT IN. This is the one place where CQL has different evaluation rules from SQLite, by design.
The result of IS and IS NOT is always of type BOOL NOT NULL, regardless of the nullability of either argument.
For CASE expressions, the result is always of a nullable type if no ELSE clause is given. If an ELSE is given, the result is nullable if any of the THEN or ELSE expressions are nullable.
- NOTE: The SQL CASE construct is quite powerful: Unlike the C switch statement, it is actually an expression. In this sense, it is rather more like a highly generalized ternary a ? b : c operator than a C switch statement. There can be arbitrarily many conditions specified, each with their own result, and the conditions need not be constants; typically, they are not.
IFNULL and COALESCE are assigned a NOT NULL type if one or more of their arguments are of a NOT NULL type.
In most join operations, the nullability of each column participating in the join is preserved. However, in a LEFT OUTER join, the columns on the right side of the join are always considered nullable; in a RIGHT OUTER join, the columns on the left side of the join are considered nullable.
As in most other languages, CQL does not perform evaluation of value-level expressions during type checking. There is one exception to this rule: An expression within a const is evaluated at compilation time, and if its result is then known to be nonnull, it will be given a NOT NULL type. For example, const(NULL or 1) is given the type BOOL NOT NULL, whereas merely NULL or 1 has the type BOOL.

Nullability Improvements

CQL is able to "improve" the type of some expressions from a nullable type to a NOT NULL type via occurrence typing, also known as flow typing. There are three kinds of improvements that are possible:

Positive improvements, i.e., improvements resulting from the knowledge that some condition containing one or more AND-linked IS NOT NULL checks must have been true:

IF statements:

IF a IS NOT NULL AND c.x IS NOT NULL THEN
  -- `a` and `c.x` are not null here
ELSE IF b IS NOT NULL THEN
  -- `b` is not null here
END IF;

CASE expressions:

CASE
  WHEN a IS NOT NULL AND c.x IS NOT NULL THEN
    -- `a` and `c.x` are not null here
  WHEN b IS NOT NULL THEN
    -- `b` is not null here
  ELSE
    ...
END;

IIF expressions:

IIF(a IS NOT NULL AND c.x IS NOT NULL,
  ..., -- `a` and `c.x` are not null here
  ...
)

SELECT expressions:

SELECT
  -- `t.x` and `t.y` are not null here
FROM t
WHERE x IS NOT NULL AND y IS NOT NULL

Negative improvements, i.e., improvements resulting from the knowledge that some condition containing one or more OR-linked IS NULL checks must have been false:

IF statements:

IF a IS NULL THEN
  ...
ELSE IF c.x IS NULL THEN
  -- `a` is not null here
ELSE
  -- `a` and `c.x` are not null here
END IF;

IF statements, guard pattern:

IF a IS NULL RETURN;
-- `a` is not null here

IF c.x IS NULL THEN
  ...
  THROW;
END IF;
-- `a` and `c.x` are not null here

CASE expressions:

CASE
  WHEN a IS NULL THEN
    ...
  WHEN c.x IS NULL THEN
    -- `a` is not null here
  ELSE
    -- `a` and `c.x` are not null here
END;

IIF expressions:

IIF(a IS NULL OR c.x IS NULL,
  ...,
  ... -- `a` and `c.x` are not null here
)

Assignment improvements, i.e., improvements resulting from the knowledge that the right side of a statement (or a portion therein) cannot be NULL:
- SET statements:
```
SET a := 42;
-- `a` is not null here
```
NOTE: Assignment improvements from FETCH statements are not currently supported. This may change in a future version of CQL.

There are several ways in which improvements can cease to be in effect:

The scope of the improved variable or cursor field has ended:

IF a IS NOT NULL AND c.x IS NOT NULL THEN
  -- `a` and `c.x` are not null here
END IF;
-- `a` and `c.x` are nullable here

An improved variable was SET to a nullable value:

IF a IS NOT NULL THEN
  -- `a` is not null here
  SET a := some_nullable;
  -- `a` is nullable here
END IF;

An improved variable was used as an OUT (or INOUT) argument:

IF a IS NOT NULL THEN
  -- `a` is not null here
  CALL some_procedure_that_requires_an_out_argument(a);
  -- `a` is nullable here
END IF;

An improved variable was used as a target for a FETCH statement:

IF a IS NOT NULL THEN
  -- `a` is not null here
  FETCH c INTO a;
  -- `a` is nullable here
END IF;

An improved cursor field was re-fetched:

IF c.x IS NOT NULL THEN
  -- `c.x` is not null here
  FETCH c;
  -- `c.x` is nullable here
END IF;

A procedure call was made (which removes improvements from all globals because the procedure may have mutated any of them; locals are unaffected):

IF a IS NOT NULL AND some_global IS NOT NULL THEN
  -- `a` and `some_global` are not null here
  CALL some_procedure();
  -- `a` is still not null here
  -- `some_global` is nullable here
END IF;

CQL is generally smart enough to understand the control flow of your program and infer nullability appropriately; here are a handful of examples:

IF some_condition THEN
  SET a := 42;
ELSE
  THROW;
END IF;
-- `a` is not null here because it must have been set to 42
-- if we've made it this far

IF some_condition THEN
  SET a := 42;
ELSE
  SET a := 100;
END IF;
-- `a` is not null here because it was set to a value of a
-- `NOT NULL` type in all branches and the branches cover
-- all of the possible cases

IF a IS NOT NULL THEN
  IF some_condition THEN
    SET a := NULL;
  ELSE
    -- `a` is not null here despite the above `SET` because
    -- CQL understands that, if we're here, the previous
    -- branch must not have been taken
  END IF;
END IF;

IF a IS NOT NULL THEN
  WHILE some_condition
  BEGIN
    -- `x` is nullable here despite `a IS NOT NULL` because
    -- `a` was set to `NULL` later in the loop and thus `x`
    -- will be `NULL` when the loop repeats
    LET x := a;
    SET a := NULL;
    ...
  END;
END IF;

Here are some additional details to note regarding conditions:

For positive improvements, the check must be exactly of the form IS NOT NULL; other checks that imply a variable or cursor field must not be null when true have no effect:
```
IF a > 42 THEN
  -- `a` is nullable here
END IF;
```
NOTE: This may change in a future version of CQL.
For multiple positive improvements to be applied from a single condition, they must be linked by AND expressions along the outer spine of the condition; uses of IS NOT NULL checks that occur as subexpressions within anything other than AND have no effect:
```
IF
  (a IS NOT NULL AND b IS NOT NULL)
  OR c IS NOT NULL
THEN
  -- `a`, `b`, and `c` are all nullable here
END IF;
```
For negative improvements, the check must be exactly of the form IS NULL; other checks that imply a variable or cursor field must not be null when false have no effect:
```
DECLARE equal_to_null INT;
IF a IS equal_to_null THEN
  ...
ELSE
  -- `a` is nullable here
END IF;
```
For multiple negative improvements to be applied from a single condition, they must be linked by OR expressions along the outer spine of the condition; uses of IS NULL checks that occur as subexpressions within anything other than OR have no effect:
```
IF
  (a IS NULL OR b IS NULL)
  AND c IS NULL
THEN
  ...
ELSE
  -- `a`, `b`, and `c` are all nullable here
END IF;
```

Forcing Nonnull Types

If possible, it is best to use the techniques described in "Nullability Improvements" to verify that the value of a nullable type is nonnull before using it as such.

Sometimes, however, you may know that a value with a nullable type cannot be null and simply wish to use it as though it were nonnull. The ifnull_crash and ifnull_throw "attesting" functions convert the type of an expression to be nonnull and ensure that the value is nonnull with a runtime check. They cannot be used in SQLite contexts because the functions are not known to SQLite, but they can be used in loose expressions. For example:

CREATE PROC square_if_odd(a INT NOT NULL, OUT result INT)
BEGIN
  IF a % 2 = 0 THEN
    SET result := NULL;
  ELSE
    SET result := a * a;
  END IF;
END;

-- `x` has type `INT`, but we know it can't be `NULL`
let x := call square_if_odd(3);

-- `y` has type `INT NOT NULL`
let y := ifnull_crash(x);

Above, the ifnull_crash attesting function is used to coerce the expression x to be of type INT NOT NULL. If our assumptions were somehow wrong, however—and x were, in fact, NULL—our program would crash.

As an alternative to crashing, you can use ifnull_throw. The following two pieces of code are equivalent:

CREATE PROC y_is_not_null(x INT)
BEGIN
  let y := ifnull_throw(x);
END;

CREATE PROC y_is_not_null(x INT)
BEGIN
  DECLARE y INT NOT NULL;
  IF x IS NOT NULL THEN
    SET y := x;
  ELSE
    THROW;
  END IF;
END;

Expression Types

CQL supports a variety of expressions, nearly everything from the SQLite world. The following are the various supported operators; they are presented in order from the weakest binding strength to the strongest. Note that the binding order is NOT the same as C, and in some cases it is radically different (e.g. boolean math)

UNION and UNION ALL

These appear only in the context of SELECT statements. The arms of a compound select may include FROM, WHERE, GROUP BY, HAVING, and WINDOW. If ORDER BY or LIMIT ... OFFSET are present, these apply to the entire UNION.

example:

select A.x x from A inner join B using(z)
union all
select C.x x from C
where x = 1;

The WHERE applies only to the second select in the union. And each SELECT is evaluated before the the UNION ALL

select A.x x from A inner join B using(z)
where x = 3
union all
select C.x x from C
where x = 1
order by x;

The ORDER BY applies to the result of the union, so any results from the 2nd branch will sort before any results from the first branch (because x is constrained in both).

Assignment

Assignment only occurs in the UPDATE statement or in the SET statement. In both cases the left side is a simple target and the right side is a general expression. The expression is evaluated before the assignment.

example:

SET x := 1 + 3 AND 4;  -- + before AND then :=

Logical OR

The logical OR operator does shortcut evaluation, much like the C || operator (not to be confused with SQL's concatenation operator with the same lexeme).

The truth table for logical OR is as follows:

A	B	A OR B
0	0	0
0	1	1
0	NULL	NULL
1	0	1
1	1	1
1	NULL	1
NULL	0	NULL
NULL	1	1
NULL	NULL	NULL

Logical AND

The logical AND operator does shortcut evaluation, much like the C && operator, so if the left side is zero the result is 0 and the right side is not evaluated.

The truth table for logical AND is as follows:

A	B	A AND B
0	0	0
0	1	0
0	NULL	0
1	0	0
1	1	1
1	NULL	NULL
NULL	0	0
NULL	1	NULL
NULL	NULL	NULL

BETWEEN and NOT BETWEEN

These are ternary operators. The general forms are:

  expr1 BETWEEN expr2 AND expr3
  expr1 NOT BETWEEN expr2 AND expr3

Note that there is an inherent ambiguity in the language because expr2 or expr3 could be logical expressions that include AND. CQL resolves this ambiguity by insisting that expr2 and expr3 be "math expressions" in the grammar. These expressions may not have ungrouped AND or OR operators.

Examples::

-- oh hell no (syntax error)
a between 1 and 2 and 3;

-- all ok
a between (1 and 2) and 3;
a between 1 and (2 and 3);
a between 1 and b between c and d; -- binds left to right
a between 1 + 2 and 12 / 2;

Logical NOT

The one operand of logical NOT must be a numeric. NOT 'x' is illegal.

Non-ordering tests `!=`, `<>`, `=`, `==`, `LIKE`, `GLOB`, `MATCH`, `REGEXP`, `IN`, `IS`, `IS NOT`

These operations do some non-ordered comparison of their two operands.

IS and IS NOT never return NULL, So for instance X IS NOT NULL gives the natural answer. x IS y is true if and only if: 1. both x and y are NULL or 2. if they are equal.
The other operators return NULL if either operand is NULL and otherwise perform their usual test to produce a boolean
!= and <> are equivalent as are = and ==
strings and blobs compare equal based on their value, not their identity (i.e. not the string/blob pointer)
objects compare equal based on their address, not their content (i.e. reference equality)
MATCH, GLOB, and REGEXP are only valid in SQL contexts, LIKE can be used in any context (a helper method to do LIKE in C is provided by SQLite, but not the others)
MATCH, GLOB, REGEXP, LIKE, and IN may be prefixed with NOT which reverses their value

 NULL IS NULL  -- this is true
(NULL == NULL) IS NULL  -- this is also true because NULL == NULL is not 1, it's NULL.
(NULL != NULL) IS NULL  -- this is also true because NULL != NULL is not 0, it's also NULL.
'xy' NOT LIKE 'z%'` -- this is true

Ordering comparisons <, >, <=, >=

These operators do the usual order comparison of their two operands.

If either operand is NULL the result is NULL
Objects and Blobs may not be compared with these operands
Strings are compared based on their value (as with other comparisons) not their address
Numerics are compared as usual with the usual promotion rules

NOTE: CQL uses strcmp for string comparison. In SQL expressions the comparison happens in whatever way SQLite has been configured. Typically general purpose string comparison should be done with helper functions that deal with collation and other considerations. This is a very complex topic and CQL is largely silent on it.

Bitwise operators <<, >>, &, |

These are the bit-manipulation operations. Their binding strength is VERY different than C so beware. And notably the & operator has the same binding strength as the | operator so they bind left to right, which is utterly unlike most systems. Many parentheses are likely to be needed to get the usual "or of ands" patterns codified correctly. Likewise, the shift operators << and >> are the same strength as & and | which is very atypical. Consider:

x & 1 << 7;    -- probably doesn't mean what you think (this is not ambiguous, it's well defined, but unlike C)
(x & 1) << 7;   -- means the same as the above
x & (1 << 7)   -- probably what you intended

Note that these operators only work on integer and long integer data. If any operand is NULL the result is `NULL.

Addition and Subtraction +, -

These operators do the typical math. Note that there are no unsigned numerics so it's always signed math that is happening here.

operands are promoted to the "biggest" type involved as previously described (bool -> int -> long -> real)
only numeric operands are legal (no adding strings)
if any operand is NULL the result is NULL

Multiplication, Division, Modulus *, /, %

These operators do the typical math. Note that there are no unsigned numerics so it's always signed math that is happening here.

operands are promoted to the "biggest" type as previously described (bool -> int -> long -> real)
only numeric operands are legal (no multiplying strings)
if any operand is NULL the result is NULL

EXCEPTION: the % operator doesn't make sense on real values, so real values produce an error.

Unary operators -, ~

Unary negation (-) and bitwise invert (~) are the strongest binding operators.

The ~ operator only works on integer types (not text, not real)
the usual promotion rules otherwise apply
if the operand is NULL the result is NULL

CASE Expressions

The case expression has two major forms and provides a great deal of flexibility in an expression. You can kind of think of it as the C ?: operator on steroids.

set x := 'y';
select case x
  when 'y' then 1
  when 'z' then 2
  else 3
end;

In this form the case expression (x here) is evaluated exactly once and then compared against each when clause. Every when clause must be type compatible with the case expression. The then expression that corresponds to the matching when is evaluated and becomes the result. If no when matches then the else expression is used. If there is no else and no matching when then the result is null.

If that's not general enough, there is an alternate form:

set y := 'yy';
set z := 'z';
select case
  when y = 'y' then 1
  when z = 'z' then 2
  else 3
end;

The second form, where there is no value before the first when keyword, each when expression is a separate independent boolean expression. The first one that evaluates to true causes the corresponding then to be evaluated and that becomes the result. As before, if there is no matching when clause then the result is the else expression if present, or null if there is no else.

The result types must be compatible and the best type to hold the answer is selected with the usual promotion rules.

SELECT expressions

Single values can be extracted from SQLite using an inline select expression. For instance:

set x_ := (select x from somewhere where id = 1);

The select statement in question must extract exactly one column and the type of the expression becomes the type of the column. This form can appear anywhere an expression can appear, though it is most commonly used in assignments. Something like this would also be valid:

if (select x from somewhere where id = 1) == 3 then
  ...
end if;

The select statement can of course be arbitrarily complex.

Note, if the select statement returns no rows this will result in the normal error flow. In that case, the error code will be SQLITE_DONE, which is treated like an error because in this context SQLITE_ROW is expected as a result of the select. This is not a typical error code and can be quite surprising to callers. If you're seeing this failure mode it usually means the code had no affordance for the case where there were no rows and probably that situation should have been handled. This is an easy mistake to make, so to avoid it, CQL also supports these more tolerant forms:

set x_ := (select x from somewhere where id = 1 if nothing -1);

And even more generally if the schema allows for null values and those are not desired:

set x_ := (select x from somewhere where id = 1 if nothing or null -1);

Both of these are much safer to use as only genuine errors (e.g. the table was dropped and no longer exists) will result in the error control flow.

Again note that:

set x_ := (select ifnull(x,-1) from somewhere where id = 1);

Would not avoid the SQLITE_DONE error code, because no rows returned is not at all the same as a null value returned.

The if nothing or null form above is equivalent to the following, but it is more economical, and probably clearer:

set x_ := (select ifnull(x,-1) from somewhere where id = 1 if nothing -1);

To compute the type of the overall expression, the rules are almost the same as normal binary operators. In particular:

if the default expression is present it must be type compatible with the select result
- the result type is the smallest type that holds both the select value and the default expression (see normal promotion rules above)
object types are not allowed (SQLite cannot return an object)
in (select ...) the result type is not null if and only if the select result type is not null (see select statement, many cases)
in (select ... if nothing) the result type is not null if and only if both the select result and the default expression types are not null (normal binary rules)
in (select ... if nothing or null) the result type is not null if and only if the default expression type is not null

Finally, the form (select ... if nothing throw) is allowed; this form is exactly the same as normal (select ...) but makes the explicit that the error control flow will happen if there is no row. Consequently this form is allowed even if @enforce_strict select if nothing is in force.

Marking Data as Sensitive

CQL supports the notion of 'sensitive' data in a first class way. You can think of it as very much like nullability; It largely begins by tagging data columns with @sensitive

Rather than go through the whole calculus, it's easier to understand by a series of examples. So let's start with a table with some sensitive data.

create table with_sensitive(
 id integer,
 name text @sensitive,
 sens integer @sensitive
);

The most obvious thing you might do at this point is create a stored proc that would read data out of that table. Maybe something like this:

create proc get_sensitive()
begin
  select id as not_sensitive_1,
        sens + 1 sensitive_1,
        name as sensitive_2,
        'x' as not_sensitive_2,
        -sens as sensitive_3,
        sens between 1 and 3 as sensitive_4
  from with_sensitive;
end;

So looking at that procedure we can see that it's reading sensitive data, so the result will have some sensitive columns in it.

the "id" is not sensitive (at least not in this example)
sens + 1 is sensitive, math on a sensitive field leaves it sensitive
name is sensitive, it began that way and is unchanged
'x' is just a string literal, it's not sensitive
-sens is sensitive, that's more math
and the between expression is also sensitive

Generally sensitivity is "radioactive" - anything it touches becomes sensitive. This is very important because even a simple looking expression like sens IS NOT NULL must lead to a sensitive result or the whole process would be largely useless. It has to be basically impossible to wash away sensitivity.

These rules apply to normal expressions as well as expressions in the context of SQL. Accordingly:

Sensitive variables can be declared:

declare sens integer @sensitive;

Simple operations on the variables are sensitive

-- this is sensitive (and the same would be true for any other math)
sens + 1;

The IN expression gives you sensitive results if anything about it is sensitive

-- all of these are sensitive
sens in (1, 2);
1 in (1, sens);
(select id in (select sens from with_sensitive));

Similarly sensitive constructs in CASE expressions result in a sensitive output

-- not sensitive
case 0 when 1 then 2 else 3 end;

-- all of these are sensitive
case sens when 1 then 2 else 3 end;
case 0 when sens then 2 else 3 end;
case 0 when 1 then sens else 3 end;
case 0 when 1 then 2 else sens end;

Cast operations preserve sensitivity

-- sensitive result
select cast(sens as INT);

Aggregate functions likewise preserve sensitivity

-- all of these are sensitive
select AVG(T1.sens) from with_sensitive T1;
select MIN(T1.sens) from with_sensitive T1;
select MAX(T1.sens) from with_sensitive T1;
select SUM(T1.sens) from with_sensitive T1;
select COUNT(T1.sens) from with_sensitive T1;

There are many operators that get similar treatment such as COALESCE, IFNULL, IS and IS NOT.

Things get more interesting when we come to the EXISTS operator:

-- sensitive if and only if any selected column is sensitive
exists(select * from with_sensitive)

-- sensitive because "info" is sensitive
exists(select info from with_sensitive)

-- not sensitive because "id" is not sensitive
exists(select id from with_sensitive)

If this is making you nervous, it probably should, we need a little more protection because of the way EXISTS is typically used. The predicates matter, consider the following:

-- id is now sensitive because the predicate of the where clause was sensitive
select id from with_sensitive where sens = 1;

-- this expression is now sensitive because id is sensitive in this context
exists(select id from with_sensitive where sens = 1)

In general: if the predicate of a WHERE or HAVING clause is sensitive then all columns in the result become sensitive.

Similarly when performing joins, if the column specified in the USING clause is sensitive or the predicate of the ON clause is sensitive then the result of the join is considered to be all sensitive columns (even if the columns were not sensitive in the schema).

Likewise a sensitive expression in LIMIT or OFFSET will result in 100% sensitive columns as these can be used in a WHERE-ish way. There is no reasonable defense against using LIMIT and testing for the presence or absence of a row as a way to wash away sensitivity so that is a weakness, but the rules that are present are likely to be very helpful.

-- join with ON
select T1.id from with_sensitive T1 inner join with_sensitive T2 on T1.sens = T2.sens

-- join with USING
select T1.id from with_sensitive T1 inner join with_sensitive T2 using(sens);

All of these expression and join propagations are designed to make it impossible to simply wash-away sensitivity with a little bit of math.

Now we come to enforcement, which boils down to what assignments or "assignment-like" operations we allow.

If we have these:

declare sens integer @sensitive;
declare not_sens integer;

We can use those as stand-ins for lots of expressions, but the essential calculus goes like this:

-- assigning a sensitive to a sensitive is ok
set sens := sens + 1;

-- assigning not sensitive data to a sensitive is ok
-- this is needed so you can (e.g.) initialize to zero
set sens := not_sens;

-- not ok
set not_sens := sens;

Now these "assignments" can happen in a variety of ways:

you can set an out parameter of your procedure
when calling a function or procedure, we require:
- any IN parameters of the target be "assignable" from the value of the argument expression
- any OUT parameters of the target be "assignable" from the procedures type to the argument variable
- any IN/OUT parameters require both the above

Now it's possible to write a procedure that accepts sensitive things and returns non-sensitive things. This is fundamentally necessary because the proc must be able return (e.g.) a success code, or encrypted data, that is not sensitive. However, if you write the procedure in CQL it, too, will have to follow the assignment rules and so cheating will be quite hard. The idea here is to make it easy to handle sensitive data well and make typical mistakes trigger errors.

With these rules it's possible to compute the the type of procedure result sets and also to enforce IN/OUT parameters. Since the signature of procedures is conveniently generated with --generate_exports good practices are fairly easy to follow and sensitivity checks flow well into your programs.

This is a brief summary of CQL semantics for reference types -- those types that are ref counted by the runtime.

The three reference types are:

TEXT
OBJECT
BLOB

Each of these has their own macro for retain and release though all three actually turn into the exact same code in all the current CQL runtime implementations. In all cases the object is expected to be promptly freed when the reference count falls to zero.

Reference Semantics

Stored Procedure Arguments

in and inout arguments are not retained on entry to a stored proc
out arguments are assumed to contain garbage and are nulled without retaining on entry
if your out argument doesn't have garbage in it, then it is up to you do release it before you make a call
When calling a proc with an out argument CQL will release the argument variable before the call site, obeying its own contract

Local Variables

assigning to a local variable retains the object, and then does a release on the previous object
this order is important; all assignments are done in this way in case of aliasing (release first might accidentally free too soon)
CQL calls release on all local variable when the method exits

Assigning to an `out` parameter or a global variable

out, inout parameters, and global variables work just like local variables except that CQL does not call release at the end of the procedure

Function Return Values

Stored procedures do not return values, they only have out arguments and those are well defined as above. Functions however are also supported and they can have either get or create semantics

Get Semantics

If you declare a function like so:

declare function Getter() object;

Then CQL assumes that the returned object should follow the normal rules above, retain/release will balance by the end of the procedure for locals and globals or out arguments could retain the object

Create Semantics

If you declare a function like so:

declare function Getter() create text;

then CQL assumes that the function created a new result which it is now responsible for releasing. In short, the returned object is assumed to arrive with a retain count of 1 already on it. When CQL stores this return value it will:

release the object that was present at the storage location (if any)
copy the returned pointer without further retaining it this one time

As a result if you store the returned value in a local variable it will be released when the procedure exits (as usual) or if you instead store the result in a global or an out parameter the result will survive to be used later.

Comparison

CQL tries to adhere to normal SQL comparison rules but with a C twist.

`OBJECT`

The object type has no value based comparison, so there is no <, > and so forth.

The following table is useful. Let's suppose there are exactly two distinct objects 'X' and 'Y'

true expressions: X = X X <> Y Y = Y Y <> X X IN (X, Y) X NOT IN (Y)

false expressions: X = Y X <> X Y = X Y <> Y X NOT IN (X, Y)

null expressions: null = null X <> null x = null null <> null Y <> null y = null

null = null resulting in null is particular surprising but consistent with the usual SQL rules. And again, as in SQL, the IS operator returns true for X IS Y even if both are null.

`TEXT`

Text has value comparison semantics but normal string comparison is done only with strcmp which is of limited value. Typically you'll want to either delegate the comparison to Sqlite (with (select x < y)) or else use a helper function with a suitable comparison mechanism.

For text comparisons including equality:

true: if and only if both operands are not null and the comparison matches (using strcmp) false: if and only if both operands are not null and the comparison does not match (using strcmp) null: if and only if at least one operand is null

EXAMPLE: 'x' < 'y' is true because strcmp("x", "y") < 0

The IS and IS NOT operators behave similarly to equality and inequality, but never return null. If X is some value that doesn't happen to be null then we have the following:

true: null is null X is X X is not null null is not X false: null is not null X is not X X is null null is X

The IN and NOT IN operators also work for text using the same value comparisons as above. Additionally there are special text comparison operators such as LIKE, MATCH and GLOB. These comparisons are defined by SQLite.

`BLOB`

Blobs are compared by value (equivalent to memcmp) but have no well-defined ordering. The memcmp order is deemed not helpful as blobs usually have internal structure hence the valid comparisons are only equality and inequality. You can use user defined functions to do better comparisons of your particular blobs if needed.

The net comparison behavior is otherwise just like strings.

Sample Code

Out Argument Semantics

DECLARE FUNCTION foo() OBJECT;

CREATE PROC foo_user (OUT baz OBJECT)
BEGIN
  SET baz := foo();
END;

void foo_user(cql_object_ref _Nullable *_Nonnull baz) {
  *(void **)baz = NULL; // set out arg to non-garbage
  cql_set_object_ref(baz, foo());
}

Function with Create Semantics

DECLARE FUNCTION foo() CREATE OBJECT;

CREATE PROCEDURE foo_user (INOUT baz OBJECT)
BEGIN
  DECLARE x OBJECT;
  SET x := foo();
  SET baz := foo();
END;

void foo_user(cql_object_ref _Nullable *_Nonnull baz) {
  cql_object_ref x = NULL;

  cql_object_release(x);
  x = foo();
  cql_object_release(*baz);
  *baz = foo();

cql_cleanup:
  cql_object_release(x);
}

Function with Get Semantics

DECLARE FUNCTION foo() OBJECT;

CREATE PROCEDURE foo_user (INOUT baz OBJECT)
BEGIN
  DECLARE x OBJECT;
  SET x := foo();
  SET baz := foo();
END;

void foo_user(cql_object_ref _Nullable *_Nonnull baz) {
  cql_object_ref x = NULL;

  cql_set_object_ref(&x, foo());
  cql_set_object_ref(baz, foo());

cql_cleanup:
  cql_object_release(x);
}

Expression Examples​

Operator Precedence​

Order of Evaluation​

Variables, Columns, Basic Types and Nullability​

LET Statement​

The @RC special variable​

Types of Literals​

String Literals​

Blob Literals​

Numeric Literals​

The NULL literal​

Other Considerations​

Const and Enumerations​

Named Types​

Type Kinds​

Nullability​

Nullability Rules​

Nullability Improvements​

Forcing Nonnull Types​

Expression Types​

UNION and UNION ALL​

Assignment​

Logical OR​

Logical AND​

BETWEEN and NOT BETWEEN​

Logical NOT​

Non-ordering tests !=, <>, =, ==, LIKE, GLOB, MATCH, REGEXP, IN, IS, IS NOT​

Ordering comparisons <, >, <=, >=​

Bitwise operators <<, >>, &, |​

Addition and Subtraction +, -​

Multiplication, Division, Modulus *, /, %​

Unary operators -, ~​

CASE Expressions​

SELECT expressions​

Marking Data as Sensitive​

Reference Semantics​

Stored Procedure Arguments​

Local Variables​

Assigning to an out parameter or a global variable​

Function Return Values​

Get Semantics​

Create Semantics​

Comparison​

OBJECT​

TEXT​

BLOB​

Sample Code​

Out Argument Semantics​

Function with Create Semantics​

Function with Get Semantics​