Introducing General Purpose Error Tracing

CG/SQL Team

CG/SQL Team

Maintainer of CG/SQL

Today we made a couple of minor changes in the code generation to take care of some lingering issues.

The first is that when you did a throw inside a catch to basically rethrow the error, you would lose the error code if something had succeeded within the catch handler.

The old codegen looked something like this:

catch_start_1: {
printf("error\n");
cql_best_error(&_rc_)
goto cql_cleanup;
}

The problem being that while the printf above is fine and well, if you did any SQL operation then _rc_ would be clobbered and you'd end up throwing an unrelated error code. cql_best_error would at least make sure it was a failure code (SQLITE_ERROR) but the original error code was lost.

The new code looks like this:

catch_start_1: {
_rc_thrown_ = _rc_;
printf("error\n");
_rc_ = cql_best_error(_rc_thrown_);
goto cql_cleanup;
}

So now if there are db operations, the original return code is still preserved. Note: you still lose sqlite3_errmsg() because SQLite doesn't know that cleanup logic is running.

This brings us to the second new thing: general purpose error traces.

Error checking of result codes happens very consistently in CQL output. The usual pattern looks something like this:

_rc_ = cql_exec(_db_,
"SAVEPOINT base_proc_savepoint");
if (_rc_ != SQLITE_OK) goto cql_cleanup;

or if it's inside a try block a little different... very little actually

// try
{
_rc_ = cql_exec(_db_,
"RELEASE SAVEPOINT base_proc_savepoint");
if (_rc_ != SQLITE_OK) goto catch_start_8;
// ... the rest of the try block
}

Basically if the local _rc_ doersn't match the necessary condition we goto the appropriate error label... either the relevant catch block or else the procedure's cleanup code.

We generalize this a bit now so that it looks like this:

if (_rc_ != SQLITE_OK) { cql_error_trace(); goto cql_cleanup; }
-- or, in a catch...
if (_rc_ != SQLITE_OK) { cql_error_trace(); goto catch_start_8; }

Now the default implementation of cql_error_trace() is in cqlrt.h which you can and should customize. I'll be writing more about that later but suffice to say you're supposed to replace cqlrt.h and cqlrt.c with suitable runtime helpers for your environment while keeping cqlrt_common.h and cqlrt_common.c fixed.

So for instance, your cqlrt.h could look like this:

#ifndef CQL_TRACING_ENABLED
#define cql_error_trace()
#else
// whatever tracing you want, for example this might help in test code.
#define cql_error_trace() \
fprintf(stderr, "Error at %s:%d in %s: %d %s\n", __FILE__, __LINE__, _PROC_, _rc_, sqlite3_errmsg(_db_))
#endif

So then when you need to debug problems involving lots of error recovery you can watch the entire chain of events easily.

Note that there are some useful variables there:

In any procedure _db_ is the current database and _rc_ is the most recent return code from SQLite. __FILE__ and __LINE__ of course come from the preprocessor. and _PROC_ (one underscore) is now generated by the compiler. Every procedure's body now begins with:

#undef _PROC_
#define _PROC_ "the_current_procedure"

So by defining your own cql_error_trace macro you can cause whatever logging you need to happen. Note this can be very expensive indeed because this happens a lot and even the string literals needed are a significant cost. So generally this should be off for production builds and enabled as needed for debug builds.

The default implementation is just an empty block

#define cql_error_trace()

But the hook is enough to light up whatever logging you might need, and you can use sqlite3_errmsg() before that message is gone.

Good hunting.

More Flexible Cursor Patterns Using "Boxing"

CG/SQL Team

CG/SQL Team

Maintainer of CG/SQL

I was reviewing the update posting that just went out and I realized I'd forgotten to mention another big ticket item. So consider this an appendix to the update.

In some cases we started seeing a need to "ship cursors around" a little bit more flexibly. Note shipping values around is already doable so this new work is largely about being able to create a "statement cursor" in one procedure and consume it safely elsewhere. The general pattern looks like this:

Declare a statement cursor as usual, maybe something like this:

declare C cursor for select * from shape_source;
-- or
declare C cursor for call proc_that_returns_a_shape();

Make an object that can hold a cursor:

declare obj object<T cursor>;

Where T is the name of a shape. It can be a table name, or a view name, or it can be the name of the canonical procedure that returns the result. You really want this to be some kind of global name though. Something you can get with a #include in various places. In this case choices for T might be shape_source the table or proc_that_returns_a_shape the procedure.

Remember you can always make a fake procedure that returns a result to sort of typedef a shape name. e.g.

declare proc my_shape() (id integer not null, name text);

The procedure here my_shape doesn’t have to actually ever be created, in fact it’s probably better if it doesn’t. You won’t call it, you’re just using its hypothetical result as a shape. This could be useful if you have several procedures like proc_that_returns_a_shape that all return my_shape.

At this point you could use the cursor maybe something like:

loop fetch C
begin
-- do stuff with C
end;

Those are the usual patterns and they let you consume statement cursors sort of “up” from where it was created, but what if you want some worker procedures that consume a cursor there is no good way to pass your cursor down again. Well, there wasn't. Now there is. Let's go back to that box object creation and use it

-- recap: declare the box that holds the cursor (T changed to my_shape for this example)
declare obj object<my_shape cursor>;
-- box the cursor into the object (the cursor shape must match the box shape)
set obj from cursor C;

The variable obj can now be passed around as usual. Then, later, you can "unbox" it to get a cursor back. Like so

-- unboxing a cursor from an object
declare D cursor for obj;

These primitives will allow cursors to be passed around with managed lifetime. Example:

-- consumes a cursor
create proc cursor_user(box object<my_shape cursor>)
begin
declare C cursor for box; -- the cursors shape will be my_shape matching box
loop fetch C
begin
-- do something with C
end;
end;
-- captures a cursor and passes it on
create proc cursor_boxer()
begin
declare C cursor for select * from something_like_my_shape;
declare box object<my_shape cursor>
set box from cursor C; -- produces error if shape doesn't match
call cursor_user(box);
end;

Importantly, once you box a cursor the underlying SQLite statement’s lifetime is managed by the box object with normal retain/release semantics so timely release becomes imperative.

With this pattern it's possible to, for instance, consume some of the rows in one procedure and the rest in another procedure.

Now, the main reason for doing this is if you have some standard helper methods that can get a cursor from a variety of places and process it. But remember, that boxing isn’t the usual pattern at all and returning cursors in a box, while possible, should be avoided in favor of the simpler pattern of doing your select or call at the end to compute the result as we do now, if only because then then lifetime is very simple in all those cases. Durably storing a boxed cursor could lead to all manner of problems -- it's just like holding on to a sqlite3_stmt * for a long time. Actually "just like" is an understatement, it's exactly the same as holding on to a statement for a long time with all the same problems because that is exactly what's going on here.

So, good generalization, but possibly less Pit of Success, especially with complex box patterns. So watch the sharp edges.

One Month Update

CG/SQL Team

CG/SQL Team

Maintainer of CG/SQL

It's hard to believe it's been a month since the welcome message went up. We were happy to see interest right away and even a bunch of forks but most of all pull requests. A sweeping change to modernize the cql.y grammar was much appreciated. That $1 stuff was very old school (I'm showing my age now).

Here's a quick summary of what's been going on:

  • @mingodad gave us an implementation of check and collate column attributes (the check attribute on tables should be easy to add from here)
  • the select function form should never return objects, only SQLite types, enforced
  • @attribute(cql:suppress_result_set) was added to save code gen for procedures that don't need the C result set wrappers
  • cql_cursor_diff_col and cql_cursor_diff_val methods were added to report what's different about two cursors (highly useful in test code)
  • cql_cursor_format was added so you can quickly convert any cursor into columns and values as string for debug output (no matter the shape)
  • sqlite3_changes was added to the builtin list so you don't have to use declare select function to use it anymore
  • cql_get_blob_size was added so you can see how big your blobs are (useful for diagnostics)
  • trim, rtrim and ltrim were added to the builtin list so you can use them without declare select function
  • the builtin function ifnull_crash was added so that nullables that have already checked can be safely typecast to not null
  • the bug we saw in demo video number 2 where some foreign keys were not properly linked up in autotest code was fixed (yay videos)
  • time functions are now known to be not null for a bunch of simple cases such as 'now' arguments
  • you can use the cast(.. as ..) operator on numeric types outside of the SQL context
  • @mingodad replaced all the positional references by named references in cql.y (yes! thank you!)
  • several minor bug fixes
  • the railroad diagrams were updated

NOTE: I often refer to "sugar" in the below. This is short for syntatic sugar which, in case you're not familiar with the term, refers to a syntatically more pleasing way of writing a concept that is otherwise totally doable with normal syntax. Many languages have sugar for forms that are common -- for brevity, clarity, and/or correctness.

And now a few notes on The Big Stuff

We often add new features to the language to facilitate the writing of tests. The tests have a lot of boilerplate often setting up and calling the same procedures again and again with slightly different arguments. Long argument lists and long insert column lists are especially problematic as these can be very error prone. Here good language constructs are very helpful. We've found good test constructs are often invaluable in production code as well, though in our experience the tests often have a lot more repitition that needs refactoring than production code. To that end we added some very useful things in the last month:

Declare cursors in the shape of a procedure's arguments and use them

The most common way to create a cursor is from a select statement but you can also make a cursor that can hold values for you by declaring it to be LIKE something else with a shape. A classic example is:

declare C cursor like some_table;

Now C has the same columns and types as some_table

Many procedures have a result type that is also a shape, for instance any procedure that ends with a select statement has a result shape defined by the columns of the select statement. You could always do this sort of thing:

declare C cursor like some_proc;

Meaning make C a cursor whose shape is whatever some_procreturns, which is of course exactly the kind of cursor you need to capture the result of some_proc.

Now we add:

declare C cursor like some_proc arguments;

The idea being that the arguments of some_proc are also a shape (unless it has none). With this done you want to use that cursor to call the procedure -- that being sort of the whole point. So we add this:

call some_proc(from C);

How do we use this effectively? Hold on just a second -- for that answer we need one more big tool to really help the syntax.

Loading cursors and inserting columns

Loading up a cursor is done with syntax that is very much like an insert statement. An example might be something like this:

fetch C(x,y,z) from values(1,2,3);

This is simple enough but it becomes more problematic if there are many values and especially if the values have complex names. To make this a little less error prone CQL now has this sugar form for fetch, insert, and soon update cursor (like maybe before you see this blog). The more readable form is:

fetch C using
1 x,
2 y,
3 z;

This form has the values next to their names just like in a select statement, like all sugars, it is automatically rewritten to the normal form.

Likewise

insert into some_table using
1 id,
'fred' first_name,
'flintstone' last_name,
'bedrock' home_town,
'dino' favorite_pet,
'wilma' life_partner;

becomes

insert into some_table(id, first_name, last_name, home_town, favorite_pet, life_partner)
values(1, 'fred', 'flintstone', 'bedrock', 'dino', 'wilma');

except the sugar form is much less error prone. This form doesn't generalize to many values but the single row case is super common.

Since this form is automatically rewritten SQLite will never see the sugar syntax, it will get the normal syntax.

NOTE: the insert rewrite is coming later today, and will likely be live by the time you read this.

Putting these together

Let's suppose you have to write a test. You have a procedure test_subject that takes some arguments plus you have another helper procedure test_setup that puts seed data in the right places for your subject. But there are many variations and a lot of what you do between variations is the same. How can you write this economically making it clear what is different between variations without a lot of fuss. Well you can do something like this:

-- use defaults for all the named values
-- use 'seed' for everything else that isn't named
create proc default_setup_args(seed integer not null)
begin
declare args cursor like test_setup arguments;
fetch args using
1334 primary_id,
98012 secondary_id,
'foo' useful_name,
'bar' other_useful_name,
1 fast_mode
@dummy_seed(seed);
out args;
end;

With the above you can easily see which values go to which arguments

Your test setup can now look something like this:

declare setup_args cursor like test_setup arguments;
fetch setup_args from call default_setup_args(1999);
update cursor setup_args using
0 fast_mode; -- override fast mode for this test
call test_setup(from setup_args);

To call the test subject you probably need some of those setup arguments and maybe some more things.

create proc default_subject_args(like default_setup_args, other_thing bool not null)
begin
declare args cursor like test_subject arguments;
fetch args using
primary_id primary_id, -- this came from the default_setup_args result
secondary_id secondary_id, -- so did this
useful_name name, -- the field names don't have to match
fast_mode fast_mode,
other_thing other_thing;
out args;
end;

Then the test code

declare test_args cursor like test_subject arguments;
fetch test_args from call default_subject_args(0);
call test_subject(from test_args);

Importantly, the cursor set operations are all by name so the order doesn't matter. Which means even if there are many arguments you don't have to worry that you got them in the wrong order or that they are the wrong type. Effectively you have a simple call by name strategy and you can easily read off the arguments. You could do something similarly brief with helper functions to provide the default arguments but then you can't readily re-use those arguments in later calls or for verification so this way seems a lot more useful in a test context.

When it comes time to validate, probably your test subject is returning a cursor from a select that you want to check. A slightly different call will do the job there.

Cursor Differencing

With the setup above you can verify results very easily. Let's change it a little bit:

-- same as before, with a cursor
declare results cursor for call test_subject(from test_args);
-- get the first row
fetch results;
declare expected cursor like results;
fetch expected using
setup_args.primary_id primary_id,
setup_args.useful_name name,
test_args.other_thing other_thing
@dummy_seed(1999); -- dummy values for all other columns
-- make a macro like EXPECT_CURSOR_EQ(x,y) for this
-- if the cursors are different the result is a string with the first
-- different column name and the left and right values ready to print
call ExpectNull(cql_cursor_diff_val(expected, result));

ExpectEqual could be

create proc ExpectNull(t text)
begin
if t is not null then
call printf('%s\n', t); -- or whatever
throw;
end if;
end;

All that testing support comes from:

  • cursors in the shape of arguments
  • cleaner fetch/insert syntax
  • cursors passed as arguments
  • cursor differences

It kills a lot of boilerplate resulting in tests that are much clearer.

And that's what's been going on for the last month in CG/SQL land.

If you got this far thanks for reading. If you didn't get this far, you aren't reading this anyway so thanking you is moot =P

Stay safe.

Rico for CG/SQL

P.S. most of these fragments don't actually compile because of missing schema and maybe the odd typo. If there is interest I'll make a demo that works soup to nuts.

Welcome

CG/SQL Team

CG/SQL Team

Maintainer of CG/SQL

Hello everyone!

Thank you for visiting the CG/SQL's blog page. If you would like to read the very first blog announcing the project, please go over to the Facebook's Engineering post published in early Octover 2020.

Looking forward to working with all of you!

Sincerely, CG/SQL Team