I posted recently about making kdb+ go faster. I forgot to mention shunning TCP and using shared memory as the transport layer.
This has been around for ages and is the technology which underpins low-latency approaches like Pete Lawrey’s Chronicle Queue. You have choices about whether to make the file durable or in-memory only.
We could simply use a shared library to write from kdb to such a thing, and enable simple interoperability with clients written in other languages. However, using a foreign format means that tools like -11! no longer work, and requires us to implement replacements (and possibly deserialisers if the data-format is different). I think there’s a way to square that circle.
A key feature of the transport is that we need a notification mechanism to let readers know that more data has been appended and is safe to read. Simply using ftruncate on the log file isn’t enough: there’s a period between the change in the file-size being observable and the subsequent write completing in which a reader could read nonsense data. I wouldn’t want to issue that many ftruncate system calls anyway, to be honest. I’m sure this is a reason why it’s important for readers of a standard TP’s log file not to read past .u.i.
I think the way we can do this is to write a standard tickerplant header, and then a carefully crafted, kdb-compatible message that at a very specific offset contains a 64-bit integer value describint the offset of the most recent message in the log. We keep the first page of the log file mapped into memory and once new messages have been appended, we write the offset of the latest over the top of the existing value. We can even use io_uring to write to different offsets within the same file and batch-up both data and update, avoiding mmap, which has its detractors.
Readers can use an approach like busy-polling to check for an updated value and then safely read any new messages.
The Magic Header
There are any number of ways you could compose the header. Essentially, you want to ensure that the “last message offset” is at an 8-byte offset from the start of the file (and thus the first memory page). This permits aligned writes and reads of the datum.
Let’s have a look at what the kdb+ function would look like:
.mg.hdr:{[O]} // "O" for offsetWrite a test log-file:
q)jfd:hopen .[`:log;();:;()]
q)jfd enlist (`.mg.hdr;0x0 sv 0xbebafecaefbeadde)
7i 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2
ff01000000000000 // log-file header
000002000000 // mixed-list length 2
f5 // symbol atom
2e6d672e68647200 // .mg.hdr
f9 // long atom
deadbeefcafebabe // the 64-bit offset valueWe see that the value we’re going to overwrite again and again appears at a 24-byte offset from the start of the file.
We could write all sorts of other interesting things into the header, like hostname, time-opened, etc., so long as the offset value is at an 8-byte boundary.
In steady-state running, the tickerplant appends standard kdb+ upd functions to the log file.
The compatibility with kdb+ only really extends to the use of-11! over data at rest. A standard kdb+ instance would not be able to “subscribe” to this kind of shared memory transport without custom code to read messages on the fly. However, for replay, all that a client needs is a dummy implementation of a monadic function named .mg.hdr.