Ayende @ Rahien

My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:


+972 52-548-6969

, @ Q c

Posts: 6,191 | Comments: 46,017

filter by tags archive

How timers works in the CLR

time to read 3 min | 490 words

One of the coolest things about the CoreCLR being open sourced is that I can trawl through the source code and read random parts of the framework. One of the reasons to do this, is to be able to understand the implementation concerns, not just the design, which than allows us to produce a much better output.

In this case, I was investigating a hunch, and I found myself deep inside the code that runs the timers in .NET. The relevant code is here, and it is clearly commented as well as quite nice to read.

I’m just going to summarize a few interesting things I found in the code.

There is actually only one single real timer for the entire .NET process. I started out thinking this is handled via CreateTimerQueueTimer on Windows, but I couldn’t find a Linux implementation. Reading the code, the CLR actually implements this directly via this code. Simplified, it does the following:


This has some interesting implications. It means that timers are all going to be fired from the same thread, at the same time (not quite true, see below), and that there is likely going to be a problem with very long timers (a timer for three months from now will overflow int32, for example).

The list of timers is held in a linked list, and every time it is awakened, it runs through the list, finding the timer to trigger, and the next time to be triggered. The code in this code path is called with only a single timer, which is then used in the managed code for actually implementing the managed timers. It is important to note that actually running the timer callback is done by queuing that on the thread pool, not executing it on the timer thread.

On the managed side, there are some interesting comments explaining the expected usage and data structures used. There are two common cases, one is the use of timeout, in which case this is typically discarded before actual use, and the other is having the recurring timers, which tend to happen once in a long while. So the code favors adding / removing timers over actually finding which need to be executed.

Another thing to note is that this adding / removing / changing / iterating over timers is protected by a single lock. Every time the unmanaged timer wakes, it queue the callback on the thread pool, and then the FireNextTimers is called, which takes a look, iterates over all the timers, and queues all those timers to be executed on the thread pool.

This behavior is interesting, because it has some impact on commonly used cases. But I’ll discuss that on my next post.

Reverse engineering the Smaz compression library

time to read 9 min | 1737 words

Smaz is a small library for compressing small strings. It is strange to talk about reverse engineering such a project, because Smaz is open source and is available on GitHub. Nevertheless the code on GitHub is actually only part of the story. It is dense,  and it was generated by a script that isn't included. That means that when we read the code, we are missing some key components, in particular, you are looking at an end result, without knowing how you got there.

Compression algorithms take advantage of repetitions in text to reduce the overall size. State of the art algorithms typically use Lempel-Ziv and Huffman coding to compress the data. But because they are typically making no assumptions about the data, and they require a bit of length before they can gather enough state to truly gather some steam and start reducing the output length.

That means that for small strings (up to a hundred bytes or so), there is no real benefit in standard compression algorithms, and often you'll see an increase in the space taken. Smaz is meant to solve that for most common cases. It does so by using a prepared dictionary of common terms that can be easily compressed.

The first thing that you'll encounter when you read the Smaz code is this:


This goes on for about 50 lines, and at first glance, it looks utterly incomprehensible. We'll ignore this for now, and read further. The next item of interest is:


The cb postfix stands for code book, and rcb stands for reverse code book.

It took me a while to figure it out, but the idea in the code book entries is that this is actually a hash table. Each string in the codebook is actually multiple entries, composed of a single byte length, the term text, and then the index in the codebook. There are possibly multiple entries in each value in the array, and we use the length of each entry to index into them.

Let us see how this works:

int smaz_compress(char *in, int inlen, char *out, int outlen) {
    unsigned int h1,h2,h3=0;
    int verblen = 0, _outlen = outlen;
    char verb[256], *_out = out;

    while(inlen) {
        int j = 7, needed;
        char *flush = NULL;
        char *slot;

        h1 = h2 = in[0]<<3;
        if (inlen > 1) h2 += in[1];
        if (inlen > 2) h3 = h2^in[2];
        if (j > inlen) j = inlen;

This code just setup a temporary buffer, and hash the current input term's bytes. Then we start getting interesting.

        /* Try to lookup substrings into the hash table, starting from the
         * longer to the shorter substrings */
        for (; j > 0; j--) {
            switch(j) {
            case 1: slot = Smaz_cb[h1%241]; break;
            case 2: slot = Smaz_cb[h2%241]; break;
            default: slot = Smaz_cb[h3%241]; break;
            while(slot[0]) {
                if (slot[0] == j && memcmp(slot+1,in,j) == 0) {
                    /* Match found in the hash table,

The value j is the current size we are checking, and we are using the hash value to get a particular slot in the hash table. Note that only the first 3 bytes are actually hashed. After the appropriate slot is found, we check whatever the term length is a match, and if so, whatever the actual strings match. If they don't, we run the following code:

slot += slot[0]+2;

this was a bit confusing at first, but what is basically going on here is that we move to the next entry in this slot. This is pretty neat, since it covers both empty slots (defined in the code as "") and multi value slots and take advantage on the fact that C strings end with \0 without having to specify it explicitly.

So what happens if we have a match?


In this case, we check if there are any verbatim bytes (bytes that we haven't been able to compress that are stored in a buffer.). If there are such bytes, we do the following:

  • Compute the space needed for the verbatim value.
  • Set the next flush point in the buffer for verbatim values to the current output location.
  • Move the output pointer after the location where the verbatim string will be written.
  • Reduce the length of the remaining output size by the verbatim length.

Note that we aren't actually going to write anything yet. First, we need to emit the newly captured match to the code book:


This is done by first checking if there is space (if there isn't, we return an error).

Then we write a byte to the output buffer. But this is a very strange byte.

What is this "slot[slot[0]+1]" ? Well, remember the structure that we talked about for the hash table entries? The first byte is a length, and the last byte is the index into the code book. So what we are actually doing is indexing into the entry to get the last value, which his the code book index, which we write to the output.

The rest is pretty much just book keeping information. We move the input buffer pointer according to the just discovered term, etc.


If no match was found, we just add the current byte to the verbatim buffer, and move on to the next one.

Now, let us look at this out label, and what it does:


Basically, it is a repeat of the earlier code when finding a match. If we have a large enough verbatim string, we need to flush it, so this takes care of it. The actual flushing is interesting:


If there is a single byte, we write 254 (single verbatim byte marker), then we write the byte. If there is more than a byte, we write 255 (variable length size), the length, then the verbatim string.

It is interesting to note that this can be written after the output byte has been written. It took me a while to understand this. Nothing wrong with it, but I like sequential writes much better.

The decompression is very simple. Read a byte from the compressed input, if it is  254, then the next byte should be copied verbatim to the output. If the byte is 255, then read the next byte, which is the length, and copy the next length bytes to the output. Finally, if the byte isn't 255 or 254, it is an index into the code book table, and the value should be taken from there.

I wrote a managed implementation of this, which allows me to play much more easily with the codebook definition. This was because there are a bunch of terms in the Smaz codebook that aren't really relevant for what we need. In particular, it looks like it was trained to find the most common values from a set of HTML documents, it has entries for "<div>", "><", etc. Instead of going with the 254 values that can be compressed, I pruned the list a bit, and ended up with just 245 items.

I was then able to change the compression format to be:

  • If it is a value under 245, use the term from the codebook.
  • If it is a value higher than 245, it is a verbatim value whose actual length need to be figured out by subtracting 245 from it.

This allows me to save a byte if we have more than two verbatim bytes.

Smaz is a really simple and clean library, but it isn't doing something very complex. It is a static shared dictionary approach, without the use of more advanced approaches. I previously written about a more complex compression systems for small strings, such as FemtoZip. You can find the previous post here (there was a whole series of them).

In my next post, I'll try to compare the two options.

Decompression code & Discussion

time to read 4 min | 763 words

As I said, a good & small interview question is this one, it is a good one because it is short, relatively simple to handle, but it should show a lot of things about your code. To start with, being faced with a non trivial task that most people are not that familiar with.

Implement a Decompress(Stream input, Stream output, byte[][] dictionary) routine for the following protocol:

Prefix byte – composed of the highest bit + 7 bits len

If the highest bit is not set, this is a uncompressed data, copy the next len from the input to the output.

If the highest bit is set, this is compressed data, the next byte will be the index in the dictionary, copy len bytes from that dictionary entry to the output.

I couldn’t resist doing this myself, and I came up with the following:

public void Decompress(Stream input, Stream output, byte[][] dictionary)
    var tmp = new byte[128];
    while (true)
        var readByte = input.ReadByte();
        if (readByte == -1)
        var prefix = (byte) readByte;
        var compressed = (prefix & 0x80) != 0;
        var len = prefix & 0x7f;

        if (compressed == false)
            while (len > 0)
                var read = input.Read(tmp, 0, len);
                if(read == 0)
                    throw new InvalidDataException("Not enough data to read from compressed input stream");
                len -= read;
                output.Write(tmp, 0, read);
            readByte = input.ReadByte();
            if(readByte == -1)
                throw new InvalidDataException("Not enough data to read from compressed input stream");
            output.Write(dictionary[readByte], 0, len);

Things to pay attention to: Low memory allocations, error handling, and handling of partial reads from the stream.

But that is just part of the question. After reading the protocol, and implementing it. The question now turns to what does the protocol says about this kind of compression scheme. The use of just 7 bits to store len drastically limit the compression utility in a general format. It also requires an external dictionary, which most compression formats don’t use, they use the actual compressed text itself as the dictionary.  Of course, I’ve been reading compression algorithms for a while now, so that isn’t that fair. But I would expect people to note that that 7 bit limits the compression usability.

And hopefully, with a bit of a hint, they should note that the external dictionary is useful for small data sets where the repetitions are actually between entry, not per entry.

Interview challenges: Decompress that

time to read 1 min | 137 words

I’m always looking for additional challenges that I can ask people who interview at Hibernating Rhinos, and I run into an interesting challenge just now.

Implement a Decompress(Stream input, Stream output, byte[][] dictionary) routine for the following protocol:

Prefix byte – composed of the highest bit + 7 bits len

If the highest bit is not set, this is a uncompressed data, copy the next len from the input to the output.

If the highest bit is set, this is compressed data, the next byte will be the index in the dictionary, copy len bytes from that dictionary entry to the output.

After writing the code, the next question is going to be, what are the implication of this protocol? What is it going to be good for? What is it going to be bad for?

The dark sides of Lucene

time to read 6 min | 1022 words

I’ve been using Lucene for the past six or seven years, and after my last post, I thought it would be a good idea to talk a bit about the kind of things that it isn’t doing well. We’ve been using it extensively in RavenDB for the past 5 years, and I think that I have a pretty good understanding of it. We used to have one of Lucene.NET committers working at Hibernating Rhinos, so I’ve a high level of confidence that I’m not just stupidly not using it properly, too.

Probably the part that caused us the most pain with Lucene was the fact that it isn’t transactional. That is, it is quite easy to get into situations where the indexes are corrupted. That make it… challenging to use it in a database that needs to ensure consistency. The problem is that it is really not a use case that Lucene is well suited for. In order to ensure that data is saved, we have to commit often, the problem is that in order to ensure good performance, we want to commit less often, but then we will the changes if we crash. For that matter, Lucene doesn’t do any attempt to actually flush the data properly, relying on the OS to do that, a system crash can cause you to lose data even though you “committed” it.

Fun times, I can tell you that.

Next, we have the issue of what Lucene call updates. Updates in Lucene are actually just delete/add, and they don’t maintain the same document id (more on that later). Because of that, you usually have to have an additional field in the index that would be your primary key, and you handle updates by first deleting then adding things. That is quite strange, to be fair, and it means that you can’t “extend” an index entry, you have to build it from scratch every time.

Speaking of this, let us talk a bit about deletes. Ignoring for the moment the absolutely horrendous decision to do deletes through the reader, let us talk about how they are actually done. Deletes are recorded in a separate file, and that means that the moment you have any deletes (or, as I mentioned, updates), all the internal statistics are wrong.  We run into this quite often with RavenDB when we are doing things like facets or suggestions. For example, if you have request a suggestion for a user name, it will happily give you suggestions for deletes users, even though we deleted it in Lucene.

It will go away eventually, when it is ready to optimize the index by merging all the files, but in the meantime, it makes  for interesting bug reports.  Speaking of merging, that is another common issue that you have to deal with. In order to ensure optimal performance, you have to be on top of the merge policy. This results in some interesting issues. For RavenDB’s purposes, we do a writer commit after every indexing batch. That means that if you are writing to RavenDB slowly enough, we do a commit after every document write. That result in a lot of segments, and the merge policy would have to do a lot of merges. The problem here is that merges have two distinct costs associated with them.

First, and obviously, you are going to need to write (again) all of the documents in all of the segments you are merging. That is very similar to doing merges in LevelDB ( indeed, in general Lucene’s file format is remarkably similar to SST ). Next, and arguably more interesting / problematic from our point of view is the fact that it also kills all of the caches. Let me try to explain, Lucene uses a lot of caches to speed things up, in fact, most of the sorting is done by using the caches, for example.  That works really well when we are querying normally, because segments are immutable, which makes for great caching. But on a merge, not only have we just invalidate all of our caches, we now need to read, again, all of the data that we just wrote, so we would be able to use it. That can be… costly. And both things can introduce stalls into the system.

The major problem externally with merges is that the document id changes, and that means that you cannot rely on them. It would be much easier if you could send an id out into the world, and get it back later and do something with it, but that isn’t possible with Lucene.

Next, and not really an operational issue like the rest, Lucene’s multi threaded behavior is… a hammer to an egg, in most cases. By that I mean code like this:


I mean, it is certainly functional, but it is pretty ugly.

Now, don’t get me wrong, I think that Lucene is pretty neat. But there are some really dark corners there. For example, the actual searching, go ahead and try to find where that is done in Lucene. It is very easy to get lost between all of the different aspects: weights, sorters, queries and various enumerators. For fun, a lot of that runs at hard to figure out times, making the actual query run time interesting to try to figure out.

As a good example, let us take the simplest possible query, TermQuery. Go ahead, try to find where it is actually doing the query for matching terms in this code: https://github.com/apache/lucene/blob/LUCENE_2_1/src/java/org/apache/lucene/search/TermQuery.java

That actually happens here: https://github.com/apache/lucene/blob/LUCENE_2_1/src/java/org/apache/lucene/search/TermScorer.java#L79, and it is effectively a side effect of calling reader.termDocs(term) that limit the matches only to those with the same term.  Trying to track down where exactly things happen can be… interesting.

Anyway, this post is getting to long, and I want to get back to figuring out how Lucene does its thing without dwelling too much in the dark…

What Lucene does, a look under the hood

time to read 4 min | 777 words

Lucene is a search engine library, which is great. But as it turns out, there is a lot going on there. After working with it for several years, I can say with confidence that it is a pretty awesome library. But surprisingly, a lot of the effort that went into it doesn’t seem to be talked about / visible to people not trolling through the code. I think that this is a pretty good testament for how successful it is. That, and the fact that it is now the base line against which all other search libraries & engines are compared.

What I wanted to talk about today was the kind of things that Lucene is doing that doesn’t seem to get much publicity. I think that Spolsky said it best:

Back to that two page function. Yes, I know, it's just a simple function to display a window, but it has grown little hairs and stuff on it and nobody knows why. Well, I'll tell you why: those are bug fixes. One of them fixes that bug that Nancy had when she tried to install the thing on a computer that didn't have Internet Explorer. Another one fixes that bug that occurs in low memory conditions. Another one fixes that bug that occurred when the file is on a floppy disk and the user yanks out the disk in the middle. That LoadLibrary call is ugly but it makes the code work on old versions of Windows 95.

I remember just how much impact that article made on me at the time. And Lucene’s codebase bear true for this words. Lucene is a search engine library, which basically means that it does:

  • Indexing
  • Querying
  • Maintenance

One of the major areas of maturity in Lucene is how it optimized indexing. You can see it in the code. For example, Lucene goes to a great deal of trouble to avoid allocating memory willy nilly. Instead, pretty much everything there is done via object pools. This helps reduce the memory pressure when doing a lot of indexing and can save a lot of GC cycles.

Another is the concept of multiple threads for indexing .A lot of Lucene is build around this idea, it has a lot of per thread state that is meant to ensure that you don’t have to deal with concurrency yourself. The idea is that you can take an IndexWriter and write to it concurrently, then call commit. A lot of the work to do with indexing is CPU intensive, so that makes a lot of sense, and Lucene nicely isolates you from all of that work. There is DocumentWriterPerThread, so you can see really nice scaling effects as you throw more threads & hardware at the problem.

Usually, when people start messing with Lucene, they do that by writing analyzers, and you are sort of exposed to the memory constraints by being encourage to use ReusableTokenStream, etc. It has also a nice pipeline architecture for doing the indexing work with filters.

On the querying side, Lucene does a lot of work to ensure that things just works. It has a Boolean Model for searches, and Vector Space Model for ranking. Writing your own Query classes is pretty easy too, once you understand how things work, and again, this is another common place for people to extend Lucene. But there is a lot going on behind the scenes. Lucene does a lot of caching on a segment basis, and it is quite nice, since segments are immutable, it means that you can get pretty good usage out of that.

That give it a lot of its speed, and it means that over time, things are actually going to be faster, because more parts of the segments are in memory and cached.

Finally, we have all of the other work that Lucene does. In practice, it means things like merging segments (hopefully in the background), and keeping the overall system humming along. Unfortunately, that is also one of the places that are usually most common for people to start tinkering with when they run into perf problems. That is anything but trivial, and optimizing it is something that require a lot of expertise and understanding about the specific scenario you have.

And on top of that, you have everything else that already works on top of Lucene. Which is quite a lot.

As I said earlier, that is a very impressive piece of technology. That doesn’t mean that it doesn’t have its own set of problems, but that is something that I’ll discuss in detail in my next post.

Sorting with Lucene

time to read 2 min | 352 words

I talked about the Lucene formula and how it calculate things using tf-idf for best matches. Now I want to talk about the actual sorting implementation. As it turned out, the default sorting (by relevancy) is really simple. All you need is to get the relevant score for a query, then you shove the results through a heap with a specified size. The heap will take care of maintain the top results.

So far, that is pretty simple to understand. But the question is, how do you do sorting on a field value? The answer is, not easily.


GetStringIndex() does something very interesting. I returns  a string index, which gives us:

  • A string array containing all the distinct (sorted) value for this index.
  • A int array with all the documents in the index, with the position of the value of that field in the string value array

Now we can compare fields by their field position on the field, which give us pretty good sorting. Unfortunately, this also require us to load all the values to memory. Let us see another example, which would probably be easier to follow:

Sorting by an integer is done like this:


Get an array (whose size match the number of documents),  We can then sort things easily because accessing the relevant field value only require us to have the document id to index into the array.

The reason Lucene does this is that it uses an inverted index, and it has no easy way of going from the field values to the list documents it has. So it is easier to read all the values into memory and work with them there. I don’t like it, but off hand, I can’t think of a better way to handle this.

The Lucene formula: TF & IDF

time to read 4 min | 721 words

The basis of how Lucene work is tf–idf, short for term frequency–inverse document frequency. You can read about it here. It is pretty complex, but it is explained in detailed in the Lucene docs. It all fall down to:


For now, I want to look at the tf() function. The default implementation is in DefaultSimilarity.Tf, and is defined as:


That isn’t really helpful, however. Not without knowing what freq is.  And I’m pretty sure that Sqrt isn’t cheap, which probably explains this:


So it caches the score function, and it appears that the tf() is purely based on the count, not on anything else. Before I’ll go and find out what is done with the score cache, let’s look at the weight value. This is calculated here:


And idf stands for inverse document frequency.  That is defined by default to be:


And the actual implementation as:


And the caller for that is:


So, we get the document doc frequency of a term, that is, in how many documents this term shows, and the number of all the documents, and that is how we calculate the idf.

So far, so good. But how about that doc frequency? This is one of the things that we store in the terms info file. But how does that get calculated?

I decided to test this with:


This should generate 3 terms, two with a frequency of 1 and one with a frequency of 2. Now, to figure out how this is calculated. That part is a real mess.

During indexing, there is a manually maintained hash table that contains information about each unique term, and when we move from one document to another, we write the number of times each term appeared in the document. For fun, this is written to what I think is an in memory buffer for safe keeping, but it is really hard to follow.

Anyway, I now know enough about how that works for simple cases, where everything happens in a single segment. Now let us look what happens when we use multiple segments. It is actually quite trivial. We just need to sum the term frequency each term across all segments. This gets more interesting when we involve deletes. Because of the way Lucene handle deletes, it can’t really handle this scenario, and deleting a document do not remove its frequency counts for the terms that it had. That is the case until the index does a merge, and fix everything that way.

So now I have a pretty good idea about how it handled the overall term frequency. Note that this just gives you the number of times this term has been seen across all documents. What about another important quality, the number of times this term appears in a specific document? Those are stored in the frq file, and they are accessible during queries. This is then used in conjunction with the overall term frequency to generate the boost factor per result.

Peeking into Lucene indexing

time to read 3 min | 516 words

Continuing my trip into the Lucene codebase, now I’m looking into the process indexing are happening. Interestingly enough, that is something that we never really had to look at before.

It is quite clear that Lucene is heavily meant for utilizing additional threads for improving overall indexing speed. You can see it in the number of per thread state that exist all over the indexing code. That is also meant to reduce memory consumption, as far as I can see.


The real stuff happens in the ProcessDocument() where we have a chain of DocFieldConsumerPerThread and TermsHashConsumerPerThread which actually do the work.

Then the real work is happening on a per field level in DocInverterPerField, where the analyzer is actually called. The process in which the analyzers return values for the text is interesting. There are “attributes” that are added to the stream, per token, and they are used to get the relevant values. I assume that this allows to have different levels of analyzers.


This way, you can have analyzers that don’t deal with offesets or positions, etc. And the actual processing of this is done in a chain that appears to be:

  • Freq Prox
  • Term Vectors

But that isn’t something that I really care about now. I want to see how Lucene writes the actual terms. This is being written in the TermsInfoWriter, which took some time to find.


Terms are stored in a prefix compressed mode (sorted, obviously), and there is the actual terms, and an index into the terms, which allows for faster seeking into the file. This is actually done here:


This is a single term written to the file. A lot of the stuff Lucene does (prefixes, VInt, etc) are done in the name of conserving space, and it reminds me greatly of LevelDB’s SST. In fact, the way terms are stored is pretty much an SST, except that this happens to be on multiple files. Pretty much the entire behavior and all the implications are the same.

It also means that searching on this is fast, because the data is sorted, but pretty complex. And it also explains a lot about the actual exposed API that it has. I think that I have a pretty good idea on how things work now. I want to now go back up and look at specific parts of how it works… but that is going to be the next post.

The Lucene disk format

time to read 3 min | 577 words

I realized lately that I wanted to know a lot more about exactly how Lucene is storing data on disk. Oh, I know the general stuff about segments and files, etc. But I wanted to know the actual bits & bytes. So I started tracing into Lucene and trying to figure out what it is doing.

And, by the way, the only thing that the Lucene.NET codebase is missing is this sign:


At any rate, this is how Lucene writes the segment file. Note that this is done in a CRC32 signed file:


And the info write method is:


Today, I would probably use a JSON file for something like that (bonus point, you know if it is corrupted and it is human readable), but this code was written in 2001, so that explains it.

This is the format of the format of a segment file, and the segments.gen file is generated using:


Moving on to actually writing data, I created ten Lucene documents and wrote them. Then just debugged through the code to see what will happen. It started by creating _0.fdx and _0.fdt files. The .fdt is for fields, the fdx is for field indexes.

Both of those files are used when writing the stored fields. This is the empty operation, writing an unstored field.


This is how fields are actually stored:


And then it ends up in:


Note that this particular data goes in the fdt file, while the fdx appears to be a quick way to go from a known document id to the relevant position in the fdx file.

As I was going through the code, I did some searches, and found a very detailed explanation of the actual file format in the docs. That is really nice and quite informative, however, just seeing how the “let us take the documents and make them searchable” part is quite interesting. Lucene has a lot of chains of responsibilities going through. And it is also quite interesting to see the design choices that were made.

Unfortunately, Lucene is very much wedded to its file format, and making changes to it isn’t going to be possible, which is a shame, since it impacts quite a lot of the way Lucene works in general.


  1. Debugging CoreCLR applications in WinDBG - 17 hours from now
  2. Non reproducible / intermittent error handling - 4 days from now
  3. The Guts n’ Glory of Database Internals: What goes inside the transaction journal - 5 days from now
  4. The Guts n’ Glory of Database Internals: Log shipping and point in time recovery - 6 days from now
  5. The Guts n’ Glory of Database Internals: Merging transactions - 7 days from now

And 19 more posts are pending...

There are posts all the way to Aug 31, 2016


  1. Production postmortem (16):
    05 Feb 2016 - A null reference in our abstraction
  2. The Guts n’ Glory of Database Internals (20):
    18 Jul 2016 - What the disk can do for you
View all series


Main feed Feed Stats
Comments feed   Comments Feed Stats