Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 18 | Comments: 87

filter by tags archive

Reviewing LevelDBPart V, into the MemTables we go

time to read 2 min | 263 words

You can read about the theory of Sorted Strings Tables and Memtables here. In this case, what I am interested in is going a bit deeper into the leveldb codebase, and understanding how the data is actually kept in memory and what is it doing there.

In order to do that, we are going to investigate MemTable. As it turned out, this is actually a very simple data structure. A MemTable just hold a SkipList, whish is a sorted data structure that allows O(log N) access and modifications. The interesting thing about Skip List in contrast to Binary Trees, is that it is much easier to create a performant solution of concurrent skip list (either with or without locks) over a concurrently binary tree.

The data in the table is just a list of key & value (or delete marker). And that means that searches through this can give you three results:

  • Here is the value for the key (exists)
  • The value for the key was remove (deleted)
  • The value is not in the memory table (missing)

It is the last part where we get involved with the more interesting aspect of LevelDB (and the reason it is called leveldb in the first place). The notion that you have multiple levels. The mem table is the first one, and then you spill the output out to disk (the Sorted Strings Table). Now that I figure out how simple MemTable is really is, I am going to take a look at the leveldb log, and then dive into Sorted Strings Table.

More posts in "Reviewing LevelDB" series:

  1. (26 Apr 2013) Part XVIII–Summary
  2. (15 Apr 2013) Part XVII– Filters? What filters? Oh, those filters…
  3. (12 Apr 2013) Part XV–MemTables gets compacted too
  4. (11 Apr 2013) Part XVI–Recovery ain’t so tough?
  5. (10 Apr 2013) Part XIV– there is the mem table and then there is the immutable memtable
  6. (09 Apr 2013) Part XIII–Smile, and here is your snapshot
  7. (08 Apr 2013) Part XII–Reading an SST
  8. (05 Apr 2013) Part XI–Reading from Sort String Tables via the TableCache
  9. (04 Apr 2013) Part X–table building is all fun and games until…
  10. (03 Apr 2013) Part IX- Compaction is the new black
  11. (02 Apr 2013) Part VIII–What are the levels all about?
  12. (29 Mar 2013) Part VII–The version is where the levels are
  13. (28 Mar 2013) Part VI, the Log is base for Atomicity
  14. (27 Mar 2013) Part V, into the MemTables we go
  15. (26 Mar 2013) Part IV
  16. (22 Mar 2013) Part III, WriteBatch isn’t what you think it is
  17. (21 Mar 2013) Part II, Put some data on the disk, dude
  18. (20 Mar 2013) Part I, What is this all about?

Comments

Duarte Nunes

The writes being serialized, they only need to worry about concurrent readers, right? Also, how is the trimming of the list done?

In the article you link to, the author states that "Periodically, the MemTable is flushed to disk as an SSTable", but that doesn't seem to be the case. First, because that would give a window where updates would be lost, and also because we saw in a previous post that first a write is applied to the log, and only then to the memtable. Am I missing something?

Matt Warren

@Duarte

The log and SSTables are 2 different files, see http://leveldb.googlecode.com/git/doc/impl.html for more info. Also the flusing happens in the background after a new log has been started, so no updates are lost (see "Level 0" in the link above.

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. Buffer allocation strategies: A possible solution - 2 days from now
  2. Buffer allocation strategies: Explaining the solution - 3 days from now
  3. Buffer allocation strategies: Bad usage patterns - 4 days from now
  4. The useless text book algorithms - 5 days from now
  5. Find the bug: The concurrent memory buster - 6 days from now

There are posts all the way to Sep 11, 2015

RECENT SERIES

  1. Find the bug (5):
    20 Apr 2011 - Why do I get a Null Reference Exception?
  2. Production postmortem (10):
    03 Sep 2015 - The industry at large
  3. What is new in RavenDB 3.5 (7):
    12 Aug 2015 - Monitoring support
  4. Career planning (6):
    24 Jul 2015 - The immortal choices aren't
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats