Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 5,972 | Comments: 44,511

filter by tags archive

Reviewing LevelDBPart XVII– Filters? What filters? Oh, those filters…


I was all set to finish this series, when I realized that I missed something important, I didn’t covered, and indeed, I don’t currently understand, what filters are, and how are they used.

As usual, once I actually got to the part of the code where this is actually handled (instead of ignoring it), it made a lot more sense:

image

And that, following with the rest of the code that I read makes a lot more sense.

A SST is a sorted table, essentially. We have the keys and the blocks, etc. And we can find a value very quickly. But what if you could do this even more cheaply?

Using a bloom filter is a good way to never get false negative, and it will reduce the amount of work we have to do if we can’t find the key in the SST drastically (only need to read the filter value, don’t even need to do any additional checks). Quite elegant, even if I say so myself.

There is one thing to pay attention to and that is that you can define your own comparator, in which case you must also define you own filter policy. If you use a comparator that ignore casing, you also need to provider a filter that ignore casing.

More posts in "Reviewing LevelDB" series:

  1. (26 Apr 2013) Part XVIII–Summary
  2. (15 Apr 2013) Part XVII– Filters? What filters? Oh, those filters…
  3. (12 Apr 2013) Part XV–MemTables gets compacted too
  4. (11 Apr 2013) Part XVI–Recovery ain’t so tough?
  5. (10 Apr 2013) Part XIV– there is the mem table and then there is the immutable memtable
  6. (09 Apr 2013) Part XIII–Smile, and here is your snapshot
  7. (05 Apr 2013) Part XI–Reading from Sort String Tables via the TableCache
  8. (04 Apr 2013) Part X–table building is all fun and games until…
  9. (02 Apr 2013) Part VIII–What are the levels all about?
  10. (29 Mar 2013) Part VII–The version is where the levels are
  11. (28 Mar 2013) Part VI, the Log is base for Atomicity
  12. (27 Mar 2013) Part V, into the MemTables we go
  13. (26 Mar 2013) Part IV
  14. (22 Mar 2013) Part III, WriteBatch isn’t what you think it is
  15. (21 Mar 2013) Part II, Put some data on the disk, dude

Comments

Igor Kalders

Thanks for highlighting the concept of Bloom Filter, which I was ignorant of up until now :)

Frank Quednau

Kelly Sommers once had a piece on blom filters, to explain how it works... http://kellabyte.com/2013/01/24/using-a-bloom-filter-to-reduce-expensive-operations-like-disk-io/

Andrew

An interesting article which summarizes LevelDB quite nicely. The visuals really helped me understand the different levels and the log interaction

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. Reducing parsing costs in RavenDB - one day from now

There are posts all the way to Aug 04, 2015

RECENT SERIES

  1. Production postmortem (5):
    29 Jul 2015 - The evil licensing code
  2. Career planning (6):
    24 Jul 2015 - The immortal choices aren't
  3. API Design (7):
    20 Jul 2015 - We’ll let the users sort it out
  4. What is new in RavenDB 3.5 (3):
    15 Jul 2015 - Exploring data in the dark
  5. The RavenDB Comic Strip (3):
    28 May 2015 - Part III – High availability & sleeping soundly
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats