Ayende @ Rahien

Hi!
My name is Ayende Rahien
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

@

Posts: 5,947 | Comments: 44,544

filter by tags archive

Reviewing LevelDBPart IV


his is a bit of a side track. One of the things that is quite clear to me when I am reading the leveldb code is that I was never really any good at C++. I was a C/C++ developer. And that is a pretty derogatory term. C & C++ share a lot of the same syntax and underlying assumption, but the moment you want to start writing non trivial stuff, they are quite different. And no, I am not talking about OO or templates.

I am talking about things that came out of that. In particular, throughout the leveldb codebase, they are very rarely, if at all, allocate memory directly. Pretty much the whole codebase rely on std::string to handle buffer allocations and management. This make sense, since RAII is still the watch ward for good C++ code. Being able to utilize std::string for memory management also means that the memory will be properly released without having to deal with it explicitly.

More interestingly, the leveldb codebase is also using std::string as a general buffer. I wonder why it is std::string vs. std::vector<char>,  which would bet more reasonable, but I guess that this is because most of the time, users will want to pass strings as keys, and likely this is easier to manage, given the type of operations available on std::string (such as append).

It is actually quite fun to go over the codebase and discover those sort of things. Especially if I can figure them out on my own Smile.

This is quite interesting because from my point of view, buffers are a whole different set of problems. We don’t have to worry about the memory just going away in .NET (although we do have to worry about someone changing the buffer behind our backs), but we have to worry a lot about buffer size. This is because at some point (80Kb), buffers graduate to the large object heap, and stay there. Which means, in turn, that every time that you want to deal with buffers you have to take that into account, usually with a buffer pool.

Another aspect that is interesting with regards to memory usage is the explicit handling of copying. There are various places in the code where the copy constructor was made private, to avoid this. Or a comment is left about making a type copy-able intentionally. I get the reason why, because it is a common failing point in C++, but I forgot (although I am pretty sure that I used to know) the actual semantics of when/ how you want to do that in all cases.

More posts in "Reviewing LevelDB" series:

  1. (26 Apr 2013) Part XVIII–Summary
  2. (15 Apr 2013) Part XVII– Filters? What filters? Oh, those filters…
  3. (12 Apr 2013) Part XV–MemTables gets compacted too
  4. (11 Apr 2013) Part XVI–Recovery ain’t so tough?
  5. (10 Apr 2013) Part XIV– there is the mem table and then there is the immutable memtable
  6. (09 Apr 2013) Part XIII–Smile, and here is your snapshot
  7. (05 Apr 2013) Part XI–Reading from Sort String Tables via the TableCache
  8. (04 Apr 2013) Part X–table building is all fun and games until…
  9. (02 Apr 2013) Part VIII–What are the levels all about?
  10. (29 Mar 2013) Part VII–The version is where the levels are
  11. (28 Mar 2013) Part VI, the Log is base for Atomicity
  12. (27 Mar 2013) Part V, into the MemTables we go
  13. (26 Mar 2013) Part IV
  14. (22 Mar 2013) Part III, WriteBatch isn’t what you think it is
  15. (21 Mar 2013) Part II, Put some data on the disk, dude

Comments

Simon Skov Boisen

I would say that the group of objects that you make none-copyable in C++ sorta is in the same camp as those you use finalizers on in .NET, e.g. network-connections or file-handles.

Artem

What's the difference between CLR buffer going to LOG versus C++ buffer going to C heap? Is main problem same in both cases?

Ayende Rahien

Artem, In C++, you have far more granular control over where that memory is. That greatly alleviate the issues. Sure, in both cases, you may get heap fragmentation, but if that is a problem, you can destroy the heap and create a new one. In .NET, that isn't possible.

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. RavenDB Sharding (3):
    22 May 2015 - Adding a new shard to an existing cluster, splitting the shard
  2. The RavenDB Comic Strip (2):
    20 May 2015 - Part II – a team in trouble!
  3. Challenge (45):
    28 Apr 2015 - What is the meaning of this change?
  4. Interview question (2):
    30 Mar 2015 - fix the index
  5. Excerpts from the RavenDB Performance team report (20):
    20 Feb 2015 - Optimizing Compare – The circle of life (a post-mortem)
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats