Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 08 | Comments: 19

filter by tags archive

If you throttle me any me I am going to throttle you back!

time to read 3 min | 418 words

It is interesting to note that for a long while, what we were trying to do with RavenDB was make it use less and less resources. One of the reasons for that is that less resources is obviously better, because we aren’t wasting anything.

The other reason is that we have users running us on a 512MB/650 MHz Celeron 32 bit machines. So we really need to be able to fit into a small box (and also allow enough processing power for the user to actually do something with the machine).

We have gotten really good in doing that, actually.

The problem is that we also have users running RavenDB on standard server hardware (32 GB / 16 cores, RAID and what not) in which case they (rightly) complain that RavenDB isn’t actually using all of their hardware.

Now, being conservative about resource usage is generally good, and we do have the configuration in place which can tell RavenDB to use more memory. It is just that this isn’t polite behavior.

RavenDB in most cases shouldn’t require anything special for you to run, we want it to be truly a zero admin database. The solution?  Take into account the system state and increase the amount of work that we do to get things done. And yes, I am aware of the pitfalls.

As long as there is enough free RAM available, we will increase the amount of documents that we are going to index in a single batch. That is subject to some limits (for example, if we just created a new index on a big database, we need to make sure we aren’t trying to load it entirely to memory), and it knows how to reserve some room for other things, and how to throttle down and as well as up.

This post is written before I had the chance to actually test this on production level size dataset, but I am looking forward to seeing how it works.

Update: Okay, that is encouraging, it looks like what we did just made things over 7 times faster. And this isn’t a micro benchmark, this is when you throw this on a multi GB database with full text search indexing.

Next, we need to investigate what we are going to do about multiple running indexes and how this optimization affects them. Fun Smile.


Comments

Paul Betts

Be careful of this strategy of using the free memory to make decisions on memory allocation - http://blogs.msdn.com/b/oldnewthing/archive/2012/01/18/10257834.aspx

Ayende Rahien

Paul, I am well aware of this, and yes, we took that into account.

Ayende Rahien

Paul, did you notice that I linked to that exact article in the post?

Martin Larsen

How did you take it into account?

Ayende Rahien

Martin, We aren't trying to do blind guesses, we are going to use as much memory as we can, but we will stop if we reach the predefined limit. That means that if there are other apps running and using memory, it doesn't affect us, we aren't trying to use more memory, we are simply trying not to use too much.

Itamar

Someone is probably going mention this soon anyway, so I thought I'll go ahead and do this myself.

Often times, using less resources is considered "being green". For example this web CMS is written entirely in C++ to minimize the use of resources: http://cppcms.com/wikipp/en/page/rationale

Question is how far you are willing to go with it. I'm pretty sure moving from your RDBMS of choice to any NoSQL that better fits your needs will save 10 times the energy you had been using so far...

Patrick Huizinga

@Itamar: In that case I would like to refer to the quote "Premature optimization..."

You only know whether it costs more or less energy to cache data in memory instead of spinning up those platters every time until you actually measured it.

Gian Maria

Ususally databases engine try to allocate as much memory they can, unless some configuration sets an upper limit.

This is how Sql Server or oracle database works, and it is perfectly good, because usually they run on a dedicated server. If you need to limit resource usage, just setup a limit.

Having unused memory is not useful :).

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. Production postmortem: The case of the memory eater and high load - 3 days from now
  2. Production postmortem: The case of the lying configuration file - 4 days from now
  3. Production postmortem: The industry at large - 5 days from now
  4. The insidious cost of allocations - 6 days from now
  5. Find the bug: The concurrent memory buster - 7 days from now

And 4 more posts are pending...

There are posts all the way to Sep 10, 2015

RECENT SERIES

  1. Find the bug (5):
    20 Apr 2011 - Why do I get a Null Reference Exception?
  2. Production postmortem (10):
    14 Aug 2015 - The case of the man in the middle
  3. What is new in RavenDB 3.5 (7):
    12 Aug 2015 - Monitoring support
  4. Career planning (6):
    24 Jul 2015 - The immortal choices aren't
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats