Ayende @ Rahien

Hi!
My name is Oren Eini
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

, @ Q c

Posts: 18 | Comments: 87

filter by tags archive

The cost of abstraction - Part II

time to read 2 min | 263 words

Sasha pointed out that I should also test what happens when you are using multiply implementation of the interface, vs. direct calls. This is important because of JIT optimizations with regards to interface calls that always resolve to the same instance.

Here is the code:

class Program
{
	public static void Main(string[] args)
	{
		//warm up
		PerformOp(new List<string>(101));
		PerformOp(new NullList());
		List<long> times = new List<long>();

		Stopwatch startNew = new Stopwatch();
		for (int i = 0; i < 50; i++)
		{
			startNew.Start();
			PerformOp(new List<string>(101));
			PerformOp(new NullList());
			times.Add(startNew.ElapsedMilliseconds);
			startNew.Reset();
		}
		Console.WriteLine(times.Average());
	
	}

	private static void PerformOp(List<string> strings)
	{
		for (int i = 0; i < 100000000; i++)
		{
			strings.Add("item");
			if(strings.Count>100)
				strings.Clear();
		}
	}

	private static void PerformOp(NullList strings)
	{
		for (int i = 0; i < 100000000; i++)
		{
			strings.Add("item");
			if (strings.Count > 100)
				strings.Clear();
		}
	}
}

And the results are:

  • IList<T> - 5789 milliseconds
  • List<string>, NullList - 3941 millseconds

Note that this is significantly higher than the previous results, because no we run the results two hundred million times.

Individual method call here costs:

  • IList<T> - 0.0000289489
  • List<string>, NullList - 0.0000197096

We now have a major difference between the two calls. When we had a single impl of the interface, the difference between the two was: 0.00000406 milliseconds.

With multiple implementations, the difference between the two methods is: 0.0000092393 milliseconds. Much higher than before, and still less than a microsecond.


Comments

James Kovacs

I agree with Oren here. Think about performance in the big:

  • Are my remote calls chunky enough?

  • Can I scale out across a farm?

  • Am I caching enough (or too much) data and do I need to cache at all?

  • etc.

Don't worry about performance in the small?

  • Interfaces vs. abstract base class vs. direct calls

  • Hand-optimizing sort algorithms

  • Crazy locking/threading constructs

  • etc.

Performance in the big is hard to fix later. You can't just magically make millions of small, chatty remote calls turn into a few dozen. You can almost magically realize that you picked a poor sorting algorithm (based on perf traces) and switch to a more optimal one.

Let's take Oren's example of interfaces vs. direct calls. If your perf traces indicate that the call through the interface is your bottleneck, how hard is it to change IList to List and create a similar NullList method? I would think 5 minutes on the outside including running your tests. On the other hand, how hard is it to maintain tons of duplicated methods throughout your codebase on the chance that interface calls are too slow? I don't even want to think about it.

Besides, if you spend all your time thinking about performance in the small, you'll never have time to think about performance in the big - which is where it really matters!

Sasha Goldshtein

James,

I think you've joined the conversation after it's gotten slightly out of context. I have not suggested that you shouldn't use interfaces because they are bad for performance. It would be a ridiculous thing to say. Neither did I suggest that you shouldn't care about "performance in the big" because you care about "performance in the small".

I care about performance all the way, throughout the entire cycle. I care about performance when designing my remote interfaces to be chunky instead of chatty; but I also care about performance when choosing the right collection for the job or optimizing object layout.

Sometimes refactoring for performance is easy, sometimes it's hard. The same can be said about refactoring in general. And another point I tried to convey is that sometimes decoupling, dependency injection and other techniques which are good for correctness might not work out just as good as far as performance is concerned.

Sasha

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

  1. Buffer allocation strategies: A possible solution - about one day from now
  2. Buffer allocation strategies: Explaining the solution - 3 days from now
  3. Buffer allocation strategies: Bad usage patterns - 4 days from now
  4. The useless text book algorithms - 5 days from now
  5. Find the bug: The concurrent memory buster - 6 days from now

There are posts all the way to Sep 11, 2015

RECENT SERIES

  1. Find the bug (5):
    20 Apr 2011 - Why do I get a Null Reference Exception?
  2. Production postmortem (10):
    03 Sep 2015 - The industry at large
  3. What is new in RavenDB 3.5 (7):
    12 Aug 2015 - Monitoring support
  4. Career planning (6):
    24 Jul 2015 - The immortal choices aren't
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats