Ayende @ Rahien

Hi!
My name is Ayende Rahien
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

@

Posts: 5,947 | Comments: 44,540

filter by tags archive

The cost of abstraction - Part II


Sasha pointed out that I should also test what happens when you are using multiply implementation of the interface, vs. direct calls. This is important because of JIT optimizations with regards to interface calls that always resolve to the same instance.

Here is the code:

class Program
{
	public static void Main(string[] args)
	{
		//warm up
		PerformOp(new List<string>(101));
		PerformOp(new NullList());
		List<long> times = new List<long>();

		Stopwatch startNew = new Stopwatch();
		for (int i = 0; i < 50; i++)
		{
			startNew.Start();
			PerformOp(new List<string>(101));
			PerformOp(new NullList());
			times.Add(startNew.ElapsedMilliseconds);
			startNew.Reset();
		}
		Console.WriteLine(times.Average());
	
	}

	private static void PerformOp(List<string> strings)
	{
		for (int i = 0; i < 100000000; i++)
		{
			strings.Add("item");
			if(strings.Count>100)
				strings.Clear();
		}
	}

	private static void PerformOp(NullList strings)
	{
		for (int i = 0; i < 100000000; i++)
		{
			strings.Add("item");
			if (strings.Count > 100)
				strings.Clear();
		}
	}
}

And the results are:

  • IList<T> - 5789 milliseconds
  • List<string>, NullList - 3941 millseconds

Note that this is significantly higher than the previous results, because no we run the results two hundred million times.

Individual method call here costs:

  • IList<T> - 0.0000289489
  • List<string>, NullList - 0.0000197096

We now have a major difference between the two calls. When we had a single impl of the interface, the difference between the two was: 0.00000406 milliseconds.

With multiple implementations, the difference between the two methods is: 0.0000092393 milliseconds. Much higher than before, and still less than a microsecond.


Comments

James Kovacs

I agree with Oren here. Think about performance in the big:

  • Are my remote calls chunky enough?

  • Can I scale out across a farm?

  • Am I caching enough (or too much) data and do I need to cache at all?

  • etc.

Don't worry about performance in the small?

  • Interfaces vs. abstract base class vs. direct calls

  • Hand-optimizing sort algorithms

  • Crazy locking/threading constructs

  • etc.

Performance in the big is hard to fix later. You can't just magically make millions of small, chatty remote calls turn into a few dozen. You can almost magically realize that you picked a poor sorting algorithm (based on perf traces) and switch to a more optimal one.

Let's take Oren's example of interfaces vs. direct calls. If your perf traces indicate that the call through the interface is your bottleneck, how hard is it to change IList to List and create a similar NullList method? I would think 5 minutes on the outside including running your tests. On the other hand, how hard is it to maintain tons of duplicated methods throughout your codebase on the chance that interface calls are too slow? I don't even want to think about it.

Besides, if you spend all your time thinking about performance in the small, you'll never have time to think about performance in the big - which is where it really matters!

Sasha Goldshtein

James,

I think you've joined the conversation after it's gotten slightly out of context. I have not suggested that you shouldn't use interfaces because they are bad for performance. It would be a ridiculous thing to say. Neither did I suggest that you shouldn't care about "performance in the big" because you care about "performance in the small".

I care about performance all the way, throughout the entire cycle. I care about performance when designing my remote interfaces to be chunky instead of chatty; but I also care about performance when choosing the right collection for the job or optimizing object layout.

Sometimes refactoring for performance is easy, sometimes it's hard. The same can be said about refactoring in general. And another point I tried to convey is that sometimes decoupling, dependency injection and other techniques which are good for correctness might not work out just as good as far as performance is concerned.

Sasha

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. RavenDB Sharding (2):
    21 May 2015 - Adding a new shard to an existing cluster, the easy way
  2. The RavenDB Comic Strip (2):
    20 May 2015 - Part II – a team in trouble!
  3. Challenge (45):
    28 Apr 2015 - What is the meaning of this change?
  4. Interview question (2):
    30 Mar 2015 - fix the index
  5. Excerpts from the RavenDB Performance team report (20):
    20 Feb 2015 - Optimizing Compare – The circle of life (a post-mortem)
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats