The cost of abstraction - Part II
Sasha pointed out that I should also test what happens when you are using multiply implementation of the interface, vs. direct calls. This is important because of JIT optimizations with regards to interface calls that always resolve to the same instance.
Here is the code:
class Program { public static void Main(string[] args) { //warm up PerformOp(new List<string>(101)); PerformOp(new NullList()); List<long> times = new List<long>(); Stopwatch startNew = new Stopwatch(); for (int i = 0; i < 50; i++) { startNew.Start(); PerformOp(new List<string>(101)); PerformOp(new NullList()); times.Add(startNew.ElapsedMilliseconds); startNew.Reset(); } Console.WriteLine(times.Average()); } private static void PerformOp(List<string> strings) { for (int i = 0; i < 100000000; i++) { strings.Add("item"); if(strings.Count>100) strings.Clear(); } } private static void PerformOp(NullList strings) { for (int i = 0; i < 100000000; i++) { strings.Add("item"); if (strings.Count > 100) strings.Clear(); } } }
And the results are:
- IList<T> - 5789 milliseconds
- List<string>, NullList - 3941 millseconds
Note that this is significantly higher than the previous results, because no we run the results two hundred million times.
Individual method call here costs:
- IList<T> - 0.0000289489
- List<string>, NullList - 0.0000197096
We now have a major difference between the two calls. When we had a single impl of the interface, the difference between the two was: 0.00000406 milliseconds.
With multiple implementations, the difference between the two methods is: 0.0000092393 milliseconds. Much higher than before, and still less than a microsecond.
Comments
I agree with Oren here. Think about performance in the big:
Are my remote calls chunky enough?
Can I scale out across a farm?
Am I caching enough (or too much) data and do I need to cache at all?
etc.
Don't worry about performance in the small?
Interfaces vs. abstract base class vs. direct calls
Hand-optimizing sort algorithms
Crazy locking/threading constructs
etc.
Performance in the big is hard to fix later. You can't just magically make millions of small, chatty remote calls turn into a few dozen. You can almost magically realize that you picked a poor sorting algorithm (based on perf traces) and switch to a more optimal one.
Let's take Oren's example of interfaces vs. direct calls. If your perf traces indicate that the call through the interface is your bottleneck, how hard is it to change IList<T> to List<T> and create a similar NullList method? I would think 5 minutes on the outside including running your tests. On the other hand, how hard is it to maintain tons of duplicated methods throughout your codebase on the chance that interface calls are too slow? I don't even want to think about it.
Besides, if you spend all your time thinking about performance in the small, you'll never have time to think about performance in the big - which is where it really matters!
James,
I think you've joined the conversation after it's gotten slightly out of context. I have not suggested that you shouldn't use interfaces because they are bad for performance. It would be a ridiculous thing to say. Neither did I suggest that you shouldn't care about "performance in the big" because you care about "performance in the small".
I care about performance all the way, throughout the entire cycle. I care about performance when designing my remote interfaces to be chunky instead of chatty; but I also care about performance when choosing the right collection for the job or optimizing object layout.
Sometimes refactoring for performance is easy, sometimes it's hard. The same can be said about refactoring in general. And another point I tried to convey is that sometimes decoupling, dependency injection and other techniques which are good for correctness might not work out just as good as far as performance is concerned.
Sasha
Comment preview