Ayende @ Rahien

It's a girl

UberProf performance improvements, beware of linq query evaluation

This is a diff from the performance improvement effort of UberProf. The simple addition of .ToList() has significantly improved the performance of this function:

image

Why?

Before adding the ToList(), each time we try to run our aggregation functions on the statements enumerable, we would force re-evaluation of the filtering (which can be quite expensive). By adding ToList() I am now making the filtering run only once.

There is another pretty obvious performance optimization that can be done here, can you see it? And why did I choose not to implement it?

Comments

Rik Hemsley
12/24/2009 11:25 AM by
Rik Hemsley

I'd say the obvious 'optimization' would be to take counts of statements which are transactions, which are cached and which are neither.

Something like this...

long cachedCount, transactionCount, neitherTransactionNorCachedCount;

statements.Each(s =>

{

cachedCount += s.IsCached ? 1 : 0;

transactionCount += s.IsTransaction ? 1 : 0;

neitherTransactionOrCachedCount += s.IsCached || s.IsTransaction ? 0 : 1;

}

Not sure why you'd avoiding doing it, but I'm on holiday and my brain's not in gear. That's my excuse, anyway.

John St. Clair
12/24/2009 12:15 PM by
John St. Clair

"group by" rather than count, and a join/group by for the agreegate number of statements?

Anthony Dehirst
12/24/2009 03:38 PM by
Anthony Dehirst

NumberOfStatements is

statements.Count() - NumberOfCahcedStatements - NumberOfTransactionsStatements;

I guess that you didn't do it as it would make the code a little messier as you couldn't use the inline constructor. I do hope that there is a better reason.

JJoos
12/24/2009 04:39 PM by
JJoos

I agree with rik, and i wouldn't do it because the performance issues are probably in the first part.

John Chapman
12/24/2009 05:27 PM by
John Chapman

This is actually one of the reasons I find var to be mostly evil. var allows developers to ignore the real type. It changes the mind of thinking. The two pieces of code do VERY different things. Yet to an untrained eye it's not clear at all. Largely I blame var for this. If the type were explicitly defined in that statement I believe fewer people would make the mistake of misunderstanding how it is being used.

I personally never use var in my code. That also means I never use anonymous types. I don't mind the extra typing involved. Typing doesn't take a long time. Forming the right solution to the problem is usually the much biggest cost, so typing cost is in the noise for me. Readability adds more.

I realize I'm in the minority on this subject.

firefly
12/25/2009 12:10 AM by
firefly

@Mike Memoization is a very powerful technique for performance optimization... Unfortunately it's something that could take awhile for an OO programmer to bend his head around to. Unless you are coming from a functional background :) Then it become your second nature.

Mike Chaliy
12/25/2009 09:57 AM by
Mike Chaliy

@firefly, agree but LINQ full of such technics... It all about lazy. So I believe we already prepared for this klind of stuff.

Konstan
12/29/2009 03:00 AM by
Konstan

@John: would you be happier if your saw IEnumerable <statement instead of var?

var is good because it allows compiler to pick the best matching extension method (for example IEnumerable and IQueryable both have "Where" extension method but first performs filtering on client whereas second does it on server - shorter code with performs better - win-win situation)

david
01/05/2010 02:26 AM by
david

@Ayende: so, are you going to fill in the the "just back from holiday, and brain slow to fire up" and the "too scared to comment in case I look foolish" amongst us? :-)

Ayende Rahien
01/05/2010 05:41 AM by
Ayende Rahien

Not sure that I am following you here

david
01/05/2010 09:45 AM by
david

You asked 3 questions in your posting. I don't know the definitive answer to them. To aid those, like me, who come along in the future and read posts like this, it would be beneficial to see questions and answers -- rather than just questions.

Comprendes?

MF
01/12/2010 03:51 AM by
MF

would .ToArray() be slightly faster? or am i missing something?

Dathan Bennett
01/22/2010 05:40 PM by
Dathan Bennett

@MF, ToArray() might be slightly faster, but list item access by index is constant-time, so the difference (if there is an appreciable one) is probably trivial.

Comments have been closed on this topic.