Ayende @ Rahien

Refunds available at head office

Tracking NH Prof performance problem on aggressively active application

I mentioned before that NH Prof has a problem with dealing with a lot of activity from the profiled application. In essence, until it catch up with what the application is doing, there isn't much that you can do with the profiler.

I consider this a bug, and I decided to track that issue with some verifiable results, and not just rely on hunches. The results were... interesting. The actual problem was in the place that I expected it to be, but it was more complex than expected. Let me try to explain.

The general architecture of NH Prof is shown here:

image

And the actual trouble from the UI perspective is in the last stage, actually updating the UI. UI updates are done directly on a (very) rich view model, which supports a lot of the functionality that NH Prof provides.

Because, as usual, we have to synchronize all access to the UI, we actually update the UI through a dispatcher class. The relevant code is this:

private void SendBatchOfUpdates()
{
	while (true)
	{
		int remainingCount;
		Action[] actions;
		lock (locker)
		{
			actions = queuedActions
				.Take(100)
				.ToArray();

			for (int i = 0; i < actions.Length; i++)
			{
				queuedActions.RemoveAt(0);
			}
			remainingCount = queuedActions.Count;
		}

		if (actions.Length == 0)
		{
			Thread.Sleep(500);
			continue;
		}

		dispatcher.Invoke((Action)delegate
		{
			using(dispatcher.DisableProcessing())
			{
				var sp = Stopwatch.StartNew();
				foreach (var action in actions)
				{
					action();
				}
				Debug.WriteLine(remainingCount + "\t" + sp.ElapsedMilliseconds);
			}
		});
	}
}

We queue all the operations on the view model, batch them and execute them to the UI. This is the behavior as it stands right now (that is, with the problem). Looking at the code, is should be obvious to you that there is one glaring issue. (Well, it should be oblivious, it wasn't to me until I dug deeper into it.)

If the number of queued actions is greater than the batch size, we are going to try to drain the queue, but as long as we add more actions to the queue faster than we can process them, we are going to keep hammering the UI, we basically keep the UI busy with updates all the time, and never let it time to process user input.

Note the debug line in the end, from that, I can generate the following graph:

image

For the duration line, the number is the number of milliseconds it took to process a batch of 100 items, for the red line, this is the number of items remaining in the queue that we need to drain. You can clearly see the spikes in which more work was added to the queue before we could deal with that.

You can also see that doing the actual UI work is slow. In many cases, it takes hundreds of milliseconds to process, and at times, it takes seconds to run. That is because, again, we have a very rich view model, and a lot of stuff is bound to that.

Before I can go forward, I need to describe a bit how NH Prof actually talks to the UI. Talking to the UI is done by raising events when interesting things happen. The problem that I initially tried to solve was that my input model is a stream of events, and it mapped nicely to updating the UI when I got the information.

Let us say that we want opened a session, executed a single SQL statement, and then closed the session. What are the implication from UI marshalling?

  1. Create new session
  2. Add new statement
  3. Add alert, not running in transaction
  4. Update statement with parsed SQL
  5. Update statement with row count
  6. Add alert, unbounded row set
  7. Update statement duration
  8. Close session
  9. Update session duration

So for the simplest scenario, we have to go to the UI thread 9 times. Even when we batch those calls, we are still end up going to the UI thread too often, and updating the UI frequently. That, as I mentioned, can take time.

Now, a very easy fix for the problem is to move the thread sleep above the if statement. That would introduce a delay in talking to the UI, letting the UI time to catch up with user input. This would make the application responsive during periods where the actions to perform in the UI.

The downside is that while this works, it means slow stream of updates into the application. That is actually acceptable solution, except that this is the same situation that we find ourselves with when we are dealing with processing an offline profiling session. When those are big, it can take really long time to process those.

Therefor, this is not a good solution.

A much better solution would be to reduce the number of times that we go to the UI. We can reduce the number of going to the UI significantly by changing the process to:

  1. Create new session
  2. Add new statement (with alerts, parsed SQL, row count and duration)
  3. Close session (with duration)

That requires some changes to the way that we work, but those are actually good changes.

This is not quite enough, though. We can do better than that when we realize that we actually only show a single session at a time. So most often, we can reduce the work to:

  1. Create new session
  2. Close Session (with duration, number of statements, alerts aggregation, etc)

If we also add batching to the mix, we can reduce it further to:

  1. Create new session & close it (with duration, number of statements, alerts aggregation, etc)

We have to support both of them, because we need to support both streaming & debugging scenarios.

We don't need to let the UI know about the statements, we can give it to the UI when it asks us about them, when you select a session in the sessions list.

And now, when we are looking at a session, we can raise an event only for that session, when a statement is added to that session. As an added bonus, we are going to batch that as well, so we will update the UI about this every .5 seconds or so. At that point, we might as well do an aggregation of the changes, and instead of sending just the new statement, we can send a snapshot of the currently viewed session. The idea here is that if we added several statements, we can reduce the amount of UI work we have to do by making a single set of modifications to the UI.

Thoughts?

Oh, and do you care that I am doing detailed design for NH Prof problems on the blog?

Comments

Mr_Design
02/13/2009 02:57 AM by
Mr_Design

Since I purchased a license and have skin in the game, I'm happy with you doing design on the blog.

Andrew Peters
02/13/2009 03:40 AM by
Andrew Peters

Please tell me this:

queuedActions.RemoveAt(0);

isn't what I think it is.

Kyle Szklenski
02/13/2009 03:47 AM by
Kyle Szklenski

Do I care? Yes. Do I want you to continue? Yes. This stuff is great - you can't get a more detailed analysis of problems on the net, especially not with a real-time, real-life application that is being developed. It shows some incredible balls on your part. :) I salute you, sir!

Kyle Szklenski
02/13/2009 03:54 AM by
Kyle Szklenski

Andrew, I guess it depends on what queuedActions' type is. If it is a non-contiguous collection, removing the first one like that won't be updating any indices, so there's no performance issue. It's also possible that it's a custom queue that has its own custom RemoveAt, but if that were the case, I would think Oren would write it a little differently. There's probably not really a performance issue anyway - that type of thing can really fly when there's only 100 items in the list (and he's guaranteed that), and it's clearly the issues that Oren brought up initially that would be the performance concern. I would still move it to removing actions from the backend backwards, but that's just because I tend to always iterate through collections backwards. It saves a lot of headaches.

Ayende Rahien
02/13/2009 04:01 AM by
Ayende Rahien

Andrew,

Okay, it isn't what you think it is.

Now tell me what the problem is?

If you are going to tell me that this snuffle the entire list, I know.

I don't really care for this particular implementation.

Peter Morris
02/13/2009 05:13 AM by
Peter Morris

What does Action do? Update the GUI? If so I would change it to update some view data instead. The GUI should poll the view data whenever it is ready. This way the ViewData can get updated as often + as quickly as it needs to, and the GUI can lock, copy out the data it needs, then bind to the copy as frequently as it can manage.

It would be easier with a simple test case though :-)

Peter Morris
02/13/2009 05:17 AM by
Peter Morris

In addition :-) The GUI then also only needs to copy out of the view data what it is currently showing, like a virtual data source.

Avish
02/13/2009 09:53 AM by
Avish

I agree with Peter, perhaps it'd be best to not rebind on every model update, but rather to raise some sort of "dirty" event. The view will then repoll the dirty parts of the model when it sees fit, and only those that it wishes to show.

Ayende Rahien
02/13/2009 12:03 PM by
Ayende Rahien

Avish,

The problem is that now I have to track all of my UI changes and actively force then, rather than let the binding do that.

Chris Cyvas
02/13/2009 01:19 PM by
Chris Cyvas

I'm diggin' it. I have skin in the game too. But, this is pretty interesting stuff regardless of that.

Shane Courtrille
02/13/2009 02:03 PM by
Shane Courtrille

We're here for the most part for your views on software development. It doesn't get any more real then the thought process behind how you are developing your commercial application. Keep going!

Peter Morris
02/13/2009 02:10 PM by
Peter Morris

You can still bind, but to a snapshot of the view data. Get a notification whenever it updates, lock, snapshot the bit you want to show, unlock, bind to the snapshot.

Peter Seale
02/13/2009 04:26 PM by
Peter Seale

Re: NHProf blogging,

It's better that you post what you care about, rather than trying to manufacture posts when that design issue is bugging you. For an analogue, see the recent posts on CodingHorror vs the recent posts on blog.stackoverflow.com vs fakeplasticrock.com - all three by Jeff Atwood (and no I don't read all three) - the point is, if you compare the three blogs, it's clear that CodingHorror isn't the one he's passionate about. Which is really unfortunate for his 124k subscribers :)

dave-ilsw
02/13/2009 07:09 PM by
dave-ilsw

I think it's great that you're willing to do detailed design for NH Prof problems on your blog.

gnash
02/14/2009 01:20 AM by
gnash

We just released 0.4.1.6 of Retlang that makes asynchronous communication with Windows Forms much easier. The new class is called FormFiber. It will give you all of the batching, scheduling, etc. of Retlang but on the Form thread. It uses ISynchronizeInvoke, instead of using the Dispatcher, but I believe the interface still exists on WPF Forms.

http://code.google.com/p/retlang/

Peter Morris
02/15/2009 09:52 AM by
Peter Morris

No need to poll is there? What about this

01: Have a manual reset event in your queue. Set it when you add your first item, reset it when the last item is removed.

02: Add an item to the ThreadPool that is waiting for the manual reset event.

03: When the event is signalled a thread is scheduled to update the GUI. At the end of updating the GUI you queue another item to the ThreadPool.

This way you''' get an instant update to the GUI when new data arrives, but will only update the GUI as often as it can handle it.

Ayende Rahien
02/16/2009 01:09 AM by
Ayende Rahien

Peter,

That is actually what is going on right now.

That is a problem, because push model doesn't work for UI in this scenario.

I'll talk about this a bit more in an additional post

Peter Morris
02/16/2009 09:45 AM by
Peter Morris

I think you are misunderstanding what I am saying. I am not suggesting pushing the data. What I am suggesting is

01: Thread which adds the data notifies there is data available by ensuring a manual reset event remains signalled.

02: The thread which you are currently polling with (which is wasteful when there is no data to process) is replaced with a thread pool wait, so it doesn't cost a thread.

03: When the event is signalled and a thread is assigned you update the GUI with a small window of the latest data.

04: Once finished binding the GUI reschedules another thread pool wait.

By polling you are pulling from the data, but still pushing in the GUI. By using this approach you are eliminating the wasteful polling and also ensuring that the GUI pulls and nothing is ever pushed into it and it updates as often as it can.

If you poll then the duration might be fine on your PC but too frequent on a slower PC with a rubbish graphics card, or not frequent enough on a really powerful PC.

Lee Campbell
02/16/2009 03:00 PM by
Lee Campbell

From a purely WPF point of view, why is the code like this:

dispatcher.Invoke((Action)delegate

{

//From here on in I am blocking the UI thread. :-(

using(dispatcher.DisableProcessing())

{

    var sp = Stopwatch.StartNew();

    foreach (var action in actions)

    {

        action();

    }

    Debug.WriteLine(remainingCount + "\t" + sp.ElapsedMilliseconds);

}

}

when I would have thought you would get the result you are chasing from this:

foreach (var action in actions)

{

//non blocking call. This queues the action to be performed by the dispatcher. Optionally provide a Priority for the action to provide beter responsiveness.

dispatcher.BeginInvoke(action); 

}

On a design point of view, if your presentation model is too verbose can you only work with "Summary objects" which represent a summary of the data which is more digestible at a higher level. Drilling in offers more information. Using the Group/Sort/Filter functionality of things like CollectionViewSource the power moves back to the XAML on how to present the data.

See leecampbell.blogspot.com/.../...ispatchers-to.html for more info on Responsive UIs in WPF.

HTH

Comments have been closed on this topic.