Performance counters sucks

time to read 2 min | 204 words

A while ago we added monitoring capabilities to RavenDB via performance counters. The intent was to give our users the ability to easily see exactly what is going on with RavenDB.

The actual usage, however, was a lot more problematic.

Performance counters API can just hang, effectively killing us (since we try to initialize it as part of setup db routine).
They require specific system permissions, and can fail without them.
They get corrupted, for mysterious reasons, and then you need to reset them all.
Even after you created them, they can still die on you for no apparent reason.

I would have been willing to assume that we are doing something really stupid. Except that SignalR had similar issues.

What is worse, it appears that using the performance counters needs to iterate the list of printers on the machine.

This is really ridiculous. And it is to the point in which I am willing to just give it up entirely, if only I had something that I could use to replace it with.

Currently I threw a lot of try /catch and a background thread to hide that, but it is ugly and brittle.

Tweet Share Share 15 comments

Tags:

bugs

Comments

26 Dec 2013
13:41 PM

rémi BOURGAREL

I had the same problem as Wallace. After a lot of digging, a Stack overflow question (http://stackoverflow.com/questions/19536688/performancecounter-hang-when-using-vs-2012-iisexpress8) I found this solution : http://support.microsoft.com/kb/300956.

I there any alternative ?

26 Dec 2013
14:38 PM

Boris Modylevsky

We had similar problem with them, But they even worst than you describe: 1. Their names are limited by their length 2. When you use them as percentage, then the order of their creation actually matters 3. It's a nightmare to collect counters from multiple computers

Finally, we moved to Splunk, reporting to it over TCP. It is an amazingly fast and reliable system. But not a cheap one...

26 Dec 2013
17:07 PM

Rafal

Hi, i've had problems with windows perf counters too, but few years ago I got rid of them completely and came up with an alternative solution, that has an additional benefit of being much easier to set up and maintain.

The solution is based on NLog library, which is used by my company's software as the primary logging tool. Performance data is logged just as any other log messages, and then directed through NLog configuration to a collector program. Performance data is sent over UDP, so the communication overhead is mainly taken by network hardware. Apart from that, there's a dedicated local event aggregator plugged into NLog that collects very frequent events and calculates some stats on them before forwarding to the network - this is to reduce UDP traffic for high freq events.

This way all performance counters can be configured externally with NLog config file, without stopping the application, and remote/centralized monitoring is very easy.

Apart from that I have implemented a data collector/perf monitor application, based on well known RRDTool utility. This is just a prototype, but works in production for two years without too many problems. It's open source, code available at https://github.com/lafar6502/cogmon. Sorry for complete lack of documentation. I'm using this for monitoring appplication and system performance + some business process KPIs.

If you're interested and want to know more pls email me.

27 Dec 2013
06:16 AM

Ayende Rahien

Remi, I have pointed users to that on several occasions, but that is really stupid. I don't want to have the ops burden of having to do this.

27 Dec 2013
20:05 PM

Robert Mircea

Maybe it's a good idea to look outside .Net world to how others solved the problem of metrics. Etsy's Statsd is one of the most popular way to log performance data.

https://github.com/etsy/statsd/

They have a very simple way of collecting performance data including from C# apps and they integrate with lots of backends for dashboarding. The most widely used is Graphite (some screenshots: http://graphite.wikidot.com/screen-shots).

An another idea is to expose metrics data as a JSON feed/webservice in the same way Java's Metrics library (http://metrics.codahale.com/) does. If you take the time, you will find out the wealth of stats Metrics is computing for you. There is a .Net port of it named Metrics.Net but I've never use it in a production scenario because Statsd + Graphite is so cool.

29 Dec 2013
09:27 AM

Ayende Rahien

Robert, Thank you very much, I'll be looking very closely at Metrics.NET

29 Dec 2013
15:20 PM

Robert Mircea

Please don't dismiss Statsd and Graphite just because their are not natively .Net. They are infrastructure indeed and live around your platforms but it would be extremely beneficial that RavenDb have capability to report internal counters to be analyzed in the entire context (e.g plot counters on graphs along with deployments, os data, etc).

The number of statistical, aggregation, customisation functions supported by Graphite is simply astonishing and only professional paid monitoring solutions match what Graphite can do. I don't see any reason for example not to use them yourself/your company to monitor your cloud infrastructure for RavenHQ (e.g: plot performance by week/day/time of day, display requests/sec of any type, etc). Even Metrics (java version) has a Graphite reporter.

Just take a look at some resources to discover their power:

http://www.codinginstinct.com/2013/03/metrics-and-graphite.html http://www.slideshare.net/itnig/collecting-metrics-with-graphite-and-statsd http://codeascraft.com/2010/12/08/track-every-release/ http://obfuscurity.com/2012/05/A-Precautionary-Tale-for-Graphite-Users

30 Dec 2013
07:07 AM

Ayende Rahien

Robert, I think that you are missing a crucial point. There is a big difference between the reporting statd, graphite, etc) than the actual metrics. What I need to do right now is collect the metrics, how I report them is a separate issue.

30 Dec 2013
22:16 PM

Robert Mircea

My point is that, as I see it, there is a fine line between basic metrics collection (timing durations, checkpointing or counting) and having additional logic inside the metrics library in order to compute during application's run stats like histograms (median and other percentiles) or rate of events (e.g. /1sec/1min/5min,etc). Codahale's Metrics library has this approach.

The other approach is to adopt the more lightweight basic collection of metrics inside the application and delegate to infrastructure the statistical calculations by sending metrics as raw data either one by one or in batches at regular times to a central server. This is the StatsD library client approach and, in some extent, Windows's performance counters approach.

Of course I distinguish between collection and reporting, the sites given where merely trying to highlight the idea towards system optimization as a whole vs. local optimum. Anyway, I can't see any reason why you can't mix both techniques in order to address metrics collection and basic stats (maybe in your own admin dashboard or API endpoints) and also give your users the possibility to have more insight into their platforms as a whole by playing nicely with some established infrastructure like Graphite or Cacti. I might've missed your point if you were just looking for metrics library API design, concepts or details about their implementation, but I've taken my chance. :)

Naturally, you know better what you are after and the last judgement is yours. I'm glad that you found something helpful or just interesting in Metrics.Net to solve this problem that you've raised in the post.

31 Dec 2013
09:40 AM

Ayende Rahien

Robert, My main issue is that we cannot just rely on an external source, which may or may not be available. We have to be able to provide the full information to the user in a self contained package.

02 Jan 2014
13:32 PM

Harry

Why not use EventSource/ETW and log events if needed?

02 Jan 2014
13:35 PM

Ayende Rahien

Harry, very complex to use.

02 Jan 2014
14:51 PM

Chris Marisic

@Ayende I was looking at that new logging stuff from Microsoft and my response directly on the msdn or codeplex post where they announced it was WTF. They made the most ridiculous system possible. I seriously don't know how they could make a worse experience for actually using it. I have no idea how good it is in use, but as a developer it is atrocious.

03 Jan 2014
10:39 AM

Harry

@Ayende EventSource/ETW is a lot easier to use now with the newly released libraries e.g. EventSource and EventTrace libraries available via nuget etc. You probably already now this.

But yes it is probably more complex if your scenario is very simple. I guess multi-platform is also an issue.

08 Jan 2014
20:45 PM

Stefano Ricciardi

Another vote for the statsd/graphite combination. It might not be what you need in your scenario, but when all you need is real-time metrics that combo is hard to beat.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB