﻿<?xml version="1.0" encoding="utf-8"?><rss version="2.0"><channel><title>Ayende @ Rahien</title><link>http://ayende.com</link><description>Ayende @ Rahien</description><copyright>Copyright (C) Ayende Rahien  2004 - 2021 (c) 2026</copyright><ttl>60</ttl><item><title>Grant Fritchey commented on Make a distinction: Errors vs. Alerts</title><description>Excellent post. One of the biggest problems I see with implementation of monitoring software, any monitoring software, is that people don't tune the alerts to maximize signal to noise. I wrote about it here: http://www.simple-talk.com/sql/database-administration/preventing-problems-in-sql-server/</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment15</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment15</guid><pubDate>Sat, 19 Nov 2011 13:21:22 GMT</pubDate></item><item><title>Fero commented on Make a distinction: Errors vs. Alerts</title><description>I'm have been using Elmah error module for a long time and, it works really well but only on Asp.Net and Asp.Net Mvc. 
So then developed extension for Elmah and it can be used with any project Silverlight, Console, WPF, WCF. Here is source https://github.com/vincoss/vinco-logging-toolkit.

Later update will call Elmah.Everywhere</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment14</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment14</guid><pubDate>Fri, 18 Nov 2011 09:29:43 GMT</pubDate></item><item><title>Scooletz commented on Make a distinction: Errors vs. Alerts</title><description>@Will, @Alwin
yes it would be nice, whether using IObservable or other way to describe the requirement of an alert, but what about scaling such solution? What if error occurs on different machines - are logged to different log files/dbs whatever?
@Ayende, where do store such information? How do you want to filter the stream of events from multiple servers?</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment13</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment13</guid><pubDate>Fri, 18 Nov 2011 07:00:57 GMT</pubDate></item><item><title>Alwin commented on Make a distinction: Errors vs. Alerts</title><description>Will, couldn't you do something like that with Reactive Framework (Rx)? You know, with Throttle and such...</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment12</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment12</guid><pubDate>Fri, 18 Nov 2011 02:01:02 GMT</pubDate></item><item><title>Will Gant commented on Make a distinction: Errors vs. Alerts</title><description>There should also be a SqlException inside angle brackets &amp;lt; &amp;gt; to the right of the For (it's intended to be a generic method). It might have gotten interpreted as HTML.</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment11</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment11</guid><pubDate>Thu, 17 Nov 2011 22:17:49 GMT</pubDate></item><item><title>Will Gant commented on Make a distinction: Errors vs. Alerts</title><description>There should only be one period after the For(). Ayende's site handled the code just fine, but it figures I'd make at least one syntax error.</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment10</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment10</guid><pubDate>Thu, 17 Nov 2011 22:15:56 GMT</pubDate></item><item><title>Will Gant commented on Make a distinction: Errors vs. Alerts</title><description>It would be nice if there was a package that let you fluently configure a policy for how your app handles errors based on type and contents. I'd love to be able to do something like:

For&lt;System.Data.SqlClient.SqlException&gt;().
    .InTimeSpan().Minutes(5)
    .Occurs(10)
    .CompareBy(CompareBy.StackTrace | CompareBy.Host)
    .Where(ex=&gt; e.Message.Contains("Timeout"))
    .Act(ex=&gt;{SendPanicMessage(ex);});

That way, I could filter errors by type, contents, how close together they are, etc, and tell it what to do with them. I have no idea off the top of my head how one might implement this and make it perform well (especially across multiple machines), but something like this would be awful handy. The intent of the above is to send a panic message when 10 or more SqlExceptions with the word "Timeout" in their message occur in a five minute timespan, from the same device with the same stacktrace. (This is just a first brush - someone that is actually skilled at making fluent interfaces could make this a good deal cleaner and more expressive).

I think you'd almost have to chuck the exceptions off into a message queue or something though - you wouldn't want the logic to check all this stuff to be running inside your app. It would also probably need to be pushed to a central location to handle the load-balancing scenario. Further, if you were to chuck this into a database somewhere, you could report on the frequency of the errors. That might be handy for building a triage list for a development roadmap before the clients get involved.

I also hope that the code doesn't get turned into (worse) indecipherable gibberish in the act of posting it.</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment9</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment9</guid><pubDate>Thu, 17 Nov 2011 22:12:04 GMT</pubDate></item><item><title>Chris Wright commented on Make a distinction: Errors vs. Alerts</title><description>@Simon
You can use those strategies for some things.

But consider this pattern of behavior: you're talking to a service and it usually responds in 50ms, with 99.9% of calls finishing in 250ms. But now 50% of its calls are over 2 seconds.

If you have an alarm for a single call taking 2 seconds, you'll probably get pinged every ten or fifteen thousand calls. This isn't actionable, or even a problem.

You want alarming on aggregate behavior, not individual requests. Now you're adding a fair bit of complexity around this call.

For extra credit, what if you have half a dozen machines running behind a load balancer and want to alert based on the aggregate logs?</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment8</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment8</guid><pubDate>Thu, 17 Nov 2011 21:43:44 GMT</pubDate></item><item><title>Phil commented on Make a distinction: Errors vs. Alerts</title><description>@Will Gant
I really like log4net.  It's open source (not from Microsoft) and has different logging levels, which can be changed at run time.  As a plus it only requires a single assembly reference.
</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment7</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment7</guid><pubDate>Thu, 17 Nov 2011 17:54:51 GMT</pubDate></item><item><title>Will Gant commented on Make a distinction: Errors vs. Alerts</title><description>I was kind of hoping there was a non-microsoft open source package that handles that well. My experience with the Enterprise Library has been that it just requires so much configuration and tinkering to get working that it isn't worth the effort. I'll admit that this impression is probably a bit dated though - they may have improved since the last time I worked with their stuff.</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment6</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment6</guid><pubDate>Thu, 17 Nov 2011 17:29:57 GMT</pubDate></item><item><title>Rafal commented on Make a distinction: Errors vs. Alerts</title><description>Very often catching exceptions and logging them is not enough, sometimes an alert should be raised if nothing happens for some time - for example when some service responsible for receiving messages from a queue dies quietly or gets stuck. Also, performance problems will not be detected by analyzing exceptions in the log file.
IMHO the log files should be used to find the problem cause but alerts should be raised based on some other criteria - like high-level application/system-level statistics and deviations from values considered normal.
Examples: measuring the 'queue latency' (time the messages spend in a queue before being processed)', web server request queue length, unusual deviations in business process statistics like number of documents processed or number of tasks completed per minute etc. 
Usually you should identify the key indicators of system (mis) behavior and select such ones that are important to the users (they don't care about the serveer disk queue length but they care a lot about GUI response time or the time it takes to some document to travel between two systems). Sometimes it's good to implement  checkpoints in the business process, for example making sure that all documents that arrive into the system are dealt with within 3 days (if not then it means that there's error somewhere). </description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment5</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment5</guid><pubDate>Thu, 17 Nov 2011 16:39:38 GMT</pubDate></item><item><title>Daniel Lidström commented on Make a distinction: Errors vs. Alerts</title><description>@Will Gant: I believe Microsoft's Enterprise Library has a block for this purpose.</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment4</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment4</guid><pubDate>Thu, 17 Nov 2011 16:07:32 GMT</pubDate></item><item><title>Will Gant commented on Make a distinction: Errors vs. Alerts</title><description>Strange coincidence. I'm trying to work out a better error handling strategy where I work right now, as we have a lot of error messages coming in that are just noise. Like your example, we've made a habit of ignoring the errors, often to our detriment (when the error is reported by a customer, it's now a marketing problem, not just a software problem). I've managed to get rid of a few of the big ones, but we're still getting far too many errors that are simply not useful - I don't know how we're going to fix this so that we are only notified when the error is worth being notified about.

Is there a package that makes the handling of errors cleaner and more policy-driven? That would be nice.</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment3</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment3</guid><pubDate>Thu, 17 Nov 2011 16:06:02 GMT</pubDate></item><item><title>Simon Skov Boisen commented on Make a distinction: Errors vs. Alerts</title><description>Shouldn't the use of error-severity categories solve a problem like the first of your customers had? Only log it as an error when the service was unresponsive for 8 times, else log as info or debug?</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment2</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment2</guid><pubDate>Thu, 17 Nov 2011 15:40:34 GMT</pubDate></item><item><title>Joseph Daigle commented on Make a distinction: Errors vs. Alerts</title><description>The corollary to this is that error handling cannot be an afterthought in your system in order to do proper alerting. Alerting is typically on par with any other feature or user story that must be designed and tested. The only difference is that the "user" of this feature is typically a sys admin or a devops team member.</description><link>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment1</link><guid>http://ayende.com/136194/make-a-distinction-errors-vs-alerts#comment1</guid><pubDate>Thu, 17 Nov 2011 12:23:27 GMT</pubDate></item></channel></rss>