We got a feature request that we don’t intend to implement, but I thought the reasoning is interesting enough for a blog post. The feature request:
If there is a critical error or major issue with the current state of the database, for instance when the data is not replicated from Node C to Node A due to some errors in the database or network it should send out mail to the administrator to investigate on the issue. Another example is, if the database not active due to some errors then it should send out mail as well.
On its face, the request is very reasonable. If there is an error, we want to let the administrator know about it, not hide it in some log file. Indeed, RavenDB has the concept of alerts just for that reason, to surface any issues directly to the admin ahead of time. We also have a mechanism in place to allow for alerts for the admin without checking in with the RavenDB Studio manually: SNMP. The Simple Network Monitoring Protocol is designed specifically to enable this kind of monitoring and RavenDB expose a lot of state via that which you can act upon in your monitoring system.
Inside your monitoring system, you can define rules that will alert you. Send an SMS if the disk space is low, or email on an alert from RavenDB, etc. The idea of actively alerting the administrator is something that you absolutely want to have.
Having RavenDB send those emails, not so much. RavenDB expose monitoring endpoint and alerts, it doesn’t act or report on them. That is the role of your actual monitoring system. You can setup Zabbix or talk to your Ops team which likely already have one installed.
Let’s talk about the reason that RavenDB isn’t a monitoring system.
Sending email is actually really hard. What sort of email provider do you use? What options are required to set it up a connection? Do you need X509 certificate or user/pass combo? What happens if we can’t send the email? That is leaving aside the fact that actually getting the email delivered is hard enough. Spam, SPF, DKIM and DMARC is where things start. In short, that is a lot of complications that we’ll have to deal with.
For that matter, what about SMS integration? Surely that would also help. But no one uses SMS today, we want WhatsApp integration, and Telegram, and … You go the point.
Then there are social issues. How will we decide if we need to send an email or not? There should be some policy, and ways to configure that. If we won’t have that, we’ll end up sending either too many emails (which will get flagged / ignored) or too few (why aren’t you telling me about XYZ issue?).
A monitoring system is built to handle those sort of issues, it is able to aggregate reports and give you a single email with the current status, open issues for you to fix and do a whole lot more that is simply outside the purview or RavenDB. There is also the most critical alert of all, if RavenDB is down, it will not be able report that it is down because it is down.
The proper way to handle this is to setup integration with a monitoring system, so we’ll not be implementing this feature request.