During the RavenDB courses* in the past few weeks, I was talking with one of the attendees and I came up with what I think is a great analogy.
* Yes, courses, in the past 4 weeks, I’ve given the RavenDB course 3 times. That probably explains why I don’t remember which course it was.
What are your success metrics? From Opening a Champagne Bottle To Hiding Under the Bed with Said Bottle?
The first success metric is when you have enough users (and, presumably, revenue) to cross the threshold to the Big Boys League. Let us call this the 25,000 users range. That is the moment when you throw a party, go to the store and grab a whole case of champagne bottles and make fancy speeches. Of course, the problem with success is that you can have too much of it. A system that does just fine (maybe creeks a little ) on a 25,000 users is going to behave pretty differently when you have 100,000 users. That is the moment when you find your engineers under the bed, with a half empty bottle of champagne and muttering things about Out Of Capacity errors and refusing to come out until we fire all the users.
In just about any system, you need to define the success points. Because Twitter was very luck that it managed to grow even though it had so many problems when its user base exploded. It is far more likely that users will figure out that your service is… well, your engineers are drunk and hiding under the bed, so the service looks accordingly.
And yes, you can try to talk to people about SLA, and metrics and capacities. But I have found that an image like that tend to give you a lot more focused answers. Even if a lot of the time the answer is “I don’t know”. That is a place to start, because this make it a lot more acute than just “how many req/sec do we need to support?”.