Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,640
|
Comments: 51,254
Privacy Policy · Terms
filter by tags archive
time to read 4 min | 606 words

I lately come into the conclusion that I need a few new terms to describe a few common ways to talk about the way that I structure different components in my applications.

Those are Centralized Service, Distributed Service and Localized Component. Mostly, I use them as a way to express the distribution semantics of the item in question.

As you can probably guess, I am using the term service to refer to something that we make remote calls to, while I am using the term component to refer to something that is running locally.

Centralized Service is probably the classic example of a web service. It is a server that is running somewhere to which we make remote calls to. As far as the system is built, there is only one such server. It may be implemented with clustering or load balancing, but logically (and quite often, physically), it is a single server that is processing requests. This is probably the easiest model to work with, since it more or less remove the entire question of concurrency conflicts from the system. Internally, the Centralized Service is using transactions or locks to ensure coherency in the face of concurrency.

Distributed Service is a built upfront to run on a set of server, and the need to handle concurrency conflicts is built into the design of the system. That may be done using sharding, Paxos or other methods. Usually, we build Distributed Service for very high scalability / reliability cases, since it tend to be the more complex solution. An example of a Distributed Service would be DNS, where we explicitly design the system to be resilient to failure, but accept the more complex concurrency issues (slow updates).

Localized Component is a solution to the chatty interface problem. There are quite a few scenarios where we need to make calls to a separated subsystem, but the cost of network traffic completely outweigh the cost of actually performing the operation on the other side. In this case, we may switch from a Centralized Service to a Localized Component. What this means is that instead of executing the operation on the other side, we perform it locally.

In practice, this means that we need to design our system in a way that any data that we would like to have is structured in such a way that it can be brought locally or retrieved very cheaply. An example of such a system appears in this post, although that is a fairly complex one. A more common situation is a component that deals with a set of rule, and we simply need to get the rule from the rule repository and execute it locally.

Another alternative for Localized Components is to structure it in such a way that retrieving and persisting the data is cheap, and processing it is done on locally. That way, the sharing of the data and the actual processing of the data are two distinct issues, which can be resolved separately. A common issue that needs to be resolved with Localized Components is consistency, if the component allow writes, how do other instances of the component, running on different machine get notified about it.

I tend to avoid Distributed Services in favor of Centralized Services and Localized Components, which tend to be easier to work with. It is also easier to lean on existing infrastructure then write an implementation from scratch. For example, I am using Rhino DHT (which I consider to be a Distributed Service) to handle a lot of the complexity inherit to one of those.

time to read 5 min | 997 words

imageI am going to talk about a few ways of trying to organize a project, mostly as a way to lay out the high level requirements for a feature or a module.

I consider filling in one of those to be the act of actually sitting down and thinking about the concrete design of the system. It is not final design, and it is not set in stone, but in general it forces me to think about things in a structured way.

It is not a very hard to do, so let us try to do this for the authentication part of the application. Authentication itself is a fairly simple task. In real corporate environment, I’ll probably need integration with Active Directory, but I think that we can do with simple username & pass in the sample.

Module: Authentication

Tasks: Authenticate users using name/pass combination

Integration: Publish notifications for changes to users

Scaling constraints:

Up to 100,000 users, with several million authentication calls per day.

Physical layout:

Since the system need to handle small amount of users, we have to separate deployment options, Centralized Service* and Localized Component*. Both options are going to be developed, to show both options.

Feature

Description

SLA

authenticate user

Based on name & password
Lock a user if after 5 failed logins

less than 50 ms per authentication request 99.9% of the time, for 100 requests per second per server

create new user

User name, password, email

less than 250 ms per change password request 99.9% of the time, for 10 requests per second globally

change password

 

less than 250 ms per change password request 99.9% of the time, for 10 requests per second globally

reset password

 

less than 250 ms per change password request 99.9% of the time, for 10 requests per second globally

enable / disable user

disable / enable the option to login to the system

less than 250 ms per change password request 99.9% of the time, for 10 requests per second globally

You should note that while I don’t expect to have that many users in the system, or have to handle that load, for the purpose of the sample, I think it would be interesting to see how to deal with such requirements.

The implications of this spec sheet is that the system can handle about 8.5 million authentication requests per day, and about a 3/4 of a million user modifications requests.

There are a few important things to observe about the spec sheet. It is extremely high level, it provide no actual implementation semantics but it does provide a few key data items. First, we know what the expected data size and load are. Second, we know what the SLAs for those are.

* Centralized Service & Localized Component are two topics that I’ll talk about in the future.

time to read 3 min | 506 words

image

The sample application that I am going to build is going to be a prison management application. I am going to take this post as a chance to talk about it a bit, discuss the domain and then I’ll talk about the overall architecture in more details.

The domain of a prison is actually fairly simple, you have an inmate, and the sole requirement is that you would keep him (it tend to be overwhelmingly him, rather than her) in lawful custody.

The term lawful custody has a lot of implications, which are, in more or less their order of importance:

  • The inmate is in custody, that is, he didn’t manage to run away.
  • Custody is lawful, that is, you have legal authorization to keep him in jail. Usually that means an order by a judge, or for the first 24 hours, by a police officer.
  • Lawful custody itself means that you:
    • keep the inmate fed
    • in reasonable conditions (sleeping quarters, sanitation, space)
    • access to medical facilities. Indeed, in most prisons the inmates get better health care, especially for emergencies, than the people living in most big cities.
    • ability to communicate with lawyers and family

The devil, however, is in the details. I am pretty sure that I could sit down and write about 250 pages of high level spec for things that are absolutely required for a system that run a prison and still not get everything right.

In practice, at least in the prisons I served at, we did stuff using paper, some VB6 apps & Access, and in one memorable occasion, an entire set of small prisons where running on what amounted to a full blown application written using Excel macros.

Anyway, what I think that I’ll do is start with a few modules in the system, not try to build a full blown system.

The modules that I‘ll start with would be:

  • Staff – Managing the prison’s staff. This is mostly for authentication & authorization for now.
  • Roster – Managing the roster of the prisoners, recording Countings, etc.
  • Legal – Managing the legal side of the prisoners, ensuring that there are authorizations for all the inmates, court dates, notifications, etc.
  • Escort – Responsible for actually taking the inmates out for court, medical evacs, releasing inmates, etc.

That is enough for now, for that matter, it is a huge workload already, but that is about the only way in which I can actually have a chance to show a big enough system and the interactions between all the parts.

More on Macto

time to read 2 min | 202 words

Looking at the responses that I got, I think that there is some basic misunderstanding about the goal of the sample. Some people seems to want this to be a usable product, some even went ahead and specified some… interesting requirements.

Unlike Storefront, I don’t intend to create something that would be a useful component to take and use, for the simple reason that I don’t think that it would allow to show anything really interesting. Any generic component or system has to either make too much assumptions (constrained) or not enough (open ended). I don’t care to have to hand wave things too much.

Given that I am a domain expert on exactly two things, and that I am not going to create Yet Another Bug Tracking Application, I think that it is fairly obvious what I have to build.

Macto is going to be a prison management system.

I am going to use it to demonstrate several topics that I have been dealing with lately, among them the concepts & features architecture and how to build scalable systems.

I’ll let you stew on that and will post more details about Macto tomorrow.

time to read 1 min | 198 words

It looks like what people would really like to see is an end to end sample of all the things that I have been talking about lately.

The problem with doing that is the scope that we are talking about here, it is pretty big, and there are some interconnected parts that would be hard to look at in isolation. To make things a little bit more interesting, building a “best practice” application is dependent on far too many variables.

Given all of that, I decided that I might as well copy a good idea and try to emulate Rob Conery’s Storefront series of webcasts. What I absolutely refuse to do, however, is to create Yet Another Online Shop.

Hence, I have another forum, dedicated to this specific task, where you can discuss what you want to see.  We need to decide what is the application, what are the requirements, etc.

Please have the conversation in the forum, which would make it easier to track things.

I think that I can promise at least two or three webcasts to come out of it, including the full source code.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. API Design (10):
    29 Jan 2026 - Don't try to guess
  2. Recording (20):
    05 Dec 2025 - Build AI that understands your business
  3. Webinar (8):
    16 Sep 2025 - Building AI Agents in RavenDB
  4. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  5. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
View all series

Syndication

Main feed ... ...
Comments feed   ... ...