Ayende @ Rahien

It's a girl

World’s Smallest No SQL database

I used the following in a lecture called “Why you should never write your own database”. It has never been run, tested, or anything, but it serves as a good way to discuss the challenges involved in building real world  databases.

Here is the server side code:

   1: public class NoSqlDbController : ApiController
   2: {
   3:     static readonly ConcurrentDictionary<string, byte[]> data = 
   4:         new ConcurrentDictionary<string, byte[]>(StringComparer.InvariantCultureIgnoreCase); 
   5:  
   6:     public HttpResponseMessage Get(string key)
   7:     {
   8:         byte[] value;
   9:         if(data.TryGetValue(key, out value) == false)
  10:             return new HttpResponseMessage(HttpStatusCode.NotFound);
  11:  
  12:         return new HttpResponseMessage
  13:             {
  14:                 Content = new ByteArrayContent(value)
  15:             };
  16:     }
  17:  
  18:     public void Put(string key, [FromBody]byte[] value)
  19:     {
  20:         data.AddOrUpdate(key, value, (_, __) => value);
  21:     }
  22:  
  23:     public void Delete(string key)
  24:     {
  25:         byte[] value;
  26:         data.TryRemove(key, out value);
  27:     }
  28: }

And the client side code:

   1: public class NoSqlDbClient
   2: {
   3:     private readonly HttpClient[] clients;
   4:  
   5:     public NoSqlDbClient(string[] urls)
   6:     {
   7:         clients = new HttpClient[urls.Length];
   8:         for (var i = 0; i < urls.Length; i++)
   9:         {
  10:             clients[i] = new HttpClient { BaseAddress = new Uri(urls[i]) };
  11:         }
  12:     }
  13:  
  14:     public Task PutAsync(string key, byte[] data)
  15:     {
  16:         var client = clients[key.GetHashCode()%clients.Length];
  17:         return client.PutAsync("?key=" + key, new ByteArrayContent(data));
  18:     }
  19:  
  20:     public Task DeleteAsync(string key, byte[] data)
  21:     {
  22:         var client = clients[key.GetHashCode() % clients.Length];
  23:         return client.DeleteAsync("?key=" + key);
  24:     }
  25:  
  26:     public async Task<byte[]> GetAsync(string key)
  27:     {
  28:         var client = clients[key.GetHashCode() % clients.Length];
  29:         var r = await client.GetAsync("?key=" + key);
  30:         return await r.Content.ReadAsByteArrayAsync();
  31:  
  32:     }
  33: }

And yes, that is a fully functional, scale out capable, sharding enabled No SQL Key/Value store in less than 60 lines of code.

Tags:

Posted By: Ayende Rahien

Published at

Originally posted at

Comments

Starfish
07/03/2013 09:42 AM by
Starfish

So why not?

Ayende Rahien
07/03/2013 10:23 AM by
Ayende Rahien

Starfish, A REALLY long list of reasons. I have a full lecture on that.

Igor Kalders
07/03/2013 01:00 PM by
Igor Kalders

The most interesting thing is not "why not", but "why did Ayende still"?

timay
07/03/2013 01:53 PM by
timay

How would you scale this thing out? How would you shard that?

Erik
07/03/2013 02:08 PM by
Erik

Any chance that lecture was recorded somewhere? =)

Damien Guard
07/03/2013 02:54 PM by
Damien Guard

I wouldn't call it a database without at least persistence.

[)amien

Ayende Rahien
07/03/2013 03:27 PM by
Ayende Rahien

Igor, I wanted to talk about the implications of those design decisions.

Ayende Rahien
07/03/2013 03:28 PM by
Ayende Rahien

Timay, Throw another shard. You have sharding build in into the client already.

Ayende Rahien
07/03/2013 03:28 PM by
Ayende Rahien

Damien, There are a lot of in memory only dbs.

tobi
07/03/2013 04:01 PM by
tobi

Nice idea and an excellent teaching tool.

Shankar
07/03/2013 04:31 PM by
Shankar

And this lecture available for viewing/listening? For sale?

Greg Young
07/03/2013 10:21 PM by
Greg Young

"Timay, Throw another shard You have sharding build in into the client already."

And rebuild the entire cluster. Cool :)

Ayende Rahien
07/04/2013 06:46 AM by
Ayende Rahien

Shankar The conference was: http://www.usievents.com/en/conferences/12-paris-usi-2013 It was recorded, so it should be available at some point.

Ayende Rahien
07/04/2013 06:47 AM by
Ayende Rahien

Greg, Sure. I was constrained by the fact that I had to fit both client & server code into a single PPT slide. That doesn't leave a lot of room for playing around, you know. And that gives you a whole new level of things to talk about regarding scaling, exactly on this issue.

Justin Van Patten
07/04/2013 11:52 AM by
Justin Van Patten

Consider using OrdinalIgnoreCase. See http://msdn.microsoft.com/en-us/library/ms973919.aspx

Bob
07/04/2013 03:39 PM by
Bob

Clearly, you shouldn't write your own database. But there are still those of us who actually know how to program.

Tecfield .
07/05/2013 01:30 PM by
Tecfield .

On @'Justin Van Patten' comment, using OrdinalIgnoreCase speeds up the process; however, it will work correctly if you think just about English. OrdinalIgnoreCase will not work correctly on some other languages (refer to the Turkish I example on the same link you mentioned). We also have had that issue at work and we needed to update all our string compare cases to 'InvariantCultureIgnoreCase' in order to avoid such issues. Since the author of the code wanted to come with a minimal code snippet, I think his choice of 'InvariantCultureIgnoreCase' is perfect to cover other languages since not all the readers are those who code (just) English apps.

Rei Roldan
07/08/2013 10:08 AM by
Rei Roldan

Ayende, you are nuts... But we <3 you :)

Andre
07/08/2013 02:45 PM by
Andre

Cool, but agree with Damien on the persistence, seems misleading, by this definition would you call any ConcurrentDictionary, Dictionary or List a database too? Or what is the difference here?

Hugo Kornelis
07/29/2013 11:25 AM by
Hugo Kornelis

@Andre: I checked some common defintions of "database": Wikipedia: "an organized collection of data". Merriam-Webster online: "a usually large collection of data organized especially for rapid search and retrieval (as by a computer)". Dictionary.com unabridged: "a comprehensive collection of related data organized for convenient access, generally in a computer". Colins English Dictionary: "a systematized collection of data that can be accessed immediately and manipulated by a data-processing system for a specific purpose" or "any large store of information". American Heritage (R) Science Dictionary: "A collection of data arranged for ease and speed of search and retrieval by a computer".

So I'd guess the answer to your question would be "yes".

Fred
08/05/2013 05:05 PM by
Fred

Good food for toughts.

Comments have been closed on this topic.