World’s Smallest No SQL database
I used the following in a lecture called “Why you should never write your own database”. It has never been run, tested, or anything, but it serves as a good way to discuss the challenges involved in building real world databases.
Here is the server side code:
1: public class NoSqlDbController : ApiController2: {
3: static readonly ConcurrentDictionary<string, byte[]> data =4: new ConcurrentDictionary<string, byte[]>(StringComparer.InvariantCultureIgnoreCase);5:
6: public HttpResponseMessage Get(string key)7: {
8: byte[] value;9: if(data.TryGetValue(key, out value) == false)10: return new HttpResponseMessage(HttpStatusCode.NotFound);11:
12: return new HttpResponseMessage13: {
14: Content = new ByteArrayContent(value)15: };
16: }
17:
18: public void Put(string key, [FromBody]byte[] value)19: {
20: data.AddOrUpdate(key, value, (_, __) => value);21: }
22:
23: public void Delete(string key)24: {
25: byte[] value;26: data.TryRemove(key, out value);27: }
28: }
And the client side code:
1: public class NoSqlDbClient2: {
3: private readonly HttpClient[] clients;4:
5: public NoSqlDbClient(string[] urls)6: {
7: clients = new HttpClient[urls.Length];8: for (var i = 0; i < urls.Length; i++)9: {
10: clients[i] = new HttpClient { BaseAddress = new Uri(urls[i]) };11: }
12: }
13:
14: public Task PutAsync(string key, byte[] data)15: {
16: var client = clients[key.GetHashCode()%clients.Length];
17: return client.PutAsync("?key=" + key, new ByteArrayContent(data));18: }
19:
20: public Task DeleteAsync(string key, byte[] data)21: {
22: var client = clients[key.GetHashCode() % clients.Length];
23: return client.DeleteAsync("?key=" + key);24: }
25:
26: public async Task<byte[]> GetAsync(string key)27: {
28: var client = clients[key.GetHashCode() % clients.Length];
29: var r = await client.GetAsync("?key=" + key);30: return await r.Content.ReadAsByteArrayAsync();31:
32: }
33: }
And yes, that is a fully functional, scale out capable, sharding enabled No SQL Key/Value store in less than 60 lines of code.
Comments
So why not?
Starfish, A REALLY long list of reasons. I have a full lecture on that.
The most interesting thing is not "why not", but "why did Ayende still"?
How would you scale this thing out? How would you shard that?
Any chance that lecture was recorded somewhere? =)
I wouldn't call it a database without at least persistence.
[)amien
Igor, I wanted to talk about the implications of those design decisions.
Timay, Throw another shard. You have sharding build in into the client already.
Damien, There are a lot of in memory only dbs.
Nice idea and an excellent teaching tool.
And this lecture available for viewing/listening? For sale?
"Timay, Throw another shard You have sharding build in into the client already."
And rebuild the entire cluster. Cool :)
Shankar The conference was: http://www.usievents.com/en/conferences/12-paris-usi-2013 It was recorded, so it should be available at some point.
Greg, Sure. I was constrained by the fact that I had to fit both client & server code into a single PPT slide. That doesn't leave a lot of room for playing around, you know. And that gives you a whole new level of things to talk about regarding scaling, exactly on this issue.
Consider using OrdinalIgnoreCase. See http://msdn.microsoft.com/en-us/library/ms973919.aspx
Clearly, you shouldn't write your own database. But there are still those of us who actually know how to program.
On @'Justin Van Patten' comment, using OrdinalIgnoreCase speeds up the process; however, it will work correctly if you think just about English. OrdinalIgnoreCase will not work correctly on some other languages (refer to the Turkish I example on the same link you mentioned). We also have had that issue at work and we needed to update all our string compare cases to 'InvariantCultureIgnoreCase' in order to avoid such issues. Since the author of the code wanted to come with a minimal code snippet, I think his choice of 'InvariantCultureIgnoreCase' is perfect to cover other languages since not all the readers are those who code (just) English apps.
Ayende, you are nuts... But we <3 you :)
Cool, but agree with Damien on the persistence, seems misleading, by this definition would you call any ConcurrentDictionary, Dictionary or List<T> a database too? Or what is the difference here?
@Andre: I checked some common defintions of "database": Wikipedia: "an organized collection of data". Merriam-Webster online: "a usually large collection of data organized especially for rapid search and retrieval (as by a computer)". Dictionary.com unabridged: "a comprehensive collection of related data organized for convenient access, generally in a computer". Colins English Dictionary: "a systematized collection of data that can be accessed immediately and manipulated by a data-processing system for a specific purpose" or "any large store of information". American Heritage (R) Science Dictionary: "A collection of data arranged for ease and speed of search and retrieval by a computer".
So I'd guess the answer to your question would be "yes".
Good food for toughts.
Good example
Comment preview