Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 421 words

In software engineering, the singleton pattern is a design pattern that restricts the instantiation of a class to one object. This is useful when exactly one object is needed to coordinate actions across the system.

More about this pattern.

I won’t show code or diagrams here. If you don’t know the Singleton pattern, you probably don’t have any business reading this series. Go hit the books and then come back for the rest of my review.

Of the top of my head, I can’t think of a single pattern that has been as denigrated as the Singleton pattern. It has been the bane of testers anywhere, and just about any Singleton implementation had had to become thread safe, given the demands that we usually have from our apps.

That basically means that any time that you use a Singleton, you have to be damn sure that your code is thread safe.  When it isn’t, this becomes really painful. That along would be a huge mark against it, since multi thread proofing code is hard. But Singleton also got a bad rep because they create hidden dependencies that were hard to break. Probably the most famous of them was HttpContext.Current and DateTime.Now.

Singleton may have a tattered reputation and wounded dignity, but it is still a crucially important pattern. Most of the issues that people have with the Singleton aren’t with the notion of the single instance, but with the notion of a global static gateway, which means that it becomes very hard to modify for things like tests, and it is easy to create code that is very brittle in its dependencies on its environment.

Common workaround to that is to break apart the notion of accessing the value and the single nature of the value. So you typically inject the value in, and something else, usually the container, is in charge of managing the lifetime of the object.

Common use cases for Singletons include caches (which would be pretty bad if they didn’t stuck around) and the NHibernate’s Session Factory (which is very expensive to create).

Recommendation: The notion of having just a single instance of an object is still very important, especially when you use that single instance to coordinate things. It does means that you have multi threading issues, but that can be solved. It is a very useful pattern, but you have to watch for the pit falls (global static accessor that is used all over the place is one of the major ones).

time to read 16 min | 3026 words

A customer had a problem. They were mostly using the RavenDB HiLo algorithm for saving documents to the database, which is very fast & cheap. That client, however, chose to use the identity method. Which means that RavenDB will assign the value.

This is usually used if you need to have sequential values. The identity is actually being managed internally by RavenDB, and that works perfectly fine.

Except… What happens when you enter replication to the mix. The documents with the identity values are replicated to the secondary server, and there we don’t have the identity value, we just have the docs being written with their full id. (users/1, users/2, users/3, etc).

So far, so good. But what happens when you have a failover and you need to write to the secondary, and you use the identity? Well, RavenDB ain’t stupid, and it won’t overwrite the users/1 document. Instead, it will search for the next available opening from the smallest identity value generated and use that. The code looks like this:

   1: private long GetNextIdentityValueWithoutOverwritingOnExistingDocuments(string key, 
   2:     IStorageActionsAccessor actions, 
   3:     TransactionInformation transactionInformation)
   4: {
   5:     long nextIdentityValue;
   6:     do
   7:     {
   8:         nextIdentityValue = actions.General.GetNextIdentityValue(key);
   9:     } while (actions.Documents.DocumentMetadataByKey(key + nextIdentityValue, transactionInformation) != null);
  10:     return nextIdentityValue;
  11: }

This works, great. Except when you have large number of documents that have already been written. Instead of the brute force search, we now use the following approach:

   1: public long GetNextIdentityValueWithoutOverwritingOnExistingDocuments(string key,
   2:     IStorageActionsAccessor actions,
   3:     TransactionInformation transactionInformation,
   4:     out int tries)
   5: {
   6:     long nextIdentityValue = actions.General.GetNextIdentityValue(key);
   7:  
   8:     if (actions.Documents.DocumentMetadataByKey(key + nextIdentityValue, transactionInformation) == null)
   9:     {
  10:         tries = 1;
  11:         return nextIdentityValue;
  12:     }
  13:     tries = 1;
  14:     // there is already a document with this id, this means that we probably need to search
  15:     // for an opening in potentially large data set. 
  16:     var lastKnownBusy = nextIdentityValue;
  17:     var maybeFree = nextIdentityValue*2;
  18:     var lastKnownFree = long.MaxValue;
  19:     while (true)
  20:     {
  21:         tries++;
  22:         if(actions.Documents.DocumentMetadataByKey(key + maybeFree, transactionInformation) == null)
  23:         {
  24:             if (lastKnownBusy + 1 == maybeFree)
  25:             {
  26:                 actions.General.SetIdentityValue(key, maybeFree);
  27:                 return maybeFree;
  28:             }
  29:             lastKnownFree = maybeFree;
  30:             maybeFree = Math.Max(maybeFree - (maybeFree - lastKnownBusy) / 2, lastKnownBusy + 1);
  31:  
  32:         }
  33:         else
  34:         {
  35:             lastKnownBusy = maybeFree;
  36:             maybeFree = Math.Min(lastKnownFree, maybeFree*2);
  37:         }
  38:     }
  39: }

This can figure out the first free item in a range of billion documents in under 100 tries, which I am pretty sure if good enough.

time to read 3 min | 414 words

Create objects based on a template of an existing object through cloning.

More about this pattern.

This is how it looks like:

Prototype Example

Surprisingly enough, there are very few useful concrete examples of this, even in the literature. A lot of the time you see reference to ConcreteImplA and ConcreteImplB.

The original impetus for the Prototype pattern was actually:

  • avoid subclasses of an object creator in the client application, like the abstract factory pattern does.
  • avoid the inherent cost of creating a new object in the standard way (e.g., using the 'new' keyword) when it is prohibitively expensive for a given application.

That is actually quite interesting. As I mentioned in the Factory Method analysis post, I like the notion of using Factory Delegate (and thus avoiding subclassing) quite a lot. This is usually useful for behavioral objects, that contains little state (it would be more accurate to say that their state is behavior, such as a class that mostly contains delegate members for different things). But for those sort of things, you usually don’t really need to modify them after the fact, so there isn’t much of a prototype here.

The second reasoning is not relevant for most things today. The cost of new is so near zero to be effectively meaningless.

But something that isn’t mentioned about this pattern is that it is very useful for multi threading. The notion of being able to handle a cloned object that can be modified independently of its original is key in things like caches, as you can see in the code above. We make heavy use of that internally inside RavenDB, for example, although we choose a slight more complex (and performant) route.

A key observation about this is that Prototype assumes long lives objects. Because otherwise, there wouldn’t be the prototype instance to clone from. In wide variety of applications today, that is simply not the case. Most of our objects live only for a single request. And anything whose lifetime is longer than a single request is usually persisted to a stable storage, rendering the basis for the Prototype pattern existence moot.

Recommendation: This is still a useful pattern for a limited number of scenarios. In particular, the ability to hand out a copy of the instance from a cache means that we don’t have to worry about multi threading. That said, beyond this scenario, I haven’t found many other uses for this.

time to read 6 min | 1123 words

Define an interface for creating an object, but let the classes that implement the interface decide which class to instantiate. The Factory method lets a class defer instantiation to subclasses.

More on this pattern.

Here is some sample code:

   1: public class MazeGame {
   2:   public MazeGame() {
   3:      Room room1 = MakeRoom();
   4:      Room room2 = MakeRoom();
   5:      room1.Connect(room2);
   6:      AddRoom(room1);
   7:      AddRoom(room2);
   8:   }
   9:  
  10:   protected virtual Room MakeRoom() {
  11:      return new OrdinaryRoom();
  12:   }
  13: }

This pattern is quite useful, and is in fairly moderate use. For example, you can take a look at WebClient.GetWebRequest, which is an exact implementation of this pattern. I like this pattern because this allows me to keep the Open Closed Principle, I don’t need to modify the class, I can just inherit and override it to change things.

Still, this is the class method. I like to mix things up a bit and not use a virtual method, instead, I do things like this:

   1: public class MazeGame {
   2:    public Func<Room> MakeRoom = () => new OrdinaryRoom();
   3: }

This allows me change how we are creating the room without even having to create a new subclass. In fact, it allows me to change this per instance.

I make quite a heavy use of this in RavenDB, for example. The DocumentConventions class is basically built of nothing else.

Recommendation: Go for the lightweight Factory Delegate approach. As with all patterns, use with caution and watch for overuse & abuse. In particular, if you need to manage state between multiple delegate, fall back to the overriding approach, because you can keep the state in the subclass.

time to read 4 min | 653 words

The intent of the Builder design pattern is to separate the construction of a complex object from its representation. By doing so, the same construction process can create different representations.

More about this pattern.

The sample code that usually comes with this pattern is something like this:

   1: PizzaBuilder hawaiianPizzaBuilder = new HawaiianPizzaBuilder();
   2: Cook cook = new Cook();
   3: cook.SetPizzaBuilder(hawaiianPizzaBuilder);
   4: cook.ConstructPizza();
   5: // create the product
   6: Pizza hawaiian = cook.GetPizza();

I find this sort of code to be extremely verbose and hard to read. Especially when we have a lot of options and things to do. Fluent Interfaces, however, are just an instance of the Builder pattern, and they are basically adding a modern API look & feel to the way we are actually constructing objects. Another thing to remember is that we are dealing with C#, and we have things like object initializers to do a lot of the heavy lifting for building objects. You should use that, for most cases.

NHibernate, for example, has the notion of a Builder, using the NHibernate.Cfg.Configuration object. It allows us to put all of the construction / validation code in one spot, and then the actual runtime code in a different place (and can assume a lot about its invariants). It also allows to do a lot of interesting things, like serializing the builder object (to save building time), which is something that you usually can’t / hard to do with real objects.

That said, you should be careful of code like the one listed above .What you have there is an overly abstract system. Requiring multiple steps to just get somewhere. If you find yourself feeding builders into builders, please stop and think about what you are doing. If you got there, you have not simplified the construction process.

Recommendation: This is still a very useful pattern. It should absolutely not be used if all you need to do is just setting some values. Reserve the Builder patterns where you actually have logic and behavior associated with the building process.

time to read 1 min | 156 words

I’ll be spending the last week on November in London, at the Skills Matter offices.

On the 26 Nov, I’ll be giving a 3 days NHibernate course.

And on the 29 Nov, I’ll be giving 2 full days of RavenDB awesomeness. This course is scheduled to run along the same time as the RavenDB 1.2 release, which leads me to the In The Brains session I’ll be giving along the way.

What is new in RavenDB 1.2

RavenDB 1.0 was exciting and fun. RavenDB 1.2 builds on top of that and adds a whole host of really nice features.
Come to hear about the new Changes API, or how you can use evil patching to make the database bow to your wished. Learn how you can add encryption and compression to your database in a few minutes, and watch how operational tasks became even simpler. In short, come and see all of the new stuff for RavenDB!

time to read 11 min | 2166 words

In my Abstract Factory post, I mentioned that I really don’t like the pattern, and in particular, code like this:

   1: static IGUIFactory CreateOsSpecificFactory()
   2: {
   3:    string sysType = ConfigurationSettings.AppSettings["OS_TYPE"];
   4:    if (sysType == "Win") 
   5:    {
   6:        return new WindowsFactory();
   7:    } 
   8:    else 
   9:    {
  10:        return new MacFactory();
  11:    }
  12: }

One of the comments mentioned that this might no be ideal, but it is still better than:

   1: if(RunningOnWindows)
   2: {
   3:     // code
   4: }
   5: else if(RunningOnMac)
   6: {
   7:    // code
   8: }
   9: else if(RunningOnLinux)
  10: {
  11:    // code
  12: }

And I agree. But I think that, as the comment mentioned, a far better alternative would be using the container. You can do this using:

   1: [OperationSystem("Windows")]
   2: public class WindowsFactory : IGUIFactory
   3: {
   4: }
   5:  
   6: [OperationSystem("Linux")]
   7: public class LinuxFactory : IGUIFactory
   8: {
   9: }
  10:  
  11: [OperationSystem("Mac")]
  12: public class MacFactory : IGUIFactory
  13: {
  14: }
  15:  

Then you just need to wire things through the container. Among other things, this means that we respect the open / closed principle. If we need to support a new system, we can just add a new class, we don’t need to modify code.

Remember, the Go4 book was written in the age of C++. Reflection didn’t exists, and that means that a lot of patterns do by hands things that can happen automatically.

time to read 5 min | 901 words

The essence of the Abstract Factory method Pattern is to "Provide an interface for creating families of related or dependent objects without specifying their concrete classes".

More about this pattern.

Here is some sample code:

   1: static IGUIFactory CreateOsSpecificFactory()
   2: {
   3:     string sysType = ConfigurationSettings.AppSettings["OS_TYPE"];
   4:     if (sysType == "Win") 
   5:     {
   6:         return new WindowsFactory();
   7:     } 
   8:     else 
   9:     {
  10:         return new MacFactory();
  11:     }
  12: }

I am in two minds about this pattern. On the one hand, we have pretty damning evidence that this has been really bad for the industry at large. For details, you can see Why I Hate Frameworks post. When I first saw that, just shortly after reading the Go4 for the first time, I was in tears from laughing. But the situation he describe is true, accurate and still painful today.

Case in point, WCF suffers from a serious overuse of abstract factories. For example, IInstanceProvider (and I just love that in order to wire that in you usually have to implement IServiceBehavior).

As the I Hate Frameworks post mentioned:

Each hammer factory factory is built for you by the top experts in the hammer factory factory business, so you don't need to worry about all the details that go into building a factory.

Awesome, or not, as the case may be.

Then again, it is a useful pattern. The problem is that in the general case, creating objects that create objects (that create even more objects) is a pretty good indication that your architecture is already pretty hosed.  You should strive to an architecture that has minimal amount of levels, and an abstract factory is a whole new level even on its own.

Recommendation: Avoid if you can. If you run into a place where you think that needs this, consider if you can simplify your architecture to the point where this is not required.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}