Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,546
|
Comments: 51,161
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 431 words

Recently we got a bug report about the performance of Windsor when registering large number of components (thousands). I decided to sit down and investigate this, and found out something that was troublesome.

Internally, registering a component would trigger a check for all registered components that are waiting for a dependency. If you had a lot of components that were waiting for dependency, registering a new component degenerated to an O(N^2) operation, where N was the number of components with waiting dependencies.

Luckily, there was no real requirement for an O(N^2) operation, and I was able to change that to an O(N) operation.

Huge optimization win, right?

In numbers, we are talking about 9.2 seconds to register 500 components with no matching dependencies. After the optimization, we dropped that to 500 milliseconds. And when we are talking about larger number of components, this is still a problem.

After optimization, registering 5,000 components with no matching dependencies took 44.5 seconds. That is better than before (where no one has the patience to try and figure out the number), but I think we can improve up it.

The problem is that we are still paying that O(N) cost for each registration. Now, to suppose systems that already uses Windsor, we can’t really change the way Windsor handle registrations by default, so I came up with the following syntax, that safely change the way Windsor handles registration:

var kernel = new DefaultKernel();
using (kernel.OptimizeDependencyResolution())
{
for (int i = 0; i < 500; i++)
{
kernel.AddComponent("key" + i, typeof(string), typeof(string));
}
}

Using this method, registering 5,000 components drops down to 2.5 seconds.

I then spent additional time finding all the other nooks and crannies where optimizations hid, dropping the performance down to 1.4 seconds.

Now, I have to say that this is not linear performance improvement. Registering 20,000 components will take about 25 seconds. This is not a scenario that we worry over much about.

The best thing about the non linear curve is that for a 1,000 components, which is what we do care about, registration takes 240 milliseconds. Most applications don’t get to have a thousand components, anyway.

There are also other improvements made in the overall runtime performance of Windsor, but those would be very hard to notice outside of a tight loop.

time to read 2 min | 293 words

imageSome would say that it is about time, I would agree. Windsor might not be the OSS project in pre release state for the longest time (I think that the honor belong to Hurd), but it spent enough time at that state to at least deserve a honorary mention.

That was mostly because, although Windsor was production ready for the last three or four years or so, most of the people making use of it were happy to make use of the trunk version.

If you will look, you won’t find Windsor 1.0, only release candidates for 1.0. As I believe I mentioned, Windsor has been production ready for a long time, and for the full release we decided to skip the 1.0 designator, which doesn’t really fit, and go directly to 2.0

The last Windsor release (RC3) was almost a year and a half ago, and in the meantime, much has improved in Windsor land. Adding upon the already superb engine and facilities, we have fitted Windsor to the 3.5 release of the .Net framework, created a full fledged fluent API to support easy configuration, allowed more granular control over the behavior of the container when selecting components and handlers and improved overall performance.

All in all, pretty good stuff, even if I say so myself. Just to give you an idea, the list of changes from the previous release goes for quite a while, so I am going to let the short listing above to stand in its place.

You can get the new release from the source forge site.

time to read 3 min | 427 words

I read Glenn' s post about MEF's not supporting open generic types with somewhat resembling shock. The idea that it isn't supporting this never even crossed my mind, it was a given that this is a mandatory feature for any container in the .NET land.

Just to give you an idea, what this means is that you can't register Repository<T> and then resolve Repository<Order>. In 2006, I wrote an article for MSDN detailing what has since became a very common use of this pattern. Generic specialization is not something that I would consider optional, it is one of the most common usage patterns of containers in the .NET land. IRepository<T> is probably the most common example that people quote, but there are others as well.

This is not a simple feature, let me make it clear. Not simple at all. I should know, I implement that feature for both Object Builder and Windsor. But that is not what I would consider an optional one.

I am even more disturbed by the actual reasoning behind not supporting this. It is a technical limitation of MEF because internally all components are resolved by string matching, rather than CLR Types. This decision is severely limiting the things that MEF can do. Not supporting what is (in my opinion) is a pretty crucial feature is one example of that, but there are other implications. It means that you can't really do things like polymorphic resolutions, that your choices in extending the container are very limited, because the container isn't going to carry the information that is required to make those decision.

I would advice the MEF team to rethink the decision to base the component resolution on strings. At this point in time, it is still possible to change things ( and yes, I know it isn't as easy as I make it seems ), because not supporting open generic types is bad, but not having the ability to do so, and the reason for that (not keeping CLR Type information) are even worse. I get that MEF needs to work with DLR objects as well, but that means that MEF makes the decision to have lousier support for CLR idioms for the benefit of the DLR.

Considering the usage numbers for both of them, I can't see this being a good decision. It is certainly possible to support them both, but if there are any tradeoffs that have to be made, I would suggest that it should be the DLR, and not the CLR, which would be the second class role.

time to read 2 min | 373 words

One of the most annoying things that we have to do during development is updating configuration files. That is why convention over configuration is such a successful concept. The problem is what to do when you can mostly use the convention, but need to supply configuration values as well.

Well, one of the nice things about Windsor is the ability to merge several sources of information transparently. Given this configuration:

<configuration>
	<configSections>
		<section name="castle"
			type="Castle.Windsor.Configuration.AppDomain.CastleSectionHandler, Castle.Windsor" />
	</configSections>
	<castle>
		<facilities>
			<facility id="rhino.esb" >
				<bus threadCount="1"
						 numberOfRetries="5"
						 endpoint="msmq://localhost/demo.backend"
             />
				<messages>
				</messages>
			</facility>
		</facilities>
		<components>
			<component id="Demo.Backend.SendEmailConsumer">
				<parameters>
					<host>smtp.gmail.com</host>
					<port>587</port>
					<password>*****</password>
					<username>*****@ayende.com</username>
					<enableSsl>true</enableSsl>
					<from>*****@ayende.com</from>
					<fromDisplayName>Something</fromDisplayName>
				</parameters>
			</component>
		</components>
	</castle>
</configuration>

And this auto registration:

var container = new WindsorContainer(new XmlInterpreter());
container.Register(
	AllTypes.Of(typeof (ConsumerOf<>))
		.FromAssembly(typeof(Program).Assembly)
	);

We now get the benefit of both convention and configuration. We can let the convention pick up anything that we need, and configure just the values that we really have to configure.

time to read 3 min | 582 words

PostSharp is an AOP framework that works using byte code weaving. That is, it re-writes your IL to add behaviors to it. From my point of view, it is like having the cake (interception, byte code weaving) and eating it (I haven't even looked at the PostSharp source code, just used the binary release).

My initial spike with it went very well. Here it is:

[Serializable]
public class Logger : OnFieldAccessAspect
{
    public override void OnGetValue(FieldAccessEventArgs eventArgs)
    {
        Console.WriteLine(eventArgs.InstanceTag);
        Console.WriteLine("get value");
        base.OnGetValue(eventArgs);
    }

    public override InstanceTagRequest GetInstanceTagRequest()
    {
        return new InstanceTagRequest("logger", new Guid("4f8a4963-82bf-4d32-8775-42cc3cd119bd"), false);
    }

    public override void OnSetValue(FieldAccessEventArgs eventArgs)
    {
        int i = (int?)eventArgs.InstanceTag ?? 0;
        eventArgs.InstanceTag = i + 1;
        Console.WriteLine("set value");
        base.OnSetValue(eventArgs);
    }
}

This is an aspect that run on each field access. It is not really useful, but it helps to show how things works. A couple of things that are I think are insanely useful:

  • Aspects are instantiated at compile time, allowed time to set themselves up, then serialized to an resource in the assembly. At runtime, they are de-serialized and ready to run. The possibilities this give you are amazing.
  • InstanceTag is a way to keep additional data per aspect.

Now, let us assume that I want to add the aspect to this code:

[Logger]
public class Customer
{
    public string Name { get; set; }
}

Note, there is no field. (Well, there is, it is generated by the compiler). Now we compile and run the PostSharp post compile step. With that, we can now investigate what is going on.

image

As you can see, we are deserializing the attribute and storing it in a field that we can now access. Let us check the Customer implementation now:

image

We have the logger field, which is used for something, but we also have the ~get~<Name>k__Backingfield and ~set~<Name>k__BackingField. <Name>k__BackingField (and I would love to hear the story behind that) is the compiler generated field that was created for us. The ~get~... and ~set~ are generated by PostSharp. Before we look at them, we will look at the implementation of Name.

image

Where it used to call the field directly, now it is doing this via a method call. And now we can look at those method calls.

image

There is a lot going on here. We create a new field access event arg, call the aspect method, and return the value. Note that the state (instance tag) is stored in the object as well, for each field access.

It looks very well done.

time to read 4 min | 613 words

In my previous post I introduced the basis of context as an architectural pattern. Now I want to talk about how we can implement that using Windsor and a new extensibility point: IModelInterceptersSelector.

The interface is defined as:

/// <summary>
/// Select the appropriate interecptors based on the application specific
/// business logic
/// </summary>
public interface IModelInterceptorsSelector
{
    /// <summary>
    /// Select the appropriate intereceptor references.
    /// The intereceptor references aren't neccessarily registered in the model.Intereceptors
    /// </summary>
    /// <param name="model">The model to select the interceptors for</param>
    /// <returns>The intereceptors for this model (in the current context) or a null reference</returns>
    /// <remarks>
    /// If the selector is not interested in modifying the interceptors for this model, it 
    /// should return a null reference and the next selector in line would be executed (or the default
    /// model.Interceptors).
    /// If the selector return a non null value, this is the value that is used, and the model.Interectors are ignored, if this
    /// is not the desirable behavior, you need to merge your interceptors with the ones in model.Interecptors yourself.
    /// </remarks>
    InterceptorReference[] SelectInterceptors(ComponentModel model);

    /// <summary>
    /// Determain whatever the specified has interecptors.
    /// The selector should only return true from this method if it has determained that is
    /// a model that it would likely add interceptors to.
    /// </summary>
    /// <param name="model">The model</param>
    /// <returns>Whatever this selector is likely to add intereceptors to the specified model</returns>
    bool HasInterceptors(ComponentModel model);
}

And registering it in the container is simply:

container.Kernel.ProxyFactory.AddInterceptorSelector(selector);

Interceptors are the basis of AOP, but traditionally, you didn't get a lot of choices in how you compose your interceptors at runtime. Using IModelInterceptersSelector make it extremely easy to modify the selection of interceptors based on relevant business logic.

Let us take the following example. We have a warehouse service that we want to add caching to. However, we can't use the cache in the request comes from the fulfillment service. First, we define the caching interceptor, then, we define the logic that controls adding or removing it.

public class WarehouseCachingInterceptorSelector : IModelInterceptorsSelector
{
    public InterceptorReference[] SelectInterceptors(ComponentModel model)
    {
        if(model.Service!=typeof(IWarehouse))
            return null;
        if(Origin.IsFromFulfillment)
            return null;
        return new InterceptorReference[]{new InterceptorReference(typeof(WarehouseCachingInterceptor)), };
    }

    public bool HasInterceptors(ComponentModel model)
    {
        return model.Service == typeof (IWarehouse);
    }
}

And now we get caching for everything except for fulfillment. And we get this in a clean and very easy to understand way. :-D

time to read 3 min | 527 words

In my previous post I introduced the basis of context as an architectural pattern. Now I want to talk about how we can implement that using Windsor and a new extensibility point: IHandlerSelector.

The interface is defined as:

/// <summary>
/// Implementors of this interface allow to extend the way the container perform
/// component resolution based on some application specific business logic.
/// </summary>
/// <remarks>
/// This is the sibling interface to <seealso cref="ISubDependencyResolver"/>.
/// This is dealing strictly with root components, while the <seealso cref="ISubDependencyResolver"/> is dealing with
/// dependent components.
/// </remarks>
public interface IHandlerSelector
{
    /// <summary>
    /// Whatever the selector has an opinion about resolving a component with the 
    /// specified service and key.
    /// </summary>
    /// <param name="key">The service key - can be null</param>
    /// <param name="service">The service interface that we want to resolve</param>
    bool HasOpinionAbout(string key, Type service);

    /// <summary>
    /// Select the appropriate handler from the list of defined handlers.
    /// The returned handler should be a member from the <paramref name="handlers"/> array.
    /// </summary>
    /// <param name="key">The service key - can be null</param>
    /// <param name="service">The service interface that we want to resolve</param>
    /// <param name="handlers">The defined handlers</param>
    /// <returns>The selected handler, or null</returns>
    IHandler SelectHandler(string key, Type service, IHandler[] handlers);
}

And registering it in the container is simply:

container.Kernel.AddHandlerSelector(selector);

A handler selector is asked if it wants to express an opinion on a particular component resolution, based on key (optional) and type. Assuming we say yes, we are called to select the appropriate handler from all the registered handlers that can satisfy that request.

Let us say that we want to recover from the database being down by serving an implementation that reads from only the cache, we can implement it thusly:

public class DataAccessHandlerSelector : IHandlerSelector
{
	bool databaseIsDown = false;

	public DataAccessHandlerSelector()
	{
		DatabaseMonitor.OnChangedState += 
			state => databaseIsDown = state == DatabaseState.Down;
	}

	public bool HasOpinionAbout(string key, Type service)
	{
		return databaseIsDown && service == typeof(IRepository);
	}

	public IHandler SelectHandler(string key, Type service, IHandler[] handlers)
	{
		return handlers.Where(x=>x.ComponentModel.Implementation == typeof(CacheOnlyRepository)).First();
   	}

}

Now we automatically replace, based on our own logic and the current context what type of component the container should resolve.

I am giving the example of detecting infrastructure change, but as important, and as interesting, is the ability to easily use this in order to select services in a multi tenant environment. We can use this approach to perform service overrides all over the place in a way that is natural, easy and extremely powerful.

Have fun...

time to read 5 min | 805 words

I am a big believer in using context in order to drive a system. What do I mean by that?

Note, I am going to talk about the problem in general, and its solution implementation using Windsor. The example is fictitious and is here to represent the problem in a way that allow me to talk about it in isolation, it doesn't necessarily represent good design.

It seems like just about all the applications that I had to deal with recently had to have the notion of system variability. Now, let us make it clear. System variability is a fancy name for the if statement. The problem with the if statement is that when you have a lot of them, it gets pretty tricky to understand what is going on with the system. That is why a common refactoring is replace conditional with polymorphism.

What I am usually talking about is "when we are in this condition, we should do X, otherwise, we should do Y". Let us take the simple idea of a warehouse service. If we are making a call from the web site, it is okay to return data that may not be accurate to the second. If we are calling from the fulfillment service, we need accurate, up to date results. A simple way of handling this is:

public bool ItemIsPhysicallyOnTheShelve(Guid id)
{
	if(Origin == Originators.Website)//can use caching
	{
		var result = Cache.Get<bool?>("item-on-shelve-" + id)
		if(result.HasValue)
			return result.Value;
	}
	// actual work and putting in cache
}

A more interesting example might be different business rules for making order authorization, based on whatever we have a strategic customer or not. In both cases, we have some context for the operation that modify the way that we deal with this operation.

public bool IsValid(Order order, ValidationSummary summary)
{
	IRule[] rules = CurrentCustomer.IsStrategic ?
		strategicCutomerRules : normalCustomerRules;
	foreach(IRule rule in rules)
	{
		rule.Validate(order, summary);
	}
	return summary.HasErrors;
}

One way of dealing with that is as you see in the code samples, get the state from somewhere and make decisions based on that. Another, more advance option is to create:

  • IWarehouseService
  • DefaultWarehouseService
  • CachingWarehouseServiceDecorator

And because decorators are really annoying, we will use AOP to deal with it by creating a caching interceptor.

Now the issue is mere configuration, I can deal with that by flipping bits in the container configuration. The second example can be solved by creating two components with different rule sets and using that. The problem is that this remove the coding issues, but it creates a more subtle and much harder to deal with problems.

If I rely on the container configuration alone, I suddenly have logic there. Important business logic. That is not a good idea, I think. Especially since this means that at some point my code has to make an explicit decision about what component to use, and that breaks the infrastructure independence rule.

What this boil down to is that now I have to manage a lot of the complexity in the application using the container configuration and tie the working of the system into it. That works if the number of variables that I have to juggle is small, but if I have a lot of axes (plural: axis)  that are orthogonal to one another, it is getting complex very fast.

My solution for that problem is to define a service and its context as a cohesive unit. That is, the concept of a service contains its interface, all of its implementations and the business logic required to select which implementation (and configuration) to choose for a given context.

In the warehouse example above, what we will have is:

  • IWarehouseService
  • DefainltWarehouseService
  • WarehouseCachingInterceptor
  • WarehouseInterceptorsSelector

Now all of those are part of the same service. The last one is where we isolate the actual decision about what type of implementation we should get. In this case, we use Windsor's IModelInterecptorsSelector to add additional, context bound, interceptors to the service.

But that is just from the interceptors side, what about the selection of the appropriate rules? We can handle that using ISubDependencyResolver, where we can decide how we want to filter the rules that goes into IWarehouseService based on the context. For that matter, we might have a completely different warehouse implementations, VirtualWarehouseService and PhysicalWarehouseService. And we need to select between them based on some business criteria. We handle that using IHanlderSelector, that make the decision which component to create.

Again, IHandlerSelector, IModelInterceptorsSelector and ISubDependencyResolvers are all implementations of Windsor extensibility mechanisms (my next two posts will cover them in details) that allows us to make it aware of the context that we have in the application.

The purpose of the explicit notion of context is to allow us to deal with the variability in the application in an explicit manner. And that, in turn means that we get much better separation of concerns.

FUTURE POSTS

  1. Partial writes, IO_Uring and safety - about one day from now
  2. Configuration values & Escape hatches - 5 days from now
  3. What happens when a sparse file allocation fails? - 7 days from now
  4. NTFS has an emergency stash of disk space - 9 days from now
  5. Challenge: Giving file system developer ulcer - 12 days from now

And 4 more posts are pending...

There are posts all the way to Feb 17, 2025

RECENT SERIES

  1. Challenge (77):
    20 Jan 2025 - What does this code do?
  2. Answer (13):
    22 Jan 2025 - What does this code do?
  3. Production post-mortem (2):
    17 Jan 2025 - Inspecting ourselves to death
  4. Performance discovery (2):
    10 Jan 2025 - IOPS vs. IOPS
View all series

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}