Ayende @ Rahien

Refunds available at head office

SQL CE Transaction Handling

Update: Yes, I am crazy! Turn out that I forgot to do "command.Transaction = tx;" and then I went and read some outdated documentation, and got the completely wrong picture, yuck! I still think that requiring "command.Transaction = tx;" is bad API design and error prone (duh!).

Someone please tell me that I am not crazy. The output out this program is:

Wrote item
Wrote item
Wrote item
3
Wrote item
Wrote item
4
Wrote item
Wrote item
5
Wrote item
Wrote item
6
Wrote item

This is wrong on so many levels...

public class Program
{
	const string connectionString = "Data Source=test.dsf";

	public static void Main(string[] args)
	{
		File.Delete("test.dsf");
		var engine = new SqlCeEngine(connectionString);
		engine.CreateDatabase();

		using (var connection = new SqlCeConnection(connectionString))
		{
			connection.Open();
			SqlCeCommand command = connection.CreateCommand();
			command.CommandText = @"CREATE TABLE Test(Id INT IDENTITY PRIMARY KEY, Name NVARCHAR(25) NOT NULL)";
			command.ExecuteNonQuery();
		}

		ThreadPool.QueueUserWorkItem(ReadFromDb);

		using (var connection = new SqlCeConnection(connectionString))
		{
			connection.Open();
			using(IDbTransaction tx = connection.BeginTransaction(IsolationLevel.Serializable))
			{
				while(true)
				{
					using (SqlCeCommand command = connection.CreateCommand())
					{
						command.CommandText = @"INSERT INTO Test(Name) VALUES('A');";
						command.ExecuteNonQuery();
					}
					Console.WriteLine("Wrote item");
					Thread.Sleep(500);
				}
			}
		}
	}

	private static void ReadFromDb(object state)
	{
		Thread.Sleep(1000);
		using (var connection = new SqlCeConnection(connectionString))
		{
			connection.Open();
			using (IDbTransaction tx = connection.BeginTransaction(IsolationLevel.Serializable))
			{
				while (true)
				{
					using (SqlCeCommand command = connection.CreateCommand())
					{
						command.CommandText = @"SELECT COUNT(*) FROM Test;";
						Console.WriteLine(command.ExecuteScalar()); 
					}
					Console.WriteLine("Wrote item");
					Thread.Sleep(500);
				}
			}
		}
	}
}

NHibernate 2.0 Alpha is out!

image

It gives me great pleasure to announce that NHiberante 2.0 Alpha 1 was released last night and can be downloaded from this location.

We call this alpha, but many of us are using this in production, so we are really certain in its stability. The reason that this is an alpha is that we have made a lot of changes in the last nine months (since the last release), and we want to get more real world experience before we ship this. Recent estimates are of about 100,000 lines of code has changed since the last release.

You can see the unofficial change list below. Please note that there are breaking changes in the move to NHibernate 2.0. There are also significant improvement in many fields, as you will see in a moment.

We are particularly interested in hearing about compatibility issues, performance issues and "why doesn't this drat work?" issues.

We offer support for moving to NHibernate 2.0 Alpha on the NHibernate mailing list at: nhusers@googlegroups.com (http://groups.google.com/group/nhusers).

And now, for the changes:

  • New features:
    • Add join mapping element to map one class to several tables
    • <union> tables and <union-subclass> inheritance strategy
    • HQL functions 'current_timestamp', 'str' and 'locate' for PostgreSQL dialect
    • VetoInterceptor - Cancel Calls to Delete, Update, Insert via the IInterceptor Interface
    • Using constants in select clause of HQL
    • Added [ Table per subclass, using a discriminator ] Support to Nhibernate
    • Added support for paging in sub queries.
    • Auto discovery of types in custom SQL queries
    • Added OnPreLoad & OnPostLoad Lifecycle Events
    • Added ThreadStaticSessionContext
    • Added <on-delete> tag to <key>
    • Added foreign-key="none" since the Parent have not-found="ignore". (not relevant to SQL Server)
    • Added DetachedQuery
    • ExecuteUpdate support for native SQL queries
    • From Hibernate:
      • Ported Actions, Events and Listeners
      • Ported StatelessSession
      • Ported CacheMode
      • Ported Statistics
      • Ported QueryPlan
      • Ported ResultSetWrapper
      • Ported  Structured/Unstructured cache
      • Ported SchemaUpdate
      • Ported Hibernate.UserTypes
      • Ported Hibernate.Mapping
      • Ported Hibernate.Type
      • Ported EntityKey
      • Ported CollectionKey
      • Ported TypedValue
      • Ported SQLExceptionConverter
      • Ported Session Context
      • Ported CascadingAction
  • Breaking changes:
    • Changed NHibernate.Expression namespace to NHibernate.Criterion
    • Changed NHiberante.Property namespace to NHiberante.Properties
    • No AutoFlush outside a transaction - Database transactions are never optional, all communication with a database has to occur inside a transaction, no matter if you read or write data.
    • <nhibernate> section is ignored, using <hibernate-configuration> section (note that they have different XML formats)
    • Configuration values are no longer prefixed by "hibernate.", if before you would specify "hibernate.dialect", now you specify just "dialect"
    • IInterceptor changed to match the Hibernate 3.2 Interceptor - interface changed
    • Will perform validation on all named queries at initialization time, and throw if any is not valid.
    • NHibernate will return long for count(*) queries on SQL Server
    • SaveOrUpdateCopy return a new instance of the entity without change the original.
    • INamingStrategy interface changed
    • NHibernate.Search - Moved Index/Store attributes to the Attributes namespace
    • Changes to IType, IEntityPersister, IVersionType - of interest only to people who did crazy stuff with NHibernate.
    • <formula> must contain parenthesis when needed
    • IBatcher interface change
  • Fixed bugs:
    • Fixing bug with HQL queries on map with formula.
    • Fixed exception when the <idbag> has a <composite-element> inside; inside which, has a <many-to-one>
    • Multi criteria doesn't support paging on dialects that doesn't support limiting the query size using SQL.
    • Fixed an issue with limit string in MsSql2005 dialect sorting incorrectly on machines with multiple processors
    • Fixed an issue with getting NullReferenceException when using SimpleSubqueryExpression within another subexpression
    • Fixed Null Reference Exception when deleting a <list> that has holes in it.
    • Fixed duplicate column name on complex joins of similar tables
    • Fixed MultiQuery force to use parameter in all queries
    • Fixed concat function fails when a parameter contains a comma, and using MaxResults
    • Fixed failure with Formula when using the paging on MSSQL 2005 dialect
    • Fixed PersistentEnumType incorrectly assumes enum types have zero-value defined
    • Fixed SetMaxResults() returns one less row when SetFirstResult() is not used
    • Fixed Bug in GetLimitString for SQL Server 2005 when ordering by aggregates
    • Fixed SessionImpl.EnableFilter returns wrong filter if already enabled
    • Fixed Generated Id does not work for MySQL
    • Fixed one-to-one can never be lazy
    • Fixed FOR UPDATE statements not generated for pessimistic locking
  • Improvements:
    • Added Guid support for Postgre Dialect
    • Added support for comments in queries
    • Added Merge and Persist to ISession
    • Support IFF for SQL Server
    • IdBag now work with Identity columns
    • Multi Criteria now uses the Result Transformer
    • Handling key-many-to-one && not-found
    • Can now specify that a class is abstract in the mapping.
  • Guidance:
    • Prefer to use the Restrictions instead of the Expression class for defining Criteria queries.
  • Child projects:
    • Added NHibernate.Validator
    • Added NHibernate.Shards
    • NHibernate.Search updated match to Hibernate Search 3.0
  • Criteria API:
    • Allow Inspection, Traversal, Cloning and Transformation for ICriteria and DetachedCriteria
      • Introduced CriteriaTransformer class
      • GetCriteriaByPath, GetCriteriaByAlias
    • Added DetachedCriteria.For<T>
    • Added Multi Criteria
    • Projections can now pass parameters to the generated SQL statement.
    • Added support for calling Sql Functions (HQL concept) from projections (Criteria).
    • Added ConstantProjection
    • Added CastProjection
    • Can use IProjection as a parameter to ICriterion
  • Better validation for proxies:
    • Now supports checking for internal fields as well
    • Updated Castle.DynamicProxy2.dll to have better support for .NET 2.0 SP1
  • SQL Lite:
    • Support for multi query and multi criteria
    • Supporting subselects and limits
    • Allowed DROP TABLE IF EXISTS semantics
  • PostgreSQL (Npgsql):
    • Enable Multi Query support for PostgreSQL
  • FireBird:
    • Much better overall support
  • Batching:
    • Changed logging to make it clearer that all commands are send to the database in a single batch.
    • AbstractBatcher now use the Interceptor to allow the user intercept and change an SQL before it's preparation
  • Error handling:
    • Better error message on exception in getting values in Int32Type
    • Better error message when using a subquery that contains a reference to non existing property
    • Throws a more meaningful exception when calling UniqueResult<T>() with a value type and the query returns null.
    • Overall better error handling
    • Better debug logs
  • Refactoring:
    • Major refactoring internally to use generic collections instead of the non generic types.
    • Major refactoring to the configuration and the parsing of hbm files.
  • Factories:
    • Added ProxyFactoryFactory
    • Added BatchingBatcherFactory

The quote generation DSL

I am doing some work on the DSL book right now, and I run into this example, which is simple too delicious not to post about.

Assume that you have the following UI, which you use to let a salesperson generate a quote for your system.

image

This is much more than just a UI issue, to be clear. You have fully fledged logic system here. Calculating the total cost is the easy part, first you have to understand what you need.

Let us define a set of rules for the application, is will be clearer when we have the list in front of us:

  • The Salary module requires a machine per every 150 users.
  • The Taxes module requires a machine per 50 users.
  • The Vacations module requires the Scheduling Work module.
  • The Vacations module requires the External Connections module.
  • The Pension Plans module requires the External Connections module.
  • The Pension Plans module must be on the same machine as the Health Insurance module.
  • The Health Insurance module requires the External Connections module.
  • The Recruiting module requires a connection to the internet, and therefore requires a fire wall of the recommended list.
  • The Employee Monitoring module requires the CompMonitor component

Of course, this fictitious sample is still too simple, we can probably sit down and come up with fifty or so more rules that we need to handle. Just handling the second level dependencies (External Connections, CompMonitor, etc) would be a big task, for example.

Assume that you have not a single such system, but 50 of them. I know of a company that spent 10 years and has 100,000 lines of C++ code (that implements a poorly performing Lisp machine, of course) to solve this issue.

My solution?

specification @vacations:
	requires @scheduling_work
	requires @external_connections
	
specification @salary:
	users_per_machine 150
	
specification @taxes:
	users_per_machine 50

specification @pension:
	same_machine_as @health_insurance

Why do we need a DSL for this? Isn’t this a good candidate for data storage system? It seems to me that we could have expressed the same ideas with XML (or a database, etc) just as easily. Here is the same concept, now express in XML.

<specification name="vacation">
	<requires name="scheduling_work"/>
	<requires name="external_connections"/>
</specification>

<specification name="salary">
	<users_per_machine value="150"/>
</specification>

<specification name="taxes">
	<users_per_machine value="50"/>
</specification>

<specification name="pension">
	<same_machine_as name="health_insurance"/>
</specification>

That is a one to one translation of the two, why do I need a DSL here?

Personally, I think that the DSL syntax is nicer, and the amount of work to get from a DSL to the object model is very small compared to the work required to translate to the same object model from XML.

That is mostly a personal opinion, however. For pure declarative DSL, we are comparable with XML in almost all things. It gets interesting when we decide that we don’t want this purity. Let us add a new rule to the mix, shall we?

  • The Pension Plans module must be on the same machine as the Health Insurance module, if the user count is less than 500.
  • The Pension Plans module requires distributed messaging backend, if the user count is great than 500.

Trying to express that in XML can be a real pain. In fact, it means that we are trying to shove programming concepts into the XML, which is always a bad idea. We could try to put this logic in the quote generation engine, but that is complicating it with no good reason, tying it to the specific application that we are using, and in general making a mess.

Using our DSL (with no modification needed), we can write it:

specification @pension: 
	if information.UserCount < 500: 
		same_machine_as @health_insurance 
	else: 	
		requires @distributed_messaging_backend

As you can imagine, once you have run all the rules in the DSL, you are left with a very simple problem to solve, with all the parameters well known.

In fact, throughout the process, there isn't a single place of overwhelming complexity.

I like that.

Attack of the virtual machines

I just found out that I have too many virtual machines around. Right now I am working on three plus the host machine. I have the Mac's Windows 2008's Fusion VM as well, come to think of it.

image

A web server in 30 lines of code

Just found myself writing that, and it was amusing.

import System.Net
import System.IO

if argv.Length != 2:
	print "You must pass [prefix] [path] as parameters"
	return

prefix = argv[0]
path = argv[1]

if not Directory.Exists(path):
	print "Could not find ${path}"
	return

listener = HttpListener()
listener.Prefixes.Add(prefix)
listener.Start()

while true:
	context = listener.GetContext()
	file = Path.GetFileName(context.Request.RawUrl)
	fullPath = Path.Combine(path, file)
	if File.Exists(fullPath):
		context.Response.AddHeader("Content-Disposition","attachment; filename=${file}")
		bytes = File.ReadAllBytes(fullPath)
		context.Response.OutputStream.Write(bytes, 0, bytes.Length)
		context.Response.OutputStream.Flush()
		context.Response.Close()
	else:
		context.Response.StatusCode = 404
		context.Response.Close()

Craig Neuwirt has a blog

Well, after a long time of bugging him about it, I finally decided to give Craig the first Hostile Blogging Award. So Craig has a blog now, which is wonderful.

Who is Craig and why should you care to read what he is thinking about?

  • A friend
  • Committer to both Castle Project and Rhino Tools
  • Main guy behind Binsor 2.0
  • Main guy behind Zero Config WCF
  • All around interesting guy

Subscribed, and very excited.

Tags:

Published at

Hibernating Rhinos #8 - Going Distributed & Building our own Bus

image

Well, I was toying around with the idea for about a month or so, and finally I got around to actually record & editing that.

Highlights:

  • Vastly improved sound quality. I think you'll enjoy it.
  • Vastly extended in time & scope. For some reason, this screencast is longer than many full length movies. We also write our own bus implementation from scratch, and discuss the implementation details there.
  • This is more of a low level discussion, not a high level architectural discussion about why you want a bus (well, I do talk about it a bit, but mostly we implement the bus).
  • The first 45 minutes are dedicated to moving from an old style RPC to an async batching bus approach, that still uses the request / reply. The rest is dedicated to building the one way, message passing, queue based, service bus.
    • There are some interesting challenges there, and I hope you'll make sense of my grunts as I write the code.
    • The last hour or so of the screen cast it live coding, and you get to see how I revert some design decisions as they turn out to be problematic.

The technical details:

  • Total length: An hour and forth minutes(!)
  • Size: 160 MB
  • Code starts on 04:31

Go to download page

Annoying

To record a full screen cast and realize 90% from the end start that the microphone wasn't recording.

WCF Async without proxies

I don't like generated proxies for web services, they are generally ugly and not fun to work with. However, up until recently I believed that I had to deal with them if I wanted to use the async operations for web services. As it turn out, I was wrong.

We can actually define an WCF service interface like this:

[ServiceContract]
public interface IAsyncBus
{
	[OperationContract(AsyncPattern = true)]
	IAsyncResult BeginProcess(IMessage[] request, AsyncCallback callback, object asyncState);

	IMessage[] EndProcess(IAsyncResult result);
}

Now you can work with it using:

IAsyncBus bus = new ChannelFactory<IAsyncBus>().CreateChannel();
ar = bus.BeginProcess(...);
//do work
bus.EndProcess(ar);

The problem with that is that on the server side, I also have to do things in an async manner. This is sometimes appropriate, but it tends to be a major PITA for a lot of things.

As it turn out, we can solve the issue with aliasing. In the interface dll, we can define:

[ServiceContract]
public interface IBus
{
	[OperationContract]
	IMessage[] Process(IMessage[] request);
}

[ServiceContract(Name="IBus")]
public interface IAsyncBus
{
	[OperationContract(AsyncPattern = true)]
	IAsyncResult BeginProcess(IMessage[] request, AsyncCallback callback, object asyncState);
	IMessage[] EndProcess(IAsyncResult result);
}

Now, you can create an instance of IAsyncBus to communicate with IBus directory. On the server side, we implement IBus, and handle the message in a synchronous manner. Easy, simple, and doesn't require any proxies :-)

Code Generation and the Open Close Principal

Vladan Strigo made a really inspiring comment about code generation in the NH Users list.

I didn't want to use codesmith for that because it would de-OCP-fy me in my future efforts :)

That makes a lot of sense, and manages to touch on what bothers me the most with code gen as an architectural approach. It gives up a very important concept, and that affect a lot of the stuff that is dependant on that.

The problem of over aggressive caching

Following the recent profiling effort, I decided to put far more aggressive caching into SvnBridge.

I set it up so it would cache the full revision from TFS on any query, and then serve it from the cache. When I run it against the test server it worked beautifully. Then I had run into this issue:

image

I am pretty sure that this is not going to be an acceptable scenario. To be rather exact, I would find it acceptable if it was a one time cost, but the problem is that this is a cost that you have to pay per revision. And that is unacceptable. The major problem is that this uses the underlying QueryItems() method, which returns all of the results, including those from previous revisions. This means that on a busy server (like tfs03), the cost of doing such a query is high.

The number of files returned is actually pretty small (910 in this case), but I assume that it have to check all the files on the server for permission before it allows me to get them.

I wonder how Rhino Security would handle this situation, it wouldn't even get the data out of the DB, and the query enhancement is pretty light weight. I assume it would be pretty fast.

Anyway, this is obviously a bad approach. For now, I made it load only the path (and its descendants) that we need, this mean that we don't get the same benefit of preemptive caching and might talk to the server a bit too much. However, it turn out that the way SvnBridge and SVN makes requests in a way that make this style of caching work fairly well. We always ask for the directory before asking for the descendant, and we have fairly long conversations about the same revision, so that is good candidate.

That isn't optimal for big projects, with a lot of files and a lot of activity, however. Because the way I handle it now, we download the entire project metadata for each revision, that can be a lot for those kind of projects, and having to download them each and every time is a waste.

SvnBridge already contains a very smart piece of code (the UpdateDiffCalculator class) that can figure out the differences between two revisions and only get the items that it needs. The problem is that the caching layer is built mainly in order to support that class.

I think that I'll need to get a bit smarter about this in the future, but for now it seems to be doing the work very well.

Tags:

Published at

SvnBridge and Multiple TFS Servers

The most highly rated request for SvnBridge was adding support for multiple TFS servers without having to run multiple instances of SvnBridge, which was a hassle.

Today I finished working on the implementation, and the UI got a new check box:

image

Funny, but this took almost three days to implement.

Yes, I don't do WinForms much, and I am pretty sure somewhere there is a way to do checkboxes, but I like writing code.*

What actually took a lot of time was working on the path routing inside SvnBridge, so it would send the proper response to the client, which would return it right back to it. The SVN WebDAV protocol is full of indirection, and that is not saying the least.

At any rate, this now works, and you can use a single instance.

Now, how do I do that? Well, using the Microsoft Mind Reading Patent technology, of course:

image

I mean, don't you know that telepathy enabled software is the next big thing?

Now, all you have to do is start SvnBridge, and you can access any TFS server in your possession.

Just issue this command to get the SvnBridge code:

svn checkout http://localhost:8081/tfs03.codeplex.com/SvnBridge

Note the tfs03.codeplex.com in the URL ? This has nothing to do with how we specify which server we use. I am using mind reading technology, I remind you.

If we want to specify a port, we will use:

svn checkout http://localhost:8081/my-team-system-server:8080/MyProject

SvnBridge will also auto detect http and https, so you don't have to worry about this.

* No, I didn't spend three days writing my CheckboxControl.

Find out the right directory

Another interesting challenge. Given the following API:

public interface IFileRepository
{
	void Save(IFile file);

	IFile[] GetFiles(string path, Recursion recursion);
}

public enum Recursion
{
	None,
	OneLevel,
	Full
}

And assuming that you are saving to a relational database, how are you going to build the GetFiles method?

Oh, and just to make things interesting, Save(IFile) may only perform a single SQL INSERT statement.

ReSharper is smarter than me

Given the following block of code:

if (presenter.GetServerUrlFromRequest!=null)
	GetServerUrlFromRequest.Checked = presenter.GetServerUrlFromRequest.Value;
else
	GetServerUrlFromRequest.Checked = true;

Resharper gave me the following advice:

image

And turned the code to this piece:

GetServerUrlFromRequest.Checked = !(presenter.GetServerUrlFromRequest!=null) || 
presenter.GetServerUrlFromRequest.Value;

And while it has the same semantics, I actually had to dechiper the code to figure out what it was doing.

I choose to keep the old version.

Profiling with dotTrace

I have a tiny feature and a bug fix that I want to implement before I am going to focus solely improving SvnBridge performance. This is a really quick analysis of a single scenario.

Start dotTrace and set the application to profile then start it. There are a lot of options, but the default was always good for me.

image

In the application, prepare it for the scenario that you are going to perform. (In SvnBridge's case, this means just setting up the server to talk to):

image

Perform some actions against the application:

image

When you are done, hit Get Snapshot:

image

Now, I tend to go to the hotspots and check what is costing me.

image

The first line, WaitMessage call is fine, this is the WinForms client that is not doing much at the moment.

The second I can't really figure out, it is starting a thread, but that doesn't take so long. I am pretty sure that this is a case of mis-measuring, or just me not understanding this, but never mind that.

The rest of the calls, which takes huge amount of time, are... remote calls. I guess I should stop talking to TFS :-)

Adaptive Domain Models with Rhino Commons

Udi Dahan has been talking about this for a while now. As usual, he makes sense, but I am working in different enough context that it takes time to assimilate it.

At any rate, we have been talking about this for a few days, and I finally sat down and decided that I really need to look at it with code. The result of that experiment is that I like this approach, but am still not 100% sold.

The first idea is that we need to decouple the service layer from our domain implementation. But why? The domain layer is under the service layer, after all. Surely the service layer should be able to reference the domain. The reasoning here is that the domain model play several different roles in most applications. It is the preferred way to access our persistent information (but they should not be aware of persistence), it is the central place for business logic, it is the representation of our notions about the domain, and much more that I am probably leaving aside.

The problem here is there is a dissonance between the requirements we have here. Let us take a simple example of an Order entity.

image As you can see, Order has several things that I can do. It can accept an new line, and it can calculate the total cost of the order.

But those are two distinct responsibilities that are based on the same entity. What is more, they have completely different persistence related requirements.

I talked about this issue here, over a year ago.

So, we need to split the responsibilities, so we can take care of each of them independently. But it doesn't make sense to split the Order entity, so instead we will introduce purpose driven interfaces. Now, when we want to talk about the domain, we can view certain aspect of the Order entity in isolation.

This leads us to the following design:

image

And now we can refer to the separate responsibilities independently. Doing this based on the type open up to the non invasive API approaches that I talked about before. You can read Udi's posts about it to learn more about the concepts. Right now I am more interested in discussing the implementation.

First, the unit of abstraction that we work in is the IRepository<T>, as always.

The major change with introducing the idea of a ConcreteType to the repository. Now it will try to use the ConcreteType instead of the give typeof(T) that it was created with. This affects all queries done with the repository (of course, if you don't specify ConcreteType, nothing changes).

The repository got a single new method:

T Create();

This allows you to create new instances of the entity without knowing its concrete type. And that is basically it.

Well, not really :-)

I introduced two other concepts as well.

public interface IFetchingStrategy<T>
{
	ICriteria Apply(ICriteria criteria);
}

IFetchingStrategy can interfere in the way queries are constructed. As a simple example, you could build a strategy that force eager load of the OrderLines collection when the IOrderCostCalculator is being queried.

There is not complex configuration involved in setting up IFetchingStrategy. All you need to do is register your strategies in the container, and let the repository do the rest.

However, doesn't this mean that we now need to explicitly register repositories for all our entities (and for all their interfaces)?

Well, yes, but no. Technically we need to do that. But we have help, EntitiesToRepositories.Register, so we can just put the following line somewhere in the application startup and we are done.

EntitiesToRepositories.Register(
	IoC.Container, 
	UnitOfWork.CurrentSession.SessionFactory, 
	typeof (NHRepository<>),
	typeof (IOrderCostCalculator).Assembly);

And this is it, you can start working with this new paradigm with no extra steps.

As a side benefit, this really pave the way to complex multi tenant applications.

Elegant Code

I just finished writing this, and I find it very pleasing to look at:

public class EntitiesToRepositories
{
	public static void Register(
		IWindsorContainer windsorContainer,
		ISessionFactory sessionFactory,
		Type repository,
		params Assembly[] assemblies
		)
	{
		if(typeof(IRepository<>).IsAssignableFrom(repository))
			throw new ArgumentException("Repository must be a type inheriting from IRepository<T>, " + 
"and must be an open generic type. Sample: typeof(NHRepository<>).
"); foreach (IClassMetadata meta in sessionFactory.GetAllClassMetadata().Values) { Type mappedClass = meta.GetMappedClass(EntityMode.Poco); if (mappedClass == null) continue; foreach (Type interfaceType in mappedClass.GetInterfaces()) { if(IsDefinedInAssemblies(interfaceType, assemblies)==false) continue; windsorContainer.Register( Component.For(typeof(IRepository<>).MakeGenericType(interfaceType)) .ImplementedBy(repository.MakeGenericType(interfaceType)) .CustomDependencies(Property.ForKey("ConcreteType").Eq(mappedClass)) ); } } } private static bool IsDefinedInAssemblies(Type type, Assembly[] assemblies) { return Array.IndexOf(assemblies, type.Assembly) != -1; } }

Upfront Optimizations

Developers tends to be micro optimizers by default, in most cases. In general it is accepted that this is a Bad Thing. This is a very common quote:

image

In my last project, I wasn't willing to allow discussion on the performance of the application until we got to final QA stages. (We found exactly two bottle necks in the application, by the way, and it cost us 1 hour and 1 day to fix them). You could say that I really believe that premature optimization is a problem.

However, the one thing that I will think of in advance (hopefully far in advance), is reducing the amount of remote calls.

image

This is something that you should think of in advance, because a remote call is several orders of magnitudes than just about anything else that you can do in your application except maybe huge prime generation.

In general, you need to think about two things:

  • How can we reduce remote calls?
  • How can we fail (in development) for exceeding some amount of remote calls per a unit of work?

What I have found is that batching makes for a really nice model for reducing remote calls, and ensuring failure on high number of remote calls to be the most effective way to reduce their number.

This is generally not something that you can retrofit into the system, the model of work is completely different. You can try, but you end up with a Frankenstein that is going to be slow and hard to work with.

So, at any rate, this is what I wanted to say. Ignore performance until very late in the game, but do think about remote calls and the distribution of the application components as early as possible.

No, this is not BDUF, it is setting the architecture for the application, and having the right fundamental approach as early as possible.

 

,

The devil is in the details

image

Quite often, I hear about a new approach or framework, and they are interesting concepts, with obviously trivial implementations.

In fact, the way I usually learn about new approach is by writing my own spike to explore how such a thing can work. (As an alternative, I will go and read the code, if it is an OSS project).

The problem is that obviously trivial implementations tends to ignore a lot of the fine details. Even for the trivial things, there are a lot of details to consider before you can call something production ready.

Production ready is a tough concept to define, but it is generally means that you have taken into account things like:

  • Performance
  • Error handling
  • Security

And many, many more.

One of the release criteria of Rhino Mocks is the ability to mock IE COM interfaces, as a simple example. That is actually much tougher than it looks (go figure out what RVA is and come back to tell me), as a good example;

But mocking is a complex issue as it is, let us take unit testing as an example, it is very simple concept, after all. How hard is it to build? Let us see:

public class MyUnitTestingFramework
{
	public static void Main(string[] args)
	{
		foreach(string file in args)
		{
			foreach(Type type in Assembly.Load(file).GetTypes())
			{
				foreach(MethodInfo method in type.GetMethods())
				{
					if(IsTest(method)==false)
						continue;
					try
					{
						object instance = Activator.CreateInstance(type);
						method.Invoke(instance, new object[0]);
					}
					catch(Exception e)
					{
						Console.WriteLine("Test failed {0}, because: {1}", method, e);
					}
				}
			}
		}
	}
	
	public static bool IsTest(MethodInfo method)
	{
		foreach(object att in method.GetCustomerAttributes(true))
		{
			if(att.GetType().Name == "Test")
				return true;
		}
		return false;
	}
}

Yeah, I have created Rhino Unit!

Well, not hardly.

I can write an OR/M, IoC container and Proxies in an hour (each, not all of them). Knowing how to do this is important, but doing so for real is generally a mistake. I am currently using a homegrown-written-in-an-hour IoC container, I really don't like it. I keep hitting the limitations, and the cost of actually getting a container up to par with the standard expectation is huge. I know that I can't use a written-in-an-hour-OR/M without sedation.

Here is the general outline of the cost to feature ratio for doing this on any of the high end tools:

image

There really isn't much of a complexity involved, just a lot of small details that you need to take care of, which is where all the cost goes.

One thing that I am not telling you is that I didn't show the entire graph, here is the same graph, this time with more of the timeline shown. After a while, the complexity tends to go away (if it isn't, the project fails and dies, see previous graph).

image

Nevertheless, the initial cost for rolling your own like this is significant. This is also the reason why I wince every time someone tells me that they just write their own everything. (yeah, I know, I am not one that can point fingers. I think that I mentioned before that I feel no overwhelming need to be self consistent).

My own response for those graphs tend to follow the same path almost invariably:

  • Build my own spike code to understand the challenges involved in building something like this.
  • Evaluate existing players against my criteria.
  • Evaluate comparability with the rest of my stack.

If it passes my criteria, I'll tend to use that over building my own. If it doesn't, well, at least I have a good reason to do it.

Programming to learn is a good practice, one that is important not to lose. At the same time, it is crucial to remember that you shouldn't take out the toys to play in front of the big boys unless you carry a big stick and are not afraid to use it.