Ayende @ Rahien

Refunds available at head office

Must resist... decoding

Rob Conery has posted an interesting poem, I wouldn't really mind, except that he posted that in binary, which meant that I really had to figure it out.

Can you resist the urge

1110111 1101000 1111001 100000 1100100 1101001 1100100 100000 1111001 1101111 1110101 100000 1101101 1100001 1101011 1100101 100000 1101101 1100101 100000 1100100 1100101 1100011 1101111 1100100 1100101 100000 1100010 1101001 1101110 1100001 1110010 1111001 100000 1100100 1100001 1110100 1100001 111111

Quotes from class

I am not sure why, but we had some hilarious quotes today at class:

When I talked about Erlang and showed an Erlang echo program in about 20 lines of code.
Student: Echo is a single word in DOS

I was cackling about that for a good five minutes, but this is the one that really had me in stitches:

Student: My wife made my code cleaner, she put periods instead of semi colons and straighten the indentations. She didn't understand why I am so messy when I write, nor why I wasn't happy that she "cleaned it up"

Tags:

Published at

ReSharper: Showing the flag

image And this if for the guys from Jet Brains, for making such a good and essential product.

Now all I need is an IDE from them, and I would be happy.

 

 

 

 

.

Binsor 2.0

All credit should go to Craig Neuwirt, for some amazing feats of syntax. He has managed to extend Binsor to include practically all of Windsor's configuration schema. This is important because previously we had to resort to either manual / ugly stuff or go back to XML (see: manual & ugly).

This is important because some of the more interesting things that you can do with Windsor are done using the facilities, and Craig has made sure that Binsor will support the main ones in a natural manner, including out of the box support for the standard configuration model.

Let us start by using the standard configuration model for a moment, the Active Record Integration Facility is expecting to be configured from XML, but we can configure it like this:

facility arintegration, ActiveRecordFacility:
	configuration:
		@isWeb = true, isDebug = true
		assemblies = [ Assembly.Load("My.Assembly") ]
		config(keyvalues, item: add):
			show_sql = true
			command_timeout = 5000
connection.isolation = 'ReadCommitted' cache.use_query_cache = false dialect = 'NHibernate.Dialect.MsSql2005Dialect' connection.provider = 'NHibernate.Connection.DriverConnectionProvider' connection.driver_class = 'NHibernate.Driver.SqlClientDriver' connection.connection_string_name = 'MyDatabase' end end

To compare, here is the equivalent  XML configuration.

<facility id="arintegration" 
	type="Castle.Facilities.ActiveRecordIntegration.ActiveRecordFacility, Castle.Facilities.ActiveRecordIntegration" 
	isWeb="true" 
	createSchema="true">
 <assemblies>
	<item>My.Assembly</item>
 </assemblies>
 <config>
	<add key="hibernate.cache.use_query_cache" value="true" />
	<add key="hibernate.connection.isolation" value="ReadCommitted" />
	<add key="hibernate.show_sql" value="false" />
	<add key="hibernate.dialect" value="NHibernate.Dialect.MsSql2005Dialect" />
	<add key="hibernate.connection.driver_class" value="NHibernate.Driver.SqlClientDriver" />
	<add key="hibernate.connection.connection_string_name" value="MyDatabase" />
	<add key="hibernate.connection.provider" value="NHibernate.Connection.DriverConnectionProvider" />
 </config>
</facility>

We don't have any significant reduction in the number of lines. This is mostly because we need to specify a lot of items for the facility to work, but we do have significant reduction in the noise that we have to deal with.

But, frankly, this isn't the best sample, let us take a look at the shortcuts that Binsor now provides, shall we?

Event Wiring

XML configuration:

<component 
	id="SimpleListener" 
      type="Castle.Facilities.EventWiring.Tests.Model.SimpleListener, Castle.Facilities.EventWiring.Tests" />

<component 
	id="SimplePublisher" 
      type="Castle.Facilities.EventWiring.Tests.Model.SimplePublisher, Castle.Facilities.EventWiring.Tests" >
      <subscribers>
      	<subscriber id="SimpleListener" event="Event" handler="OnPublish"/>
	</subscribers>
</component>

And the Binsor configuration to match it:

component simple_listener, SimpleListener

component simple_publisher, SimplePublisher:
	wireEvent Publish:
		to @simple_listener.OnPublish

Now that is far clearer, isn't it?

Factory Support

XML configuration:

<component id="mycompfactory" 
      type="Company.Components.MyCompFactory, Company.Components"/>

<component id="mycomp" 
	type="Company.Components.MyComp, Company.Components"
	factoryId="mycompfactory" 
	factoryCreate="Create" />

Binsor configuration:

component mycompfactory, MyCompFactory

component mycomp, MyComp:
	createUsing @mycompfactory.Create

Other new features:

  • Integrating Castle Resources, so now we can use Binsor over several files, which can be located anywhere, including assembly resources, for instance.
  • Easy support for startable and life styles.
  • Extension points for future needs.

Analyzing a DSL implementation

You were just handed a strange DSL implementation, it does stuff, and it may be cool, but you have no idea how it works.

Craig Neuwirt recently did a major overhaul of Windsor ( I am going to post soon with the details of how cool it is now ), but I am now faced with the question, how do you grok such a thing? I thought that it would be useful to put a list of what I am doing to understand how the internals now works.

It goes without saying, but the nitpickers will ask, that a DSL implementation is code like any other, and as such, it can have good & bad implementations. I think that what Craig has done was amazing. It makes for an interesting reading, I actually had to take notes, to make sure that I will not miss any of the cool stuff.

  • Understand the domain.
    This is critical, because if the DSL is chicken scratching to you, you won't be able to get the implementation.
  • Get the tests.
    First things first, make sure that you have tests that ensure that you can understand what the DSL is supposed to do. In addition to that, it allows you to debug a simplified scenario repeatedly, so you can walk through that and figure out what is going on.
  • Take the simplest scenario and run it, then open the resulting DLL in Reflector
    This is important, because you need to be able to see what the differences are between the DSL and the actual executed code are. This allows you to understand in a deep level what the various elements of the language do.
  • Identify a transformation, and follow how it is being built throughout the code
    It will allow you to understand the concepts and patterns used through the code.
  • Follow each transformation throughout its own code path only
    Fairly often you can get side tracked by trying to put the entire program in your head, this is usually not possible, track a single path through the code to completion. After that, you can start tracking other paths, but it is important to have an understanding of the full cycle, and then expand from that.

Hm, not really that different from how I would approach an unfamiliar code base, come to think about it.

Cross Site Scripting

So I had to do it today, I had two pages, in two unrelated domains (foo.com and bar.com) and I had to open a page from one and interact with it. Security constraints disallow this, unfortantely. There are all sorts of ways around it, mostly focusing on proxies, but I didn't want to get into that for a simple page, so I decided to write my own stupid method to do it.

From foo.com, the calling page:

var url = "http://www.bar.com/someImportantPage.castle?id=15"&onCloseRedirectTo=" + 
		encodeURIComponent(window.location.href + 
"&returnUrl="+ encodeURIComponent(window.location.href) ); window.open(url);

And using JS injection for the called page (I have some limited control there), I put:

if(window.opener)
{
	var oldClose = window.close;
	window.close = function()
	{
		if(window.opener && window.returnValue )
		{
			var url = decodeURIComponent($.getURLParam('onCloseRedirectTo')) + 
							"&idToAdd=" + window.returnValue;
			window.opener.location.href = url;
		}
		oldClose();
	};
}

And voila, it works. I'll leave the how as an excersize for the reader. Suffice to say that if you want to add a local iframe to the mix you can even get it to work in an "ajaxian" fashion.

Generics vs. Duplication

imageI need to express the same logic over two different types. The code is literally the same in 90% of the cases, with minor modifications for each type. Seeking to avoid copy & paste programming, I decide to use some

public class BaseSpecificationController
<TSpecification, TSpecificationResult, TEntity> : BaseController where TSpecification : BaseSpecification<TEntity,TSpecificationResult> where TSpecificationResult : BaseSavedResult

The problem is that I got there, and it is really saving a lot of duplicated code. But still, I can't help but dislike such code, it seems like it is cheating. 

 

.

Memorable Quotes

From the Castle Dev Mailing list:

Ken Egozi: When someone would build a generator that can generate a domain model by listening to a white-board design meeting, that would be a craic.

David Marzo: There is already one -> good software developer ;-)

More generics gotchas, argh!

I have been working on .NET 2.0 for a long time now, generics still manage to trip me.

Consider this piece of code:

public interface IFoo<TFromInterface> { }

public class Foo<TFromClass> : IFoo<T> { }

[Test]
public void More_Generics_Gotcha()
{
	Assert.AreEqual(typeof(IFoo<>), 
		typeof(Foo<>).GetInterfaces()[0]);
}

This test fails. the returned System.Type instance is a IFoo<TFromClass>, which is  an unbounded generic parameter, but not the same as IFoo<> itself. Now I need to apologize for Windsor, it wasn't its fault all along.

Specifying Specifications

imageSpecification:  An explicit statement of the required characteristics for a matching entity / object.

Specifications are very useful concepts when you need to think about searches. They are useful because they seperate what you are searching from how you are searching.

In the example to the right, we have a specification for a job opening. We can use it to search for matching job openings. But couldn't we do it directly? Yes we could, but then we would dealing with both what and how at the same time.

Let us say that we want to search for an opening matching a male candidate, age 28, shall we? How do we handle that?

Pretty simple, actually:

OpeningSpecification spec = new OpeningSpecification();
spec.Age = 28;
spec.Gender = Gender.Male;
spec.FindAll(openingRepository);

The specification is responsible for translating that to a query that would return the correct result, which this translation can be wildly different than the specification itself. For example, let us take the age. We want opening that match a person whose age is 28, but the opening doesn't have a DesiredAge property, they have min / max ages that they allow.

That fact means that we search for a male doesn't mean that we need to exclude every opening where the max/min age reject someone with age of 28 years. But we don't care about that when we use the speficiation, we just want to find the matching opening.

From an implementation perspective, we use the property setters as a way to incremently build the query, here is how I built the Age property:

[Property]
public virtual int? Age
{
	get { return age; }
	set
	{
		age = value;
		if (value == null)
			return;
		criteria
			.Add(Expression.Le("MaxAge", age) || Expression.IsNull("MaxAge"))
			.Add(Expression.Ge("MinAge", age) || Expression.IsNull("MinAge"));
	}
}

As an aside, consider the usage of the operator overloading there, I completely forgot to use that before, and that is a shame, since I was the one that added that to NHibernate. Andrew Davey reminded me how useful this is.

You may notice that I have a [Property] on the specification Age property, and that it carries some unusual properties such as Id, Name and Username. Those are used to persist the specification, so I can save it and then load it again, extremely useful if I want to save a search for "Males over 30", and re-run it later.

Another advantage here is that I have the possibility to construct a specification easily from another object. You can see the FromCandidate() method on the specification above, it build a specification from the candidate that matches job opening that the candidate is eligable for.

All in all, a very useful concept, and it gets even nicer if you throw MonoRail into the mix, but this is a matter for another post.

Querying Complexity

When it comes to building search screens, there really isn't something that I would rather use over NHibernate. I am just astounded at how I can slice a very complex scenario into very narrow & clear fashion:

criteria.Add(Expression.Disjunction()
		.Add(Expression.Disjunction()
					.Add(Expression.Eq("City1.id", value))
					.Add(Expression.Eq("City2.id", value))
					.Add(Expression.Eq("City3.id", value))
					.Add(Expression.Eq("City4.id", value))
				)
		.Add(Expression.Conjunction()
					.Add(Expression.IsNull("City1.id"))
					.Add(Expression.IsNull("City2.id"))
					.Add(Expression.IsNull("City3.id"))
					.Add(Expression.IsNull("City4.id"))
			)
	);
cloned.Add(Expression.Disjunction()
			.Add(Expression.In("PreferredArea1", ToArray(Areas)))
			.Add(Expression.IsNull("PreferredArea1"))
	);
cloned.Add(Expression.Disjunction()
			.Add(Expression.In("PreferredArea2", ToArray(Areas)))
			.Add(Expression.IsNull("PreferredArea2"))
	);

Now imagine roughly 20 such cases, all of them can be added/removed dynamically... I am going to write a far longer post about the specification pattern and how it useful it is.

Update: Andrew reminded me that there are easier ways to do this, using operator overloading, let us take the first example and see how it goes:

criteria.Add(
	(
		Expression.Eq("City1.id", value) ||
		Expression.Eq("City2.id", value) ||
		Expression.Eq("City3.id", value) ||
		Expression.Eq("City4.id", value) ||
	) || (
		Expression.IsNull("City1.id") &&
		Expression.IsNull("City2.id") &&
		Expression.IsNull("City3.id") &&
		Expression.IsNull("City4.id") &&
	)
);

Twitter Snapshots with Scott Bellware

I never really got twitter, but here are a few snapshots that I took from Scott Bellware's twitter feed, which is not safe for work, who decided to run an all night radio from there. He has been talking about someone with a familiar name, but apparently in a different universe altogether.

I couldn't help but post a few snapshots of things happened:

image

 

image

 

image

The IoC mind set: Validation

I think that I mentioned that I believe that having a good IoC container at your disposable can make you really shift the way you are structuring your code and architects solutions.

Let us take for instance Validation as an example. The two examples that jump to mind are Castle.Components.Validation and the Validation Application Block. They both have the same general approach, you put an attribute on a property, and then you use another class to validate that the object meet its validation spec.

This approach work, and it is really nice approach, for very simple input validation. Required fields, length of a string, etc. Being able to declaratively specify that is so very nice. But it breaks down when you have more complex scenarios. What do I mean by that? After all, both libraries offers extensibility points to plug your own validators...

Well, as it turn out, when we go beyond the realm of simple input validation, we reach into the shark filled waters of business rules validation. Those rules are almost never applied across the board. Usually, they are very targeted.

Business rule: A customer may only open a service call for products that the customer has purchased

Expressing this rule in code is not very hard, but it doesn't benefit much from the above mentioned libraries.*

Nevertheless, you want to have some sort of a structure in your validation code. You want to be able to add new validation easily. Here is the back of the envelope solution for this. Hopefully it will make it clearer why I think that having IoC makes it so very easy to handle things.

public interface IValidatorOf<T>
{
   void Validate(T instance, ErrorSummary errorSummary);
}

public class Validator
{
	IKernel kernel;//ctor injection, not shown
	
	public ErrorSummary Validate<T>(T instance)
	{
		ErrorSummary es = new ErrorSummary();
		foreach(IHandler handler in kernel.GetHandlers(typeof(IValidatorOf<T>)))
		{
			((IValidatorOf<T>)handler.Resolve(CreationContext.Empty)).Validate(instance, es);
		}
		return es;
	}
}

And yes, that is the entire thing. Come to think about it, I think that we need to add GetAllComponentsFor<T>() to Windsor, but that is another matter. Now, if I want to add a new validation, I inherit IValidatorOf<T> with the right T, and that is it. It gets picked up by the container automatically, and the rest of your code just need to call Validator.Validate(myServiceCall); and handle the validation errors.

* Actually, a standard way to report validation errors is very nice.

Dependency Injection doesn't cut it anymore

So, I showed how you can write an IoC container in 15 lines of code, obviously this means that containers such as Windsor, at ~52,000 lines of code are tremendously bloated and not worth using, right?

Well, no. And the reason is that DI is now just one of the core functions of a container, it is the other things that it does that make it so useful. Since the discussion started from why you don't need IoC in Ruby, I decided to also explore the ideas in the Ruby frameworks. It seems like a typical Ruby IoC code is this:

registry.register( :component ) do |reg| 
  c = Component.new
  c.service = reg.some_other_service
  c
end

This is very similar to the idea of the demo container that I build. Basically, it is using the block (anonymous delegate) to defer the creation to some other time. On the surface, it is a viable approach, but it breaks down as soon as you start dealing with even moderately complex applications.

I will get back to it in a minute, but let us walk through the core things that an IoC container should provide.

  • Dependency Injection
  • No required dependency on the container from the services
  • Life style management
  • Aspect Oriented Programming

The above is what I consider the core minimum to be an IoC container. It is not really that hard to build, and using Dynamic Proxy I can probably get this all working in about three hours. But it wouldn't be very usable. (And yes, it will have facilities :-) )

The reason that I use IoC is not to encourage testing, it is not to break dependencies, it is not to get separation of concerns. Those are well and good, but if I had to use a container that behaved like the one listed above, I would go crazy.

I am using IoC because it makes all of the above so easy. It make it harder to create a manual dependency than to do the wiring through Windsor. As a matter of fact, I went crazy about a year ago just having to deal with just specifying service declarations. That is why I have Binsor in place. That took care of all the complexity.

Now testability, separation of concerns and no dependencies are no longer things that I have to work toward, they are already there, right out of the box.

My current project has well over 250 components that are managed through Windsor. I have no clue as to their dependencies, and I don't really care about that. Other people on my team keep adding components, and they aren't even aware that they have IoC there. They just know that if they put the IFooService in the ctor, they will get it when the code is running. And if they add IBarService and BarServiceImpl, they will be available to any component that uses them.

image Can you imagine trying to deal with 250+ dependencies resolving blocks (or anonymous delegates) ? Just writing them down would be a chore, and then they would have to be maintained, and adding something is becoming fragile. I can't imagine having something break because I added a parameter to the constructor, or a settable property that I need.  I can't imagine having to do something beyond just creating a class to get it to register in the container.

Well, I can imagine those, I can't imagine me suffering through that. It would be like trying to pull a particular fish from the image on the right, possible, but very painful.

Auto wiring is what makes IoC a mandatory part of my application, because it means that I don't need to manage the dependency cloud at all. I let the container do it for me. It means that once is has been setup, you no longer need to think or be aware of it. I have forgotten that I am using an IoC container for three months, while using it each and every day.

Then you get to the AoP concepts, of which I particularly like automatic transaction support and cross thread synchronization.

To conclude, I think that if you are using an IoC merely for dependency injection, you are missing a lot of the benefits there. Having the ability to just throw the complexity at a tool and have it manage it for you is a key factory in how you design your applications.

Building an IoC container in 15 lines of code

imageI am writing this post as a result of a discussion in the ALT.Net mailing list and DI containers in Ruby. Since I promised that I would build such a thing, it made it interesting to see how far I can minimize an IoC container.

The result is, as I said, 15 lines of significant code (ignoring blank lines or line with just curly braces).

You can see the sample model on the right. It is fairly typical model, I believe. I want to get the login controller instance without having to worry about the dependencies, and I want to do it in a non invasive way.

How can we handle that?

Well, it is pretty simple, as a matter of fact, here is the full source code for the container:

public class DemoContainer
{
	public delegate object Creator(DemoContainer container);

	private readonly Dictionary<string, object> configuration 
= new Dictionary<string, object>(); private readonly Dictionary<Type, Creator> typeToCreator
= new Dictionary<Type, Creator>(); public Dictionary<string, object> Configuration { get { return configuration; } } public void Register<T>(Creator creator) { typeToCreator.Add(typeof(T),creator); } public T Create<T>() { return (T) typeToCreator[typeof (T)](this); } public T GetConfiguration<T>(string name) { return (T) configuration[name]; } }

 Not really hard to figure out, right? And the client code is as simple:

DemoContainer container = new DemoContainer();
//registering dependecies
container.Register<IRepository>(delegate
{
	return new NHibernateRepository();
});
container.Configuration["email.sender.port"] = 1234;
container.Register<IEmailSender>(delegate
{
	return new SmtpEmailSender(container.GetConfiguration<int>("email.sender.port"));
});
container.Register<LoginController>(delegate
{
	return new LoginController(
		container.Create<IRepository>(),
		container.Create<IEmailSender>());
});

//using the container
Console.WriteLine(
	container.Create<LoginController>().EmailSender.Port
	);

I should probably mention that While this handles dependency injection quite nicely, it is absolutely not what I would consider an appropriate container to use. More on that in the next post.

Test Once: The other side of continuous integration

I am sharing this story because it is funny, in a sad way. And because I don't post enough bad things about myself. I also want to make it clear that this story is completely and utterly my own fault for not explaining in full what we were doing to the other guy.

A few days ago I was pairing with a new guy about some code that we had to write to generate word documents. We run into a piece of code that had fairly complex requirements. (string parsing & xpath, argh!)

I thought that this would be a good way to get the point of unit testing, since the other guy didn't get the all the edge cases when I explained them (about 7, IIRC). So we sat down and wrote a test, and we made it pass, and we wrote another test, and made it pass, etc. It was fun, it made it very clear what we were trying to achieve, etc. This post is not about the values of unit tests.

So we were done with the feature, and I move on to the next guy, trying to understand how he can make javascript do that (I still don't know, but the result is very pretty). About two hours afterward, I update my working copy with the latest source, and try to find the test that we have written. It is not there.

Hm...

I asked him were the test was, his answer was: "I deleted it, we were finished with that functionality, after all." *

* Nitpicker corner: I went and hit myself on the head for being stupid, he had no way of knowing, because I didn't explain nearly enough. I did afterward, and then I wrote the tests again.

Well, it started life as an IoC Container ,but then we added this bit...

I hate planning. This comes from a long experience at all the problems that raise when you have faulty planning.

So, today I intended to have the class write an IoC container ,but somehow, still not sure how, we ended up building the trivial OR/M imp

https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/trunk/SampleApplications/Course/SampleORM

 Let me say first that this is sample code, written in the span of a few hours, off the cuff coding, etc. Not meant for production... It doesn't even have identity map or unit of work.  But, it shows that it takes only a few hours to make a significant improvement in the way you work, if you are not already using the best tools out there.

And allow me to reiterate that this was written by me and the students in about two and a half hours only. I hope that this would take care of the "can't handle it" muttering.

Teaching Reflection

I am going to teach Reflection to my students tomorrow. Instead of going by the book, I think that we will implement an IoC container. Nothing like going into the nuts & bolts of it in order to make people understand how stuff works.

They are Morts, so this ties in very well into the discussion that I had on the weekend, about Mort's abilities.

I'll report on Friday...

Rhino Mocks: Void methods using Expect.Call

One of the things that I dislike about Rhino Mocks is the disconnect between methods that return a value and methods that do not. The first are handled quite naturally using Expect.Call() syntax, but the later have to use the LastCall syntax, which is often a cause of confusion.

There is little to be done there, though, that is a (valid) constraint place there by the compiler, can't do much there, right?

Jim Bolla had a different idea, and had a great suggestion, so now we can write this:

IServer mockServer = mocks.CreateMock<IMock>();
Expect.Call(delegate { mockServer.Start(); }).Throw(new InvalidConigurationException());
// rest of the test

It is also a good way to start preparing Rhino Mocks for C# 3.0.

Many thanks to Jim.

Again, the code is in the repository, and will be released soon.

Lucene as a data repository

The issue of user driven entity extensibility came up in the castle users mailing list, and a very interesting discussion has started. The underlying problem is fairly well known, we want to allow the extensibility of the schema by our end users.

The scenarios that I usually think of are about extending the static schema of the application. Like adding a CustomerExternalNumber field to the Customer entity, or adding MyOwnEntity custom entity. This can be solved in a number of ways, from meta tables a schema that looks like this:

image

I am usually suspicious of such methods, and would generally prefer to go with the option of simply extending the schema at runtime by adding additional tables for the use extensions.

image The issue that came up in the list was quite different, the need was to extend each entity instance. Let us take bug tracking for instance. We need to allow the user to add different fields per each bugs. Then we need to allow to search on those extra fields, and each user can define their own fields.

Lucene came up as a way to store those extra fields, and then I had a light bulb moment. Lucene, by its nature, is a good place to store semi structure data. The basic unit of storage in Lucene is the Document. And a document is compromised of a set of fields, which can be indexed, stored or both. Hibernate Search (and NHibernate Search) uses this ability to allow us to store entity information in Lucene, which mean that we can retrieve information directly from Lucene, hitting the DB only for the missing information.

Extending this idea to also allow extra information in the Lucene store is a fairly natural extension, and extremely interesting to me. It means that I can give my users what they want (full extensibility) while keeping things very simple & clean from my point of view. Searching is built in, and easy enough that you can give the users the ability to do direct queries against that. In fact, you can even use NHibernate Search to allow even better scaling of the searching capabilities.

Reporting is also easy enough, you pull the data out, and into your entities, and report off of that, but if you want to do something more generic, it is very easy to build a Lucene query to a DataSet, which you can then hand to the reporting engine.

Exciting idea.

Pattern Matching in Boo

Looks like it is a good thing that I picked that Erlang book, Boo just got Pattern Matching, I am looking at the code, and but for the grace of Erlang, I would be completely lost.

Trusting the benchmark

imageI was scanning this article when I read noticed this bit.  A benchmark showing superior performance to another dynamic proxy implementation.  

I should mention in advance that I am impressed. I have built one from scratch, and am an active member of DP2 (and DP1, when it was relevant). That is not something that you approach easily. It is hard, damn hard. And the edge cases will kill you there.

But there are two problems that I have with this benchmark. They are mostly universal for benchmarks, actually.

First, it tests unrealistic scenario, you never use a proxy generation framework with the caching off. Well, not unless you want out of memory exceptions.

Second, it tests the wrong thing. DP2 focus is not on generation efficiency (it is important, but not the main focus), the main focus are correctness (DP2 produces verifiable code) and runtime performance. You only generate a proxy type once, this means that if you can do more in the generation phase to make it run faster at runtime, you do it.

Again, it is entirely possible that the LinFu proxies are faster at runtime as well, but I have no idea, and I don't mean to test those. Benchmarks are meaningless most of the time, you need to test for performance yourself.

Rhino ETL: Importing Data into MS CRM

Okay, so this is the "coding in anger" part for Rhino ETL. I need to import files into MS CRM entities. The files are standard CSV files, with the usual corruption of values that such files have. The CRM is accessed through the web services, although I am keeping aside the option of direct DB access, if I can't get the Web Services to perform any faster.

The first problem that I had was that the MS CRM Web Services are not simple services. They accept entities that are defined in the WSDL for them, not simple values. That put me in a complexity spin for a while, until I remembered that I am not working in my own little language, I am working on .NET. A quick trip to Visual Studio and an Add Web Reference + Compile later, I had integrated accessing the MS CRM into Rhino ETL.

Here is how it was done:

import CrmProxy.Crm from CrmProxy

Basically it means that I now had a dll that contains the proxy definitions for the web service, and I imported it. So it is incredibly easy to use.

Then, it was the matter of reading the file. Rhino ETL has integrated with the FileHelpers library, and I couldn't really be happier about it. There are several reasons for that, but the main one is that I run into something that the library can't handle, and I fixed that in 10 minutes, without changing the library code. Speaking of software that I like, this is one of the main criteria that I use to evaluate a piece of software. What happens when I step off the ledge? With FileHelpers, I can extend it so easily, that I really don't care about that.

Anyway, here is a part of the class definition for our file: 

[DelimitedRecord(","), IgnoreFirst]
class Customer:
      [FieldConverter(ConverterKind.Date, "dd/MM/yyyy")] 
      UpdateDate as date
      Id as int
      Name as string
      ResponsibleEmployee as Nullable of int
      [FieldConverter(Rhino.ETL.FileHelpersExtensions.DateTimeConverterWithNullValue, "dd/MM/yyyy","00/00/0000")] 
      ReceptionDate as Nullable of date

As you can see, there isn't much to it except defining the fields, types, etc.

source CustomersFile:
     execute:
            file = Read(typeof(Customer)).From(Configuration.CustomerFile)
            file.OnError(ErrorMode.SaveAndContinue)
            for customer in file:
                  print "Source ${customer.Id}"
                  SendRow( Row.FromObject(customer) ) 
            if file.HasErrors:
                  file.OutputErrors(Configuration.CustomerErrorsFile)
                  AddError("Errors have been written to ${Configuration.CustomerErrorsFile}")

Here I read from the file, use the Row.FromObject() to translate an entity into a row, and then send it forward. One amazing thing here is that FileHelpers will generate an errors file for me on demand. And that one is clear and concise and actually useful. Comparing to the amount of effort that I know are required to pull reasonable errors from SSIS file input, that is a great pleasure.

Anyway, if you missed that, I am very happy about FileHelpers.

Another thing to point out is the Configuration.CustomerFile, etc. The Configuration object is dynamically populated from a config file that you can pass to Rhino ETL (command line arg), which is a simple xml file in the format:

<configuration>
	<CustomerErrorsFile>D:\customers_errors.txt</CustomerErrorsFile>
</configuration>

Why XML? Because this seems like a place where I would want to touch with stuff like xmlpoke, etc. So it is easier to work with. It is also a flat configuration scheme, that doesn't have any semantics other than the simple key/value pair.

So, now that I have the data, I can send it to the destination:

destination Crm:
      initialize:
            Parameters.Srv = CrmService(
                  Url: Configuration.Url,
                  Credentials: NetworkCredential(
                            Configuration.Username,
                           
Configuration.Password,
                           
Configuration.Domain),
                  CallerIdValue: CallerId(CallerGuid: Guid(Configuration.CallerId)),
                  UnsafeAuthenticatedConnectionSharing: true,
                  PreAuthenticate: true
                  )

      onRow:
            theAccount = account(
                  accountnumber: Row.Id.ToString(),
                  name: Row.Name,
                  telephone1: Row.Phone,
                  telephone2: Row.Cellular,
                  telephone3: Row.AdditionalPhone,
                  fax: Row.Fax,
                  accountreceptiondate: CrmDateTime(Value: Row.ReceptionDate.ToString("yyyy-MM-ddT00:00:00")),
                  address1_city: Row.City,
                  )
            result = Parameters.Srv.Create(theAccount)
            print "Created account ${Row.Id} -> ${result}"

      cleanUp:
            Parameters.Srv.Dispose()

As you can see, we have the initialize method, which creates the service, then we instansiate an account instance, fill it with the required parameters, and go to town. It is also notable the easy translation of types from CLR types to CRM types, such as in the case of accountreceptiondate.

All in all, the only difficulities that I had during this were to make heads or tails from the inforamtion in the file, which is where I want the difficulity to lie when I am dealing with ETL processes.

Functional Longing

I am reading the Erlang book right now (post coming as soon as I finish it), but so far I have managed to grasp the idea of heavy use of recursion and functions for everything. It is very different from imperative languages, but I think I can make the shift.

Joe Functional, however, may have some trouble on the other way around. The following were found on production (but I'll not comment any further on their origin):

public static int CharCount(string strSource, string strToCount, bool IgnoreCase)
{
    if (IgnoreCase)
    {
        return CharCount(strSource.ToLower(), strToCount.ToLower(), true);
    }
    return CharCount(strSource, strToCount, false);
}

Let us ignore the inconsistent naming convention, the misleading function name are really think what this does...

public static string ToSingleSpace(string strParam)
{
    int index = strParam.IndexOf("  ");
    if (index == -1)
    {
        return strParam;
    }
    return ToSingleSpace(strParam.Substring(0, index) + strParam.Substring(index + 1));
}

The pattern continue, but at least I can hazard a guess about what this does, but I wouldn't want to pipe this post through it.

public static string Reverse(string strParam)
{
    if ((strParam.Length != 1) && (strParam.Length != 0))
    {
        return (Reverse(strParam.Substring(1)) + strParam.Substring(0, 1));
    }
    return strParam;
}

"Reverse a string" is something that I like to ask in interviews, but I don't suppose that this implementation will be found sufficient.