Ayende @ Rahien

It's a girl

Microsoft CRM woes

I have wasted the entire day trying to troubleshoot some idiotic issues with MS CRM.

Allow me to reiterate my previous statements on Microsoft CRM. It is an application built on feet of clay, marketed as a development platform and undeserving of consideration as anything but a vanilla install.

Highlights of the CRM:

  • Trying to update an entity via web service call. You get a "platform error" message. You have no input why this is not working working. Fast forward a day, turn out a PreUpdate callout is throwing an exception, which the CRM just swallows and abort.
  • Trying to update an entity via web service call. The call completes successfully, but the update is never made. Fast forward a day or two, turn out a PostUpdate callout is throwing an exception, which the CRM will swallow, appear to continue and discard the supposedly saved data.
  • Changing the type of a field in the entity will ensure that you cannot do an import/export from that entity ever again. You have to do a reinstall.

Yuck!

Tricky Code

Without running the code, what is the result of this masterpiece?

class Program
{
	static void Main(string[] args)
	{
		DoSomething("a","b");
	}

	public static void DoSomething<T>(IList<T> set)
	{
		Console.WriteLine(set.Count);
	}

	public static void DoSomething<T>(params T[] items)
	{
		List<T> set = new List<T>();
		foreach (T t in items)
		{
			if (t == null)
				continue;
			set.Add(t);
		}
		DoSomething(set);
	}
}

It surprised the hell out of me until I figured out what was going on, then I was very amused. This code works exactly as it should be, to produce a very different result than the expected one.

Low pain tolerance

I am coming back to a project that I haven't been a part of for about six months. It was in active development during that time, and I was happy to get back to it, since it is a really good code base, and a fun project beside. The team that handle this project is top notched, but my first reaction was something in the order: "how could you let it turn so hard!"

Then I toned it down a bit and started Big Refactoring. About fifteen minutes later, I was done, which didn't match at all my initial response, and took some thinking. Why did I react this way? Why did it took ~15 minutes to turn something that I found awkward to something that was really pleasant to work with?

How could the other team members, all of whom are very good, have gotten into this situation?

The answer is both that I over reacted and that the expectations that I had from the project where high. We have made working on this project very smooth experience, but as time passed, it started to get awkward to work with. By that time, I wasn't around, and the developers just dealt with that. It isn't painful or hard, it is not bad or annoying. It is just that it began to get awkward, in a project that was really smooth sailing.

Those 15 minutes were spent mostly in breaking apart a few services, setting up a registration rule in Binsor and relaxing in the glow of content that Things Just Work once again.

I came to the conclusion that different pain tolerance levels are responsible for my reaction. I have very high expectations from my code, and I expected that it would continue to be as easy as it was in the start. The devs in the team would have likely performed the same actions as I did, at a later date. But I found it painful now.

It is especially interesting in light of the recent discussion about code size and scaling a project higher.

I think that having a lower pain tolerance level is a good thing, if kept in check.

The applicability of DSL

When is a DSL applicable? When will using a DSL make our job easier?

There are several layers to the answer, depending on the type of DSL that we want, and the context that we want to apply it.

When building a technical DSL, we will generally use that as bootstrapping and configuration mechanism, to make it easier to modify and change a part of the system.

In general, those DSL are focused enabling recurring types of tasks, usually of one off nature. Configuration is a common scenario, since it is probably the simplest to explain and to start, but there are many other example. Build scripts comes to mind, in fact, scripting in general is a common area for technical DSL. Combining the power of a flexible language with a DSL directed at the task at hand makes for a powerful tool.

Another interesting task for DSL is mapping layers, we have the destination in mind, usually some domain object or DTO, and we get some input that we transform to that object. Here we use the DSL for ease of modification, and ease of just adding new handlers.

Again, technical DSL are shortcuts to code and to avoiding pain. You can do everything you do in a technical DSL using code. The DSL should make it easier, but the main benefit is one-of-a-kind solution to a type of task.

Note that this one of a kind solution doesn't mean throw-away code. it means that you would usually have a singular need in an application. Configuring the IoC container is a need that you have once per application, for example, but it is critically important part of the application, and something that you go back. For the last year and half or so, we have used Binsor to just that, as a DSL that can configure the IoC container for us. It allowed very good flexibility in that it would allow use to define per project conventions very easily.

Technical DSL are usually the glue that holds stuff together.

What about business DSL?

Here, the situation is different. Business DSL are almost always about rules and the actions to be taken when those rules are met.

This sounds like a very narrow space, doesn't it? But let me state in another way, the place of a business DSL is to define policy, while the application code defined the actual operations. A simple examples will be defining the rules for order processing. Those, in turn, will affect the following domain objects:

  • Discounts
  • Payment plans
  • Shipping options
  • Authorization Rules

The application code then takes this and act upon it.

Policy is usually the place where we make most changes, while the operations of the system are mostly fixed. We are also not limited to a single DSL per application, in fact, we will probably have several, both technical and business focused DSL. Each of those will handle a specific set of scenarios (processing orders, authorizing payments, suggesting new products, etc).

What about building the entire system as a set of DSL?

That may be a very interesting approach to the task. In this scenario, we inverse the usual application code to DSL metrics, and decide that the application code that we would like to build would be about mostly infrastructure concerns and the requirements of the DSL. I would typically like to use this type of approach in backend processing systems. Doing UI on top of a DSL is certainly possible, but at this point, I think that we will hit the point of diminishing returns. Mostly because UI are usually complex, and you want to be able to handle them appropriately. A complex DSL is a programming language, and at the point, you would probably want to use a programming language to work with rather than a DSL.

There is an exception to that, assuming that your application code is in Boo. They you are working with a programming language, and then you can build a technical DSL that will work in concrete with the actual frameworks that you are using. Rails is a good example of this approach.

Assuming that you write your application code in a language that is not suited for DSL, however, you would probably want to have a more strict separation of the two approaches. Using DSL to define policy and using the application code to define the framework and the operations that can be executed by the system.

Building such a system turns out to be almost trivial, because all you need to do is apply the operations (usually fairly well understood) and then can play around with the policy at will. If you have done your job well, you'll likely have the ability to sit down with the customer and define the policy and have them review it at the same time.

I wonder how and why you would test those...

The Record/Replay/Verify model

Daniel and I are having an interesting discussion about mock frameworks, and he just posted this: What's wrong with the Record/Reply/Verify model for mocking frameworks.

Daniel also pulled this quote from the Rhino Mocks documentation:

Record & Replay model - a model that allows for recording actions on a mock object and then replaying and verifying them. All mocking frameworks uses this model. Some (NMock, TypeMock.Net, NMock2) use it implicitly and some (EasyMock.Net, Rhino Mocks) use it explicitly.

Daniel go on to say:

I find this record/replay/verify model somewhat unnatural.

I suggest that you would read the entire post, since this is a response to that. I just want to point out that I still hold this view. Mock frameworks all use this model, because verification is a core part of mocking.

The problem that I have with what Daniel is saying is that we seem to have a fundamental difference in opinion about what mocking is. What Daniel calls mocks I would term stubs. Indeed, Daniel's library, by design, is not a mock framework library. It is a framework for providing stubs.

Going a little bit deeper than that, it seems that Daniel is mostly thinking about mocks in tests as a way to make the test pass. I am thinking of those in terms of testing the interactions of an object with its collaborators.

This project does a good job of showing off how I think mocking should be used, and it represent the accumulation of several years of knowledge and experience in testing and using mocks. It also represent several spectacular failures in using both (hint: mocking the database when going to production may not be a good idea), from which I learned quite a bit.

Mocking can be abused to cause hard to change tests, so can other methods. Deciding to throw the baby with the bath water seems to be a waste to me. There is a lot of guidance out there about correct application of mocking, including how to avoid the over specified tests. The first that comes to mind is Hibernating Rhinos #1 web cast, which talks about Rhino Mocks and its usage.

Rhino Mocks can be used in the way that Daniel is describing. I would consider this appropriate at certain times, but I think that to say that this is the way it should be is a mistake. The Rhino Mocks web cast should do a good job in not only showing some of the more interesting Rhino Mocks features, but also where to use them.

To conclude, the record, replay, verify model stands at the core of mocking. You specify what you think that should happen, you execute the code under test and then you verify that the expected happened. Taking this to the limit would produce tests that are hard to work with, but I am in the opinion that taking a worst practice and starting to apply conclusions from that is not a good idea.

Multi threading challenge: can you spot a bug?

One of the problems of multi threading is that there are a lot of intricacies that you have to deal with. Recently I run into issues that dictated that I would have to write an AsyncBulkInserterAppender  for log4net.

One of the reasons that I want to do that is to avoid locking the application if the database is down or the logging table is locked.I just had a recent issue where this casued a problem.

When I implemented that, I started to worry about what would happen if the database is locked for a long duration. There is a chance that this async logging would block for a long time, and then another async batch would start, also blocking, etc. Eventually, it will fill the thread pool and halt the entire system.

This is the approach I ended up with, it should ensure that there is at most, only two threads that are writing to the database at a time. Since I wrote it, I already found at least two bugs in there. It looks fine now, but I can't think of any good way to really test that.

I am afraid that multi threading can't really be tested successfully. This is something where code review is required.

Here is the code:

protected override void SendBuffer(LoggingEvent[] events)
{
	// we accept some additional complexity here
	// in favor of better concurrency. We don't want to
	// block all threads in the pool (train) if we have an issue with
	// the database. Therefor, we perform thread sync to ensure
	// that only a single thread will write to the DB at any given point
	ThreadPool.QueueUserWorkItem(delegate
	{
		lock (syncLock)
		{
			eventsList.AddLast(events);
			if (anotherThreadAlreadyHandlesLogging)
				return;
anotherThreadAlreadyHandlesLogging = true; } while (true) { LoggingEvent[] current; lock (syncLock) { if(eventsList.Count == 0) { anotherThreadAlreadyHandlesLogging = false; return; } current = eventsList.First.Value; eventsList.RemoveFirst(); } PerformWriteToDatabase(current); } }); }

Code base size, complexity and language choice

Via Frans, I got into these two blog posts:

In both posts, Steve & Jeff attack code size as the #1 issue that they have with projects. I read the posts with more or less disbelieving eyes. Some choice quotes from them are:

Steve: If you have a million lines of code, at 50 lines per "page", that's 20,000 pages of code. How long would it take you to read a 20,000-page instruction manual?

Steve: We know this because twenty-million line code bases are already moving beyond the grasp of modern IDEs on modern machines.

Jeff: If you personally write 500,000 lines of code in any language, you are so totally screwed.

I strongly suggest that you'll go over them (Steve's posts is long, mind you), and then return here to my conclusions. 

Frans did a good job discussing why he doesn't believe this to be the case, he takes a different tack than mine, however, but that is mostly business in usual between us. I think that the difference is more a matter of semantics and overall approach than the big gulf it appears at time.

I want to focus on Steve's assertion that at some point, code size makes project exponentially harder. 500,000 LOC is the number he quotes for the sample project that he is talking about. Jeff took that number and asserted that at that point you are "totally screwed".

Here are a few numbers to go around:

  • Castle: 386,754
  • NHibernate: 245,749
  • Boo: 212,425
  • Rhino Tools: 142,679

Total LOC: 987,607

I think that this is close enough to one million lines of code to make no difference.

This is the stack on top of which I am building my projects. I am often in & out of those projects.

1 million lines of code.

I am often jumping into those projects to add a feature or fix a bug.

1 million lines of code.

I somehow manage to avoid getting "totally screwed", interesting, that.

Having said that, let us take a look at the details of Steve's post. As it turn out, I fully agree with a lot of the underlying principals that he base his conclusion on. 

Duplication patterns - Java/C# doesn't have the facilities to avoid duplication that other languages do. Let us take the following trivial example. I run into it a few days ago, I couldn't find a way to remove the duplication without significantly complicating the code.

DateTime start = DateTime.Now;
// Do some lengthy operation
DateTime duration = DateTime.Now - start;
if (duration > MaxAllowedDuration)
{
    SlaViolations.Add(duration, MaxAllowedDuration, "When performing XYZ with A,B,C as parameters");
}

I took this example to Boo and extended the language to understand what SLA violation means. Then I could just put the semantics of the operations, without having to copy/paste this code.

Design patterns are a sign of language weakness - Indeed, a design pattern is, most of the time, just a structured way to handle duplication. Boo's [Singleton] attribute demonstrate well how I would like to treat such needs. Write it once and apply it everywhere. Do not force me to write it over and over again, then call it a best practice.

There is value in design patterns, most assuredly. Communication is a big deal, and having a structured way to go about solving a problem is important. That doesn't excuse code duplication, however.

Cyclomatic complexity is not a good measure of the complexity of a system - I agree with this as well. I have seen unmaintainable systems with very low CC scores. It was just that changing anything in the system require a bulldozer to move the mountains of code required. I have seen very maintainable systems that had a high degree of complexity at parts. CC is not a good indication.

 Let us go back to Steve's quotes above. It takes too long to read a million lines of code. IDE breaks down at 20 millions lines of code.

Well, of the code bases above, I can clearly and readily point outs many sections that I have never read, have no idea about how they are written or what they are doing. I never read those million lines of code.

As for putting 20 millions lines of code in the IDE...

Why would I want to do that?

The secret art of having to deal with large code bases is...

To avoid dealing with large code bases.

Wait, did I just agree with Steve? No, I still strongly disagree with his conclusions. It is just that I have a very different approach than he seems to have for this.

Let us look at a typical project structure that I would have:

image

Now, I don't have the patience (or the space) to do it in a true recursive manner, but imagine that each of those items is also composed of smaller pieces, and each of those are composed of smaller parts, etc.

The key hole is that you only need to understand a single part of the system at a time. You will probably need to know some of the infrastructure, obviously, but you don't have to deal with it.

Separation of concerns is the only way to create maintainable software. If your code base doesn't have SoC, it is not going to scale. What I think that Steve has found was simply the scaling limit of his approach in a particular scenario. That approach, in another language, may increase the amount of time it takes to hit that limit, but it is there nevertheless.

Consider it the inverse of the usual "switch the language for performance" scenario, you move languages to reduce the amount of things you need to handle, but that scalability limit is there, waiting. And a language choice is only going to matter about when you'll hit it.

I am not even sure that the assertion that 150,000 lines of dynamic language code would be that much better than the 500,000 lines of Java code. I think that this is utterly the wrong way to look at it.

Features means code, no way around it. If you state that code size is your problem, you also state that you cannot meet the features that the customer will eventually want.

My current project is ~170,000 LOC, and it keeps growing as we add more features. We haven't even had a hitch in our stride so far in terms of the project complexity. I can go in and figure out what each part of the system does in isolation. If I can't see this in isolation, it is time to refactor it out.

On another project, we have about 30,000 LOC, and I don't want to ever touch it again.

Both projects, to be clear, uses NHiberante, IoC, DDD (to a point). The smaller project has much higher test coverage as well and much higher degree of reuse.

The bigger project is much more maintainable (as a direct result of learning what made the previous one hard to maintain).

To conclude, I agree with many of the assertions that Steve makes. I agree that C#/Java encourage duplication, because there is no way around it. I even agree that having to deal with a large amount of code at a time is bad. What I don't agree is saying that the problem is with the code. The problem is not with the code, the problem is with the architecture. That much code has no business being in your face.

Break it up to manageable pieces and work from there. Hm... I think I have heard that one before...

NHiberante: Querying Many To Many associations using the Criteria API

Over a year ago I was asked how we can query a many to many association with NHibernate using the criteria API. At the time, that was not possible, but the question came up again recently, and I decided to give it another try.

First, let us recall the sample domain model:

Blog
    m:n Users
    1:m Posts
        n:m Categories
        1:m Comments

And what we want to do, which is to find all Posts where this condition is met:

Blog.Users include 'josh' and Categories includes 'Nhibernate'  and a Comment.Author = 'ayende'.

At the time, it wasn't possible to express this query using the criteria API, although you could do this with HQL. Doing this with HQL, however, meant that you were back to string concat for queries, which I consider a bad form.

I did mention that a year have passed, right?

Now it is possible, and easy, to do this using the criteria API. Here is the solution:

DetachedCriteria blogAuthorIsJosh = DetachedCriteria.For<User>()
	.Add(Expression.Eq("Username", "josh")
	.CreateCriteria("Blogs", "userBlog")
	.SetProjection( Projections.Id())
	.Add(Property.ForName("userBlog.id").EqProperty("blog.id"));

DetachedCriteria categoryIsNh = DetachedCriteria.For(typeof(Category),"category")
    .SetProjection(Projections.Id())
    .Add(Expression.Eq("Name", "NHibernate"))
    .Add(Property.ForName("category.id").EqProperty("postCategory.id "));

session.CreateCriteria(typeof (Post),"post")
    .CreateAlias("Categories", "postCategory")
    .Add(Subqueries.Exists(categoryIsNh))
    .CreateAlias("Comments", "comment")
    .Add(Expression.Eq("comment.Name", "ayende"))
    .CreateAlias("Blog", "blog")
    .Add(Subqueries.Exists(blogAuthorIsJosh))
    .List();

And this produces the following SQL:

SELECT This_.Id              AS Id1_3_,
       This_.Title           AS Title1_3_,
       This_.TEXT            AS Text1_3_,
       This_.Postedat        AS Postedat1_3_,
       This_.Blogid          AS Blogid1_3_,
       This_.Userid          AS Userid1_3_,
       Blog3_.Id             AS Id7_0_,
       Blog3_.Title          AS Title7_0_,
       Blog3_.Subtitle       AS Subtitle7_0_,
       Blog3_.Allowscomments AS Allowsco4_7_0_,
       Blog3_.Createdat      AS Createdat7_0_,
       Comment2_.Id          AS Id4_1_,
       Comment2_.Name        AS Name4_1_,
       Comment2_.Email       AS Email4_1_,
       Comment2_.Homepage    AS Homepage4_1_,
       Comment2_.Ip          AS Ip4_1_,
       Comment2_.TEXT        AS Text4_1_,
       Comment2_.Postid      AS Postid4_1_,
       Categories7_.Postid   AS Postid__,
       Postcatego1_.Id       AS Categoryid,
       Postcatego1_.Id       AS Id3_2_,
       Postcatego1_.Name     AS Name3_2_
FROM   Posts This_
       INNER JOIN Blogs Blog3_
         ON This_.Blogid = Blog3_.Id
       INNER JOIN Comments Comment2_
         ON This_.Id = Comment2_.Postid
       INNER JOIN Categoriesposts Categories7_
         ON This_.Id = Categories7_.Postid
       INNER JOIN Categories Postcatego1_
         ON Categories7_.Categoryid = Postcatego1_.Id
WHERE  EXISTS (SELECT This_0_.Id AS Y0_
               FROM   Categories This_0_
               WHERE  This_0_.Name = @p0
                      AND This_0_.Id = Postcatego1_.Id)
       AND Comment2_.Name = @p1
       AND EXISTS (SELECT This_0_.Id AS Y0_
                   FROM   Users This_0_
                          INNER JOIN Usersblogs Blogs3_
                            ON This_0_.Id = Blogs3_.Userid
                          INNER JOIN Blogs Userblog1_
                            ON Blogs3_.Blogid = Userblog1_.Id
                   WHERE  This_0_.Username = @p2
                          AND Userblog1_.Id = Blog3_.Id);

I am pretty sure that this is already in 1.2, but I don't have that handy to check.

Looking for a DSL idea

I need to start writing the second part of the book soon. This one is supposed to take a DSL implementation through all the interesting stages that I would like to cover. However, I am not sure yet what the subject of the DSL will be.
I need something that has enough scope to last for about a hundred pages, complex enough to expose usual problem when writing DSL and not tied to a specific domain so strongly that it would be hard to outsiders to grasp.

I am also interested in knowing what kinds of patterns and problems you would like me to cover.

My current line of thinking is to build several DSL around the backend of an online store. That domain should be fairly familiar to all, and it is rich enough to offer a lot of things to discuss. It is also a good place to discuss several types of DSL.

I was thinking of the following DSL:

  • Message routing / dispatch DSL
  • Order processing DSL
  • Then we will extend that to be a generic rule engine DSL
  • Perhaps creating a testing DSL for those DSL or for the domain itself

The main problem that I have now is that building those DSL is very simple, I don't think that it would be enough to cover everything the we need to build DSL. In that light, I am looking for either more ideas or challenges on the use of the above mentioned DSL.

Deploying from source control

This is just a quick note for public review, you are probably aware that I am doing deployments by doing a "svn up && build". I am now thinking about how we can apply the same idea to deploying DSL. This ensures, at the very least, that our DSL are under source control. But that has led me to another thought, if we are enforcing SCM for the DSL, why not enforce unit testing as well?

Part of the loading process of a DSL can be loading the DSL and its unit tests, executing the unit tests and only accepting them if they all passed. Failure in the unit tests or lack of unit tests would cause the DSL load process to fail.

Thoughts?

Boo: Design By Contract in 20 lines of code

Now, before Greg hurls a modopt on me, I want to be clear that this isn't the same thing that Spec# is doing. But it is a very cool way to specify constraints that must always be valid when a method exists.

Here is the code:

[AttributeUsage(AttributeTargets.Class)]
class EnsureAttribute(AbstractAstAttribute):
	
	expr as Expression
	
	def constructor(expr as Expression):
		self.expr = expr
		
	def Apply(target as Node):
		type as ClassDefinition = target
		for member in type.Members:
			method = member as Method
			continue if method is null
			block = method.Body
			method.Body = [|
				block:
					try:
						$block
					ensure:
						assert $expr
			|].Block

And the usage:

[ensure(name is not null)]
class Customer:
	name as string
		
	def constructor(name as string):
		self.name = name
	
	def SetName(newName as string):
		name = newName

Now, any attempt to set the name to null will cause an assertion exception. This technique is quite powerful, and very easy to use. A few years ago I wrote a design by contract implementation for boo that was far more ambitious (handling inheritance, etc). I remember it being much more complicated, and while things like quasi quotation do make it easier, it is not that big a change.

I think that mostly it is the way I write code now, striving to simplicity is something that I am trying to apply recently, and I think it works.

Meta Methods

A meta-method is a shortcut into the compiler; it is a method that accepts AST nodes[1] and returns an AST node.

Let us implement this very simple scenario, the assert statement. Now, because Boo already has that, we will use “verify” as the method name. Here is the full method implementation:

[Meta]
static def verify(expr as Expression):
	return [|
		unless $expr:
			raise $(expr.ToCodeString())
	|]

We are using quasi quotation to save us typing. This is a static method decorated with the [Meta] attribute, and accepting an AST expression. This is all you need in order to create a meta-method. When you have a meta-method, you can call it, like this:

verify 1 == 2

Now the interesting tidbit happens. When the compiler sees a call to a meta-method, it doesn’t emit the code to call this method at runtime. Instead, during compilation, the meta-method is executed. We pass it the AST of the arguments of the method code (including anonymous blocks), and then we replace this method call with the result of calling the meta-method.

It is important that you’ll understand that after compilation, where in the code we had this:

verify 1 == 2

The actual compiled bits will have this:

unless 1 == 2:
	raise “1 == 2”

Please go over it again, to make sure that you understand how it works. It is similar to text substitution macros in C and C++, but this is actual code that is running during compilation that gets to output any code that it wants back into the compilation process, not mere text preprocessing. In addition to that, we are dealing directly with the compiler’s AST, not just copying lines of text.

This seems to be something that a lot of people have a hard time grasping. The compiler will ask you, at compilation time, what kind of transformation you want to do on the code. It will then take the result of the transformation (the method return value) and put it where the method call used to be.

The Boo code above can also be translated to the following C#, which is a bit more explicit about what is going on:

[Meta]
public static UnlessStatement verify(Expression expr)
{
	UnlessStatement unlessS = new UnlessStatement();
	unless.Condition = Expression.Lift(expr);
	RaiseStatement raise = new RaiseStatement();
	raise.Exception = Expression.Lift(expr.ToCodeString());
	unless.Statements.Add(raise);
	return unless;
}

Both have the same exact semantics.

We have actually used meta-methods before, when we implemented the “when” keyword for the scheduling DSL. Meta-methods are used in DSL quite often. They are usually the first step that we need to take into the compiler when we run into the limits of what the compiler gives us out of the box.


[1] An ast node is generic term to all the types that compose the abstract syntax tree of the language.

If it walks like a duck and it quacks like a duck

Then it must be an IQuackFu.

IQuackFu is Boo’s answer to the Method Missing / Message Not Understood from dynamic languages. Since Boo is a statically typed language[1], and since method missing is such a nice concept to have, we use this special interface to introduce this capability.

You are probably confused, because I didn’t even explain what method missing is. Let us go back and look at an example, shall we? We want to look at the following xml:

<People> 	<Person> 		<FirstName>John</FirstName> 	</Person> 	<Person> 		<FirstName>Jane</FirstName> 	</Person> </People> 

Now we want to display the first names in the xml. We can do it using XPath, but the amount of code required makes this awkward. We can also generate some sort of strongly typed wrapper around it, assuming that we have a schema for this, we can use a tool to generate the schema, if we don’t have it already…

Doesn’t it look like a lot of work? We can also do this:

doc = XmlObject(xmlDocument.DocumentElement)
for person as XmlObject in doc.Person:
print person.FirstName

But we are using a generic object here, how can this work? This works because we intercept the calls to the object and decide how to answer them at runtime. This is the meaning of the term “method missing”. We “catch” the method missing and decide to do something smart about it (like returning the data from the xml document).

At least, this is how it works in dynamic languages. For a statically typed language, the situation is a bit different; all method calls must be known at compile time. That is why Boo introduced the idea of IQuackFu. Let us check the implementation of XmlObject first, and then we will discuss how it works:

class XmlObject(IQuackFu):
_element as XmlElement

def constructor(element as XmlElement):
_element = element

def QuackInvoke(name as string, args as (object)) as object:
pass # ignored

def QuackSet(name as string, parameters as (object), value) as object:
pass # ignored

def QuackGet(name as string, parameters as (object)) as object:
elements = _element.SelectNodes(name)
if elements is not null:
return XmlObject(elements[0]) if elements.Count == 1
return XmlObject(e) for e as XmlElement in elements

override def ToString():
return _element.InnerText

We didn’t implement the QuackInvoke and QuackSet, because they are not relevant to the example at hand, I think that QuackGet will make the point. Now, just to complete the picture, we will write the first code sample, the use of XmlObject, as the compiler will output it.

doc = XmlObject(xmlDocument)
for person as XmlObject in doc.QuackGet(“Person”):
print person.QuackGet(“FirstName”)

The way it works, when the compiler finds that it can’t resolve a method (or a property) in the usual way, it then check if the type implements the IQuackFu interface. If it does implement IQuackFu, it translates the method call into the equivalent method call.

The example of the Xml Object is a really tiny one of the possibilities. Convention based methods are an interesting idea[2] that is widely used in Ruby. Here is an example that should be immediately familiar to anyone who dabbled in Rails’ ActiveRecord:

user as User = Users.FindByNameAndPassword(“foo”, “bar”)

Which will be translated by the compiler to:

user as User = Users.QuackInvoke(“FindByNameAndPassword”, “foo”, “bar”)

The Users’ QuackInvoke method will parse the “method name” and issue a query by name and password.

You can do some very interesting things with IQuackFu...


[1] Well, it is statically typed unless you explicitly tell the compiler that you want late bound semantics. Aside from working against IDispatch COM interfaces, I have rarely found that ability useful. One case I did find it useful, however, was when I wanted to introduce Context Parameters, which we will discuss in a few pages.

[2] For the adventurous sorts, you can also do something called Lazy Methods, in which you generate a method if and only if it is being called. This is an interesting exercise in extending the compiler, but for all intents and purposes, IQuackFu answers this need very well.

Web development with training wheels?

This quote has me floored:

Well, I'm an asp.net developer, not really a web developer. It is like web development with training wheels, only the training wheels are really heavy, uneven, and make riding the bike harder

Statically typed? Compiler checked? Ha!

Just a nod toward the people that cling to static typing with both hands, their teeth and the tail:

RouteTable.Routes.Add(new Route
{
 Url = “admin/[controller]/[action]“,
 Defaults = new
 {
  Controller = “Admin“,
  Acton = “Index”
 },
 Validation = new
 {
  Conrtoller = “Admin|Users|Categories”
 },
 RouteHandler = typeof(MvcRouteHandler)
});

Now, instead of abusing the language to get this, can we get first class support for this things?

Cross Site Scripting and letting the framework deal with it

Rob Conery asks how the MS MVC platform should handle XSS attacks. In general, I feel that frameworks should do their best to ensure that to be secure by default. This means that I feel that by default, you should encode everything that comes from the user to the app. People seems to think that encoding inbound data will litter your DB with encoded text that isn’t searchable and consumable by other applications.

That may be the case, but consider, what exactly is getting encoded? Assuming that this is not a field that require rich text editing, what are we likely to have there?

Text, normal text, text that can roundtrip through HTML encoding without modifications.

HTML style text in most of those form fields are actually rare. And if you need to have some form of control over it, you can always handle the decoding yourself. Safe by default is a good approach. In fact, I have a project that uses just this approach, and it is working wonderfully well.

Another approach for that would be to make outputting HTML encoded strings very easy. In fact, it should be so easy that it would be the default approach for output strings.

Here, the <%= %> syntax fails. It translate directly to Response.Write(), which means that you have to take an extra step to get secured output. I would suggest changing, for MS MVC, the output of <%= %> to output HTML encoded strings, and provide a secondary way to output raw text to the user.

In MonoRail, Damien Guard has been responsible for pushing us in this direction. He had pointed out several places where MonoRail was not secure by default. As a direct result of Damien's suggestions, Brail has gotten the !{post.Author} syntax, which does HTML encoding. This is now considered the best practice for output data, as well as my own default approach.

Due to backward comparability reasons, I kept the following syntax valid: ${post.Author}, mainly because it is useful for doing things like output HTML directly, such as in: ${Form.HiddenField("user.id")}. For the same reason, we cannot automatically encode everything by default, which is controversial, but very useful.

Regardless, having a very easy way ( !{post.Author} ) to do things in a secure fashion is a plus. I would strongly suggest that the MS MVC team would do the same. Not a "best practice", not "suggested usage", simply force it by default (and allow easy way out when needed).

Making diagrams for dummies

image I have been getting a lot of questions about how I make the diagrams for the blog.

I got some very strange suggestions from people, from having a full blown art department dedicated to producing those to drawing the diagrams on physical paper and then taking pictures of that on a wooden table.

The truth is far more boring, I am afraid.

I generally use the following tools to produce the diagrams:

  • Google Image Search
  • Power Point
  • MS Paint
  • Visual Studio

I use visual studio's class diagram designer to create the class diagrams, usually from empty projects, not the real ones. Then I copy the image to power point, where I do some mixing & matching.

Power Point is really nice in this regard, because it offers a rich set of effects that even a graphical dummy like me can use effectively.

I use Google Image Search to find relevant images, and then drag them into Power Point as well, making the same style of effects there. Then, if needed, I move it to MS Paint for final processing, and from there, to the blog.

This approach works well for someone that needs three tries to draw a straight line using a ruler, it is also something that doesn't take very long, which is also important.

AOP: Be aware where your point cuts are

So, this issue cause some head scratching today. We are using WIndsor's Automatic Transaction Management with NHibernate's flush-on-commit option, so if a transaction doesn't commit, nothing is written to the database.

Anyway, this is a story about refactoring, and what it showed us. We performed the following refactoring:

image

Some things that is important to understand, the LoginController is decorated with [Transactional], and there is a [Transaction] attribute on CreateUserLoggedInAuditRecord.

When it was on the controller, it just worked. When we moved it to its own class, it didn't work. To be rather more exact, it worked, it just never committed the transaction. That was weird. After some head scratching I found out that I forgot to put [Transactional] on the UsageRegistrationImpl. With a small smile of geeky  triumph, I run the code again. It didn't save.

That was really worrying, and I had no idea what was going on. Since this is rarely popular, I repeatedly run the code, hoping that something would turn up and that no one would pull the old quote about insanity.

After a few repetitions, I suddenly saw the light.

image

It had to do where I placed the pointcut. A pointcut, in AOP terms, is where the AOP can interfere with the running code. Let us take a look at how it worked when we used the LoginController directly. Because we (well, the transaction facility) asked the container to create an interceptor for it, we got the following classes at runtime:

image

The login controller is the original class, the login controller proxy was generated at runtime, and any invocation of any of its methods would fire the transaction interceptor, so it would get a chance to create/rollback/commit a transaction if needed. Since those methods are virtual, this means that even if I am calling methods on the same class, they will be intercepted correctly.

Now, when I moved to the interface + implementing class, we have a different behavior. Now, we use the interface pointcuts in order to inject behavior, it looks like this:

image

Windsor will create a proxy interface implementation that would call the AOP interceptors and will forward to the UsageRegistrationImpl.

The problem was with the RegisterUserLoggedIn method. It was similar to this:

public virtual void RegisterUserLoggedIn(string username)
{
	// do other things
	CreateUserLoggedInAuditRecord(username);
}

[Transaction]
public virtual void CreateUserLoggedInAuditRecord(string username)
{
	//do database stuff
}

Given the story so far, you can obviously see the problem. When we call the CreateUserLoggedInAuditRecord() method, we call it from the UsageRegistrationImpl class, so we never pass through any of the pointcuts.

When we used the method from the controller directly, we made a virtual method call, which was intercepted, but since in this case, we were using the interface as our pointcut, this simply by passed the whole thing.

That was an interesting lesson, and one that I'll need to remember for the future.

Moq: Mocking in C# 3.0

Daniel has a very interesting post about how mocking can works in C# 3.0 (I don't like the term Linq for Mock, if you haven't noticed).

I wonder if he has used Rhino Mocks, from the description of mock frameworks in the post, it looks like he didn't.

At any rate, the syntax that he has there if quite interesting:

var mock = new Mock<IFoo>();
mock.Expect(x => x.DoInt(It.Is<int>(i => i % 2 == 0))).Returns(1);

The use of lambda as a way to specify expectations is cool, and as a way to specify constraints, flat out amazing.

You can do most of that with Rhino Mocks right now, I feel forced to point out.

IFoo foo = mocks.DynamicMock<IFoo>();
Expect.Call( () => foo.DoInt(0) )
	.Callback( (int i) => i % 2 == 0 )
	.Return(1);

We even have an Is.Matching constraint, that we can use instead:

Expect.Call( () => foo.DoInt(0) )
	.Constraints( Is.Matching<int>( i => i % 2 == 0) )
	.Return(1);

I guess we will just need to see what kind of cool stuff we still have in store for it.

Great job, Daniel.

I suggest that you would check this out. The Quick Start gives few more examples, and it is looking really nice.

By the way, for the horde of people who would like to "port" those examples to Rhino Mocks, we have a wiki... :-) 

Hibernating Rhinos 7: Rhino Igloo

image

I got quite a few requests for more information on this, beyond the short documentation in this post. This is not something that I was very happy with because I feel that Rhino Igloo represents a compromise that I am not very happy with. Consider this the product of a developer longing for MonoRail while having to deal with WebForms world.

I am not making excuses for this project, it is meant to serve a very specific goal, and it has done that very successfully.It is also extremely opinionated and may not fit what you want to do.

  • Length: 57 minutes
  • Download size: 79 MB
  • Code starts at: 11:14

You can download the screen cast here. There is a secret message there, let us see if you can spot it.

Update: looks like the file I uploaded is corrupted, I'll upload a new one soon, in the meantime, I removed it from the download page.

Update 2: Uploaded a good version, it is available now.

Do you trust your compiler? Really trust your compiler?

There is a discussion in the alt.net mailing list right now about how far you can and should trust your compiler. I thinks that this is interesting, because this piece of code of mine is on its way to production:

public Guid Create<T>(T newEntity)
{
	using (CrmService svc = GetCrmService())
	{
		object cheatCompiler = newEntity;
		Guid guid = svc.Create((BusinessEntity) cheatCompiler);
		return guid;
	}
}

This is part an implementation of an interface in an assembly that cannot reference BusinessEntity.

I am feeling good with this.

Choices...

IFooFactoryFactoryFactoryFactory vs. Factory<Factory<Factory<Factory<Factory<IFooFactory>>>>>

Discuss...

How to get around in Boo

When you are not sure how to do something in Boo, try doing it like you would with C# (with the obvious syntax changes), in most cases, it would work. It may not be the best way to do something, however.

Keep this a secret, I may get thrown out of the Boo Associated Hackers community if that would happen, and where I would be without my BAH! membership?