Ayende @ Rahien

It's a girl

Macto, Authorization decisions

image

Authorization is one of the few ubiquitous requirements, you are going to have to handle them in pretty much every system that you are going to build.

The users are the staff, and the securables are the inmates. The problem is that we have fairly different authorization requirements for different parts of the system.

For example, any authenticated user can lookup pretty much any inmate’s data (except for medical records, of course), but changing release date is something that only Legal can do. Only the staff on the enclosure that the inmate is located in can Sign Out the inmate. Actually releasing an inmate requires Legal and On Duty Officer approval, etc.

However, during weekends, the on duty staff assume responsibility for the entire prison. That is, an officer from enclosure C can Sign Out an inmate from enclosure A if that officer is the on duty officer.

There are a few more complications, but we will ignore them for now. One thing that is fairly clear, we have fairly complex authorization requirements, and they are different for each part of the system. For that matter, the way we make security decisions itself is different.

And since authorization decisions are synchronous (you can make them async, sort of, but at very high cost), performance is a critical concern. This is especially true because there is a strong tendency to call authorization decisions many times.

Given that, and given the complexity inherit to authorization, I think that we can skip the entire problem entirely by changing the rules of the game.

Most authentication systems would have you do something like:

auth.IsAllowed(CurrentUser, "/Inmate/Move", inmate);

And rely on the system to do its magic this way. The problem in this manner is that we provide the authorization system with very little information with which it can work. That means that in order to make authorization decisions, the system has to have a way to access other data (such as in what enclosures the current user is in charge of, where the inmate is, what is its status, etc).

The problem then become more an issue of data retrieval complexity rather than the authorization rules complexity. I think that we can avoid this by designing the system with more flexibility by providing the required data to the authorization system explicitly.

What do I mean? Well, just take a look:

auth.IsAllowed(
	new SignOutAuthorization
	{
		OfficerRank = CurrentUser.Rank,
		OfficerRoles = CurrenUser.GetAllCurrentRoles(),
		OfficerEnclosures = CurrentUser.GetEnclosuresUserIsResponibleFor(),
		InmateEnclosure = inmate.Enclosure,
		InmateStatus = inmate.Status,
	}
);

In other words, we are explicitly providing the authorization system with all the data that it needs for a particular task. That means, in turn, that we can now execute the authorization decision completely locally, without having to go somewhere to fetch the data. It also open the option of using a DSL to build the authorization rules, which will make things more dynamic and easier to work with.

Reviewing NerdDinner: The Select N+1 pitfall

During my review of NerdDinner, I came across the following piece of code:

image

And I knew that there is a Select N+1 problem here. This is quite similar to the problem that I described here. RSVPs is a lazily loaded collection. As such, calling Any() on it will result in a database query.

Now the question was whatever or not it is used somewhere in a loop. ReSharper will help us figure that one out:

image

The first usage location is here:

image

And this certainly doesn’t look like it can cause a Select N+1, right. Just to be sure, I checked where it is called from the client side as well:

image

This looks all right, we will have a query, but it is only one per request.

Great, now let us take a look at the second usage. You can tell that this is going to be a problem just by the ascx suffix. The code is pretty simple:

image

I have a moral issue with the view generating a call to the database, but at least it isn’t being done in a loop.

I am certain, however, that we are going to find this .ascx file in a grid, which will cause a loop. This .ASCX file is being used in Details.aspx and is used like this:

image

However, it is not actually being used in a loop.

There is not a Select N+1 issue, which renders this entire post moot.

However, while it is not an active Select N+1, it is a dormant one. It would be very easy to have a new requirements that would make use of the RSVPStatus.ascx file in a loop and cause the issue.

Reviewing NerdDinner

Scott Hanselman has asked me to see if I can port NerdDinner to NHibernate, now that Linq for NHibernate has made RTM. I thought that I might as well take the time to look at the code and do a code review.

I would like to say a few things before actually even starting to look at the code. I am not going to review the code as a sample application, I am going to review it as if it was a standard production project. I am assuming that (remember, haven’t seen the code yet) there are things there that are done that way explicitly because they are easier to explain as a sample app than a more complex project would be.

The first thing that I opened was HomeController, which I am going to show you in full.

image

There are two things that I don’t like here.

First, the empty actions (return View() is empty as far as I am concerned) resulting in what is, for all intents and purposes, an empty controller (one that contains empty actions). In other words, the only reason that this controller exists is because it is the way ASP.Net MVC want it to. A brief overview of the codebase show me that there are more than a few of method like that.

I would deal with something like that completely on the view side, having a routing action that would check the view directly and just render the view, rather than creating empty actions and controllers.

Second, [HandleErrorWithELMAH], such things should not be on controllers (and that attribute appears on all controllers). It should be on a base controller, because right now, it is a violation of DRY, and it is easy to forget.

Then there are the tests for HomeController, which are… less than useful, shall we say?

image

Next, let us take a look at the following interface:

image

I have a serious problem with the Save() method. I had to look at the implementation to figure out what it is doing. It should be called FlushToDatabase, or SubmitChanges. Save() is a totally unexpected name for the method who implementation is:

image

You are not, actually, saving anything.

As an aside, I don’t like seeing things like this one:

image

Linq to SQL actually have a way of handling cascading deletes, and it should have been used.

For that matter, coming back to the tests, there aren’t any persistence tests. Since there are some complexity involved in the queries (geo located queries), I would have liked to see at least those queries being covered.

I don’t want to rehash the poor man’s IoC story, but this is really pissing me off:

image

I mean, if you want to do poor man’s IoC, go ahead. But please don’t create this bastard child.

This is more about code style than hygiene, but I don’t like it:

image

Those classes should be in separated files, they should not be in AccountController.

Overall, I don’t see too many issues with the codebase. The only thing that pissed me off was the Mad Man IoC.

Macto, defining Centralized Service, Distributed Service and Localized Component

I lately come into the conclusion that I need a few new terms to describe a few common ways to talk about the way that I structure different components in my applications.

Those are Centralized Service, Distributed Service and Localized Component. Mostly, I use them as a way to express the distribution semantics of the item in question.

As you can probably guess, I am using the term service to refer to something that we make remote calls to, while I am using the term component to refer to something that is running locally.

Centralized Service is probably the classic example of a web service. It is a server that is running somewhere to which we make remote calls to. As far as the system is built, there is only one such server. It may be implemented with clustering or load balancing, but logically (and quite often, physically), it is a single server that is processing requests. This is probably the easiest model to work with, since it more or less remove the entire question of concurrency conflicts from the system. Internally, the Centralized Service is using transactions or locks to ensure coherency in the face of concurrency.

Distributed Service is a built upfront to run on a set of server, and the need to handle concurrency conflicts is built into the design of the system. That may be done using sharding, Paxos or other methods. Usually, we build Distributed Service for very high scalability / reliability cases, since it tend to be the more complex solution. An example of a Distributed Service would be DNS, where we explicitly design the system to be resilient to failure, but accept the more complex concurrency issues (slow updates).

Localized Component is a solution to the chatty interface problem. There are quite a few scenarios where we need to make calls to a separated subsystem, but the cost of network traffic completely outweigh the cost of actually performing the operation on the other side. In this case, we may switch from a Centralized Service to a Localized Component. What this means is that instead of executing the operation on the other side, we perform it locally.

In practice, this means that we need to design our system in a way that any data that we would like to have is structured in such a way that it can be brought locally or retrieved very cheaply. An example of such a system appears in this post, although that is a fairly complex one. A more common situation is a component that deals with a set of rule, and we simply need to get the rule from the rule repository and execute it locally.

Another alternative for Localized Components is to structure it in such a way that retrieving and persisting the data is cheap, and processing it is done on locally. That way, the sharing of the data and the actual processing of the data are two distinct issues, which can be resolved separately. A common issue that needs to be resolved with Localized Components is consistency, if the component allow writes, how do other instances of the component, running on different machine get notified about it.

I tend to avoid Distributed Services in favor of Centralized Services and Localized Components, which tend to be easier to work with. It is also easier to lean on existing infrastructure then write an implementation from scratch. For example, I am using Rhino DHT (which I consider to be a Distributed Service) to handle a lot of the complexity inherit to one of those.

Macto, a module features spec sheet for authentication

imageI am going to talk about a few ways of trying to organize a project, mostly as a way to lay out the high level requirements for a feature or a module.

I consider filling in one of those to be the act of actually sitting down and thinking about the concrete design of the system. It is not final design, and it is not set in stone, but in general it forces me to think about things in a structured way.

It is not a very hard to do, so let us try to do this for the authentication part of the application. Authentication itself is a fairly simple task. In real corporate environment, I’ll probably need integration with Active Directory, but I think that we can do with simple username & pass in the sample.

Module: Authentication

Tasks: Authenticate users using name/pass combination

Integration: Publish notifications for changes to users

Scaling constraints:

Up to 100,000 users, with several million authentication calls per day.

Physical layout:

Since the system need to handle small amount of users, we have to separate deployment options, Centralized Service* and Localized Component*. Both options are going to be developed, to show both options.

Feature

Description

SLA

authenticate user

Based on name & password
Lock a user if after 5 failed logins

less than 50 ms per authentication request 99.9% of the time, for 100 requests per second per server

create new user

User name, password, email

less than 250 ms per change password request 99.9% of the time, for 10 requests per second globally

change password

 

less than 250 ms per change password request 99.9% of the time, for 10 requests per second globally

reset password

 

less than 250 ms per change password request 99.9% of the time, for 10 requests per second globally

enable / disable user

disable / enable the option to login to the system

less than 250 ms per change password request 99.9% of the time, for 10 requests per second globally

You should note that while I don’t expect to have that many users in the system, or have to handle that load, for the purpose of the sample, I think it would be interesting to see how to deal with such requirements.

The implications of this spec sheet is that the system can handle about 8.5 million authentication requests per day, and about a 3/4 of a million user modifications requests.

There are a few important things to observe about the spec sheet. It is extremely high level, it provide no actual implementation semantics but it does provide a few key data items. First, we know what the expected data size and load are. Second, we know what the SLAs for those are.

* Centralized Service & Localized Component are two topics that I’ll talk about in the future.

Tags:

Published at

Originally posted at

Comments (13)

What is maintainable?

Frans Bouma left the following in this post, which I found quite interesting:

The person who spend a lot of time inside the code base obviously finds it logical and knows his way through the codebase and where to make changes what to avoid etc. For that person, the codebase is maintainable. It is different for a person who's new to the codebase. If that person has to spend a lot of time figuring out where what is and most importantly: why, it's already a sign things aren't as maintainable as it should.

I have to say that I disagree quite strongly with this definition.

Maintainable is a value that can only be applied by someone who is familiar with the codebase. If that someone find it hard to work on the codebase, it is hard to maintain. If someone with no knowledge of a codebase find it hard to work with it, tough luck, but that doesn’t say anything about the maintainability of a code base.

A codebase is more than just a code sample, it is conventions, ways of doing things, overall architecture and a sense of how to do things.

Now, there is a separate value, related to how easy it is to bring someone up to speed on the codebase, but that does not relate directly to maintainability.

Tags:

Published at

Originally posted at

Comments (44)

A short note about NHibernate and Silverlight

I got a few questions about NHibernate and Silverlight. That is actually a very easy thing to answer.

Don’t even try. They don’t get along. In fact, they aren’t even going to get along.

Silverlight doesn’t have System.Data.IDbConnection, and you can safely assume that that it somewhat important to NHibernate.

So, running NHibernate inside a Silverlight application, presumably in order to access a local database is out. But I don’t think that this is what most people actually had in mind when they ask about NHibernate and Silverlight. They want to know about NHibernate on the server and Silverlight on the client.

And that is easy enough to answer as well, it is going to work just like any client / server system. All the same rules apply.

Tags:

Published at

Originally posted at

Comments (27)

Macto, or How To Build a Prison

image

The sample application that I am going to build is going to be a prison management application. I am going to take this post as a chance to talk about it a bit, discuss the domain and then I’ll talk about the overall architecture in more details.

The domain of a prison is actually fairly simple, you have an inmate, and the sole requirement is that you would keep him (it tend to be overwhelmingly him, rather than her) in lawful custody.

The term lawful custody has a lot of implications, which are, in more or less their order of importance:

  • The inmate is in custody, that is, he didn’t manage to run away.
  • Custody is lawful, that is, you have legal authorization to keep him in jail. Usually that means an order by a judge, or for the first 24 hours, by a police officer.
  • Lawful custody itself means that you:
    • keep the inmate fed
    • in reasonable conditions (sleeping quarters, sanitation, space)
    • access to medical facilities. Indeed, in most prisons the inmates get better health care, especially for emergencies, than the people living in most big cities.
    • ability to communicate with lawyers and family

The devil, however, is in the details. I am pretty sure that I could sit down and write about 250 pages of high level spec for things that are absolutely required for a system that run a prison and still not get everything right.

In practice, at least in the prisons I served at, we did stuff using paper, some VB6 apps & Access, and in one memorable occasion, an entire set of small prisons where running on what amounted to a full blown application written using Excel macros.

Anyway, what I think that I’ll do is start with a few modules in the system, not try to build a full blown system.

The modules that I‘ll start with would be:

  • Staff – Managing the prison’s staff. This is mostly for authentication & authorization for now.
  • Roster – Managing the roster of the prisoners, recording Countings, etc.
  • Legal – Managing the legal side of the prisoners, ensuring that there are authorizations for all the inmates, court dates, notifications, etc.
  • Escort – Responsible for actually taking the inmates out for court, medical evacs, releasing inmates, etc.

That is enough for now, for that matter, it is a huge workload already, but that is about the only way in which I can actually have a chance to show a big enough system and the interactions between all the parts.

Unethical behavior

Ben is pointing out something that I find flat out infuriating, a TFS MVP had removed a comment talking about SVN & Git from his blog with the following explanation:

“No offense, but I deleted your comment.  I make way too much $$ on Team System training & consulting to go publicly plugging alternative options.”

I am… disappointed. I started to write shocked, but it is not the first time that I have seen stuff like that happen.

What bothers me even more, if you can’t deal with critics on something that you are doing for a living, how can you call yourself an consultant in the field? How can you actually point out the options if you refuse to even look at them or engage in conversation about them.

This make me seriously doubt the professionalism of the person in question, to say the least.

image

Independent expert? I don’t think so.

Real world answers? Hardly.

Update: Ben Day has posted a reply about this.

Pitfalls

Mike Scott has pointed out a bug in this code, relating to the use of DateTime.Now vs. DateTime.UtcNow, which can cause inaccurate wait durations during daylight change time.

That made me think for a bit, there are things that just naturally make me highly suspicious, because they are a common source for bugs.

Using a WCF service in a using block, creating a session factory, a foreach loop with database calls, a select without a limit clause.

What are the things that cause a flashing neon sign to go through your head when you see them in code? The stuff that you know will have to be re-read twice to make sure it is bug free.

More on Macto

Looking at the responses that I got, I think that there is some basic misunderstanding about the goal of the sample. Some people seems to want this to be a usable product, some even went ahead and specified some… interesting requirements.

Unlike Storefront, I don’t intend to create something that would be a useful component to take and use, for the simple reason that I don’t think that it would allow to show anything really interesting. Any generic component or system has to either make too much assumptions (constrained) or not enough (open ended). I don’t care to have to hand wave things too much.

Given that I am a domain expert on exactly two things, and that I am not going to create Yet Another Bug Tracking Application, I think that it is fairly obvious what I have to build.

Macto is going to be a prison management system.

I am going to use it to demonstrate several topics that I have been dealing with lately, among them the concepts & features architecture and how to build scalable systems.

I’ll let you stew on that and will post more details about Macto tomorrow.

NHibernate Linq 1.0 released!

The most requested feature for NHibernate for the last few years has been Linq support, and it gives me great pleasure to announce the release of NHibernate Linq 1.0 RTM, which you can download from here.

NHibernate Linq support is based on the existing, proven in production, Linq provider in NHibernate Contrib. The plans to overhaul that and merge that into NHibernate’s proper for the next release are still active, but the project team feels most strongly that production quality Linq support is something that we ought to provide for our users now.

This Linq release support just about anything that you can do with the criteria API. We do not support group joins or subqueries in select clauses, since they aren’t supported by the criteria API. NHibernate’s Linq support has been tested (in production!) for the last couple of years, and most people find it more than sufficient for their needs.

We do plan for expanding the Linq support to support more, but the decision has been made that it makes absolutely no sense not to provide an interim, highly capable, release in the meantime for our users.

Enjoy, and happy linqing.

Oh, and I almost forgot, which would have been a shame. Many thanks for Tuna and Chad, for doing the bulk of the work on the provider.

Macto: An end to end sample

It looks like what people would really like to see is an end to end sample of all the things that I have been talking about lately.

The problem with doing that is the scope that we are talking about here, it is pretty big, and there are some interconnected parts that would be hard to look at in isolation. To make things a little bit more interesting, building a “best practice” application is dependent on far too many variables.

Given all of that, I decided that I might as well copy a good idea and try to emulate Rob Conery’s Storefront series of webcasts. What I absolutely refuse to do, however, is to create Yet Another Online Shop.

Hence, I have another forum, dedicated to this specific task, where you can discuss what you want to see.  We need to decide what is the application, what are the requirements, etc.

Please have the conversation in the forum, which would make it easier to track things.

I think that I can promise at least two or three webcasts to come out of it, including the full source code.

Blog posts ideas, remapped

Wow,

I got a lot of suggestions for new blog posts. I am not promising that I would do all of them, but I am going to try.

The experiment went well enough that I decided to create a dedicated forum for this, which would allow people to also vote for individual post topics, and give me better idea what people want to see.

I also moved all the previous suggestions to the new forum, and I think that I’ll make it a regular feature.

Book Review: Watch on the Rhine

Watch on the Rhine (Posleen War Series #7)

I just finished listening to this book, and it is… quite an interesting one. The basic premise of the book is enough to ensure that it would be interesting:

After the first [alien invasion] enemy landings in 2004, the German chancellor decides, despite fierce opposition, to rejuvenate survivors of the Waffen SS. Eager to redeem their tarnished honor, these veterans display the same steadfastness and fortitude that they did in Russia and Normandy.

I think that just from that, you can understand why it is interesting by default. I have to say, Ringo and Kratman managed to set a very believable world, and the handling of the topic was superb. I am going to have another post about Ringo’s style vs. Weber’s style, so I’ll skip a lot of that discussion in this post.

This is a Good Book, although I have to say, I find it much easier to accept alien invasions than the Judas Maccabiah SS brigade (which appear in the book).

I wonder the affect of Heinlein on Ringo’s writing. Some of the themes woven throughout the plot are definitely Heinleinism. The parasitic pacifist and peace through superior firepower, in particular.

I want to say that the book’s portrayal of the civilian attitude to the military mindset is unrealistic, but I have to say that unfortunately it isn’t so. There are some really stupid people out there.

The one thing that I find totally unrealistic in the book, however, was that political pressure was able to basically castrate the army. Mostly from environmentalist groups and the like. I have no idea how the German political and military game is played, but in most places, there is Peace and there is War. And you don’t mess around with the army in a time of war, the army tend to push back on that, and hard.

Tags:

Published at

NHibernate and NDepend – skimming the surface

Well, you are probably are already aware of the long discussion that I had about it. Patrick was kind enough to help me get to the actual information that I was interested at, and now I can try to talk with concrete data.

The post is based on skimming the following matrix:

image

But the reason that I am skimming is that the matrix size is just too big to really go over by hand. The matrix is showing dependencies between different types.

Here is an area that seems to be rich in cycles:

image

Let me list them out for you:

  • FutureCriteriaBatch and FutureQueryBatch have a cyclic relationship with SessionImpl. There isn’t a problem here, since they are considered to be friend classes, they would probably have been defined as inner classes of SessionImpl, if that class wasn’t already big enough. So we consider them to be part and parcel of the same thing, since they are tied together very tightly.
  • SessionFactoryImpl have cyclic relationship with SessionImpl and StatelessSessionImpl. Again, we don’t consider this a problem, those classes are tied together by the nature of their existence. SessionFactoryImpl have to be able to create the sessions, and they need to access it to be able to touch everything that isn’t session scoped.
  • MultiCriteriaImpl has a cyclic relation with SessionImpl for much the same reason, SessionImpl is its creator, after which it  is using SessionImpl to do a lot of its work.
  • CritertiaImpl and its inner class, CriteriaImpl.Subcrtieria are also listed as a cycle. In this case, again, I don’t have an issue. Especially since inner class relationship already have the concept of a cycle in it.

Here are a few other snapshots:

image

On the face of it, it looks like a lot of cycles in NHibernate.Type. But again, digging a bit deeper, we can see that there is actually a good reason for those. Customer types needs to be able to create additional types, and it is no great surprise that they use the TypeFactory to do so, or that the TypeFactory can also create them.

Another matrix that looks bad at first glace is this:

image

Except that every single one of those cyclic dependencies are between a parent class and an inner class.

Now, I don’t want to say that there aren’t cyclic dependencies in NHibernate that shouldn’t be there and need to be removed. But on a(n admittedly short) study, I wasn’t able to find a such a case. And the cases that I found had pretty good reasons for behaving the way they do.

Another observation worth noting is that NHibernate is structured using a recursive spoke and wheel approach.

image

What I mean by that is that we have different types of components all doing their own specific task, and then we have what, for lack of a better word, I’ll call orchestrators, which manager a bunch of those components and how they work together.

This continues on recursively, we have orchestrators for orchestrators, until we end up at the pinnacle, with ISession and ISessionFactory.

This design make it pretty easy to add new functionality along existing lines, and adding entirely new functionality is just a matter of deciding where to put things.

One of the things that I most like with NHibernate is that so many features can be implemented using very high level concepts. For example, Futures were extremely easy to implement, because we could take advantage of all the underlying infrastructure.

An implication of this is that when I need to expose something from SessionImpl, for example, which was built mostly using SessionImpl, I am going to have a cycle.

I suggest taking a look at FutureCriteriaBatch and its association to SessionImpl to look at why we have those cycles. I don’t believe that you will find them to be a problem after you look at the actual code.

Now, let see how big an ants’ nest I kicked this time.

Update: Patrick mentioned in the comments that I should also pay attention to direct & indirect usage. The problem is that I don’t find it very useful, from the aforementioned reasons. For that matter, when I looked at a few of the “trouble spots” with direct & indirect usage, I was still not able to point at something that I would consider a problem.

The graph certainly looks scary, isn’t it?

image

But the problem is that I can’t find much new information when I dig in. Here are a few examples:

image

We already talked about SessionImpl and FutureCriteriaBatch, and it doesn’t surprise me that IType and ISessionImplementor are tied together. ISessionImplementor needs types to do its work, and IType needs ISessionImplementor to access the services that NHibernate offers internally.

Another one:

 image

Again, nothing that I didn’t know already.

In fact, I’ll go a bit further and say that even for indirect only cycles in NHibernate’s codebase, the things I said above about direct cycles still hold.

Extending SubText to report Future Posts

Since I am making so much use of future posts recently, I decided that it would be interesting to have a sidebar that shows the future posts. The problem is that I really don’t want to mess around with SubText.

This is not a slight against SubText, it has served me well for a long time. It is simply that for what I wanted, the number of steps that I would have to go through is way too long:

  • Get the relevant source
  • Compile it on my machine
  • Figure out SubText’s architecture and where I should make my changes
  • Upload a new version of the blog engine

It is possible, but it just take too long. Moreover, it would most certainly break the next time that I would update SubText, because I would forget all about it.

But it is not the only way to get stuff done, however. Here is my solution:

Update: I actually had a bug here related to time zone handling, fixed now.

Update 2: I used the 3.5 TimeZone semantics for this, but my server is running 2.0, use an okay hack instead. And fixed potential issue with < & > in the titles.

<%@ Page Language="C#" %>
<%@ OutputCache Duration="60" VaryByParam="None" %>

<%@ Import Namespace="System.Data.SqlClient" %>
<%@ Import Namespace="System.Configuration" %>
<ul>
<%
string connectionString = ConfigurationManager.ConnectionStrings["subtextData"].ConnectionString;
using(SqlConnection connection = new SqlConnection(connectionString))
{
	connection.Open();
	
	using(SqlCommand cmd = connection.CreateCommand())
	{
		cmd.CommandText = @"
select top 15 Id, DateSyndicated, Title from subtext_Content
where DateSyndicated > @nowIsrael
order by DateSyndicated
";
		DateTime nowIsrael = DateTime.UtcNow.AddHours(3);
		
		cmd.Parameters.AddWithValue("nowIsrael",nowIsrael);
		
		using(SqlDataReader reader = cmd.ExecuteReader())
		{
			if(reader.HasRows == false)
			{
				%><li>Queue is empty</li><%
			}
			while(reader.Read())
			{
				DateTime dateSyndicated = (DateTime)reader["DateSyndicated"];
				string title = (string) reader["Title"];
				string formattedSyndication;
				TimeSpan timeLeft = dateSyndicated - DateTime.Now;
				
				if (timeLeft.Days > 7)
					formattedSyndication = timeLeft.Days / 7 + " weeks " + timeLeft.Days % 7 + " days";
				else if (timeLeft.Days > 0)
					formattedSyndication = timeLeft.Days + " days";
				else if (timeLeft.Hours > 0)
					formattedSyndication = timeLeft.Hours + " hours";
				else if (timeLeft.Minutes > 0)
					formattedSyndication = timeLeft.Minutes + " minutes";
				else
					formattedSyndication = "In a moment";
				
				%>
				<li><%= formattedSyndication %> <br /> <%= Server.HtmlEncode(title) %></li>
				<%
			}			
		}
	}
}
%>
</ul>

And then it was just a matter of changing the template to include the following JavaScript:

$('#futurePosts').load('http://ayende.com/Blog/FuturePosts.aspx');

And that is it. It works, it is safe, it can’t really break anything, and it will probably survive blog upgrades.

Feature by feature

I commented before that a feature is usually composed of several classes, running in different places and at different times, all working together to achieve a single goal. This post is meant to expand on this notion a bit.

Let us go back to the type of architecture that I favor right now. Note that this is a physical diagram only, without actually going into the details about how each of those is implemented.

image

Storage, for example, may be a relational database, a key value store, a distributed hash table or something else, depending on  the requirements and constraints that I have.

More than anything, this diagram represent physical distribution and setup behavior. For example, you really want to backup all the storage servers, but an application server can just have a image made and that would be it.

The main reason for having a separation between web & app servers is mostly so we can place the web server in the DMZ and the app server in a more trusted location. But I digress.

I mentioned before that I don’t really like layering anymore, and that I tend to think hard before I create assemblies.

Here is a sample of some of the ideas that I have been working on lately. The idea is to formalize a lot of the notions that I talked about in my concepts and features post. 

image

You can think about this as a logical extension to the way people build composite applications. We have the infrastructure, the idea of the base services that are being provided by the application, and then we have features.

Each feature is a complete story, which means that a feature will contain, in the same place, all the parts required to make it work. For example, as you can see in the project image, what we have is the Search feature, which contains some UI, the logic to control the UI on the client side and the server side, helper classes to manage that. In the Finders folder we have additional functionality that is specific for this particular feature.

The infrastructure knows that it needs to pull of of that together, so we base everything on conventions. For example, we have a routing convention for aspx pages, and for finding services or controllers.

From layering perspective, there aren’t any. Inside a feature, you can do pretty much whatever you want. There is usually some conventions around to help build things properly, but that is mostly about just getting things working, not to support layering. And I don’t have a problem with breaking the rules inside a feature.

Features that affect each other are rare. It usually only happen whenever we have to do things like “when you search for an order and you go to a particular order, we want to always go back to the search page, no matter how deep we went”.

Cross feature communication is done using pub/sub mechanism, most of the time.

Most of the ideas that I outline here are composite applications notions, just applied in a more general fashion. The end result is that you gain all the usual benefits from composite applications. You don’t step on another feature toes when you add stuff, and you have a stable base from which to work from. Each feature can be developed independently and is only worried about its own problems. The infrastructure is built during the first few features, but afterward it is mainly a stable platform on top of which we build.

I should note that notion is recursive. That is, we have a feature that keep getting expanded. At that point, we will probably treat it like the entire application, and build the infrastructure in place so we can drop additional features to the root feature. A common example of that would be to support additional authentication mechanisms for the Authentication feature.

Book Review: By Heresies Distressed

By Heresies Distressed

Hands down, David Weber is my favorite author. He has the ability to create rich worlds that are complete, logically consisted and interesting. While Weber is mostly known for the Harrington series, which I also really like, I have to say that the Prince Roger books (March Upcountry, etc) are the best military action series that I have read, and that the Safehold series is the best political action series.

Of the two, I actually think that I prefer the Safehold series, although it is a very close match, and I’ll likely change my mind if there would be a new book in the series.

All of that said, this book is actually about the latest book in the Safehold series, which includes Off Armageddon Reef and By Schism Rent Asunder. In a single sentence, I can tell you that Weber has managed to capture my interest all over again. His ability to weave so many concurrent plot lines is the key part of the high level of enjoyment (and quality) that I derive from the books.

The one problem that I have, like the one in the previous one, is stops too early! If I were smart, I would probably drop the series for a decade or so, and wait for Weber to pump enough books out that I can get them all in one shot.

That is not to say that the books are too long, or full of fluff. It is just that Weber is painting a big picture, and that takes time. Unfortunately, it means that by the book ends, I was left with quite a desire to known what the hell is going to happen next.

The Tale of the Lazy Architect

So, there is the interesting debate in the comments of this post, and I thought that it might be a good time to talk again about the way I approach building and architecting systems.

Let me tell you a story about a system I worked on. It started as a Big Project for a Big Company, and pretty much snowballed into getting more and more features and requirements as time went by. I started up as the architect and team lead for this project, and I still consider it to be one of the best projects that I worked on and the one that I hold up to compare to others.

Not that it didn’t have its problems, for once, I wasn’t able to put my foot down hard enough, and we used SSIS on this project. After that project, however, I made a simple decision, I am never going to touch SSIS again. You can’t pay me enough to do this (well, you can pay me to do migrations away from SSIS).

Anyway, I am sidetracking, I was on the project for 9 months, until its 1.0 release. At that point, we were over delivering on the spec, we were on schedule and actually under budget. The codebase was around 130,000 lines of code, and consisted of a huge amount of functionality. I was then moved to a horrible nasty project that ended up with me quitting after the first deliverable. The team lead was changed and a few new people were added, in the next 6 months, the code size doubled, velocity remained more or less fixed (and high) and the team released to production on scheduled with very few issues.

I lost touch with the team for a while, but when I reconnected with them, they had switched the entire time again, this time they brought it fully in house, they were still working on it, and a code review that I did revealed no significant deterioration in the codebase, in fact, I was able to go in and look at pieces that were written years after I left the project, and follow the logic and behavior as if no time has passed or the entire team changed.

Oh, and just to make things even more interesting, there were no tests for the entire thing. Not a single one. I did mention that we did frequent releases and had low amount of bugs, right? Yes, I am painting a rosy picture, but I did say that I consider this to be the best project that I was on, now didn’t I?

The question arises, how did this happen? And the thing that is responsible for this more than anything else was the overall architecture of the system.

Broadly, it looks like this:

image

Except that this image is not to scale, here is an image that should give you a better idea about the scales involved:

image

That tiny red piece there at the bottom? That is the application infrastructure. Usually, we are talking about very few classes. In NH Prof case, for example, there are less than five classes that compose the infrastructure for the main functionality (listener, bus, event processor (and probably another one or two) if you care).

The entire idea was based around something like this:

image

We had provided a structure for the way that the application was built. The infrastructure was aware of this structure, enforced it and used it. The end result that we ended up with a  lot of “boxes”, for lack of a better word, where we could drop functionality, and it would just pick it up. For example, adding a new page with all new functionality usually consisted of about few things that had to be changed:

  • Create the physical page & markup
  • Create the controller for the page
  • Optional: Create associated page specific markup
  • Optional: Create associated page specific Json WebService

If new functionality was required in the application core itself, it was usually segregated to one of few functional areas (external integration, business services, notifications, data) and we have checklists for those as well. It was all wired up in such a way that the steps to get something working were:

  • Create new class
  • Start using new class (no new allowed, expose as ctor parameter)
  • Done

The end result was that within each feature box, if you weren’t aware of the underlying structure, it would look like a mess. There is not organization at the folder (or namespace), because that wasn’t how we thought about those things. What we had was a feature based organization, and a feature usually span several layers of the application and dealt with radically different parts.

Note: Today, I would probably discard this notion of layering and go with slightly different organization pattern for the code, but at the time, I had a strict divide between different parts of the application.

Anyway, because we worked with it that way, the way that we actually approached source code organization was at the feature level. And a feature usually spanned all parts of the application and had to be dealt with at all layers. So let us take the notion of tracking down a feature, and what composed it. We usually started from the outer shell (UI) and moved inward, by simply looking at the referenced . Key part of that was the ability to utilize R#’s capabilities for source code browsing.

I’ll talk a bit more about this in a future post, because it is not really important for this one.

What is important is that we had two very distinct attributes to the system:

  • You rarely had to change code. Most of the time, you added code.
  • There was a well defined structure to the application’s features.

The end result that we produced a lot of code, all to the same pattern, more or less, and all of it was isolated from the other code in the same level.

It work, throughout several years of development, several personal change and at least two complete team changes. I just checked, and the team added new features since the last time I visited the site.

System rollout nightmares

The Finance Department in Tel Aviv Municipality has decided to go on a (probably illegal) strike (Hebrew link). Since I don’t live in Tel Aviv nor am I a local blogger, there is actually a reason for me pointing it out.

The reason for them being on a strike, especially since it is looks like it is not a legal one? The rollout of a new system is currently going on, and it is painful enough for them to refuse to do any work.

I wonder how bad it is for them to get to that point…

Blog posts ideas

Here is something a little different, I am going to open the suggestion box for blogging ideas.

Just jot them down here, and I’ll try to get to them.

Answering to NHibernate codebase quality criticism

Patrick has a post analyzing the changes to NHibernate 2.1, which I take exception to. Specifically, to this:

Now, the code base is completely entangled and it must be a daily pain to maintain the code. There is a total lack of componentization and each modification might potentially affect the whole code base.

Um… not!

The problem with the way things are seen from Patrick’s point of view is that his metrics and the metrics used by the NHibernate team are not identical. As a matter of fact, they are widely divergent. This is part of the reason that I tend to use automatic metrics only as tracer bullets in most code review scenarios.

In particular, Patrick uses the notion of namespaces as a layering mechanism, when the codebase doesn’t follow that convention, the output is broken. NHibernate doesn’t use namespaces as a layering mechanism, hence, the results that Patrick is showing.

Just to give you an idea, we have tens of new features in the new release, and the number of issues have decreased. We are able to make changes, including major ones, with no major problems, and with safety. For that matter, I actually take offence at the notion that NHibernate isn’t componentized. It is highly componentized, and we are actually working on making this even more so, by adding additional extension points (like the pluggable proxies, object factories, and more).

There is no question about NHibernate being complex, but in the last year or so we have taken on several new committers, drastically increased the number of new features that we ship and have maintained a high code quality. For that matter, we were able to finally resolve some long standing code quality issues (three cheers for the death of the NHibernate.Classic.QueryParser, and many thanks to Steve Strong for the ANTLR HQL parser)!

There are a few other issues that I would like to point out as well:

Methods added or changed represent 67% of the code base!

A lot of those are the result of internal refactoring that we do, a good example is the move to strongly typed collections internally, or renaming classes or methods to better match their roles. Those are, generally speaking, safe changes that we can do without affecting anyone outside of NHibernate Core. That helps us keep the high level of code quality over time.

30 public types were removed which can then constitute a breaking change.

Again, here we have a more complex scenario than appears upfront. In particular, NHibernate uses a different method for specifying what a breaking change is. By default, we try to make NHibernate extensible, so if you need to make changes to it, you can do that. That means we make a distinction between Published for Inheritance, Published for Use and Public.

For example, we guarantee compatibility for future version if you inherit from EmptyInterpcetor, but not if you implement IInterceptor. By the same token, we guarantee compatibility if you use ISession, but not if you try to implement it. And then there are the things that are public, extensible by the user, but which we do not guarantee which be compatible between versions. Dialect are a good example of that.

This approach means that we have a far greater control over the direction of NHibernate, and it has stood for us throughout 4 major releases of the project.

As for how we measure code quality in NHibernate? We measure it in:

  • Peer code review (especially for things that are either tricky or complex)
  • Reported bugs
  • Ease of change over time
  • Quality of deliverable product

So far, even if I say so myself, I have to say that there is good indication that we are successful in maintaining high quality codebase. Complex, yes, but high quality, easy to change and easy to work with.

Don’t castrate your architecture

Note: I thought about trying to find a more PC title for this post, but I don’t think that I can really find a good one that express the same emotional content and the punch that this title have.

A few weeks ago I got my first interesting cold call (someone calling me from the number in the blog without some previous acquaintance), cold calling doesn’t happen very often (maybe 5 – 7 times in the 5 years I had the number there), but that was the first time that I actually got to talk to someone about an interesting and relevant problem.

Anyway, they were building a multi tier system using NHibernate, and they were running into problems with lazy loading over WCF. I quickly pointed out that I really don’t like this approach, it has more than technological faults, it has serous architectural issues.

But let us try to get some idea about how they structured they application. It looked something like this:

image

I think that you can figure out how a normal request would work, but let me spell it out for you.

image

Please note that this diagram shows the communication between tiers, that is, each of those is a separate machine.

The idea, as far as I could understand, was to use NHibernate in the Data tier, to get entities from the database and then send them to the business tier where it would do business logic processing, after which it may return it back to the Data tier, which would write them to the database.

I was… at a loss for some time, trying to find a way to explain how screwy this architecture was. It quickly became evident that while the guy on the other side of the line wasn’t aware of my reservations, he certainly felt the pain of this type of architecture:

  • Slow response times
  • Lazy loading over WCF
  • Need to handle change tracking on the Business tier

Those are the problem that they already experienced directly. I can add a few more:

  • Anemic domain model (by design!)
  • Required manual caching
  • Required distributed transactions
  • Cascading failure scenarios

I think that I’ll stop here, but I am pretty sure that I can come up with a bigger list if I put my mind to it.

My questions about how did they come up with this architecture were mostly answered with: “that is how we do things” and “security”.

To add insult to injury, naturally the developers are running it all on a single machine, so they aren’t actually seeing what it going on there.

Here is how I would build such a system:

image

Note that I don’t have a Business tier or a Data tier, I have an Application tier. From my experience, even under fairly strict regulatory compliance rules, application servers can call the database, so the security aspect is covered. What we are actually doing, however, is a far more significant change.

We don’t need to worry about:

  • lazy loading (NHibernate does it for us)
  • change tracking (NHibernate does it for us)
  • caching (NHibernate does it for us, mostly)
  • reduced number of hops
  • no need for distributed transactions
  • reduced number of failure points
  • have a chance to build a true domain model

As I mentioned before, I don’t even like the distinction between BAL and DAL, even when they are layers, instead of tiers. Trying to make them into tiers is going to cause quite a lot of pain. In essence, and the main thing that is being missed here, is that you are going to have to build some infrastructure to deal with the data at the Business tier. That may be just simple change tracking and DTC support, but it is likely that you’ll need more than that for real world applications. Caching and lazy loading are both topics that you’ll need to deal with, and neither is going to be an easy task.

This is what NHibernate is meant to do. People keep looking at NHibernate and seeing the Row <- -> Entity conversion, but that is just the very tip of a very big iceberg.

image

Most of the complexity within NHibernate is with things like caches, lazy loading and change tracking. That is where you are going to see the really significant time and complexity saving.

When you are forcing an architecture into that mode, you are basically removing a lot of the functionality already in the box, and forcing yourself to create it from scratch.

Instead of castrating your abilities, make sure that your architecture matches them, don’t play to your weaknesses, play to your strengths.

NHibernate 2.1 is out!

NHibernate 2.1 GA (RTM) was released, Download it!

There are many good changes (check the release notes) and I strongly recommend moving to it.

I would like to personally thank Fabio for the tremendous amount of work that he put into it, as well as to all the other members of the team.