Ayende @ Rahien

It's a girl

Multiple Cascading Drop Down Lists

I recently found myself needing to handle a case where I had three drop down lists, and I needed to setup cascading relationship between those:

 ThreeDropDowns.png

Using the CascadingDropDown extender really make this a breeze, except that I had additional requirement, the benefits drop down should be filtered by both policy and insurance. So, when a policy is selected, all the matching insurance types are filled as well as all the matching benefits for the policy. When an insurance is selected, the benefits list should contains only the benefits for this insurance type.

At first I tried to simply setup a new extender from benefits to policies, that worked, until I selected a value from the insurance list, after which trying to unselect a value would leave me with a disabled drop down. After a bit of fumbling, I came up with this code, that runs on the onChanged client-side event on the Insurances drop down list:

function updateBenefitsDropDownIfDisabled()

{

    if($('<%=Insurances.ClientID %>').selectedIndex == 0)

    {

       setTimeout("updateBenefitsByPolicy();",1);

      }

}

The reason that I am using setTimeout here is that I want the updateBenefitsByPolicy to run after all the associated event hanlders for the event has run ( including the one that disabled it ). Here is how updateBenefitsByPolicy works:

function updateBenefitsByPolicy()

{

      var behaviors =  Sys.UI.Behavior.getBehaviors($('<%=Benefits.ClientID%>'));

      for(var i=0;i<behaviors.length;i++)

      {

            var behave = behaviors[i];

            if(behave._name == "CascadingDropDownBehavior" &&

                  behave._parentControlID == '<%=Policies.ClientID %>')

            {

                  behave._lastParentValues = null;

                  behave._clearItems();

                  var updating = new Option();

                  updating.text = '<asp:Literal runat="server" Text="<%$ Resources:App, Updating %>"/>';

                  addToDropDown(dropDown,updating);

                  behave._onParentChange(null,null);

            }

      }

}

It uses internal variabled from the CascadingDropDown behavior, so it is a hack, but it works. It cycles through the list of behaviors for this element, searching for a CascadingDropDown whose parent is the policies drop down, then it clear the cache (forces it to go back to the server) and manually force an update on the cascade behavior.

Not the most elegant code, but it works...

ORM and when query plans go bad

D. Mark Lindell has a few questions about sclaing ORM:

  1. How can dynamic SQL ORMs deal with the fact that your database server (a.k.a SQL Server) can decide at any point that it is going to use an alternate query plan. A simple index HINT on the join syntax can fix this problem but how is my ORM going to handle this?
  2. How come there is no talk about scaling these ORMs. No, I'm not talking about scaling the database. A layer between the ORM and the database execution.

The answer to the first question is easy, you use index HINT when you need it. NHibernate makes it very easy to plug in your SQL (and I have several posts about - here is one) when the default approach is not sufficent, the 1.2 release goes beyond just leting you specify you SQL for queries, it lets you specify custom SQL for just about anything (Entity - CRUD, Collection - CRUD, etc).

I am not sure that I understand the second question at all, though.

Advance: Extending NHibernate Proxies

There was some interest in extending the way NHibernate deals with the entities, and I have just commited those changes into the trunk. Notice that since NHibernate 1.2.0 is at feature freeze right now, this will not be in 1.2.0 RTM, but at a later version.

At any rate, let us see how we can extend NHibernate so it would offer automatic implementation for INotifyPropertyChanged. Some caveats: It works only for entities that have lazy loading enabled (the default for 1.2.0) which were retreived from NHibernate.

You can see the full test here, but let me go through the implementation. First here is the test case:

[Test]

public void CanImplementNotifyPropertyChanged()

{

       using (ISession s = OpenSession())

       {

              Blog blog = new Blog("blah");

              Assert.IsFalse(blog is INotifyPropertyChanged);

              s.Save(blog);

              s.Flush();

       }

 

       using (ISession s = OpenSession())

       {

              Blog blog = (Blog)s.Load(typeof(Blog), 1);

              INotifyPropertyChanged propertyChanged = (INotifyPropertyChanged)blog;

              string propChanged = null;

              propertyChanged.PropertyChanged += delegate(object sender, PropertyChangedEventArgs e)

              {

                     propChanged = e.PropertyName;

              };

 

              blog.BlogName = "foo";

              Assert.AreEqual("BlogName", propChanged);

       }

}

As you can see, when we create the entity manually, it does not implements INotifyPropertyChanged.  But when we load the object from NHibernate, not only does it implements INotifyProperyChanged, but it behaves correctly as well. They key is in extending the Proxy Factory that NHibernate uses for lazy loading.

We need to specify the proxy factory type that we would like to use:

protected override void BuildSessionFactory()

{

       cfg.SetProxyFactoryClass(typeof(DataBindingProxyFactory));

       base.BuildSessionFactory();

}

And now, let us see how the DataBindingProxyFactory works:

public class DataBindingProxyFactory : CastleProxyFactory

{

       public override INHibernateProxy GetProxy(object id, ISessionImplementor session)

       {

              try

              {

                     CastleLazyInitializer initializer = new DataBindingInterceptor(_persistentClass, id,

                                _getIdentifierMethod, _setIdentifierMethod, session);

 

                     object generatedProxy = null;

 

                     ArrayList list = new ArrayList(_interfaces);

                     list.Add(typeof(INotifyPropertyChanged));

                     System.Type[] interfaces = (System.Type[])list.ToArray(typeof(System.Type));

                     if (IsClassProxy)

                     {

                           generatedProxy = _proxyGenerator.CreateClassProxy(_persistentClass, interfaces, initializer, false);

                     }

                     else

                     {

                           generatedProxy = _proxyGenerator.CreateProxy(interfaces, initializer, new object());

                     }

 

                     initializer._constructed = true;

                     return (INHibernateProxy)generatedProxy;

              }

              catch (Exception e)

              {

                     log.Error("Creating a proxy instance failed", e);

                     throw new HibernateException("Creating a proxy instance failed", e);

              }

       }

}

Here we did two things, We specify a different interceptor, which inherits from the default interceptor, that handles lazy loading, and we specify that the proxy that NHibernate returns should also implement INotifyPropertyChanged. Now, we need to see how the DataBindingInteceptor works:

public class DataBindingInterceptor : CastleLazyInitializer

{

       private PropertyChangedEventHandler subscribers = delegate { };

 

       public DataBindingInterceptor(System.Type persistentClass, object id, 
               
MethodInfo getIdentifierMethod, MethodInfo setIdentifierMethod, ISessionImplementor session)

              : base(persistentClass, id, getIdentifierMethod, setIdentifierMethod, session)

       {

       }

 

       public override object Intercept(IInvocation invocation, params object[] args)

       {

              if (invocation.Method.DeclaringType == typeof(INotifyPropertyChanged))

              {

                     PropertyChangedEventHandler propertyChangedEventHandler = (PropertyChangedEventHandler)args[0];

                     if (invocation.Method.Name.StartsWith("add_"))

                     {

                           subscribers += propertyChangedEventHandler;

                     }

                     else

                     {

                           subscribers -= propertyChangedEventHandler;

                     }

                     return null;

              }

              object result = base.Intercept(invocation, args);

              if (invocation.Method.Name.StartsWith("set_"))

              {

                     subscribers(this, new PropertyChangedEventArgs(invocation.Method.Name.Substring(4)));

              }

              return result;

       }

}

We extend CastleLazyInitializer, which handles the usual lazy loading related logic for NHibernate, and when we get a method, we check to see if it a call to a method from the INotifyPRopertyChanged interface. If it is, we handle the add/remove of the event subscriber. But that is not the interesting part.

The interesting part is that after we let the method be processed normally (which is what the call to base.Intercept() does, we check to see if this is a property setter, and raise the appropriate event if it does.

The client code for this is very natural, and it took about 70 lines of code to add this functionality.

Who let that smart client into my database?

Some of the comments about my Lazy Load post included discussion about how to handle Lazy Load scenarios in this cases. The answer is simple, I don't. I don't handle this scenario because I don't like the idea of a client application going directly against my database. This puts too much responsability at the client end, and leaving the server as a dumb data container. This also means that users can now connect to the database, which I really don't like.

My default architecture for a smart client application is a client application that talks to a set of web services, where usually there is a one service per form, or per functionality. In those cases, I don't worry about the usual SOA stuff, those are WS dedicated to the app. The application can make calls to the WS, and those make it explicit that a boundary is being crossed.

 

More on Lazy Loading vs. Pre-loading in O/R mapping scenarios

Frans responded to my post about Lazy Loading. He disagrees with the comparision between lazy loading and paging, because...

The difference is in the fact that the first (paging) goes through the same channels for every fetch of new data while the second (lazy loading) first uses the normal channel to fetch data and after that uses functionality 'under the hood' to get the additional data.

This thus means that the lazy loading functionality bypasses the logic you would otherwise use to obtain data.

That is actually depending on how you define the logic for obtaining the data. Usually lazy loading would occur inside an aggerate root, so I don't believe that there is that much logic that is involved in getting the data. In both cases, it is the OR/M that is invoked to load the data from the DB, and hook up the correct assoications.

Most OR/M would provide you with a way to hook into this mechanism. In NHibernate, you can use IInterceptor.OnLoad to execute logic whenever an object is loaded, regardless of how it is loaded. I don't see this as a problem.

I am curious about the definition of logic to get the data, by the way. About the only thing that I can think of is security, where you may not be allowed to see the entire graph, but lazy loading may enable that, if you got the object model wrong. This is more an issue of proper design than anything else, in my opinion. An aggregate root should be the object you secure, and not its children.

It's therefore of upmost importancy that you realize if your application is really suitable for lazy loading or not and if not, don't use it. A good O/R mapping solution always lets you do pre-fetching/eager loading of entity graphs without lazy loading.

No argument about here, although my approach would be the other way around. A good place to avoid lazy loading is in remoting scenarios, for the same reason you don't want to memory map a file on a network share and page it in.

"Read The Code" is not a valid answer

Scott Bellware has a post titled Just Read the Code (or, Let them Eat Cake)

That open source software can be opened and read is a great quality of open source.  But reading code and understanding code are two different things.

Here is a simple example, from my own code With.Cache, it is a small class, but in order to understand what it is doing you have to grok Disposable Actions, Semi Statics and the Local Data concept. And then you need to find out where the properties on this class are used, to understand what it is doing.

There is a reason that my primary tool for exploring through a code base that I am not familiar with is ReSharper for C# and grep for anything else. And that is in a code base that I consider pretty good. Code doesn't stand in isolation, in order to understand what it is doing, you need to understand how it is used, and what is it is supposed to be doing.

I had code bases thrown at me when I wasn't ducking fast enough, and trying to untangle that mess leaves scars. I learned SQL to as a self defence measure.

But then "community" in open source jargon usually refers the the community of people involved in the development of the project rather than the community at large...

Yes, that is true, but it is important to understand the difference in the target audiance. To the Castle team, I can talk using a short handed terminology, because I can assume prior knowledge. This helps make communication efficent, and the code base more understandable if you understand the idioms used. If you don't, you are going to have hard time until you do. In other words, you have to already have an understanding of the code in order to understand the code.

Now, there is a not so fine line between the implementation and the interface of the code. If I am intending to make something that I will use later, that I will be as smart as possible in the implementation, so I can be as stupid as possible when using the interface. This is the way it should be, in my opinion, but it does make the implementation harder to grok, because it is doing more.

To quote Joel Spolsky:

Back to that two page function. Yes, I know, it's just a simple function to display a window, but it has grown little hairs and stuff on it and nobody knows why. Well, I'll tell you why: those are bug fixes. One of them fixes that bug that Nancy had when she tried to install the thing on a computer that didn't have Internet Explorer. Another one fixes that bug that occurs in low memory conditions. Another one fixes that bug that occurred when the file is on a floppy disk and the user yanks out the disk in the middle. That LoadLibrary call is ugly but it makes the code work on old versions of Windows 95.

If you are going into non trivial code base, you should expect to have to invest some time in understanding it. A good code base would help you do it, but again, it requires some knowledge before you can do that.

From Spolsky again:

I think the best way to read somebody else's code is just to SLOW DOWN... it's like deciphering a code, not like reading. Most people have trouble reading code because their eyes are used to reading at a certain speed from reading text written in human languages. But code is much more dense than English, and contains 'secrets' that need to be deciphered by looking elsewhere...

Documentation helps in this regard. Not for the specifics of the code, but to document the what and the how. I am not a fan of documenting the code, but documenting the way you are supposed to use it is important. Once you understand how it is supposed to be used, you can look at the implementation and understand how it enables this usage.

Lazy loading: The Good, The Bad, And The Evil Witch

Frans Bouma commented about lazy loading:

In general lazy loading is more of a burden than a blessing.

The reason for this is that it leaks persistent storage access to different tiers via the lazy loadable associations. If you want to prevent your UI developers to utilize lazy loading, or are sending entities across the wire to a service, how are you preventing that lazy loading is called under the hood? We support 2 models, one has lazy loading, the other one doesn't (and is more geared towards disconnected environments).

You don't really miss lazy loading in the second model really, as long as you have prefetch paths to specify prefetching what you want (also into an existing graph) The thing is that the model then forces you to write more service oriented software: make the call to the data producer and tell the data producer (or repository, whatever you want to call it) what to get and you get the data and work with it. there's no leaky lazy loading under the hood bypassing the repository, you need to call the dataproducer to get the data, period.

The problem with this approach is that pre-suppose that you are going to provide all the fetch modes in the application. I find that this is quite a burden in my applications, because I explictly don't want to deal with those concerns 90% of the time. I never quite understood the desire to protect the application from the UI developers, but that is another issue.

I have heard lazy loading described as virtual memory for data, and I agree with this description. Lazy loading is a tremendous convenience compare to the manual management of the data. If you walk into a serious performance conversation, you would often hear terms such as memory locality, minimizing paging, etc. This is after quite a bit of time with Operating Systems that take care of that transperantly. I believe that there are still people out there that do their own manual paging, but those are usually guys like Oracle and MS SQL Server.

Those are willing to take the burden of managing paging themselves, because they know hey have better knowledge of is happening, and they need this type of control. There is a reason there is such a thing as SQL OS.

For most applications, even the very big ones, trying to do that is foolhardy in the extreme. You certainly want to be aware of what is going, and you will certainly get significant performance improvements by reducing the paging patterns of the application, but you don't want to manage it yourself.

I view lazy loading in the same manner. It is something that I really don't want to live without. It is something that can cause some really bad perfomance if you are misuing it, but so can any tool in the tool box. (You had better believe that a hammer can do some serious damange).

While I don't think that we are at the same level of maturity for lazy loading as we are for paging, I definitely see this as the way we are headed. Manually loading the stuff we need is cumbersome, it is much easier to let the tools do their job, and hint them in the right direction when they don't do the right thing.

SSIS: I know better than you do

If you haven't guess it by now, I am not fond of SSIS. The latest installment is probably a "feature". Assume the following, I develop an SSIS package on my local machine, testing it against a local database.

Now, I want to run the package against a remote database. I did the Right Thing and put the connection string in a data source, so I change that and run the package from Visual Studio. Imagine my surprise when Visual Studio does a full cycle, including reporting the number of rows that it copied. Everything seems to be fine, until I checked the database itself.

About half the tables where empty, and I am still not sure why. The best guess that I can make is that it is caching the previous database credentials, and writing to that, since I found the data in the local database. Argh!!

SSIS: You really don't need all this data

Here is SSIS deciding that "No, you really don't need to move this data."

NoNeedForThatData.png

I really like the "smarts" that went into that engine. I make sure that I keep busy and don't move on to othe areas of the application, but dedicate my full an undivided attention to SSIS, as it is apperantly should be.

Would it be too hard to provide a deteministic engine? SSIS would run only (random) parts of the sequence container if I right click it and tell it to "Execute Container". Doing a full debug seems to work, but it execute stuff that I don't want to run right now. Urgh!

Quick & Dirty CodeGen

I needed to get some code that would map an XML file to a database table. Not being particularily fond of doing it by hand, I whipped out this statement:

select  '

      if node.SelectSingleNode("' + column_name + '/text()") is not null:

            row["' + column_name + '"] = node.SelectSingleNode("' + column_name + '/text()").Value

      else:

            row["' + column_name + '"] = DBNull.Value'

from    information_schema.columns

where   table_name = 'Content'

I am doing about 60% of my code gen with SQL and Regex, I think.

Wish: Reflector.PDB

Here is an idea for a kick-ass Reflector plugin. We already have Reflector plugins for outputting entire projects out, but what I would really like to see is a plugin that takes it one (big) step further and generate the PDB as well.

A PDB file allows Visual Studio to debug, it contains the correlation between the compiled code and the source code, enabling stepping into the code. The important idea here is that Reflector is capable of producing botht he source code and the PDB, which would allow us to debug into assemblies that we don't have the source to.

The big benefit of generating the source and compiling ourselves is that we don't need to do it for the whole chain. The immediate use of this would be to finally see what black magic is making the view state put values in completely random places.

Business Objects vs. Entities: What is the diff?

Recently I have noticed several discussions about the differences between entities and business objects.  I have my own opinions in this matter, naturally, but before I get to them, let us try to define what each term mean, and I can't quite put this in words.

An entity is something that has a valid domain meaning, and is usually persisted to durable storage. I am not so clear on the definition of a business objects, and I am not sure that my entity definition is very clear, either.

Options?

 

On Orthogonal Frameworks

One of the interesting aspects of open source projects is that there are quite a bit of directions that the developers and the community would like to go to, all at once. Even in healthy communities, people sometimes would like to do Really Wierd Stuff, for a specific scenario or use case, they would like to do something that goes against the grain of the project. Or is something that really should be outside the scope of the project, etc.

The usual response to this is to allow a user to supply its own implementation of a framework component. NHibernate is a great example of that, where you can almost always override the default NHibernate behavior with your own. This means that in terms of flexibility, it is possible to go in at almost all levels and decide that you would like this behavior to work differently for you. For a tool like NHibernate, it is a very important feature, because people have some very interesting ideas about how a database schema should look like.

Why am I talking to you about this? Because I find, with the exception of WCF, that most commercial frameworks do not give nearly as much leeway, and they have "certain ideas" about what you have to do in order to use them. Right now, I would like to talk about the Entity Framework requirements for a base class / interface implementations.

Again, a base class / optional interfaces are great way to extend the way you are using a given framework, but the heavy weight approach is flawed from the start. It makes a lot of assumptions about the way you use it, and uses tooling to hide ugliness. There shouldn't be ugliness in the first place, but I digress.

Let us take change tracking as a simple example, the Entity Framework requires that the object will manage its own changed state. Personally, I feel that this is a violation of the Single Responsability Principal. NHibernate handles this by comparing the old/new values of the object, and decide accordingly if it should save it. This has some performance implications in limited scenarios*. Having the object track its state is preferable, because then you can just do if(entity.Changed) Save(entity).

I decided to spike what it would take to build an orthogonal implementation of this with NHibernate. ~120 lines of code later, I had NHibernate inject change tracking behavior into the object, which would flag it as changed if a set property was called.

Let me repeat that, 120 lines of code, zero change to the model (still Plain Old C# Classes), new behavior. For kicks, I have added support for INotifyPropertyChanged behavior, which was an additional of 10 lines of code.  

This is also an approach that is taken by guys that doesn't have a designer team just waiting to generate a ton of code just to get things started. This is based on clever code and flexible architecture. The problem with the code gen approach is that it lets you cover holes in the tool you are using, but those holes will come back to bite you when you need to amke something that the author of the tool didn't think about.

* Loading large number of objects and updating a few, then persisting. That is partly why Evict() is for.

** The new behavior is not part of NHibernate, if there is interest, I can post the patch.

Convenient & Easy & Slow vs Convenient & Hard & Fast

Here is an interesting quote from the guys that build Twitter:

Once you hit a certain threshold of traffic, either you need to strip out all the costly neat stuff that Rails does for you...

Disclaimer: I am completely ignorant about Ruby and Rails, should maybe I should just shut up.

One thing that I have noticed is that all those "neat" stuff can be very costly in its default implementation, but it doesn't have to be. Let us take a look at dasBlog's macros for an instance, shall we? The initial implementation was very easy to use (and probably to implement), but was very slow due to heavy use of uncached reflection.

One way to get over it is to drop the notions of macros entirely, and do it all by hand. That would make it much harder to use them, of course. Another way would be to optimize the macros themselves. Recent version of dasBlog has done just that, by using cached reflection, and the numbers I have seen quotes talks about 100% improvement in performance, with no difference for the user. From experiance, this optimization can be as hard as the actual functionality, but the tradeoffs across the board are more than worth it.

Assuming that we reached the next scaling point, where cached reflection is still too slow, what do we do then? Beyond cache reflection, we have runtime code generation, which is more complex still, but offers the speed of the native platform. Here we get to thinking about tradeoffs, it may be that the complexity would cost more than the convenience is worth, but most often, the complexity is localized to a narrow set of functionality, and it enable quite a bit of cost saving.

Again, I have no idea about Rails and the kind of problems that he is talking about, but if I were facing this issue, I would probably try to see if I can JIT the Ruby code into native code, and run that. There is quite a bit that can be done if the scenario is focused enough that you can make assumption about the way it is used (remove edge cases, basically).

Dynamic Proxy 2: Mixins

I needed a rest from dealing with SSIS and data migrations issues, so I decided to put some time in real code. I just finished adding support for Mixins to Dynamic Proxy 2. Here is the first test that passed, the name should tell you quite a bit about what is required to make it work.

[Test]

public void CanCreateSimpleMixinWithoutGettingExecutionEngineExceptionsOrBadImageExceptions()

{

       ProxyGenerationOptions proxyGenerationOptions = new ProxyGenerationOptions();

       proxyGenerationOptions.AddMixinInstance(new SimpleMixin());

       object proxy = generator.CreateClassProxy(

              typeof(object), proxyGenerationOptions, new AssertInvocationInterceptor());

 

       Assert.IsTrue(proxy is ISimpleMixin);

 

       ((ISimpleMixin)proxy).DoSomething();

}

The generated proxy looks something like this one:

public class ObjectProxyefb7dccd21fe43b5b2d13c788dce3bdb : ISimpleMixin

{

    public IInterceptor[] __interceptors;

    public ISimpleMixin __mixin_Castle_DynamicProxy_Test_Mixins_ISimpleMixin;

    public static MethodInfo tokenCache1 = ((MethodInfo) methodof(ISimpleMixin.DoSomething, ISimpleMixin));

    public static Type typeTokenCache = typeof(object);

 

    public ObjectProxyefb7dccd21fe43b5b2d13c788dce3bdb(ISimpleMixin mixin1, IInterceptor[] interceptorArray1)

    {

        this.__mixin_Castle_DynamicProxy_Test_Mixins_ISimpleMixin = mixin1;

        this.__interceptors = interceptorArray1;

    }

 

    public override int DoSomething()

    {

        object[] objArray = new object[0];

        InvocationDoSomething_1 g_ = new InvocationDoSomething_1(
            
this.__mixin_Castle_DynamicProxy_Test_Mixins_ISimpleMixin,
             this.__interceptors, typeTokenCache, tokenCache1, objArray, this);

        g_.Proceed();

        return (int) g_.ReturnValue;

    }

 

    [Serializable]

    public sealed class InvocationDoSomething_1 : AbstractInvocation

    {

        public ISimpleMixin target;

 

        public InvocationDoSomething_1(ISimpleMixin mixin1, IInterceptor[] interceptorArray1, Type type1, MethodInfo info1,
              
object[] objArray1, object obj1) : base(mixin1, obj1, interceptorArray1, type1, info1, objArray1)

        {

            this.target = mixin1;

        }

 

        public sealed override void InvokeMethodOnTarget()

        {

            int num = this.target.DoSomething();

            base.ReturnValue = num;

        }

    }

}

 

The code is in the repository, if you feel like taking it out for a spin.

Linq for NHibernate: Functions

Bobby Diaz is still producing some amazing stuff. The new addition is native support for SQL functions. Take a look at this code:

DateTime date = new DateTime(1960, 1, 1);

var query = ( from e in db.Employees
                   where db.Methods.Year(e.BirthDate) >= date.Year && db.Methods.Len(e.FirstName) > 4
                   select e.FirstName )
                   .Aggregate(
new StringBuilder(), (sb, name) => sb.Length > 0 ? sb.Append(", ").Append(name) : sb.Append(name));

Console.WriteLine("Birthdays after {0}:", date.ToString("yyyy"));
Console.WriteLine(query);

Check out Bobby's post for the full details. I am going to write a full overview of the current state of Linq for NHibernate soon. The situation so far is looking very nice.

Give me the code!

Here is an amusing thought.

I am explicitly a code guy (vs. designers guy).

  • My persistance format is code
  • My configuration is code
  • My database is generated from code

Bug Reports For Dummies

No, you won't learn how to build a good bug report here. As long as you don't send me a screen shot of a screen shot of the application inside a word document, that is fine by me.

You may learn what not to do from this, though. Check out this bug. I just run into a bug in Brail, I didn't feel like fixing it right now, so I opened a bug to myself on this. I was annoyed because I fixed a few bugs of similar nature in the past few days. I tried to take the most trunk version of Brail to the project, and it worked. I really should start pay more attention to what I am doing.

Unit Testing Trivalities?

The following code is a MonoRail action. It doesn't do much, but relies on the framework to do most of its work.

public void ShowCustomerDetails([ARFetch("id")] Customer customer)

{

       if(customer==null)

              RenderSharedView("common/notfound");

       PropertyBag["customer"] = customer;

}

What are the arguments for/against testing this method directly?

That might take a while...

Yesterday I started a run of a process that should run every 5 minutes, I got annoyed when it took too long and went home. I just got back to see this number:

SSIS_ORA_ENT.png

Maybe I need to rethink the architecture. It is still working on it. The really scary thing is that the select count(*) that I also left running didn't finish in ~16 hours or so. And just to clarify, this is an OLTP DB, not a data warehousing one.d

Comments on DNR #226: Entity Framework

I have listened with interest to the DNR episode about the entity framework, I am still not convinced on what they are doing. Daniel says that after explaining the grand vision to the code better guys, they agreed that this is what they wanted, but he doesn't disclose this grand vision. Update: I went back and listened again, Daniel is talking about Jeffrey Palermo specifically, not the rest of the code better guys. Sorry for the confusion.

A unified logical model for entities is nice in theory, but I already talked about why this is a hard problem to solve. Further more, it looks like they are focusing on the grand vision too much, and leaving aside the real users right now.

There are going to be a lot of users that would buy into the entity framework, lock, stock & barrel. But if is it hard to use for the common scenarios, it is going to generate a lot of ill-will to the framework. As far as I understand, they are nearly at feature-freeze for Orcas, and there seems to be a lot of rough edges all over the place.

That aside, here are a few more comments from the episode:

  • The Entity Framekwork is cheating. They are movig the burden of the actual query generation from the ORM layer to the database provider layer. Basically, it is sending the query AST to the database provider. The problem with that is that this would make it much harder to build an EF compatible database provider. You would need to handle not only the wire protocol, but also an optimizing provider for this AST for your database. I am pretty sure that SQL Server will have a good one, and Oracle would have a working one, but that leaves a lot of other databases out of the loop.

  • Something that I haven't heard so far is optimizations. There are scenarios where I have to write my own SQL to get the data back, is this something that the EF support?

  • Code generation - I have seen the amount of code that is required to make the EF happy. It is not pretty. You need to implement a lot of interfaces, handling properties changes, etc. This is a lot of code that really shouldn't be there.
    I just spiked what it would take for NHibernate to handle the INotifyPropertyChanged implementation for the entities, and that one is less than 100 lines of code. And that is for doing it across the board!

  • Automatic assoication wiring. By that I means that if I do customer.Orders.Add(new Order()), the Order's customer is set to the customer we just gave it. This is typically a thorny issue with regard to the mismatch between the relational model and the OO model, there are no one way assoications in the relational model, and there are only one way assoication in the OO model. The entity framework supports this with special collections, and lamda expressions. NHibernate had the same about a year ago in NHibernate Generics.
    I stopped supporting this functionality a while ago, because I do not believe that this is a good way to handle assoications in the model. There are several reasons for that.

    • There are various scenarios where I want a one-way assoication, usually for transient instances that I want to use for business logic calculations.

    • There is a business logic assoicated with assoications :-).
      If we will take the customer.Orders example, adding a new order should verify that the customer has the credit to pay for it. Where does this logic exists now?
      The best practice for NHibernate has always been to have an AddOrder(Order o) method, which would handle the business logic and the wiring of the assoications

  • Innovation: using both Linq and eSql to get dynamic queries.
    This one really annoys me. There is zero innovation is the ability to have several ways to query the same source, using the best way for the scenario. As far as I have seen, the dynamic querying capabilities of the EF are fairly week, relying on either string concat or building the expression tree manually. Neither of which is very conductive to maintainability.

  • Lazy Loading - it is possible that I got it wrong, but it seems like Daniel said that lazy loading is not something that is support in EF. (31:40)

    We never make a query unless you know it is going to happen, very explicit.
    On the surface, this is very good, but what exactly does that mean? If it means that I can not do this:

    Customer customer = EntityFramework.GetCustoemer(15); //not the real way of doing it
    foreach(Order order in customer.Orders)
       Console.WriteLine(order);

    Then this puts the burden of knowing what to bring in the hand of the developer, and that is a real PITA to handle explicitly. This is not a good place to be in. This is one of the basic features of an OR/M, is it really missing?

  • Daniel also talks about dropping the relational model and storing the entity model directly in SQL Server, this smells a lot like OODB to me, and I wonder if this is a good idea. The major issue with OODB is that the tools to work with them were limited. Just about anything can work with relational data, very little can work against some random entity model. Microsoft certainly has the resources to do this, of course, and Daniel mentions that several teams are working on this, but I don't think that we will see this any time soon.

 

Refactoring the DailyWTF

It is not always that I quote the Daily WTF as an example of good code, but today's article is very interesting. Alex is talking about hard coding vs. soft coding. Where as hard coding means moving things into code, and soft coding means moving things into configuration, rule engines, business integration layers, etc.

I have seen systems whose configuration literally dwarves the complexity of the system itself. The system can literally do everything. My canonical example to this is that I once need to build a program that reads a file and send it to a web service. The program should have been generic, so it was. At a boring integration session, I configured it to be a web server as well, just because I could.

The example given for bad code in the Daily WTF article is this:

private void attachSupplementalDocuments()
{
  if (stateCode == "AZ" || stateCode == "TX") {
    //SR008-04X/I are always required in these states

    attachDocument("SR008-04X");
    attachDocument("SR008-04XI");
  }

  if (ledgerAmnt >= 500000) {
    //Ledger of 500K or more requires AUTHLDG-1A

    attachDocument("AUTHLDG-1A");
  }

  if (coInsuredCount >= 5  && orgStatusCode != "CORP") {
    //Non-CORP orgs with 5 or more co-ins require AUTHCNS-1A

    attachDocument("AUTHCNS-1A");
  }
}

And you know what, this isn't that bad of a code, certainly better than the alternatives paths that Alex has shown. It is still not good code, however.

The main issue that I have is that this method violates the SRP principal. I have three separate rules in this method, and three separate reasons to want to modify it. If given such a task, I would have refactored it to:

public interface ISupplementDocuments
{
    void Supplement(Policy policy);
}

Where each if statement in the above code is a class that implements this interface. Now, the attachDocuments method looks like:

private void attachSupplementalDocuments()
{
    foreach(ISupplementDocuments supplementer in this.supplementers)
         supplementer.Supplement(this);
}

Now I can make use of IoC in order to gain both benefits, clear cut code, and flexibility in deployment and runtime. The end result is well factored code, that can be easily understood and worked on in isolation.

ORM:1, Hand Coding:0

Alexey has an interesting story about big cost saving (money, time, simpler architecture) NHibernate has enabled for his team.

I recall doing an integration project once where I was able to move the database from an intermediatory SQL Server to DB2 on AS400 by changing a few lines in the configuration files (the client decided not to go with the approach, though, since it didn't fit their ESB plans).

 

Paul Graham: Microsoft Is Dead, Take 2

Paul Graham has posted a clarification of his earlier statement about Microsoft demise:

What I meant was not that Microsoft is suddenly going to stop making money, but that people at the leading edge of the software business no longer have to think about them.

Whatever it is that he is taking, that is good stuff.

We have to make a separation here from Microsoft as a platform builder and Microsoft as a service / application provider. In this case, I do believe that he is talking more about the service/application provider side of Microsoft. Assuming that I would decide to build a Web 2.0 for Social Bathroom Painting(TM), do I really need to fear Microsoft? Or to worry about Microsoft moving into my market and gobbling it all?

Probably not, but not for the reasons that Graham's paints. The issue with Microsoft is their size, what a startup sees as a viable market to cater to is not something that Microsoft is even seeing on their radar, unless you can present someone at Microsoft with a number with a lot of zeros in it, they aren't interested in trying. But, wait until the market is big enough to do appear on their radar, and you have another matter. Go ask Telligent (community server) what they think of Microsoft's moving in to their territory.