Ayende @ Rahien

Refunds available at head office

Answering to NHibernate codebase quality criticism

Patrick has a post analyzing the changes to NHibernate 2.1, which I take exception to. Specifically, to this:

Now, the code base is completely entangled and it must be a daily pain to maintain the code. There is a total lack of componentization and each modification might potentially affect the whole code base.

Um… not!

The problem with the way things are seen from Patrick’s point of view is that his metrics and the metrics used by the NHibernate team are not identical. As a matter of fact, they are widely divergent. This is part of the reason that I tend to use automatic metrics only as tracer bullets in most code review scenarios.

In particular, Patrick uses the notion of namespaces as a layering mechanism, when the codebase doesn’t follow that convention, the output is broken. NHibernate doesn’t use namespaces as a layering mechanism, hence, the results that Patrick is showing.

Just to give you an idea, we have tens of new features in the new release, and the number of issues have decreased. We are able to make changes, including major ones, with no major problems, and with safety. For that matter, I actually take offence at the notion that NHibernate isn’t componentized. It is highly componentized, and we are actually working on making this even more so, by adding additional extension points (like the pluggable proxies, object factories, and more).

There is no question about NHibernate being complex, but in the last year or so we have taken on several new committers, drastically increased the number of new features that we ship and have maintained a high code quality. For that matter, we were able to finally resolve some long standing code quality issues (three cheers for the death of the NHibernate.Classic.QueryParser, and many thanks to Steve Strong for the ANTLR HQL parser)!

There are a few other issues that I would like to point out as well:

Methods added or changed represent 67% of the code base!

A lot of those are the result of internal refactoring that we do, a good example is the move to strongly typed collections internally, or renaming classes or methods to better match their roles. Those are, generally speaking, safe changes that we can do without affecting anyone outside of NHibernate Core. That helps us keep the high level of code quality over time.

30 public types were removed which can then constitute a breaking change.

Again, here we have a more complex scenario than appears upfront. In particular, NHibernate uses a different method for specifying what a breaking change is. By default, we try to make NHibernate extensible, so if you need to make changes to it, you can do that. That means we make a distinction between Published for Inheritance, Published for Use and Public.

For example, we guarantee compatibility for future version if you inherit from EmptyInterpcetor, but not if you implement IInterceptor. By the same token, we guarantee compatibility if you use ISession, but not if you try to implement it. And then there are the things that are public, extensible by the user, but which we do not guarantee which be compatible between versions. Dialect are a good example of that.

This approach means that we have a far greater control over the direction of NHibernate, and it has stood for us throughout 4 major releases of the project.

As for how we measure code quality in NHibernate? We measure it in:

  • Peer code review (especially for things that are either tricky or complex)
  • Reported bugs
  • Ease of change over time
  • Quality of deliverable product

So far, even if I say so myself, I have to say that there is good indication that we are successful in maintaining high quality codebase. Complex, yes, but high quality, easy to change and easy to work with.

Comments

Fabio Maulo
07/21/2009 02:17 PM by
Fabio Maulo

Thanks to clarify.

Btw, again, "NHibernate" is a good name to promote commercial products, especially when the target of the products is the high quality/quantity of projects using NHibernate.

Mark Lindell
07/21/2009 02:29 PM by
Mark Lindell

Are you claiming that using either namespaces or assembly dependencies is not valid for the NHibernate code base? Is NDepend some how not relevant to the NHibernate code base? How would one go about analyzing the coupling in the code base? Please inform a Padawan.

Ayende Rahien
07/21/2009 02:39 PM by
Ayende Rahien

Mark,

Analyzing namespace dependencies as a way to measure coupling isn't going to work for NH, no. The underlying assumption just doesn't hold there.

As for assembly dependencies, that is more complex, especially since a lot of stuff in NH is done dynamically.

Patrick Smacchia
07/21/2009 03:51 PM by
Patrick Smacchia

Fabio you wrote:

Btw, again, "NHibernate" is a good name to promote commercial products, especially when the target of the products is the high quality/quantity of projects using NHibernate.

Fabio what I like to do is more analyzing from 10.000 foots with NDepend the OSS code base to show publicly which decisions seasoned developers behind the product are taking. The goal is not to criticize but to expose facts usually buried in the code base itself like, namespaces entangling.

In particular, Patrick uses the notion of namespaces as a layering mechanism, when the codebase doesn’t follow that convention, the output is broken.

On this I follow Mark Lindell: If you don't use namespaces to componentized and create layer inside your unique assembly then which other concrete artifact are you relying on? You wrote:

(NHibernate) is highly componentized, and we are actually working on making this even more so, by adding additional extension points (like the pluggable proxies, object factories, and more).

But you say it is highly componentized from the user point-of –view, not from the NHibernate insider point-of –view. Think of the new developers coming on NHibernate. How can he understand what depends on what and why? what is high level/low level? where to begin to add the new feature lambda? how to decide if a new namespace must be created or if a types better goes inside an existing namespace? What was the mental state of the previous developer that decided to structure the code this way?...

Take the namespace NHibernate.Util: it is using directly AND used by directly 8 other namespaces that are: NHibernate.Type; NHibernate.SqlCommand; NHibernate.Properties; NHibernate.Impl ; NHibernate.SqlTypes; NHibernate.Engine ; NHibernate.Dialect ; NHibernate.Dialect.Function

Can you then answer immediately which namespaces, which type inside these namespaces, is at the lowest level then? Or you don’t believe in layered architecture and then this question become irrelevant. But if you don’t believe in layered architecture then in which architectural tenets do you believe?

To my assertion: it must be a daily pain to maintain the code.

You answer: Um… not!

…and I believe you. But did you answer that especially because you know perfectly the code base and can show immediately an answer to any question asked above. If you or Fabio or someone else important quit the NHibernate project, without a concrete componentization, think about all the re-engineering work that must be done by the new developer generation.

My concern is mainly that NHibernate developers are respected developers in the community. You know your job well, you know your code base well, and then likely you can do without clear layering (personally I can’t). But when you claim publicly that clear layering is not important (by clear I mean not layered in your mind but something visible inside the code base itself), then many are going to follow you on this and you are propagating the idea that spaghetti code is not a bad thing. My opinion is that spaghetti is the biggest pain for most projects maintained today.

Using concrete artifact to componentize/to layer/to get rid of spaghetti is a need and IMHO namespaces is the right artifact to use, because class is to fine-grained and assembly is physical and not logical (not to mention that namespaces offers also the hierarchical componentization possible).

Concerning breaking change, a public type that was here and is not anymore IS a breaking anyway, no matter the level of interpretation you use. If some code bases are relying on a public type not here anymore then they will be de-facto syntactically broken.

Anyway, congratulation for this new major release :o)

DaRage
07/21/2009 04:10 PM by
DaRage

NHibernate is spaghetti?!! too bad I can't refund my NHibernate in Action book.

caustic
07/21/2009 04:26 PM by
caustic

I wouldn't get too worked up over that post. It's very common for people to think they can measure "quality" when they first get started with software metrics, but it's a misguided endeavor.

Even the visionaries in the field will tell you that any metric is useful for measuring the change the unit under measure either over time or in comparison to another, similar unit. Thinking that you can define a magical number that says something is "good" or "bad" is a misunderstanding of the whole spirit of software metrics.

It's a newbie mistake.

Ayende Rahien
07/21/2009 04:42 PM by
Ayende Rahien

which other concrete artifact are you relying on?

We are not relying on concrete artifacts, you can pretty much look at a class and tell where it belongs.

How can he understand what depends on what and why?

You read the code, generally, you only need to read the code path that you are interested in improving. I usually suggest trying to find the relevant interface and go through that.

what is high level/low level?

That is not a valid distinction, actually. Mostly because of the usage of a lot of template XYZ in many places. As a good example, hydrating an entity is work that actually encompass quite a bit of NHibernate. Just off the top of my head, it involve IUserType, IInterceptor, IGetter, ISetter, IIdGenerator, ICacheProvider, IXyzEvent and IPersister

I am using interfaces as a good way to talk about the different parts of NHibernate. There is no way to say that something is higher or lower than the other.

Just to give you an idea, entity hydration (or dehydration) is the part that deals directly with IDbCommand. It is also pretty important location for things like cache coherency and concurrency control.

where to begin to add the new feature lambda?

feature lambda? I don't understand that.

As for new feature, you start them based on what exactly you want to do.

For example, one of the hardest thing that I had to do was to implement Multi Query support, I did that by following the code path for single query execution and then applying that for multiples. Building Future support on top of that was just a matter of adding a field in SessionImpl to hold the currently waiting queries.

Sorry, there isn't a checklist for new features (there is for additional capabilities, like adding new DB, or extending a dialect with new functions, etc), the dev has to think.

how to decide if a new namespace must be created or if a types better goes inside an existing namespace?

Based on the same decision tree that you would use in any other codebase. Generally, we do this for organizational purposes, not for layering.

What was the mental state of the previous developer that decided to structure the code this way?

Um? Why do I care about that?

Take the namespace NHibernate.Util: it is using directly AND used by directly 8 other namespaces that are: NHibernate.Type; NHibernate.SqlCommand; NHibernate.Properties; NHibernate.Impl ; NHibernate.SqlTypes; NHibernate.Engine ; NHibernate.Dialect ; NHibernate.Dialect.Function

And your point is? As I said, we aren't using namespaces as layering, so there is no problem there.

If you want to show a problem, take a look at how many things have circular reference to SessionImpl, but we don't really consider this to be a serious issue, because SessionImpl is actually a misnomer, it is orchestration more than anything else, the real stuff happen in the event listeners.

Ayende Rahien
07/21/2009 05:00 PM by
Ayende Rahien

Here is the second part of the reply:

But if you don’t believe in layered architecture then in which architectural tenets do you believe?

I have a problem with this question, because it has the hidden assumption that there IS such a thing as always valid architectural tenet. But for a lot of scenarios, I love Independent Components, mostly, where components may be a single class or a set of classes.

But did you answer that especially because you know perfectly the code base and can show immediately an answer to any question asked above.

Which is why I mentioned that we had several new committers in the last year. Just to give you an idea, Steve Strong, which did tremendous amount of work for us, started out without any insider knowledge about how NH was put together at all!

Another strong indication is the number of patches that we get, or the fact that people are able to send us very direct and suitable patches for specific problems.

by clear I mean not layered in your mind but something visible inside the code base itself

I really don't like things being forced upon me, be it to make the compiler happy or to make some architectural document happy.

In most cases, I tend to create infrastructure components and on top of them put specific implementations of a component interface.

My opinion is that spaghetti is the biggest pain for most projects maintained today.

While I would agree with you here, I think that our definition of what spaghetti is are quite different. From my perspective, if I can reach and change one thing, without having to touch 15 others, there is no spaghetti.

namespaces is the right artifact to use

That is your opinion, but when you try to use this as a metric on a project where this is not the case, you are going to get garbage. GIGO.

a public type that was here and is not anymore IS a breaking anyway

Yes, except that we tend to break those to things that we worry about and thinks that if you do you are expected to know how to fix.

Patrick Smacchia
07/21/2009 05:11 PM by
Patrick Smacchia

Caustic, if you read the following post I wrote 3 weeks ago I just come to the exact same conclusion as you:

codebetter.com/.../...g-fabricated-complexity.aspx

From the post:

I would prefer to avoid giving the NDepend user the feeling that fighting fabricated complexity is only a matter of reducing the value of a naked index

As every time I shed light to some non-positive facts buried in a code base I get the same critic again and again: You believe too much in metrics basically (which I don't, read my post)

Is it my fault if no-one can extract from the NHibernate code base itself a clean boxes and arrows Direct Acyclic Graph, even if where box can be anything (like set of classes) and not necessarily namespaces as I advocate for.

When one answers to the lack of clean layering with...

As for assembly dependencies, that is more complex, especially since a lot of stuff in NH is done dynamically.

...then if you have a structure more complex than a clean DAG, isn't it the sign that there is a lot of fabricated complexity? What is simpler than a DAG to structure a software? What are the reasons to have an implementation more complex than a DAG?

Also I believed that the whole purpose of the "dynamic, IoC" things was to achieve loose-coupling by removing static dependencies. And in NHibernate code base 95% of classes depends statically on every other classes, that's a fact.

Frans Bouma
07/21/2009 05:18 PM by
Frans Bouma

"We are not relying on concrete artifacts, you can pretty much look at a class and tell where it belongs."

Interesting. Looking at the 'NHibernate' folder in the 2.1 GA trunk, I see a subfolder 'Exceptions' and I also see exception classes in the 'NHibernate' folder. That doesn't add up: why aren't the exceptions in the NHibernate folder in the Exceptions folder?. I think that things like that is what Patrick means: if you want to have full control over a big code base, you have to be really careful what you do with all that code: where do you place a class, which namespace are you creating for the group of classes it's in etc.

There are more examples, like the interceptor interface not in the intercept folder etc. You can't come with the story that that structure is dictated by the Java people

What surprises me a bit is the defensive tone. Sure, they're just metrics and you can look at them in a million ways and every way gives a different picture. That's not to say that there's room for improvement, perhaps a big room (except the boundaries of stay backwards compatible for most parts). Why not take this knowledge Patrick dug up and use it to your own advantage to create a better nhibernate? There will come a day that you and/or Fabio will call it a day and someone else has to take over.

Ayende Rahien
07/21/2009 05:25 PM by
Ayende Rahien

Patrick,

Is it my fault if no-one can extract from the NHibernate code base itself a clean boxes and arrows Direct Acyclic Graph

No, but it is a problem when you think that this is a goal.

Going back to the entity hydration scenario, this is, by its very nature, a cyclic problem.

Just consider the case of late eager loading of an association, which may be a one to many or many to one. And that is really without even taking into account the more complex scenarios when you have complex data types that then have associations.

Not all problems can be solved with DAG, at least not without additional and artificial complexity that wouldn't really serve much.

Also I believed that the whole purpose of the "dynamic, IoC" things was to achieve loose-coupling by removing static dependencies.

That depend on what dependencies you are talking about, not all are made equal, and they shouldn't be treated as such.

Ayende Rahien
07/21/2009 05:30 PM by
Ayende Rahien

Frans,

You can't come with the story that that structure is dictated by the Java people

anonsvn.jboss.org/.../Interceptor.java

A lot of the structure does come from Java, actually.

What surprises me a bit is the defensive tone.

The problem is that the basis for the metrics are not valid, which paints an inaccurate picture.

There will come a day that you and/or Fabio will call it a day and someone else has to take over.

That has actually happened at least 3 times already, actually. We don't see a problem yet.

Patrick Smacchia
07/21/2009 06:26 PM by
Patrick Smacchia

First part:

That is not a valid distinction, actually. (answer to what is high level/low level?)

As a good example, hydrating an entity is work that actually encompass quite a bit of NHibernate.

There is no way to say that something is higher or lower than the other.

These remarks suggest that you see NHibernate as a code base where every aspects of the product interact with every other aspect. In one word: Monolithic.

Going back to the entity hydration scenario, this is, by its very nature, a cyclic problem.

We all cop with cyclic problem. Abstractions can be used to break static dependency cycles. The domain being cyclic is not a reason for cyclic implementation (hopefully!).

feature lambda? I don't understand that.

Feature lambda = any upcoming feature that will come in your pipe in the future.

I am using interfaces as a good way to talk about the different parts of NHibernate.

Interface is Abstraction and Abstraction is used to de-couple caller and called implementations. This is because interface is something less complex and less subject to changes than implementation that we use them. But once again, 95% of the NHibernate implementations depend statically on all other implementations.

Patrick Smacchia
07/21/2009 06:27 PM by
Patrick Smacchia

Second part:

Um? Why do I care about that? (that = the mental state of the previous developer that decided to structure the code this way)

Because if one cannot recover the mental state of the former developer, then one is about to miss former developer intentions. By modifying the legacy code without a total understanding of former intentions, one will necessarily increase the entropy. Refactoring is not an answer to that. If one doesn’t understand former intentions, one cannot do better code than it was.

components may be a single class or a set of classes

I agree, component is often something bigger than a class. My point is that if you see NHibernate.dll as a single component then it is a giant components that contains 58.082 Lines of Code, 1 456 classes, 277 interfaces and 47 enumeration. My understanding of components is that developers use them to divide and conquer. From my experience, I estimate that 2.000 LoC should be a maximum limit for the size for a component.

In most cases, I tend to create infrastructure components and on top of them put specific implementations of a component interface.

Unfortunately at first glance I cannot identify any infrastructure & interface components in NHibernate.dll. I hope you have some but if they are not made obvious, then there is fabricating complexity anyway.

I think that our definition of what spaghetti is are quite different. From my perspective, if I can reach and change one thing, without having to touch 15 others, there is no spaghetti.

Well, we have the same perspective. If you can reach and change one thing, without having to touch 15 others, it means that you can take this thing, put it in a temporary VS project, work with it, test it, tweak it… and then put the modified version into the original code base. How to do that if the piece of code statically depends on 95% of other classes? By mocking 95% of the code?

There will come a day that you and/or Fabio will call it a day and someone else has to take over.

That has actually happened at least 3 times already, actually. We don't see a problem yet.

There is the diseconomy of scale phenomenon. We (me, Ayende, Frans) should keep in mind that we are working on medium sized code base LlblGen (120K LoC) NHibernate (58K LoC) NDepend (85K LoC). But nobody on earth could cop with a 1M LoC code base without a clean structure with concrete layers and clear components separation.

Developers in large corporation are working on Millions LoC projects that result from hundreds of man-years. I relentlessly advocate that structuring, layering, organizing, componentizing, dividing, separating, segregating, levelizing, partitioning…, in a word caring for components and dependencies (boxes and arrows), all these things are good and will serve them. But you wrote:

No, but it is a problem when you think that this is a goal. (The goal I mentioned was to be able to extract from the NHibernate code base itself a clean boxes and arrows Direct Acyclic Graph).

DaRage
07/21/2009 06:41 PM by
DaRage

So you're saying there is not concrete organization, philosophy, structure to the code base and developers just know how to do things and the only thing you're backing your argument with is that it worked well so far.

Demis
07/21/2009 06:42 PM by
Demis

We'll I personally think that NHibernate's days are numbered. It will hang around to support legacy databases schemas but I think its role as the number 1 orm for building persistent domain models in .NET will quickly get overtaken by a provider that has proper support for LINQ.

It requires too much configuration and knowledge of internals to be considered the top choice for rapid application development. In cases where you have total control of your domain model you really should be assuming (as much as possible) convention over configuration. Technologies like db4o require a lot less effort to achieve the same result.

Internally we're using our own in-house LINQ provider which stores your object models just as you would expect and you're able to configure non-default behaviour with attributes. As it was built from the ground-up to be a LINQ provider the query API also comes naturally as its based on LINQ.

Demis
07/21/2009 06:52 PM by
Demis

I personally think if you can't measure your code quality in an automated way you can't improve it in the long run.

Peer reviews and other manually gathered metrics are at best subjective and at worst become stale the day they're gathered.

I mean if its built in such a way that it trips up all code-quality metric software, it either has to be changed or a custom automated code-quality tool needs to be written if it ever has a chance to maintain a high-level of quality in the long term.

Frans Bouma
07/21/2009 07:19 PM by
Frans Bouma

"A lot of the structure does come from Java, actually."

Due to the origin of the code base, you mean? Ok, but that's not an excuse for leaving a bad structure in place for so many years. (I can't find relevance in the link you posted, I think I overlook an important aspect, but I don't see how a decision made by the java team is important for you guys, at least not today. It might have been in the early days, but not anymore).

"The problem is that the basis for the metrics are not valid, which paints an inaccurate picture."

As I said, one can use metrics in any way they want and always paint the picture they want to paint. But that's in this case not the point. You and your fellow alt.net friends push a certain agenda where among other things, low coupling is one of the things which are considered very important. If you look at this nhibernate codebase, one can't deny the fact that the concept of 'low coupling' is simply absent.

I'm not saying that preaching low coupling is bad, on the contrary, low coupling is good. The problem is that if person X preaches low coupling while at the same time the code base person X works on is a codebase with very dense coupling between each and every class, it makes person X less believable: why is person X pushing a point while X isn't doing what he preaches?

That's the defensive tone here and the reason I find it a bit ironic. ;) No-one is saying your code sucks and it falls apart. It just makes you a bit less believable as a person who knows what he's doing when your code is so tightly coupled as the nhibernate sourcecode.

Removing it is very hard, but at the same time, it's also a sign of a healthy codebase that these kind of things are addressed and CAN be addressed. High coupling is a problem, and you'll be bitten by the downsides sooner or later.

Again, nobody's perfect, and no-one is criticizing your ability to write software, the point is that if someone points you at a 'less optimal situation', you also could pick up that advice and address it. After all, you can jump up and down till you're blue in the face, but the high coupling in the code base and the disorganization in the class structure aren't going away because of that. :)

Like the excuses about namespace usage. These things are there for one reason only: organizing the code. If someone's choice of namespaces is less optimal, why not address it and create a better situation? Who's going to be hurt by that? :)

Ayende Rahien
07/21/2009 07:32 PM by
Ayende Rahien

DaRage,

Ha?

I didn't say that. What I said that there isn't a namespace or folder based organization, because this isn't how we tend to think about the source code.

There is a very well defined structure, that allow us to rapidly navigate, explore and modify it. It happens to be that most of the codebase is structured around the code itself, there is a reason I kept mentioning the interfaces, it is the basic building block and the implementations compose how you think about NHibernate itself.

Ayende Rahien
07/21/2009 07:46 PM by
Ayende Rahien

Frans,

that's not an excuse for leaving a bad structure in place for so many years.

I disagree that this is a bad structure in the first place.

I don't see how a decision made by the java team is important for you guys, at least not today

Maintaining close compatability to the Java version is actually very important for us, it means that we can port new features, that porting Hibernate contrib projects is possible, and it simplify things like documentation, knowledge transfer etc.

one can't deny the fact that the concept of 'low coupling' is simply absent.

That isn't what is being pointed out, however. The problem, as it was stated, was that there is no layering between the namespaces.

As for low coupling inside the codebase, we fix that whenever this is a problem, but it isn't a problem in most cases.

Well, I tell a lie, the class that I want most to be decoupled from is System.String, but that is another issue.

codebase with very dense coupling between each and every class

That is quite a gross extrageration. Yes, there are parts of NHibernate that are being tightly coupled, but it isn't remotedly close to what you are trying to paint.

why is person X pushing a point while X isn't doing what he preaches?

Um, different contextes, perhaps?

Regardless for everything else, there are different criteria for infrastructure code and business level code.

For that matter, if you would like to see some ugly code of mine, check out this guy:

rhino-tools.svn.sourceforge.net/.../Sender.cs
And yet I have no intentions of actually making any sort of change there.

it's also a sign of a healthy codebase that these kind of things are addressed and CAN be addressed

Changes has to be motivated by something, if there is no problem, there is no reason for change. In fact, there are good reasons NOT to change.

Change for the sake of change, and that is what I would call trying to refactor something that cause no problem, is nothing but waste.

the point is that if someone points you at a 'less optimal situation', you also could pick up that advice and address it

You should read what I actually responded to, it isn't less optimal, it is 'daily pain' that I objected to.

Who's going to be hurt by that?

All of us:

  • next time someone is going to port something and have to figure out where things have moved

  • moving between namespaces is a breaking change

  • it is harder to track changes to a particular file if it was moved

Ayende Rahien
07/21/2009 08:47 PM by
Ayende Rahien

In one word: Monolithic.

That is a complex issue to answer, mostly because the term monolithic has bad conotations associated with it.

But to the point, internally NHibernate is built of a lot of small components that have rich interaction behavior.

Abstractions can be used to break static dependency cycles

And what would it give me? Abstraction for the sake of abstraction?

If you make the assumption that dependency cycle is bad, maybe. I don't follow that, and I have no problem with making them when it is making things simpler.

95% of the NHibernate implementations depend statically on all other implementations

That is interesting number, and it doesn't match what I know of the codebase.

What is the basis of that, and is it based on namespaces of classes?

Because if one cannot recover the mental state of the former developer

If I need the mental state, I lost.

one will necessarily increase the entropy

No, that is why we have tests for. It is actually really hard to break NHibernate, we got pretty much the entire thing covered every which way.

If one doesn’t understand former intentions, one cannot do better code than it was.

If I can't understand what is going on from the code, it is a problem in itself, period.

My point is that if you see NHibernate.dll as a single component

You can do that, in which case you have one class (NHibernate.Cfg.Configuration) and two interfaces (ISessionFactory, ISession) that you actually care about. Anything else is for advance stuff.

Unfortunately at first glance I cannot identify any infrastructure & interface components

Infrastrucure in NH is the configuration and session factory (mostly), with the interfaces being, well... the interfaces.

If you can reach and change one thing, without having to touch 15 others, it means that you can take this thing, put it in a temporary VS project, work with it, test it, tweak it… and then put the modified version into the original code base.

Huh? In what project can you do that, very few classes live in isolation.

Even if we assume a strictly layered project, you are going to have dependencies on lower level stuff, and you can't just move a bit of code around in that case.

How to do that if the piece of code statically depends on 95% of other classes?

Until I know how you got to that number, I can't answer that.

NHibernate (58K LoC)

How did you get to that number? On the 2.1 source code, I get 143 KLOC using: http://paste-it.net/public/lb69595/

Counting things like the contrib projects, I would say that it exceeds the .5 MLOC

(boxes and arrows),

I don't care about boxes and arrows, I care about being able to understand things. It doesn't follow that you have to have the boxes & arrows to make things understandable.

I'll have another post tomorrow talking about that in detail, though.

Patrick Smacchia
07/21/2009 08:56 PM by
Patrick Smacchia

95% of the NHibernate implementations depend statically on all other implementations

That is interesting number, and it doesn't match what I know of the codebase.

Just have a look at the matrix code base in my blog post. From any 2 arbitrary namespaces those 2 are involved in a static dependency cycle. (except NHibernate.Id.Enhanced, NHibernate.Proxy.Poco, NHibernate.DebugHelpers)

s3.amazonaws.com/.../DSM.png

NHibernate (58K LoC)

I got it from the only comparable measure of Lines of Code in the .NET world: # of sequence points. As defined in VisualStudio, in NCover and in NDepend as well:

http://www.ndepend.com/Metrics.aspx#NbLinesOfCode

Ayende Rahien
07/21/2009 09:05 PM by
Ayende Rahien

From any 2 arbitrary namespaces those 2 are involved in a static dependency cycle.

snort, wrong metric, then.

Your data is about namespaces, but you are talking about types.

Can you get the data about types instead of namespaces?

As I mentioned, trying to apply this sort of metric for namespaces isn't going to get us anywhere.

of sequence points.

Okay, that explains the disparity, do I read it correct in that usually it is 1 loc per 7 sequence points? Or is it the other way around?

Dylan Smith
07/21/2009 09:27 PM by
Dylan Smith

The way I read it, I think he's trying to say 7 IL instructions per 1 sequence point.

Just to add my 2 cents on a few points here:

"Changes has to be motivated by something, if there is no problem, there is no reason for change. In fact, there are good reasons NOT to change.

Change for the sake of change, and that is what I would call trying to refactor something that cause no problem, is nothing but waste."

The point that I see trying to be made is that they are not recommending change for the sake of change, but rather change to improve maintainability (in their opinion obviously). In your opinion what are valid reasons for change? Are "code smells" valid reason for change? Even if all functionality is working as intended, and we haven't experienced any pain associated with those code smells yet? (I'm trying to ask this question separately from the other question of whether there is an actual code smell present here, since there is obvious disagreement on that point).

As far as recovering the mental state of the previous developer. It seems to me that both Ayende and Patrick are in agreement, just wording it differently. It sounds to me that Ayende is saying he can in fact recover the mental state of the former developer by reading the code and determining the intentions from there...Ayende just isn't calling it the "mental state", but it sounds like you are both talking about the same thing.

Ayende Rahien
07/21/2009 09:31 PM by
Ayende Rahien

Dylan,

As far as I see it, if there is no pain associated with something, for the users or developers, then touching a piece of code is a waste.

However, if there is pain, and if this is a smelly part that you have to touch, then you should do something about it.

Patrick Smacchia
07/22/2009 12:02 AM by
Patrick Smacchia

Here is the type entangling matrix you were looking for, a 1456x1456 matrix where black cells represent dependency cycles between classes.

codebetter.com/.../TypesDSM.jpg

Thinks are indeed better than when just considering namespaces. Now roughly 40% of classes are entangled.

1 loc per 7 sequence points

No, 1 sequence points = 1 LoC. This way you get a LoC measure independent from the coding style, the comment, the language...

Doing so is the only way to avoid counting physical LoC which is highly subjective and cannot be used for comparing code base size.

cbp
07/22/2009 01:03 AM by
cbp

Hi,

I have read these comments with interest.

@Patrick

Statistical metrics are most definitely useful, but especially in a handwritten analysis they do need to be more thoroughly tempered by deeper insight into the code. As Ayende has pointed out, the NHibernate team have chosen to utilise namespaces in such a way that makes a namespace dependency matrix misleading as a code quality metric. This fact needs to be analysed more thoroughly - the pros and cons need to be weighed up. Without this analysis the dependency matrix is not at all useful.

The same argument can be applied to the "Breaking Changes" analysis. Of the public types that were removed, what is the likelihood that NHibernate's consumers are actually using those public types? How well have these breaking changes been documented and how accessible is that documentation? How easy is it to fix the breaking change - has a type simply been renamed or are deeper structural changes required?

@Ayende

I guess we really need to see what are the benefits of your utilisation of namespaces.

One must remember the following about NHibernate: it is a mature; it is a port of Hibernate for Java; it is open-source and free.

As Ayende has pointed out, all of these points affect our judgement of the code base. Being a port of Hibernate, and being a free product, aligning namespaces and types with the Java code base is a huge benefit for the NHibernate Team and this factor needs to be taken into account when analysing metrics.

@Demis

A mature system will always have legacy-like components that do not use the latest hyped-up technology or best-practice. These systems will always have more complex dependencies ("Number of Users" is a dependency; "Documentation" is a dependency). On the other hand, a mature system provides many benefits: larger developer community; more likely to be maintained; feature-rich etc. Whatever technology you believe will swiftly overtake NHIbernate will have a long way to go in building up the same level of community, features and capabilities.

Hudson Akridge
07/22/2009 01:14 AM by
Hudson Akridge

Demis,

We'll I personally think that NHibernate's days are numbered. It will hang around to support legacy databases schemas but I think its role as the number 1 orm for building persistent domain models in .NET will quickly get overtaken by a provider that has proper support for LINQ.

Which provider, to your knowledge, has robust Linq support, and has even close to the number of features and mapping support NHibernate does? I'm not trying to be sarcastic, I'm legitimately interested in which ORM or ORMs specifically you're thinking of that have a shot at de-throning NHibernate.

It requires too much configuration and knowledge of internals to be considered the top choice for rapid application development. In cases where you have total control of your domain model you really should be assuming (as much as possible) convention over configuration. Technologies like db4o require a lot less effort to achieve the same result.

Check out Fluent NHibernate (FluentNhibernate.org) Convention over configuration in NHibernate.

Jonatan Berggren
07/22/2009 06:31 AM by
Jonatan Berggren

Demin:

It requires too much configuration and knowledge of internals to be considered the top choice for rapid application development.

I am currently building a webb application based on S#arp architecture. The project makes use of fluent nhibernate for configuration with automapper. Now during development I have it setup so that it updates the database schema on Application_start and fills it with som test data. This means that when I create a new entity class, I only have to recompile and browse to the site for it to setup the database for me. No manual mapping required. If I want to change to mapping in any way, I just have to setup what I want to change and not the entire mapping. I'm loving it!

NHibernate FTW.

Frans Bouma
07/22/2009 07:50 AM by
Frans Bouma

"> why is person X pushing a point while X isn't doing what he preaches?

Um, different contextes, perhaps? Regardless for everything else, there are different criteria for infrastructure code and business level code."

huh? So the principles you and your fellow alt.net friends push are subject to context? I never heard anyone say anything about 'context' when someone's code was bashed to pieces because it violated some principle. That's the ironic part here.

Principles aren't bound to context, they're either a principle you live by or you don't live by it. Not supporting principles is fine though, being pragmatic is often better than stubbornly following a principle, but that's precisely why one has to be very careful when bashing other people's code.

This one was funny though:

"> one will necessarily increase the entropy

No, that is why we have tests for. It is actually really hard to break NHibernate, we got pretty much the entire thing covered every which way."

that's impossible. An o/r mapper has an infinite number of use cases, and you can't cover every single case, left alone a substantial part of that. As the code is very fragmented in nhibernate, I really wonder if you have ever looked at proving the code to be correct. Mind you: tests test two things: 1) if the algorithm is correct for the particular situation the test is made for and 2) if the implementation of the algorithm is correct for the particular situation the test is made for. They don't prove anything on any generic scale.

That's not to say there are many bugs in the code, the more people use it the less bugs there will be left, however claiming that you have everything covered because you use tests is simply naive.

Frans Bouma
07/22/2009 07:53 AM by
Frans Bouma

"Which provider, to your knowledge, has robust Linq support, and has even close to the number of features and mapping support NHibernate does? I'm not trying to be sarcastic, I'm legitimately interested in which ORM or ORMs specifically you're thinking of that have a shot at de-throning NHibernate."

I guess you also want poco? Genom-e or EUSS come to mind. EUSS is free and open source, Genom-e isn't free but their linq provider is very good. For non-poco, well, you know the answer ;)

Ayende Rahien
07/22/2009 07:57 AM by
Ayende Rahien

I never heard anyone say anything about 'context'

Then you weren't listening very well, or the context was well defined.

Principles aren't bound to context

Of course they are.

that's impossible

Tell it to our users, we have very few (as in single digit number, out of hundreds of thousands of users) things that break between releases.

I really wonder if you have ever looked at proving the code to be correct.

We have been over this before, we aren't trying to prove it to be correct, only that it works.

DaRage
07/22/2009 03:56 PM by
DaRage

Patrick: namespaces are 95% entagled.

Ayende: we don't structure by namespaces but with types (whatever that means)

Patrick: types are 40% entagled.

Ayende: ?

Ayende Rahien
07/22/2009 04:35 PM by
Ayende Rahien

Ayende doesn't get paid to write the blog, so he does other things as well.

I'll get to it soon.

Bob C.
07/22/2009 05:51 PM by
Bob C.

Look at this big tangled image with arrows pointing every which way, the code must be bad...blah, blah, blah. God I am so sick of the NDepend posts on CodeBetter.

Dan
07/22/2009 09:15 PM by
Dan

@Demis

... we're using our own in-house LINQ provider which stores your object models just as you would expect and you're able to configure non-default behaviour with attributes. As it was built from the ground-up to be a LINQ provider the query API also comes naturally as its based on LINQ.

I'm assuming you spent at least a few weeks working in-depth with a mature ORM before deciding to roll your own.

How much of the following have you / will you need to implement from the ("ground-up")?

  • session / UOW management

  • identity map

  • loading and persisting all types of association (one-to-many, many-to-one, one-to-one and many more)

  • lazy loading

  • eager fetching

  • inheritance mapping

  • second level caching

  • change tracking etc etc. All from the "ground-up".

How did you justify the cost of the work your team has put into this?

iLude
07/23/2009 03:37 AM by
iLude

@Frans

huh? So the principles you and your fellow alt.net friends push are subject to context? I never heard anyone say anything about 'context' when someone's code was bashed to pieces because it violated some principle. That's the ironic part here.

Come on Frans, Straw man arguments? I do not recall anyone using bad metrics as the basis for those "bashings". The ones that were worth reading discussed poor code design and showed or references better ways of doing things. I have not heard anyone here point out any concrete examples where these code metrics, being discussed here, show an area of NHibernate that shares equally bad design choices. I'm not saying that those areas do not exist, I am not an NHibernate contributor and only have done passing examinations of the code. But when I did, I did not find it overly taxing to understand the areas I was looking at and how the code worked.

If you are going to question whether Oren, Fabio or any other alt.netter practices what they preach, cite examples in their code. It would be a far more enlightening discussion than this one.

The simple fact is the existing contributors find the structure acceptable and no one has seen any economic benefit in changing it. But if you and Patrick would like to take a bit your cash and fund a startup that employed Oren, Fabio and Steve full time to make these changes and get us all LINQ support, I would be one of the first to submit my resume to join them. Hell, If I didn't have a house and a baby on the way I would do it for free, because the experience of working with them or many of the other alt.netters would be worth it.

PandaWood
07/23/2009 05:30 AM by
PandaWood

I took issue with a number of things being said here.

Without meaning to be personal, the content of posts by Frans and Patrick are laced with logical fallacies. Most of them seem to be an effort to feel or sound superior.

These add nothing to the discussion and worse, focus on non-issues that have nothing to do with the subject.

1) Frans - "claiming that you have everything covered because you use tests is simply naive"

No one said everything was covered. Claiming that someone said something they didn't is dishonest.

2) Frans - "What surprises me a bit is the defensive tone"

This is irrelevant. Anyone who argues will probably sound "defensive" if they are defending. I imagine that this makes the sayer feel just a tad superior.

3) Patrick - "These remarks suggest that you see NHibernate as a code base where every aspects of the product interact with every other aspect. In one word: Monolithic."

I think this is called "Slippery Slope" argument.

Here, let me try it: Patrick, these remarks suggest you see NHibernate as a massive product with great potential. In one word: 'Perfectly awesome'.

Notice, I can't even count, but I won't let that affect my superior sounding remarks. Why stop there: my remarks suggest that I am the chosen messiah. In one word "God".

4) Frans - "You and your fellow alt.net friends push a certain agenda..."

"Ad Hominem" - attacking the arguer rather than the argument. Waste of time.

I'm interested in whether the points made by Ayende in this blog have merit or are possibly false.

Please don't waste our time by using flawed "debating tactics".

Frans Bouma
07/23/2009 08:19 AM by
Frans Bouma

@iLude: "If you are going to question whether Oren, Fabio or any other alt.netter practices what they preach, cite examples in their code. It would be a far more enlightening discussion than this one."

Where are the Solid principles in the nhibernate code? If you preach low coupling, while there's a lot of coupling between types and namespaces, there IS high coupling. Perhaps not through a secret view system no-one but the nhibernate developers know about, but there is when you use normal measurements. Again: I don't care about that, I just find it ironic that apparently the holier than thou attitude of alt.net-ers gives them free passes for not doing what they tell others to do. ;)

"The simple fact is the existing contributors find the structure acceptable and no one has seen any economic benefit in changing it. But if you and Patrick would like to take a bit your cash and fund a startup that employed Oren, Fabio and Steve full time to make these changes and get us all LINQ support, I would be one of the first to submit my resume to join them. Hell, If I didn't have a house and a baby on the way I would do it for free, because the experience of working with them or many of the other alt.netters would be worth it."

Why would I give away my money to employ people to build what I've already build myself? It might not occur to you but I funded my 9 months of Linq provider development from my own pocket. Perhaps I'll donate my linq provider, perhaps not, that's not decided yet.

Though, downplaying the essence of a good architecture because of 'there's no economic value' is missing the point: a good architecture which is maintainable has a far greater economic value than an architecture which isn't maintainable. Of course, Oren will now say that the nhibernate code is maintainable, but perhaps only for the few people who have done a lot of porting of the java code. Perhaps they're also stuck in the situation where they have to live with the bad design decisions of the java version, however nhibernate is an o/r mapper which has all the features the majority of users will ever need, so if another feature has to be ported over, you can do that by simply writing the code from scratch, there's little left to write.

@PandaWood: whatever.

Ayende Rahien
07/23/2009 08:23 AM by
Ayende Rahien

a good architecture which is maintainable has a far greater economic value than an architecture which isn't maintainable.

Yes, I would certainly agree with that statement.

Oren will now say that the nhibernate code is maintainable

See previous line

but perhaps only for the few people who have done a lot of porting of the java code

Nope, we have a mid size team of committers, and a lot of people who submit patches, so I am pretty compfortable with the idea that we are doing well from that perspective.

And that also include my own personal experience in learning how parts of NHibernate that I have never seen before work.

PandaWood
07/23/2009 12:28 PM by
PandaWood

Frans > "Of course, Oren will now say that the nhibernate code is maintainable, but perhaps only for the few people who have done a lot of porting of the java code."

If a code base requires the original developers, then it's not maintainable.

That is surely the meaning of maintainable software?

If Ayende believes it's maintainable, then it follows that he does not believe the original developers (those who did a lot of java porting) are required.

Frans Bouma
07/23/2009 12:43 PM by
Frans Bouma

"If Ayende believes it's maintainable, then it follows that he does not believe the original developers (those who did a lot of java porting) are required."

With all due respect, but it's not his call to make. The person who spend a lot of time inside the code base obviously finds it logical and knows his way through the codebase and where to make changes what to avoid etc. For that person, the codebase is maintainable. It is different for a person who's new to the codebase. If that person has to spend a lot of time figuring out where what is and most importantly: why, it's already a sign things aren't as maintainable as it should.

NHibernate is a bit of an odd project if you can say so: the reasons why the code is constructed the way it is, is often not clear because it's a port of a different code base and the developers of that code base aren't closely involved in nhibernate. This requires reverse engineering of the code to the mindset where you can see why things are the way they are. So doing 'maintenance' might not be what it means on a codebase which isn't a port. IMHO, the dependency on hibernate might be one which isn't preferable in the long run, as java and .net aren't alike. You already see that with Linq: because hibernate doesn't have a linq requirement, nhibernate struggles to get a linq provider, for a long time now. The sad thing is: more and more people will demand a mature linq provider to be there, if not this year than at least next year.

Ayende Rahien
07/23/2009 12:48 PM by
Ayende Rahien

Panda,

For that matter, the original developers are the Java guys, none of which are contributing to NHibernate.

iLude
07/23/2009 04:33 PM by
iLude

@Frans

"If you preach low coupling, while there's a lot of coupling between types and namespaces"

I in no way wish to speak for Oren, but I believe he has made it abundantly clear that looking at NH from a namespace coupling stand point is not going to work. Your argument is flawed in the sense that you imply that namespace decoupling is the only form of low cohesion. It is not it is one way of measuring it, but far from the only way.

Metrics are like accounting. If I have a widget that is 50% of the way through being manufactured, what is the value of that widget? Is it worth the cost of the labor and materials used in its making to this point? Is it worth what a scrap dealer will pay you for it to recycle the materials? Is it worth 50% of the market value of the final product? The correct answer is that all three are right answers. Each provides a different view of that widget's value. But if you are looking at whether the changes you made to the process of making that widget to this point are saving you money, then the scrap value is useless in answering that question. Its the same regardless of the changes you made.

No one is saying that Patrick's tool is useless. Just that in the case of NH its measuring the wrong values to be useful.

" Again: I don't care about that, I just find it ironic that apparently the holier than thou attitude of alt.net-ers gives them free passes for not doing what they tell others to do. ;)"

In Oren's defense when he critics others code, he far more often than not provides at least examples of a way he considers better, if not a completely working replacement implementation. I find this neither ironic nor holier than thou. I find it enlightening and educational. I've never paid Oren a dime for any of this, but what I do give him is a whole lot of respect and thanks for making me a better programmer. As for other alt.netters they get the same bar to live up to. If you don't think its right show me a better way. The ones that don't do not get a free pass from me the don't get read. :-)

"It might not occur to you but I funded my 9 months of Linq provider development from my own pocket."

Yes I am well aware of who you are and what you've written. And I respect you for it and respect your opinion when you make valid arguments. But in this case I believe you are making fallacious arguments and feel I should call you on it. Hopefully you will see that I do not do this to be vindictive. Only to allow you the attempt to improve them so that they do not continue to come off as self serving or self gratifying.

"...downplaying the essence of a good architecture because of 'there's no economic value' is missing the point..."

Oh where to begin! The entire paragraph files in the face of the economic principle of division of labor. Why should NH cut off the upstream code and documentation manufacture coming from the Hibernate team? As soon as the do they must replace those resources or lose the value that they bring. NH is a free product so there is no market incentive to improve the product the incentive comes from having working software to build other solutions that do have market value. Fabio, Oren, Steve and the other contributors do what they do because it gives them a product that solves a need. It so happens that this need is shared by many others and they can use NH as well and benefit from the labor of the NH team.

"...however nhibernate is an o/r mapper which has all the features the majority of users will ever need, so if another feature has to be ported over, you can do that by simply writing the code from scratch, there's little left to write."

A paper and pencil provide all the features of an accounting process the majority of users will ever need, does this mean that they should not use quickbooks? Or perhaps they should write their own accounting package, if they need a feature they can just code it from scratch. I mean really how long can it really take?

iLude
07/23/2009 05:15 PM by
iLude

@Frans

"because hibernate doesn't have a linq requirement, nhibernate struggles to get a linq provider, for a long time now. The sad thing is: more and more people will demand a mature linq provider to be there, if not this year than at least next year."

I do not believe that a linq provider is missing because it is difficult to build one using the existing code structure. Its missing because linq providers are very very difficult to write, as you yourself know. You have more than 14 posts on your blog about doing it and very few of those make it sound like alot of fun. And I must assume that your code base passes through NDepend with flying colors and makes it show rainbows and unicorns. :-) So by your logic a linq provider should have been easy to code up from scratch.

More to the point a linq provider could be created that output HQL or ICriteria. This provider could be bolted on to NHibernate with very minimal effort. That does not make the effort to build the provider any less. But if you go back and review Oren and Steve's posts on the subject it was decided that this was not the route that the NH team wanted to take because, shock of all shocks, there are areas in NH's codebase that are not as cleanly designed as they could be. And as such Steve is working to correct those areas. And its is being done for exactly the reasons you state, because in the long run it makes sense to improve those poorly designed areas.

None of this proves any of your arguments. NDepends 95% metric is still useless when it comes to NH. It provides useful values in determining where there are areas of NH that can be improved. It simply makes the claim that NH is 95% shit when measured this way. Since its clear to many that NH is not 95% shit, it must be the the measurements are wrong. And as such should be a challenge to Patrick to review how much coupling is really present in NH and figure out a way to make his product gather metrics that bring value and show areas that do suffer from poor design.

Would this not bring more market value to Patrick's product? Allowing him to use it on a wider range or codebases and bring useful metrics that allow for focused improvement to those codebases?

PandaWood
07/23/2009 11:31 PM by
PandaWood

Frans said > "With all due respect, but it's not his call to make. The person who spend a lot of time inside the code base obviously finds it logical..."

Ayende not just making a call. He's saying 3 things. Please don't ignore any of them:

Q: "is NH maintainable?"

1) Ayende's own code review and opinion is "yes"

2) Non- committers are submitting patches

  • Ayende: "Another strong indication is the number of patches that we get, or the fact that people are able to send us very direct and suitable patches for specific problems."

  • if multiple non-committers can submit many patches, this is some evidence that the code is "maintainable" - being mindful of the definition that includes "doesn't require original developers"

3) "In the last year or so we have taken on several new committers"

  • this is further evidence that the code is maintainable (the significance of which might depend on exactly how many committers there are). If many people are able to become committers, this is some good indication the code can be picked up easily ie maintainable

This is not a touchdown, but there's a pretty good indication.

From the other side, I've heard some metrics.

I love metrics but there's no definite link between a number and "maintainability" - without some judgement.

Some of the claims (NH is 95% cluttered) have been questioned eg

Ayende: "Analyzing namespace dependencies as a way to measure coupling isn't going to work for NH"

I haven't heard this answered with a decent argument - or did I miss it?

Points (2) and (3) are significant and can't be ignored. You must answer it. try something like "those guys just happen to be mates with the original developers" or "actually, there's only been 5 patches in 5 months, that's nothing".

I'm giving you hints here. Can you pick it up and make a convincing argument? I'm genuinely interested.

At the moment, the proof is in the pudding, and it sounds like regardless of metrics, NH is proving to be maintainable. Assuming Ayended's numbers and facts are true.

firefly
07/24/2009 03:26 AM by
firefly

First let me say that I enjoy this discussion tremendously. Especially when I get to hear it from some of the people that I have high respect for :)

Personally I think one of the most important standard for any project to follow is that internally it should be consistent across the board. A good design doesn't have to be intuitive to a new comer, in fact I would argue there is no such thing. Unless the designer borrow the concept else where. It should, however, always be intuitive to the "regular".

Comments have been closed on this topic.