Ayende @ Rahien

It's a girl

Lucene.NET is UGLY

If you ever had to go through the Lucene.NET code base, I am sure that you’ll agree that the code base is quite ugly. It does a lot  of low level stuff, which is almost always nasty, it is a port of a code from another language and framework, which means that it isn’t idiomatic code, and it has a lot of… strange things going on there.

  • Exceptions are used far too often.
  • There is a strong tendency to delegate things in such a way that make it hard to figure out where things are actually happening.
  • The big stick approach to thread safety (slap a lock on it).
  • Some really horrible things with regards to mutable shared state with IndexInputs.

Here is a good example of many of the issues that I talk about:

https://github.com/apache/lucene.net/blob/trunk/src/core/Search/FieldCacheImpl.cs#L207

Read this method, and I think you’ll understand.

Then again, you can see methods of similar or greater complexity in RavenDB, for example, see here:

https://github.com/ayende/ravendb/blob/1.2/Raven.Database/Indexing/IndexingExecuter.cs#L60

My main problem with the Lucene.NET codebase is that it feels alien, it isn’t .NET code, and it shows.

Then again, Lucene is also quite beautiful, but I’ll talk about this in my next post.

Tags:

Published at

Originally posted at

Comments (17)

Bug Fixes in OSS environment

A user reported a bug in RavenDB. We tracked that bug into a race condition in a 3rd party library, which then forced us to fix the bug, and then do the dependency roll up:

image

 

Sigh…

Then again, we could do all of that ourselves.

Reviewing NAppUpdate

I was asked to review NAppUpdate, a simple framework for providing auto-update support to .NET applications, available here. Just for to make things more interesting, the project leader of NAppUpdate is Itamar, who works for Hibernating Rhinos, and we actually use NAppUpdate in the profiler.

I treated it as an implementation detail and never actually looked at it closely before, so this is the first time that I am actually going over the code. On first impression, there is nothing that makes me want to hurl myself to the ocean from a tall cliff:

image

Let us dig deeper, and almost on the first try, we hit something that I seriously dislike.

image

Which leads to this:

public static class Errors
{
    public const string UserAborted = "User aborted";
    public const string NoUpdatesFound = "No updates found";
}

And I wouldn’t mind that, except that those are used like this:

image

There are actually quite a lot of issues with this small code sample. To start with, LastestError? Seriously?!

LatestError evoke strong memories of GetLastError() and all the associated fun with that.

It doesn’t give you the ability to multiple errors, and it is a string, so you can’t put an exception into it (more on that later).

Also, note how this work with the callback and the return code. Both of which have a boolean for success/failure. That is wrong.

That sort of style was valid for C, but .NET, we actually have exceptions, and they are actually quite a nice way to handle things.

Worse than that, it means that you have to check the return value, then go the LatestError and check what is going on there, except… what happen if there was an actual error?

image

Note the todo, it is absolutely correct. You really can’t just call a ToString() on an exception and get away with it (although I think that you should). There are a lot of exceptions where you would simply won’t get the required information. ReflectionException, for example, ask you to look at the LoaderExceptions property, and there are other such things.

In the same error handling category, this bugs me:

image

This is an exception that implements the serialization exception, but doesn’t have the [Serializable] attribute, which all exceptions should have.

Moving on, a large part of what NAU does is to check remote source for updates, then apply it. I liked this piece of code:

public class AppcastReader : IUpdateFeedReader
{
    // http://learn.adobe.com/wiki/display/ADCdocs/Appcasting+RSS

    #region IUpdateFeedReader Members

    public IList<IUpdateTask> Read(string feed)
    {
        XmlDocument doc = new XmlDocument();
        doc.LoadXml(feed);
        XmlNodeList nl = doc.SelectNodes("/rss/channel/item");

        List<IUpdateTask> ret = new List<IUpdateTask>();

        foreach (XmlNode n in nl)
        {
            FileUpdateTask task = new FileUpdateTask();
            task.Description = n["description"].InnerText;
            //task.UpdateTo = n["enclosure"].Attributes["url"].Value;
            task.UpdateTo = n["enclosure"].Attributes["url"].Value;

            FileVersionCondition cnd = new FileVersionCondition();
            cnd.Version = n["appcast:version"].InnerText;
            task.UpdateConditions.AddCondition(cnd, BooleanCondition.ConditionType.AND);

            ret.Add(task);
        }

        return ret;
    }

    #endregion
}

This is integrating with an external resources, and I like that this is simple to read and understand. I don’t have a lot of infrastructure going on in here that I have to deal with just to get what I want done.

There is a more complex feed reader for an internal format that allows you to use the full option set of NAU, but it is a bit complex (it does a lot more, to be fair), and again, I dislike the error handling there.

Another thing that bugged me on some level was this code:

image

The issue, and I admit that this is probably academic, is what happen if the string is large. I try to avoid exposing API that might force users to materialize a large data set in memory. This has implications on the working set, the large object heap, etc.

Instead, I would probably exposed a TextReader or even a Stream.

Coming back to the error handling, we have this:

FileDownloader fd = null;
if (Uri.IsWellFormedUriString(url, UriKind.Absolute))
    fd = new FileDownloader(url);
else if (Uri.IsWellFormedUriString(baseUrl, UriKind.Absolute))
    fd = new FileDownloader(new Uri(new Uri(baseUrl, UriKind.Absolute), url));
else if (Uri.IsWellFormedUriString(new Uri(new Uri(baseUrl), url).AbsoluteUri, UriKind.Absolute))
    fd = new FileDownloader(new Uri(new Uri(baseUrl), url));

if (fd == null)
    throw new ArgumentException("The requested URI does not look valid: " + url, "url");

My gripe is with the last line. Instead of doing it like this, I would have created a new Uri and let it throw the error. It is likely that it will have much more accurate error information about the actual reason this Uri isn’t valid.

On the gripping hand, we have this guy:

image

This is the abstraction that NAU manages, the Prepare() method is called to do all the actual work (downloading files, for example) and Execute() is called when we are done and just want to do the actual update.

I know that I am harping about this, but this is really important. Take a look at another error handling issue (this is representative of the style of coding inside this particular class) in the RegistryTask:

image

So, I have an update that is failing at a customer site. What error do I have? How do I know what went wrong?

Worse, here is another snippet from the FileUpdateTask:

image

For some things, an exception is thrown, for other, we just return false.

I can’t follow the logic for this, and trying to diagnose issues in production can be… challenging.

In summary, I went through most of the actual important bits of NAppUpdate, and like in most reviews, I focused on the stuff that can be improved. I didn’t touch on the stuff it does well, and that is its job.

We have been using NAppUpdate for the last couple of years, with very few issues. It quite nicely resolves the entire issue of auto updates to the point were in our code, we only need to call a few methods and it takes cares of the entire process for us.

Major recommendations from this codebase:

  • Standardize the error handling approach. It is important.
  • And even more important, logging is crucial to be able to diagnose issues in the field. NAU should log pretty much everything it does and why it does it.

This will allow later diagnosal of issues with relative ease, vs. “I need you to reproduce this and then break into the debugger”.

That ain’t no Open Source that I see here

Some things just piss me off. But before I get to what pissed me off this time, let me set the scene.

We usually request from candidates applying for a job in Hibernating Rhinos to submit some piece of code that they wrote. They get additional points if their code is an Open Source project.

Some people have some… issues with the concept. The replies I got were:

  • I don’t know if I can’t send you the code, I’ll have to ask my employer. (Which seems really silly thing to do, considering you want to get the code to show it to some other company that you want to hire you).
  • Here is the code, but don’t tell anyone. (Those usually get deleted immediately after I send them a scathing reply about things like IP and how important it is to respect that).
  • Here is my last course code. (Which is what actually triggered this post).

Here is the deal, if you aren’t coding for fun, you are not suitable for a developer position in Hibernating Rhinos. Just to give you some idea, currently we have the following pet projects that I am aware of:

  • Jewish Sacred Books repository – display / commentary
  • Jewish Sacred Books repository – search / organization (Note that the two are by two different people and are totally unrelated.)
  • Music game app for Android, iOS and WP7
  • Personal finance app
  • Auto update library for .NET
  • Various OSS projects

And probably other stuff that I am not aware of. (Just for the record, those are things that they are working on their own time, not company time. And not because I or anyone else told them).

Why is this relevant? Because I keep getting people who think submitting some random piece of code that they have from their latest university course is a good way to show their mad code skillz.

I mean, sure, that might do it, but consider carefully what sort of projects you are usually given as part of university courses. They are usually very small, focusing on just one aspect, and they are totally driven by whatever the crazy professor think is a valid coding standard. Usually, that is NOT a good candidate for sending a code to a job interview.

I am going to share just one line from a codebase that I recently got:

private void doSwap(ref Album io_Album1, ref Album io_Album2)

The code is in C#, in case you are wondering. And you can probably learn a lot about the state of the codebase from just this line of code. Among my chief complaints:

  • Violating the .NET framework naming guidelines (method name).
  • Violating the .NET framework naming guidelines (argument names).
  • Swapping parameters, seriously?! What, are you writing your own sort routine? And yeah, the answer is yes.

When I pinged the author of the code, he replies that this was because of the course requirements. They had a strict polish notation guidelines, and io_ is for a input & output parameter.

They had other guidelines (you may not use foreach, for example) that explained some strangeness in the codebase.

But that isn’t really the point. I can understand crazy coding standards, what I can’t understand is why someone would submit something that would raise so many red flags so quickly as part of a job application process.

This is wasting everyone’s time. And that is quite annoying.

Tags:

Published at

Originally posted at

Comments (46)

Send me a patch for that

This post is in reply to Hadi’s post. Please go ahead and read it.

Done? Great, so let me try to respond, this time, from the point of view of someone who regularly asks for patches / pull requests.

Here are a few examples.

To make things more interesting, the project that I am talking about now is RavenDB which is both Open Source and commercial. Hadi says:

Numerous times I’ve seen reactions from OSS developers, contributors or merely a simple passer by, responding to a complaint with: submit a patch or well if you can do better, write your own framework. In other words, put up or shut up.

Hadi then goes on to explain exactly why this is a high barrier for most users.

  • You need to familiarize yourself with the codebase.
  • You need to understand the source control system that is used and how to send a patch / pull request.

And I would fully agree with Hadi that those are stumbling blocks. I can’t speak for other people, but in our case, that is the intention.

Nitpicker corner here: I am speaking explicitly and only about features here. Bugs gets fixed by us (unless the user already submitted a fix as well).

Put simply, there is an issue of priorities here. We have a certain direction for the project that we want to take it. And in many cases, users want things that are out of scope for us for the foreseeable future. Our options then become:

  • Sorry, ain’t going to happen.
  • Sure, we will push aside all the work that we intended to do to do your thing.
  • No problem, we added that to the queue, expect it in 6 – 9 months, if we will still consider it important then.

None of which is an acceptable answer from our point of view.

Case in point, facets support in RavenDB was something that was requested a few times. We never did it because it was out of scope for our plan, RavenDB is a database server, not a search server and we weren’t really sure how complex this would be and how to implement this. Basically, this was an expensive feature that wasn’t in the major feature set that we wanted. The answer that we gave people is “send me a pull request for that”.

To be clear, this is basically an opportunity to affect the direction of the project in a way you consider important. What ended up happening is that Matt Warren took up the task and created an initial implementation. Which was then subject to intense refactoring and finally got into the product. You can see the entire conversation about this here. The major difference along the way is that Matt did all the research for this feature, and he had working code. From there the balance change. It was no longer an issue of expensive research and figuring out how to do it. It was an issue of having working code and refactoring it so it matched the rest of the RavenDB codebase. That wasn’t expensive, and we got a new feature in.

Here is another story, a case where I flat out didn’t think it was possible. About two years ago Rob Ashton had a feature suggestion (ad hoc queries with RavenDB). Frankly, I thought that this was simply not possible, and after a bit of back and forth, I told Rob:

Let me rephrase that.
Dream up the API from the client side to do this.

Rob went away for a few hours, and then came back with a working code sample. I had to pick my jaw off the floor using both hands. That feature got a lot of priority right away, and is a feature that I routinely brag about when talking about RavenDB.

But let me come back again to the common case, a user request something that isn’t in the project plan. Now, remember, requests are cheap. From the point of view of the user, it doesn’t cost anything to request a feature. From the point of view of the project, it can cost a lot. There is research, implementation, debugging, backward compatibility, testing and continuous support associated with just about any feature you care to name.

And our options whenever a user make a request that is out of line for the project plan are:

  • Sorry, ain’t going to happen.
  • Sure, we will push aside all the work that we intended to do to do your thing.
  • No problem, we added that to the queue, expect it in 6 – 9 months, if we will still consider it important then.

Or, we can also say:

  • We don’t have the resources to currently do that, but we would gladly accept a pull request to do so.

And that point, the user is faced with a choice. He can either:

  • Oh, well, it isn’t important to me.
  • Oh, it is important to me so I have better do that.

In other words, it shift the prioritization to the user, based on how important that feature is.

We recently got a feature request to support something like this:

session.Query<User>()
   .Where(x=> searchInput.Name != null && x.User == searcInput.Name)
   .ToArray();

I’ll spare you the details of just how complex it is to implement something like that (especially when it can also be things like: (searchInput.Age > 18). But the simple work around for that is:

var q = session.Query<User>();
if(searchInput.Name != null)
  q = q.Where(x=> x.User == searcInput.Name);

q.ToArray();

Supporting the first one is complex, there is a simple work around that the user can use (and I like the second option from the point of view of readability as well).

That sort of thing get a “A pull request for this feature would be appreciated”. Because the alternative to that is to slam the door in the user’s face.

Why RavenDB is OSS

There are actually several reasons for that, not the least of which is that I like working on OSS projects. But there is also another reason for that, which is very important for adoption:

image

Joel Spolsky have an interesting article about just this topic.

When I sit down to architect a system, I have to decide which tools to use. And a good architect only uses tools that can either be trusted, or that can be fixed. "Trusted" doesn't mean that they were made by some big company that you're supposed to trust like IBM, it means that you know in your heart that it's going to work right. I think today most Windows programmers trust Visual C++, for example. They may not trust MFC, but MFC comes with source, and so even though it can't be trusted, it can be fixed when you discover how truly atrocious the async socket library is. So it's OK to bet your career on MFC, too.

You can bet your career on the Oracle DBMS, because it just works and everybody knows it. And you can bet your career on Berkeley DB, because if it screws up, you go into the source code and fix it. But you probably don't want to bet your career on a non-open-source, not-well-known tool. You can use that for experiments, but it's not a bet-your-career kind of tool.

I have used the same logic myself in the past, and I think it is compelling.

The down side of porting

I got to this code as a result of a profiling session (shown below):

image

And here is how I got there:

 

image

Well, the fun part is that by the time I found it, it got fixed :-)

Open Source Success Metrics

I got into a very interesting discussion with Rob Conery recently, talking about, among other things, the CodePlex Foundation. You can hear it TekPub’s Podcast #2, when it is out later this week. Tangential to our discussion, but very important, is how an Open Source project owner defines success for their project.

Usually, when people try to judge if an open source project is successful or not, they look at the number of users, and whatever the general response for the project is positive. But I am involved in a lot of open source projects, so you might say that I have an insider’s view on how the project owner view this.

Now, let us be clear, I am talking about OSS project that are being led by an individual or a group of individuals. There are different semantics for OSS projects that are being led by a company.

There are several reasons for individuals to be involved in OSS projects, those are usually:

  • Scratch an itch – I want to do something challenging/fun.
  • Need it for myself – This is something that the owner started because they needed it themselves and open sourced it for their own reasons.
  • Reputation building – Creating this project would give me some street cred.
  • I use it at work – Usually applicable for people joining an existing project, they use it at work and contribute stuff that they require.

We must also make a distinction between OSS adoption and OSS contribution. Typically, maybe a tenth of a percent of the adopters would also be contributors. Now, let us talk about the owner’s success metric again. Unless the owner started the project for getting reputation, the number of people adopting your project is pretty meaningless. I think that DHH , the creator of Ruby on Rails, did a great job describing the owner sentiment:

I'm not in this world to create Rails for you. I'm in this world to create Rails for me and if you happen to like that version of Rails that I'm creating for me, than you are going to have a great time.

Even if you are interested in OSS for reputation building, after a while, it stops being a factor. Either you got your reputation, or you never will. A good example of a project started to gain reputation is Rhino Mocks. And you know what, it worked. But I think that it is safe to say that I am no longer just that Rhino Mocks guy, so the same rules applies as for the other motivations.

So, what does it mean, from an owner perspective, when you ask if an OSS project is successful or not?  The answer is simple: it does what I need it to do.

I asked a popular OSS project owner the following question: What would your response be if I told you that I am taking a full page ad in the Times magazine proclaiming your stuff to be the best thing since sliced bread? You’ll get a million users out that that.

His response was: Bleh, no!

Tying it back to the discussion that I had with Rob, I feel that much of the confusion with regards to the CodePlex Foundation role is when you try to talk to projects led by individuals as if they were commercial enterprises. The goals are just completely different, and in many cases, adding more users for the project will actually be bad for the project, because it put a higher support cost for the project ream.

In the .NET ecosystem, most of the projects aren’t being led by a company. They are led by individuals. That is an important distinction, and understanding it would probably make it clear why the most common response for the CodePlex Foundation was: What is in it for me?

Impleo – a CMS I can tolerate

If you head out to http://hibernatingrhinos.com/, you will see that I finally had the time to setup the corporate site. This is still very early, but I have a lot of content to add there, but it is a start.

Impleo, the CMS running the site, doesn’t have any web based interface, instead, it is built explicitly to take advantage of Windows Live Writer and similar tools. The “interface” for editing the site is the MetaWeblog API. This means that in order to edit the site, there isn’t any Wiki syntax to learn, or XML files to edit, or anything of this sort.

You have a powerful editor in your fingertips, one that properly handle things like adding images and other content. This turn the whole experience around. I usually find documentation boring, but I am used to writing in WLW, it is fairly natural to do, and it removes all the pain from the equation.

One of the things that I am trying to do with it is to setup a proper documentation repository for all my open source projects. This isn’t something new, and it is something that most projects have a hard time doing. I strongly believe in making things simple, in reducing friction. What I hope to do is to be able to accept documentation contributions from the community for the OSS projects.

I think that having a full fledged rich text editor in your hands is a game changer, compared to the usual way OSS handle documentation.  Take a look at what is needed to make this works, it should take three minutes to get started, no learning curve, no “how do they do this”.

So here is the deal, if you would like to contribute documentation (which can be anything that would help users with the projects), I just made things much easier for you. Please contact me directly and I’ll send you the credentials to be able to edit the site.

Thanks in advance for your support.

The law of unintended consequences

A while ago I got a tattoo on my forearm (did I mention that you DevTeach is wild?). Here is how it looks like:

image

To preempt the nitpickers, this was my logo first, I reused it for other stuff afterward (including Rhino Mocks & NH Prof), but it is mine. I did not tattooed NH Prof’s logo.

So a while ago I was buying something in the supermarket and the clerk asked me why I had this tattoo. I told her that I like rhinos, and she said, but it is not a rhino.

Took me a while to realize that from her perspective, it looked like this:

image

Do you see it? You may need to tilt your head a bit to see it.

I guess that if need be, I can prove that I am OSS aficionado now.

Open Source development model

As someone who does a lot of Open Source stuff, I find myself in an interesting position in the CodePlex Foundation mailing list. I am the one who keep talking about letting things die on the vine if they aren’t successful on their own.

I am going to try to put a lot of discussion into a single (hopefully) coherent post. Most of the points that I am going to bring up are from the point of view of an OSS project that got traction already (has multiple committers, community, outside contribution).

One of the oft repeated themes of the conversation in the CPF mailing list is that the aim is to encourage OSS adoption and contributions to OSS in businesses and corporations.

That sounds nice, but I don’t really get why.

From the business side: if a business don't want to use OSS, then it is in a competitive disadvantage compared to its competitors that do make use of it, since OSS projects tend to make great infrastructure and generate high quality base to work from. If you choose to develop things in house it is going to cost you a lot. And you are likely going to end up with an inferior quality solution.

This is not to disparage someone’s effort, but a OSS project that got traction behind it is likely to have a lot more eyes & attention on it than a one off solution. The Java side has demonstrated that quite clearly.

Even in the .Net world, I can tell you that I am aware of Fortune 50 companies making use of things like NHibernate or Castle. They can most certainly fund building a project of similar size, but it doesn’t make economic sense to do so.

From the project side, if you got enough traction, you don't generally worry about the OSS fearing businesses. It is their loss, none for the project.

It would be more accurate that the project won't feel any pain if a business decide not to use it. Remember that unlike commercial software, OSS projects don't really have an incentive to "sell" more & more.

There is the incentive to grow bigger (for ego reason, if nothing else), get more people involved, add more features, etc. But unless there is some business model behind it (and in the .NET world, there are very few projects with a business model behind them), growing the project usually mean problems for the project team.

As a simple example, Rhino Mocks mailing list has an average of 140 messages per month. I had to scale down my own involvement in the mailing list because it took too much of my time. The NHibernate Users mailing list is crazy, averaging in a thousand messages a month this year alone.

That is even assuming that I want traction for a project, which isn’t always the case. As a good example, I have a lot of stuff that I put out as one-use only solutions. Rhino Igloo is a good example of that, a WebForms MVC framework that we needed for a single project. I built it as OSS, we get contributions for it once in a while. But if it gets to be *very* active, I am going to find myself in a problem, because I don't really want to maintain it anymore.

But in general, for most projects I do want to have more contributors. In the CPF mailing list the issue of getting contributions from companies was brought up as problematic. I disagree, since I don't find that the problems that were brought up (getting corporate and legal sign up for contributing work, getting people to adopt OSS for commercial uses) has any relevance whatsoever to getting more contributors. By far, most contributions that we get for the projects I am involved at are from people making commercial use of them.

But usually, I don’t really care about adoption.  I have 15 - 20 OSS projects that I have either founded or am a member of, in exactly one of them I cared about adoption (Rhino Mocks), and that was 5 years ago, mainly because I thought it would give me some credentials when I was looking for a job (and it did).

For all the rest, I am working on those because I need them to solve a problem. I get the benefit that other people are going to look at them and contribute if they feel like it, but mostly, I am working on OSS to solve a problem, the number of users in a project isn't something that I really care about.

There were three scenarios that were discussed in the mailing list that I want to address in particular.

A company would like to pay you 5 times your normal rates, but they have a “no OSS” policy, thus losing the contract.

I have to say that this scenario never happened to me.  Oh, I had to talk with the business a lot of time. It is easy to show them why OSS is the safer choice.

Today, that is fairly easy. I can point out stats like this: http://www.ohloh.net/p/nhibernate and that trying to build something like NH is going to cost you in the order of 130 years and ~15 millions dollars. I can tell them that going with MS data access method is a good way to throw good money at upgrading their data access methodology every two years. I can point them to a whole host of people making good use of it.

I got lots of arguments to use. And they tend to work, quite well, in fact. I may need to talk to the lawyers, but that has generally been a pretty straightforward deal.  So no, I don't lose clients because of no OSS rule.

Beside, you know what, if they are willing to pay me 5 times my normal rate, I am going to be very explicit about making my preferences made and explaining the benefits. Afterward, they are the client, if they want to may me gobs of money, I am not going to complain even if I am going to use NIH as the root namespace.

Corporate developers have a problem getting permission to use OSS projects in their product or project.

I have seen it happen a few times in the past, but it is growing rarer now. The main problem was never legal, it was the .NET culture more than anything else. The acceptance of OSS as a viable source of software had more to do with team leads and architects accepting that than any legal team putting hurdles in the path.

Oh, you still need to talk to legal before, but you are going to do that when bringing a 3rd party component anyway. (You do make sure to run any commercial legal agreements through the legal department, right? You need to know that there aren’t hooks involved there).

Corporate developers have a problem getting permission to contribute to OSS projects.

Once OSS is adopted, I never run into an issue where legal stopped the contribution of a patch. There are damn good reasons for the business to want this, after all.  To that manager, I am going to say: "look, we can maintain it, or we give it to the project, they maintain/fix/debug/improve it. we get great credits and we gain a lot for work we would have done anyway"

A few final thoughts, OSS projects are a dime a dozen. In the .Net space alone there are thousands. Most of them, I have to say, aren’t really that interesting. Out of those thousands of projects, there are going to be a few that would get traction, attract additional committers, outside contributions and a community.

I think it would be safe to say that there are around fifty such projects in the .Net space. There is nothing particularly noble or important in an OSS project that requires special treatment. If it gets enough attention, it will live on. If it doesn’t, who cares (except maybe the author)?

The CodePlex Foundation, however it may end up as, is going to be dealing with the top fifty or so projects, simply because trying to reach the long tail of OSS projects is a futile task. I mentioned what I think would be good ways of helping the top projects (resources, coaching, money).

Another approach would be to turn it around, the CPF can focus on building a viable business model for OSS projects. A healthy OSS project is one that makes money for the people who contribute to it. It may be directly or indirectly, but if it isn’t going to do that, it isn’t going to live long. A labor of love would keep one or two committers working on a project, but it wouldn’t generally sustain a team.

Finally, something that I think seems to get lost in all the noise around it, Open Source projects are about the code. I hear a lot about legal issues, making business adopt OSS, etc. I don’t see discussion about the main thing.

Show me the code!

CodePlex Foundation

I wanted to drop a few words about CodePlex Foundation.

To preempt the snarky comments, no I have no knowledge about the foundation beyond what was made public.

The CodePlex Foundation is apparently about:

Enabling the exchange of code and understanding among software companies and open source communities.

That is good on several levels.

  • It is another stepping stone in making OSS an acceptable solution in the .NET eco system.
  • It is explicitly setup to encourage the use of OSS in commercial settings.

It is still way to early to tell, but my hope is that it will become a platform on top of which OSS projects and contributors can build a commercially viable solutions. Working on OSS is hard, when you have to donate all your time and energy.

Ideally, I would like to see a the foundation work to make it happen. I have some ideas about this, such as sponsoring outright some of the projects, or contributing resources for things like build servers, tech writers, support system, etc.

Can you make money on OSS tooling in the .NET world?

The answer is yes. Here is a chart of downloads and orders of NH Prof.

image

I am fairly happy with the way NH Prof sells. I think it could be better, but I want to see what happens to sales when I release the v1.0 version (which will be soon).

Nitpicker corner: Numbers has been removed from the chart for a reason.

The state of Open Source in the .NET ecosystem: A five year summary

I have been forcibly reminded lately that I have been doing this for quite some time. In fact, I have been doing working with Open Source on the .Net platform for over 5 years now. And a few conversations with friends have given me quite a retrospective on the state of OSS.Net.

5 years ago, it was 2004, .Net 1.1 was still a hot thing, and Stored Procedures on top of datasets where still a raging debate. Open source was considered a threat and Steve Balmer was busy blasting at any OSS project that showed up. The very existence or need for OSS on the .NET platform was frequently questioned.

I remember trying to find work in 2005, after over a year of actively working on Open Source projects and with Rhino Mocks making steady but sure progress in the .NET TDD community and not being able to leverage that experience into job interviews. It was only commercial experience that counted for the gate keepers.

The last 5 years have been quite interesting in the .NET ecosystem from the OSS world. It has gotten to the point where the use of OSS tools, frameworks and platforms is no longer a strange exception, but is quite common place.

There are several data points on which I am basing this statement:

  • Books about OSS projects are commonly published.
  • Microsoft is doing a lot to encourage OSS on the .Net platform.
  • NHibernate’s download numbers are consistently above ten thousands a month, usually closer or above twenty thousands a month.
  • I released Windsor 2.0 not even two weeks ago, and it has over 1,200 downloads already.
  • The number of messages to the NHibernate users mailing list is usually above a thousands per month.
  • My NHibernate course sold out and I have to do a repeat course to satisfy demand.

And then, there is my own experience, both as a member in the community and as a consultant. I see OSS being used quite often. A lot of my engagements are about OSS tools and framework, and I am rarely the person to introduce them into the company.

I think that there are several factors playing around here, but most of that is around maturity. The OSS players in the .NET world had had some time to work on things, and most established projects have been around for years. NHibernate is 6 years old, Castle is 5, Rhino Mocks 4. It is not the Open Source world that represent stability. With Microsoft replacing their data access strategy every two years, it might be best to use NHibernate, because it has been around for a long time already.

There is also the issue of maturity in the ecosystem itself. It has became quite clear that it is acceptable and even desirable to use OSS projects. And we have companies making explicit decisions to support Open Source projects (iMeta decision to donate 3 dev months is just one example, although the most prominent one). Recently I was working with a client on strategies for Open Sourcing their software, and how to manage a good Open Source project. In another client, a decision has been reached to put all the infrastructure stuff as Open Source, even newly developed one, because they are infrastructure. Infrastructure is seen as a commodity, and as such, there is little value in trying to make it unique.

There is a lot of value, however, in making it Open Source and accepting improvements from others. And I was able to point out to that client how outside contributions to the infrastructure has enabled us to do things that we would have to do ourselves.

Things are changing, I don’t think that we are at the balance point yet, but I think that we are seeing a very big shift, happening very very slowly. And from the Open Source perspective, things are looking quite good.

Castle Windsor 2.0 RTM Released

imageSome would say that it is about time, I would agree. Windsor might not be the OSS project in pre release state for the longest time (I think that the honor belong to Hurd), but it spent enough time at that state to at least deserve a honorary mention.

That was mostly because, although Windsor was production ready for the last three or four years or so, most of the people making use of it were happy to make use of the trunk version.

If you will look, you won’t find Windsor 1.0, only release candidates for 1.0. As I believe I mentioned, Windsor has been production ready for a long time, and for the full release we decided to skip the 1.0 designator, which doesn’t really fit, and go directly to 2.0

The last Windsor release (RC3) was almost a year and a half ago, and in the meantime, much has improved in Windsor land. Adding upon the already superb engine and facilities, we have fitted Windsor to the 3.5 release of the .Net framework, created a full fledged fluent API to support easy configuration, allowed more granular control over the behavior of the container when selecting components and handlers and improved overall performance.

All in all, pretty good stuff, even if I say so myself. Just to give you an idea, the list of changes from the previous release goes for quite a while, so I am going to let the short listing above to stand in its place.

You can get the new release from the source forge site.

Ayende's Open Source Project Maturity Model

From my point of view, there is a very easy model for the maturity of an Open Source project. Look at the answers in the project's mailing list. The questions do no matter.

The model is simple, the higher the percentage of answers given by non project members, the more mature the project is.

NHProf, Open Source, Licensing and a WTF in a good sense

I have a Google alert setup for NH Prof.

I got the following alert.

image

I was willing to mutter a few choice curses and let it go, because there really isn't much that you can do about this. But then I followed up on the rest of the thread.

image

Um... thanks? I mean, I sure appreciate the sentiment.

But the fun continues...

image

And the replies...

image

Honestly, I wouldn't believe it if I didn't see it with my own eyes.

acexman & cluka, thanks.

Oufti, I don't think that I like you very much.

A note to Microsoft: Agile or open source doesn’t excuse it being crap

I explicitly don’t want to go over the exact scenario that this is relating to. I want to talk about a general sentiment that I got from several people from Microsoft a few times, which I find annoying.

It can be summed up pretty easily by this quote:

You all know that we work on the Agile process here, right? We get something out (perhaps a little early) and then improve it. Codeplex is for open source and continuous improvement with community feedback.

The context is a response to a critique about unacceptable level of quality in something Microsoft put out. Again, I do not want to discuss the specifics. I want to discuss the sentiment, I got answers in a similar spirit from several Microsoft people recently, and I find it annoying in the extreme.

Agile doesn’t mean that you start with crap, call it organic fertilizer and try to tell me that it will improve in the future. Quality is supposed to be built in, it is the scope that you grow incrementally, not the product quality.

I actually find the open source comment to be even more annoying. Open source does not mean that you get someone else to do your dirty work. And if you take something and call it open source, it doesn’t mean that you are not going to get called on the carpet for the quality of whatever you released.

Calling it open source does not mean that the community is accountable for its quality.

The Common Service Locator library

Glenn and Chris has already gotten the word out, but that is an important piece of news.

The idea is based off a post that Jeremy Miller had about a month ago, having a common, shared interface across several IoC implementation. That would allow library authors to make use of the benefits of a container without taking a dependency on a particular implementation.

The alternative for that is to each library to create its own abstraction layer. A good example of that is NServiceBus' IBuilder interface, or ASP.NET MVC's IControllerFactory. That is just annoying, especially if you are integrating more than a single such framework.

This project was created with the aid of most of the IoC containers authors on the .NET framework, and we have adapters for Windsor, Unity, Spring.NET already, with adapters for the rest of the containers coming soon.

What it is not?

It is not meant to be a complete container abstraction. The reason that the interface is so small (and you wouldn't believe the amount of time that it took to settle on exactly what is going to be there) is that this is not supposed to be the container that you are using. This is explicitly designed to be a read only interface that allows a library to use the container, not some uber container interface (which doesn't really make sense considering the differences between the containers).

Why Service Locator vs. Container? Again, the design is focused on enabling integration scenarios, more than anything else.

I still recommend to avoid explicit service locator usage whenever possible, and to rely on the container and dependency injection. This is not always possible, which is what this library is supposed to solve.

Who should use this?

If you are releasing a piece of code that is taking a dependency on a container, you should consider that as an option. It is generally considered to be a better idea than to force your choice of containers on the clients. The OAuth library for .NET is going to move to this, for exactly this reason.

I am from Microsoft, can I use this?

The code is released under the MS-PL, and all of the work on the project was done by Microsoft employees. That was quite intentional, since this is allow Microsoft projects to make use of that (this is Microsoft IP). If you are a Microsoft employee, you can make use of that. In fact, according to Chris, the next version of the Enterprise Library is going to use this as the container abstraction, so you'll be able to use the Enterprise Library with Windsor and StructureMap and Spring.NET and Ninject and AutoFac and all the rest of the things that I forgot.

I can use only Microsoft products, can I use this?

See previous question.

Where can I find the code

The codeplex site is here: http://www.codeplex.com/CommonServiceLocator

On jQuery & Microsoft

No, I am not going to bore you with another repetition of the news. Yeah, Microsoft is going to bundle jQuery with Visual Studio and the ASP.Net MVC. That is important, but not quite as important as something else that I didn't see other people pointing out.

This is the first time in a long time that I have seen Microsoft incorporating an Open Source project into their product line.

I am both thrilled and shocked.

How to expose an OSS build server?

I just finished setting up a build server for Rhino Tools. Ideally, I want it to be publicly accessible, and have people download the build artifacts after each build. However, CC.Net is not something that you want to just expose to the web. It has no security model (any random Joe can just start a build, hence DOS).

Any suggestions?

I should note that anything that involves significant amount of time is going to be answered with: "Great, when can you help me do that".

It is alive! CodePlex has Subversion Access

It is so much fun to see things that I worked on coming alive. The official announcement is here, with all the boring details. You can skip all of that and go read the code directly using SVN by hitting: https://svnbridge.svn.codeplex.com/svn

Switch svnbridge for your project, and you are done. Note that this is https. And yes, it should work with git-svn as well.

Way cool!

Persistent DSL caching issues

A while ago I talked about persistent DSL caching. I was asked why my solution was not a builtin part of Rhino DSL.

The reason for that is that this is actually a not so simple problem. Let me point out a few of the issues that are non obvious.

  • Need to handle removal of scripts
  • Need to handle updating scripts
  • Need to handle new scripts

Those are easy, sort of, but what about this one?

  • Need to handle DSL updates

When you are in development mode, you really need to know that changing the way the DSL behaves would also invalidate any cache.

I like to keep a very high bar of quality on the software I make, and there is a fine distinction between one off attempts and reusable ones. One off attempts can be hackish and stupid. Reusable implementations should be written properly.

And no, there isn't anything overly complex here. Just time to test all bases.

Anyone feels like sumbiting a patch?

So where were you?

This post has really pissed me off:

It makes me sick to my stomach to think of all the good .NET projects that are now abandoned (or soon will be) because Microsoft seduced their authors away from doing anything that would actually benefit the .NET community.

Excuse !

Who exactly said that I owe something to anybody? Who exactly said that any of the guys who went to work for Microsoft (many of whom I consider friends) owe you something. The entire post is a whine about "I can't get the software I want for free".

Well, guess what, no one said it has to be free. Software has no right to be free. If anyone wants to stop dedicating significant amount of their time into free stuff, that is their decision, for their own reasons. Rhino Mocks is estimated at nine million dollars by Ohloh, I might decide to stop using it tomorrow, and you don't get a chance to protest that, or even to complain. Put simply, where exactly are your efforts? Where is your money and time?

Because unless you are a customer (in the sense of, money exchanged hands), you got stuff for free and now you complain because people aren't willing to do so anymore?

Now, leaving that aside, to the best of my knowledge, Castle, SubText, dasBlog and SubSonic are all alive and well and have received attention from the respective "seduced" authors.