Ayende @ Rahien

filter by tags archive

architecture (628) rss
bugs (451) rss
challenges (137) rss
community (391) rss
databases (482) rss
design (905) rss
development (673) rss
hibernating-practices (75) rss
miscellaneous (593) rss
performance (398) rss
programming (1125) rss
raven (1490) rss
ravendb.net (580) rss
reviews (184) rss

2026
- May (2)
- April (5)
- February (4)
- January (5)
2025
- December (8)
- November (4)
- October (4)
- September (10)
- August (6)
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Feb 27 2013

Rob’s RavenDB Sprint

time to read 1 min | 104 words

Tweet Share Share 5 comments

Tags:

Rob Ashton is a great developer. We invited him to Hibernating Rhinos as part of his Big World Tour. I had the chance to work with him in the past on RavenDB, and I really liked working with him, and I liked the output even better. So we prepared some stuff for him to do.

This is the status of those issues midway through the second day.

And yes, I am giddy.

Feb 20 2013

Open Source Application Review: BitShuva Radio

time to read 39 min | 7645 words

Tweet Share Share 11 comments

Tags:

As part of my ongoing reviews efforts, I am going to review the BitShuva Radio application.

BitShuva Radio is a framework for building internet radio stations with intelligent social features like community rank, thumb-up/down songs, community song requests, and machine learning that responds to the user's likes and dislikes and plays more of the good stuff.

I just cloned the repository and opened it in VS, without reading anything beyond the first line. As usual, I am going to start from the top and move on down:

We already have some really good indications:

There is just one project, not a gazillion of them.
The folders seems to be pretty much the standard ASP.NET MVC ones, so that should be easy to work with.

Some bad indications:

Data & Common folders are likely to be troublesome spots.

Hit Ctrl+F5, and I got this screen, which is a really good indication. There wasn’t a lot of setup required.

Okay, enough with the UI, I can’t really tell if this is good or bad anyway. Let us dive into the code. App_Start, here I come.

I get the feeling that WebAPI and Ninject are used here. I looked in the NinjectWebCommon file, and found:

Okay, I am biased, I’ll admit, but this is good.

Other than the RavenDB stuff, it is pretty boring, standard and normal codebase. No comments so far. Let us see what is this RavenStore all about, which leads us to the Data directory:

So it looks like we have the RavenStore and a couple of indexes. And the code itself:

   1: public class RavenStore

   2: {

   3:     public IDocumentStore CreateDocumentStore()

   4:     {

   5:         var hasRavenConnectionString = ConfigurationManager.ConnectionStrings["RavenDB"] != null;

   6:         var store = default(IDocumentStore);

   7:         if (hasRavenConnectionString)

   8:         {

   9:             store = new DocumentStore { ConnectionStringName = "RavenDB" };

  10:         }

  11:         else

  12:         {

  13:             store = new EmbeddableDocumentStore { DataDirectory = "~/App_Data/Raven" };

  14:         }

15:

  16:         store.Initialize();

  17:         IndexCreation.CreateIndexes(typeof(RavenStore).Assembly, store);

  18:         return store;

  19:     }

  20: }

I think that this code need to be improved, to start with, there is no need for this to be an instance. And there is no reason why you can’t use EmbeddableDocumentStore to use remote stuff.

I would probably write it like this, but yes, this is stretching things:

   1: public static class RavenStore

   2: {

   3:     public static IDocumentStore CreateDocumentStore()

   4:     {

   5:         var store = new EmbeddableDocumentStore

   6:             {

   7:                 DataDirectory = "~/App_Data/Raven"

   8:             };

9:

  10:         if (ConfigurationManager.ConnectionStrings["RavenDB"] != null)

  11:         {

  12:             store.ConnectionStringName = "RavenDB";

  13:         }

  14:         store.Initialize();

  15:         IndexCreation.CreateIndexes(typeof(RavenStore).Assembly, store);

  16:         return store;

  17:     }

  18: }

I intended to just glance at the indexes, but this one caught my eye:

This index effectively gives you random output. It will group by the count of documents, and since we reduce things multiple times, the output is going to be… strange.

I am not really sure what this is meant to do, but it is strange and probably not what the author intended.

The Common directory contains nothing of interest beyond some util stuff. Moving on to the Controllers part of the application:

So this is a relatively small application, but an interesting one. We will start with what I expect o be a very simple part of the code .The HomeController:

   1: public class HomeController : Controller

   2: {

   3:     public ActionResult Index()

   4:     {

   5:         var userCookie = HttpContext.Request.Cookies["userId"];

   6:         if (userCookie == null)

   7:         {

   8:             var raven = Get.A<IDocumentStore>();

   9:             using (var session = raven.OpenSession())

  10:             {

  11:                 var user = new User();

  12:                 session.Store(user);

  13:                 session.SaveChanges();

14:

  15:                 HttpContext.Response.SetCookie(new HttpCookie("userId", user.Id));

  16:             }

  17:         }

18:

  19:         // If we don't have any songs, redirect to admin.

  20:         using (var session = Get.A<IDocumentStore>().OpenSession())

  21:         {

  22:             if (!session.Query<Song>().Any())

  23:             {

  24:                 return Redirect("/admin");

  25:             }

  26:         }

27:

  28:         ViewBag.Title = "BitShuva Radio";

  29:         return View();

  30:     }

  31: }

There are a number of things in here that I don’t like. First of all, let us look at the user creation part. You look at the cookies and create a user if it isn’t there, setting the cookie afterward.

This has the smell of something that you want to do in the infrastructure. I did a search for “userId” in the code and found the following in the SongsController:

   1: private User GetOrCreateUser(IDocumentSession session)

   2: {

   3:     var userCookie = HttpContext.Current.Request.Cookies["userId"];

   4:     var user = userCookie != null ? session.Load<User>(userCookie.Value) : CreateNewUser(session);

   5:     if (user == null)

   6:     {

   7:         user = CreateNewUser(session);

   8:     }

9:

  10:     return user;

  11: }

12:

  13: private static User CreateNewUser(IDocumentSession session)

  14: {

  15:     var user = new User();

  16:     session.Store(user);

17:

  18:     HttpContext.Current.Response.SetCookie(new HttpCookie("userId", user.Id));

  19:     return user;

  20: }

That is code duplication with slightly different semantics, yeah!

Another issue with the HomeController.Index method is that we have direct IoC calls (Get.As<T>) and multiple sessions per request. I would much rather do this in the infrastructure, which would also give us a place for the GetOrCreateUser method to hang from.

SongsController is actually an Api Controller, so I assume that it is called from JS on the page. Most of the code there looks like this:

   1: public Song GetSongForSongRequest(string songId)

   2: {

   3:     using (var session = raven.OpenSession())

   4:     {

   5:         var user = GetOrCreateUser(session);

   6:         var songRequest = new SongRequest

   7:         {

   8:             DateTime = DateTime.UtcNow,

   9:             SongId = songId,

  10:             UserId = user.Id

  11:         };

  12:         session.Store(songRequest);

  13:         session.SaveChanges();

  14:     }

15:

  16:     return GetSongById(songId);

  17: }

GetSongById will use its own session, and I think it would be better to have just one session per request, but that is about the sum of my comments.

One thing that did bug me was the song search:

   1: public IEnumerable<Song> GetSongMatches(string searchText)

   2: {

   3:     using (var session = raven.OpenSession())

   4:     {

   5:         return session

   6:             .Query<Song>()

   7:             .Where(s =>

   8:                 s.Name.StartsWith(searchText) ||

   9:                 s.Artist.StartsWith(searchText) ||

  10:                 s.Album.StartsWith(searchText))

  11:             .Take(50)

  12:             .AsEnumerable()

  13:             .Select(s => s.ToDto());

  14:     }

  15: }

RavenDB has a really good full text support. And we could be using that, instead. It would give you better results and be easier to work with, to boot.

Overall, this is a pretty neat little app.

Oct 12 2012

This code ain’t production ready

time to read 1 min | 191 words

Tweet Share Share 8 comments

Tags:

Greg Young has a comment on my Rhino Events post that deserves to be read in full. Go ahead, read it, I’ll wait.

Since you didn’t, I’ll summarize. Greg points out numerous faults and issues that aren’t handled or could be handled better in the code.

That is excellent, from my point of view, if only because it gives me more stuff to think about for the next time.

But the most important thing to note here is that Greg is absolutely correct about something:

I have always said an event store is a fun project because you can go anywhere from an afternoon to years on an implementation.

Rhino Events is a fun project, and I’ve learned some stuff there that I’ll likely use again letter on. But above everything else, this is not production worthy code .It is just some fun code that I liked. You may take and do whatever you like with it, but mostly I was concerned with finding the right ways to actually get things done, and not considering all of the issues that might arise in a real production environment.

Oct 11 2012

Introducing Rhino.Events

time to read 3 min | 529 words

Tweet Share Share 7 comments

Tags:

open source

After talking so often about how much I consider OSS work to be indicative of passion, I got bummed when I realized that I didn’t actually did any OSS work for a while, if you exclude RavenDB.

I was recently at lunch at a client, when something he said triggered a bunch of ideas in my head. I am afraid that I made for poor lunch conversation, because all I could see in my head was code and IO blocks moving around in interesting ways.

At any rate, I sat down at the end of the day and wrote a few spikes, then I decided to write the actual thing in a way that would actually be useful.

What is Rhino Events?

It is a small .NET library that gives you embeddable event store. Also, it is freakishly fast.

How fast is that?

Well, this code writes a fairly non trivial events 10,000,000 (that is ten million times) to disk.

It does this at a rate of about 60,000 events per second. And that include the full life cycle (serializing the data, flushing to disk, etc).

Rhino.Events has the following external API:

As you can see, we have several ways of writing events to disk, always associating to a stream, or just writing the latest snapshot.

Note that the write methods actually return a Task. You can ignore that Task, if you wish, but this is part of how Rhino Events gets to be so fast.

When you call EnqueueEventAsync, we register the value in a queue and have a background process write all of the pending events to disk. This means that we actually have only one thread that is actually doing writes, which means that we can batch all of those writes to get really nice performance from being able to handle all of that.

We can also reduce on the number of times that we have to actually Flush to disk (fsync), so we only do that when we run out of things to write or at a predefined times (usually after a full 200 ms of non stop writes. Only after the information was fully flushed to disk will we set the task status to completed.

This is actually a really interesting approach from my point of view, and it makes the entire thing transactional, in the sense that you can wait to be sure that the event has been persisted to disk (and yes, Rhino Events is fully ACID) or you can fire & forget it, and move on with your life.

A few words before I let you go off and play with the bits.

This is a Rhino project, which means that it is a fully OSS one. You can take the code and do pretty much whatever you want with it. But I , or Hibernating Rhinos, will not be providing support for that.

You can get the bits here: https://github.com/ayende/Rhino.Events

Oct 04 2012

Thou shall not do threading unless you know what you are doing

time to read 6 min | 1104 words

Tweet Share Share 10 comments

Tags:

I had a really bad couple of days. I am pissed, annoyed and angry, for totally not technical reasons.

And then I run into this issue, and I just want to throw something really hard at someone, repeatedly.

The issue started from this bug report:

   1: NetTopologySuite.Geometries.TopologyException was unhandled   2:   HResult=-2146232832   3:   Message= ... trimmed ...   4:   Source=NetTopologySuite   5:   StackTrace:   6:        at NetTopologySuite.Operation.Overlay.Snap.SnapIfNeededOverlayOp.GetResultGeometry(SpatialFunction opCode)   7:        at NetTopologySuite.Operation.Union.CascadedPolygonUnion.UnionActual(IGeometry g0, IGeometry g1)   8:        at NetTopologySuite.Operation.Union.CascadedPolygonUnion.Worker.Execute()   9:        at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)  10:        at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)  11:        at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)  12:        at System.Threading.ThreadHelper.ThreadStart()

At first, I didn’t really realized why it was my problem. I mean, it is NTS problem, isn’t it?

Except that this particular issue actually crashed ravendb (don’t worry, it is unstable builds only). The reason it crashed RavenDB? An unhandled thread exception.

What I can’t figure out is what on earth is going on. So I took a look at the code, have a look:

I checked, and this isn’t code that has been ported from the Java code. You can see the commented code there? That is from the Java version.

And let us look at what the execute method does?

So let me see if I understand. We have a list of stuff to do, so we spin out threads, reclusively, then we wait on them. I think that the point was to optimize things in some manner by parallelizing the work between the two halves.

Do you know what the real killer is? If we assume that we have a geometry with just 20 items on it, this will generate twenty two threads.

Leaving aside the issue of not handling errors properly (and killing the entire process because of this), the sheer cost of creating the threads is going to kill this program.

Libraries should be made to be thread safe (I already had to fix a thread safety bug there), but they should not be creating their own threads unless it is quite clear that they need to do so.

I believe that this is a case of a local optimization for a specific scenario, it also carry all of the issues associated with local optimizations. It solves one problem and opens up seven other ones.

Oct 03 2012

Lucene is beautiful

time to read 1 min | 191 words

Tweet Share Share 4 comments

Tags:

open source

So, after I finished telling you how much I don’t like the lucene.net codebase, what is this post about?

Well, I don’t like the code, but then again, I generally don’t like to read low level code. The ideas behind Lucene are actually quite amazingly powerful in their simplicity.

At its core, Lucene is just a set of sorted dictionaries on disk (greatly simplified, I know). Everything else is build on top of that, and if you grok what is going on there, you would be quite amazed at the number of things that this has made possible.

Indexing in Lucene is done by a pipeline of documents and fields and analyzers, which all participate together to generate those dictionaries. Searching in lucene is done by traversing those dictionaries in various ways, and combining the results in interesting ways.

I am not going to go into details about how it works, you can read all about that here. The important thing is that once you have grasped the essential structure inside lucene, the rest are just details.

The concept and the way the implementation fell out are quite beautiful.

Oct 02 2012

Lucene.NET is UGLY

time to read 1 min | 200 words

Tweet Share Share 17 comments

Tags:

open source

If you ever had to go through the Lucene.NET code base, I am sure that you’ll agree that the code base is quite ugly. It does a lot of low level stuff, which is almost always nasty, it is a port of a code from another language and framework, which means that it isn’t idiomatic code, and it has a lot of… strange things going on there.

Exceptions are used far too often.
There is a strong tendency to delegate things in such a way that make it hard to figure out where things are actually happening.
The big stick approach to thread safety (slap a lock on it).
Some really horrible things with regards to mutable shared state with IndexInputs.

Here is a good example of many of the issues that I talk about:

https://github.com/apache/lucene.net/blob/trunk/src/core/Search/FieldCacheImpl.cs#L207

Read this method, and I think you’ll understand.

Then again, you can see methods of similar or greater complexity in RavenDB, for example, see here:

https://github.com/ayende/ravendb/blob/1.2/Raven.Database/Indexing/IndexingExecuter.cs#L60

My main problem with the Lucene.NET codebase is that it feels alien, it isn’t .NET code, and it shows.

Then again, Lucene is also quite beautiful, but I’ll talk about this in my next post.

Sep 21 2012

Bug Fixes in OSS environment

time to read 1 min | 73 words

Tweet Share Share 9 comments

Tags:

A user reported a bug in RavenDB. We tracked that bug into a race condition in a 3rd party library, which then forced us to fix the bug, and then do the dependency roll up:

Sigh…

Then again, we could do all of that ourselves.

Jul 05 2012

Reviewing NAppUpdate

time to read 9 min | 1680 words

Tweet Share Share 11 comments

Tags:

I was asked to review NAppUpdate, a simple framework for providing auto-update support to .NET applications, available here. Just for to make things more interesting, the project leader of NAppUpdate is Itamar, who works for Hibernating Rhinos, and we actually use NAppUpdate in the profiler.

I treated it as an implementation detail and never actually looked at it closely before, so this is the first time that I am actually going over the code. On first impression, there is nothing that makes me want to hurl myself to the ocean from a tall cliff:

Let us dig deeper, and almost on the first try, we hit something that I seriously dislike.

Which leads to this:

public static class Errors
{
    public const string UserAborted = "User aborted";
    public const string NoUpdatesFound = "No updates found";
}

And I wouldn’t mind that, except that those are used like this:

There are actually quite a lot of issues with this small code sample. To start with, LastestError? Seriously?!

LatestError evoke strong memories of GetLastError() and all the associated fun with that.

It doesn’t give you the ability to multiple errors, and it is a string, so you can’t put an exception into it (more on that later).

Also, note how this work with the callback and the return code. Both of which have a boolean for success/failure. That is wrong.

That sort of style was valid for C, but .NET, we actually have exceptions, and they are actually quite a nice way to handle things.

Worse than that, it means that you have to check the return value, then go the LatestError and check what is going on there, except… what happen if there was an actual error?

Note the todo, it is absolutely correct. You really can’t just call a ToString() on an exception and get away with it (although I think that you should). There are a lot of exceptions where you would simply won’t get the required information. ReflectionException, for example, ask you to look at the LoaderExceptions property, and there are other such things.

In the same error handling category, this bugs me:

This is an exception that implements the serialization exception, but doesn’t have the [Serializable] attribute, which all exceptions should have.

Moving on, a large part of what NAU does is to check remote source for updates, then apply it. I liked this piece of code:

public class AppcastReader : IUpdateFeedReader
{
    // http://learn.adobe.com/wiki/display/ADCdocs/Appcasting+RSS

    #region IUpdateFeedReader Members

    public IList<IUpdateTask> Read(string feed)
    {
        XmlDocument doc = new XmlDocument();
        doc.LoadXml(feed);
        XmlNodeList nl = doc.SelectNodes("/rss/channel/item");

        List<IUpdateTask> ret = new List<IUpdateTask>();

        foreach (XmlNode n in nl)
        {
            FileUpdateTask task = new FileUpdateTask();
            task.Description = n["description"].InnerText;
            //task.UpdateTo = n["enclosure"].Attributes["url"].Value;
            task.UpdateTo = n["enclosure"].Attributes["url"].Value;

            FileVersionCondition cnd = new FileVersionCondition();
            cnd.Version = n["appcast:version"].InnerText;
            task.UpdateConditions.AddCondition(cnd, BooleanCondition.ConditionType.AND);

            ret.Add(task);
        }

        return ret;
    }

    #endregion
}

This is integrating with an external resources, and I like that this is simple to read and understand. I don’t have a lot of infrastructure going on in here that I have to deal with just to get what I want done.

There is a more complex feed reader for an internal format that allows you to use the full option set of NAU, but it is a bit complex (it does a lot more, to be fair), and again, I dislike the error handling there.

Another thing that bugged me on some level was this code:

The issue, and I admit that this is probably academic, is what happen if the string is large. I try to avoid exposing API that might force users to materialize a large data set in memory. This has implications on the working set, the large object heap, etc.

Instead, I would probably exposed a TextReader or even a Stream.

Coming back to the error handling, we have this:

FileDownloader fd = null;
if (Uri.IsWellFormedUriString(url, UriKind.Absolute))
    fd = new FileDownloader(url);
else if (Uri.IsWellFormedUriString(baseUrl, UriKind.Absolute))
    fd = new FileDownloader(new Uri(new Uri(baseUrl, UriKind.Absolute), url));
else if (Uri.IsWellFormedUriString(new Uri(new Uri(baseUrl), url).AbsoluteUri, UriKind.Absolute))
    fd = new FileDownloader(new Uri(new Uri(baseUrl), url));

if (fd == null)
    throw new ArgumentException("The requested URI does not look valid: " + url, "url");

My gripe is with the last line. Instead of doing it like this, I would have created a new Uri and let it throw the error. It is likely that it will have much more accurate error information about the actual reason this Uri isn’t valid.

On the gripping hand, we have this guy:

This is the abstraction that NAU manages, the Prepare() method is called to do all the actual work (downloading files, for example) and Execute() is called when we are done and just want to do the actual update.

I know that I am harping about this, but this is really important. Take a look at another error handling issue (this is representative of the style of coding inside this particular class) in the RegistryTask:

So, I have an update that is failing at a customer site. What error do I have? How do I know what went wrong?

Worse, here is another snippet from the FileUpdateTask:

For some things, an exception is thrown, for other, we just return false.

I can’t follow the logic for this, and trying to diagnose issues in production can be… challenging.

In summary, I went through most of the actual important bits of NAppUpdate, and like in most reviews, I focused on the stuff that can be improved. I didn’t touch on the stuff it does well, and that is its job.

We have been using NAppUpdate for the last couple of years, with very few issues. It quite nicely resolves the entire issue of auto updates to the point were in our code, we only need to call a few methods and it takes cares of the entire process for us.

Major recommendations from this codebase:

Standardize the error handling approach. It is important.
And even more important, logging is crucial to be able to diagnose issues in the field. NAU should log pretty much everything it does and why it does it.

This will allow later diagnosal of issues with relative ease, vs. “I need you to reproduce this and then break into the debugger”.

Jul 02 2012

That ain’t no Open Source that I see here

time to read 4 min | 634 words

Tweet Share Share 46 comments

Tags:

open source

Some things just piss me off. But before I get to what pissed me off this time, let me set the scene.

We usually request from candidates applying for a job in Hibernating Rhinos to submit some piece of code that they wrote. They get additional points if their code is an Open Source project.

Some people have some… issues with the concept. The replies I got were:

I don’t know if I can’t send you the code, I’ll have to ask my employer. (Which seems really silly thing to do, considering you want to get the code to show it to some other company that you want to hire you).
Here is the code, but don’t tell anyone. (Those usually get deleted immediately after I send them a scathing reply about things like IP and how important it is to respect that).
Here is my last course code. (Which is what actually triggered this post).

Here is the deal, if you aren’t coding for fun, you are not suitable for a developer position in Hibernating Rhinos. Just to give you some idea, currently we have the following pet projects that I am aware of:

Jewish Sacred Books repository – display / commentary
Jewish Sacred Books repository – search / organization (Note that the two are by two different people and are totally unrelated.)
Music game app for Android, iOS and WP7
Personal finance app
Auto update library for .NET
Various OSS projects

And probably other stuff that I am not aware of. (Just for the record, those are things that they are working on their own time, not company time. And not because I or anyone else told them).

Why is this relevant? Because I keep getting people who think submitting some random piece of code that they have from their latest university course is a good way to show their mad code skillz.

I mean, sure, that might do it, but consider carefully what sort of projects you are usually given as part of university courses. They are usually very small, focusing on just one aspect, and they are totally driven by whatever the crazy professor think is a valid coding standard. Usually, that is NOT a good candidate for sending a code to a job interview.

I am going to share just one line from a codebase that I recently got:

private void doSwap(ref Album io_Album1, ref Album io_Album2)

The code is in C#, in case you are wondering. And you can probably learn a lot about the state of the codebase from just this line of code. Among my chief complaints:

Violating the .NET framework naming guidelines (method name).
Violating the .NET framework naming guidelines (argument names).
Swapping parameters, seriously?! What, are you writing your own sort routine? And yeah, the answer is yes.

When I pinged the author of the code, he replies that this was because of the course requirements. They had a strict polish notation guidelines, and io_ is for a input & output parameter.

They had other guidelines (you may not use foreach, for example) that explained some strangeness in the codebase.

But that isn’t really the point. I can understand crazy coding standards, what I can’t understand is why someone would submit something that would raise so many red flags so quickly as part of a job application process.

This is wasting everyone’s time. And that is quite annoying.

Oren Eini

Oren Eini

CEO of RavenDB

Rob’s RavenDB Sprint

Open Source Application Review: BitShuva Radio

This code ain’t production ready

Introducing Rhino.Events

Thou shall not do threading unless you know what you are doing

Lucene is beautiful

Lucene.NET is UGLY

Bug Fixes in OSS environment

Reviewing NAppUpdate

That ain’t no Open Source that I see here

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed