Ayende @ Rahien

It's a girl

Rob’s Sprint: Idly indexing

During Rob Ashton’s visit to our secret lair, we did some work on hard problems. One of those problems was the issue of index prioritization. As I have discussed before, this is something that isn’t really easy to do, because of the associated IO costs with not indexing properly.

With Rob’s help, we have the defined the following:

  • An auto index can be set to idle if it hasn’t been queried for a time.
  • An index can be forced to be idle by the user.
  • An index that was automatically set to idle will be set to normal on its first query.

What are the implications for that? And idle index will not be indexed by RavenDB during the normal course of things. Only when the database is idle for a period of time (by default, about 10 minutes with no writes) will we actually get it indexing.

Idle indexing will continue indexing as long as there is no other activity that require their resources. When that happens, they will complete their current run and continue to wait for the database to become idle again.

But wait, there is more. In addition to introducing the notion of idle indexes, we have also created another two types of indexes. The first is pretty obvious, the disabled index will use no system resources and will never take part in indexing. This is mostly there so you can manually shut down a single index. For example, maybe it is a very expensive one and you want to stop it while you are doing an import.

More interesting, however, is the concept on an abandoned index. Even idle indexes can take some system resources, so we have added another level beyond that, an abandoned index is one that hasn’t been queried in 72 hours. At that point, RavenDB is going to avoid indexing it even during idle periods. It will still get indexed, but only if there has been a long enough time passed since the last time it was indexed.

Next, we will discuss why this feature was a crucial step in the way to killing temporary indexes.

Rob’s RavenDB Sprint

Rob Ashton is a great developer.   We invited him to Hibernating Rhinos as part of his Big World Tour.  I had the chance to work with him in the past on RavenDB, and I really liked working with him, and I liked the output even better. So we prepared some stuff for him to do.

This is the status of those issues midway through the second day.

image

And yes, I am giddy.

RavenDB Course on DVD

Want to learn RavenDB? In addition to the docs and the videos you can now order the RavenDB DVD course!

The course contains over 12 hours of recorded RavenDB course, which allows you to learn RavenDB at your own pace.
This 2 days course spans 7 DVD Discs and include deep discussion on domain model, RavenDB design and optimization, deep dive into the best practices for designing and building good RavenDB application, the scale out story, and much more.

You can find the full details: http://ravendb.net/buy/dvd

You can also get our awesome new tshirt.

Tags:

Published at

Originally posted at

Comments (6)

RavenDB And Not Having Foreign Keys

This is something that we hear quite often in the mailing list, and I thought that I would spend the time to answer it in full. There tend to be two kinds of FK references in RDMBS, the essential ones and the one the DBA added just to make my life a living hell.

The first one include things that are essential for internal consistency within a single aggregate. For example, the OrderLines.OrderID FK reference to Orders.ID is quite important, since an order line without an order is meaningless. But what about the association between OrderLine.ProductID and Products.ID ? 

An OrderLine can most certainly exists without the product, in fact, an OrderLine already copied into it all of the properties of the product that are important for the order. But because of the FK happy nature of most DBAs (and developers, for that matter), we have a FK reference between the two. The problem is that it actually is perfectly fine to remove a product that we are no longer selling from the store.

Yes, there are order lines for that product, but they have been completed ages ago. With a RDBMS and a FK, you cannot do that. So you resort to hacks like IsDeleted = false, which in practice gives you the exact same behavior as a deleted product, except that the FK is happy. Your application has a 50/50 change to work or not work with that.

With RavenDB, we make distinctions between internal consistency, which is maintained inside the same document, and external references, which can come and go as they please. You cannot have an order line in RavenDB without an order, because the order is where the order line exists. But you can most certainly remove the product that an order line refers to, because that is outside the scope of the order, it is a separate aggregate.

Tags:

Published at

Originally posted at

Comments (18)

Index Prioritization in RavenDB: Problems

Every now and then we get a request for index prioritization in RavenDB. The requests are usually in the form of:

I have an index (or a few indexes) that are very important, and I would like them to be update before any other indexes.

That is really nice request, but it ignores a lot of actual really hard implementation details.

In any prioritization scheme, there is the risk of starvation. If index A has to complete before index B, what happens if we have enough writes that index A is always busy? That means that index B will never get to index, and it will fall further & further behind. There are well known algorithms to handle this scenario, mostly from the OS thread scheduling point of view.

You could incrementally increase the priority of an index every time that you skipped updating it, until at some point it has higher priority than all the other indexes and gets its moment in the sun. That is workable if all you are working with are threads, and there isn’t a significantly different execution environment for a thread to run.

For RavenDB indexes, there is actually a major difference in the execution environment depending on when you are running. We have a lot of optimizations inside RavenDB to avoid IO, in particular, we do a lot of work so indexes do not have to wait for their input, we do parallel IO, optimized insert hooks, and a whole bunch of stuff like that. All of those assume that you all of the indexes are going to run together, however.

We already have the feature that a very slow index will be allowed to run while the rest of the indexes are keeping up, but that is something that we really try to avoid (we give it a grace period of 3/4 as much time as all of the other indexes combined). That is because the moment you have out of sync indexes, all that hard work is basically going to be wasted. You are going to be needing to load the documents to be indexed multiple times, creating more load on the server. Keeping the documents that were already indexed waiting in memory for the low priority index to work on is also not a good idea, since that is going to cause RavenDB to consume potentially a LOT more memory.

I have been thinking about this for a while, but it isn’t an easy decision. What do you think?

Tags:

Published at

Originally posted at

Comments (18)

Hibernating Rhinos Practices: A Sample Project

I have previously stated that one of the things that I am looking for in a candidate is the actual candidate code. Now, I won’t accept “this is a project that I did for a client / employee”, and while it is nice to be pointed at a URL from the last project the candidate took part of, it is not a really good way to evaluate someone’s abilities.

Ideally, I would like to have someone that has an OSS portfolio that we can look at, but that isn’t always relevant. Instead, I decided to sent potential candidates the following:

Hi,

I would like to give you a small project, and see how you handle that.

The task at hand is to build a website for Webinars questions. We run bi-weekly webinars for our users, and we want to do the following:

  • Show the users a list of our webinars (The data is here: http://www.youtube.com/user/hibernatingrhinos)
  • Show a list of the next few scheduled webinar (in the user’s own time zone)
  • Allow the users to submit questions, comment on questions and vote on questions for the next webinar.
  • Allow the admin to mark specific questions as answered in a specific webinar (after it was uploaded to YouTube).
  • Manage Spam for questions & comments.

The project should be written in C#, beyond that, feel free to use whatever technologies that you are most comfortable with.

Things that we will be looking at:

  • Code quality
  • Architecture
  • Ease of modification
  • Efficiency of implementation
  • Ease of setup & deployment

Please send us the link to a Git repository containing the project, as well as any instructions that might be necessary.

Thanks in advance,

     Oren Eini

This post will go live about two weeks after I started sending this to candidates, so I am not sure yet what the response would be.

Open Source Application Review: BitShuva Radio

As part of my ongoing reviews efforts, I am going to review the BitShuva Radio application.

BitShuva Radio is a framework for building internet radio stations with intelligent social features like community rank, thumb-up/down songs, community song requests, and machine learning that responds to the user's likes and dislikes and plays more of the good stuff.

I just cloned the repository and opened it in VS, without reading anything beyond the first line. As usual, I am going to start from the top and move on down:

image

We already have some really good indications:

  • There is just one project, not a gazillion of them.
  • The folders seems to be pretty much the standard ASP.NET MVC ones, so that should be easy to work with.

Some bad indications:

  • Data & Common folders are likely to be troublesome spots.

Hit Ctrl+F5, and I got this screen, which is a really good indication. There wasn’t a lot of setup required.

image

Okay, enough with the UI, I can’t really tell if this is good or bad anyway. Let us dive into the code. App_Start, here I come.

image

I get the feeling that WebAPI and Ninject are used here. I looked in the NinjectWebCommon file, and found:

image

Okay, I am biased, I’ll admit, but this is good.

Other than the RavenDB stuff, it is pretty boring, standard and normal codebase. No comments so far. Let us see what is this RavenStore all about, which leads us to the Data directory:

image

So it looks like we have the RavenStore and a couple of indexes. And the code itself:

   1: public class RavenStore
   2: {
   3:     public IDocumentStore CreateDocumentStore()
   4:     {
   5:         var hasRavenConnectionString = ConfigurationManager.ConnectionStrings["RavenDB"] != null;
   6:         var store = default(IDocumentStore);            
   7:         if (hasRavenConnectionString)
   8:         {
   9:             store = new DocumentStore { ConnectionStringName = "RavenDB" };
  10:         }
  11:         else
  12:         {
  13:             store = new EmbeddableDocumentStore { DataDirectory = "~/App_Data/Raven" };
  14:         }
  15:  
  16:         store.Initialize();
  17:         IndexCreation.CreateIndexes(typeof(RavenStore).Assembly, store);
  18:         return store;
  19:     }
  20: }

I think that this code need to be improved, to start with, there is no need for this to be an instance. And there is no reason why you can’t use EmbeddableDocumentStore to use remote stuff.

I would probably write it like this, but yes, this is stretching things:

   1: public static class RavenStore
   2: {
   3:     public static IDocumentStore CreateDocumentStore()
   4:     {
   5:         var store = new EmbeddableDocumentStore
   6:             {
   7:                 DataDirectory = "~/App_Data/Raven"
   8:             };
   9:  
  10:         if (ConfigurationManager.ConnectionStrings["RavenDB"] != null)
  11:         {
  12:             store.ConnectionStringName = "RavenDB";
  13:         }
  14:         store.Initialize();
  15:         IndexCreation.CreateIndexes(typeof(RavenStore).Assembly, store);
  16:         return store;
  17:     }
  18: }

I intended to just glance at the indexes, but this one caught my eye:

image

This index effectively gives you random output. It will group by the count of documents, and since we reduce things multiple times, the output is going to be… strange.

I am not really sure what this is meant to do, but it is strange and probably not what the author intended.

The Common directory contains nothing of interest beyond some util stuff. Moving on to the Controllers part of the application:

image

So this is a relatively small application, but an interesting one. We will start with what I expect o be a very simple part of the code .The HomeController:

   1: public class HomeController : Controller
   2: {
   3:     public ActionResult Index()
   4:     {
   5:         var userCookie = HttpContext.Request.Cookies["userId"];
   6:         if (userCookie == null)
   7:         {
   8:             var raven = Get.A<IDocumentStore>();
   9:             using (var session = raven.OpenSession())
  10:             {
  11:                 var user = new User();
  12:                 session.Store(user);
  13:                 session.SaveChanges();
  14:  
  15:                 HttpContext.Response.SetCookie(new HttpCookie("userId", user.Id));
  16:             }
  17:         }
  18:  
  19:         // If we don't have any songs, redirect to admin.
  20:         using (var session = Get.A<IDocumentStore>().OpenSession())
  21:         {
  22:             if (!session.Query<Song>().Any())
  23:             {
  24:                 return Redirect("/admin");
  25:             }
  26:         }
  27:  
  28:         ViewBag.Title = "BitShuva Radio";
  29:         return View();
  30:     }
  31: }

There are a number of things in here that I don’t like. First of all, let us look at the user creation part. You look at the cookies and create a user if it isn’t there, setting the cookie afterward.

This has the smell of something that you want to do in the infrastructure. I did  a search for “userId” in the code and found the following in the SongsController:

   1: private User GetOrCreateUser(IDocumentSession session)
   2: {
   3:     var userCookie = HttpContext.Current.Request.Cookies["userId"];
   4:     var user = userCookie != null ? session.Load<User>(userCookie.Value) : CreateNewUser(session);
   5:     if (user == null)
   6:     {
   7:         user = CreateNewUser(session);
   8:     }
   9:  
  10:     return user;
  11: }
  12:  
  13: private static User CreateNewUser(IDocumentSession session)
  14: {
  15:     var user = new User();
  16:     session.Store(user);
  17:  
  18:     HttpContext.Current.Response.SetCookie(new HttpCookie("userId", user.Id));
  19:     return user;
  20: }

That is code duplication with slightly different semantics, yeah!

Another issue with the HomeController.Index method is that we have direct IoC calls (Get.As<T>) and multiple sessions per request. I would much rather do this in the infrastructure, which would also give us a place for the GetOrCreateUser method to hang from.

SongsController is actually an Api Controller, so I assume that it is called from JS on the page. Most of the code there looks like this:

   1: public Song GetSongForSongRequest(string songId)
   2: {
   3:     using (var session = raven.OpenSession())
   4:     {
   5:         var user = GetOrCreateUser(session);
   6:         var songRequest = new SongRequest
   7:         {
   8:             DateTime = DateTime.UtcNow,
   9:             SongId = songId,
  10:             UserId = user.Id
  11:         };
  12:         session.Store(songRequest);
  13:         session.SaveChanges();
  14:     }
  15:  
  16:     return GetSongById(songId);
  17: }

GetSongById will use its own session, and I think it would be better to have just one session per request, but that is about the sum of my comments.

One thing that did bug me was the song search:

   1: public IEnumerable<Song> GetSongMatches(string searchText)
   2: {
   3:     using (var session = raven.OpenSession())
   4:     {
   5:         return session
   6:             .Query<Song>()
   7:             .Where(s =>
   8:                 s.Name.StartsWith(searchText) ||
   9:                 s.Artist.StartsWith(searchText) ||
  10:                 s.Album.StartsWith(searchText))
  11:             .Take(50)
  12:             .AsEnumerable()
  13:             .Select(s => s.ToDto());
  14:     }
  15: }

RavenDB has a really good full text support. And we could be using that, instead. It would give you better results and be easier to work with, to boot.

Overall, this is a pretty neat little app.

A pull request with all of the taxes already paid, gimme that again!

I recently merged a pull request from Barry Hagan. The actual pull request is pretty boring, to tell you the truth, exposing a Lucene feature through the RavenDB API.

What really impressed me was how complete the pull request was.

What do I mean by complete?

A feature is not just supporting it in the engine. For this particular pull request, Barry have done:

  • Supported this in RavenDB Database Core.
  • The HTTP API.
  • The C# Client API.
  • The strongly typed C# Client API.
  • Included support for dynamic indexes.
  • Updated the query optimizer.
  • Exposed this feature in the UI.

Basically, the only things that I had to do were git pull and then review the code.

Very nicely done.

Tags:

Published at

Originally posted at

Comments (8)

Bug hunting in a massively multi threaded environment

We got a really nasty bug report from a user. Sometimes, out of the blue, RavenDB would throw an error:

System.InvalidOperationException: Collection was modified; enumeration operation may not execute.
at System.Collections.Generic.List`1.Enumerator.MoveNextRare()
at Raven.Json.Linq.RavenJArray.WriteTo(JsonWriter writer, JsonConverter[] converters) in c:\Builds\RavenDB-Stable\Raven.Abstractions\Json\Linq\RavenJArray.cs:line 174
at Raven.Json.Linq.RavenJObject.WriteTo(JsonWriter writer, JsonConverter[] converters) in c:\Builds\RavenDB-Stable\Raven.Abstractions\Json\Linq\RavenJObject.cs:line 275 at Raven.Json.Linq.RavenJObject.WriteTo(JsonWriter writer, JsonConverter[] converters) in c:\Builds\RavenDB-Stable\Raven.Abstractions\Json\Linq\RavenJObject.cs:line 275

Unfortunately, this error only happened once in a while, usually when the system was under load. But they weren’t able to provide a repro for that.

Luckily, they were able to tell us that they suspected that this is related to the replication support. I quickly setup a database with replication and wrote the following code to try to reproduce this:

using(var store = new DocumentStore
{
    Url = "http://localhost:8080",
    DefaultDatabase = "hello"
}.Initialize())
{
    using(var session = store.OpenSession())
    {
        session.Store(new ReadingList
        {
            UserId = "test/1",
            Id = "lists/1",
            Books = new List<ReadingList.ReadBook>()
        });
        session.SaveChanges();
    }
    Parallel.For(0, 100, i =>
    {
        while (true)
        {
            try
            {
                using (var session = store.OpenSession())
                {
                    session.Advanced.UseOptimisticConcurrency = true;
                    session.Load<ReadingList>("lists/1")
                            .Books.Add(new ReadingList.ReadBook
                            {
                                ReadAt = DateTime.Now,
                                Title = "test " + i
                            });
                    session.SaveChanges();
                }
                break;
            }
            catch (ConcurrencyException)
            {
                
            }
        }
    });
}

And that reproduced the bug! Hurrah! Done deal, we can move on, right?

Except that the bug was only there when we have massive amount of threads hitting the db at once, and trying to figure out what is actually going on there was next to impossible using standard debugging tools. Instead, I reached down to my tracing toolbelt and starting pulling stuff out. First, we identified that the problem occurred when iterating over RavenJArray, which is our own object, so we added the following:

        ConcurrentQueue<StackTrace>  addStackTraces = new ConcurrentQueue<StackTrace>();

        public void Add(RavenJToken token)
        {
            if (isSnapshot)
                throw new InvalidOperationException("Cannot modify a snapshot, this is probably a bug");

            addStackTraces.Enqueue(new StackTrace(true));

            Items.Add(token);
        }

And this one (where the exception is raised):

public override void WriteTo(JsonWriter writer, params JsonConverter[] converters)
{
    writer.WriteStartArray();

    if (Items != null)
    {
        try
        {
            foreach (var token in Items)
            {
                token.WriteTo(writer, converters);
            }
        }
        catch (InvalidOperationException e)
        {
            foreach (var stackTrace in addStackTraces)
            {
                Console.WriteLine(stackTrace);
            }
            throw;
        }
    }

    writer.WriteEndArray();
}

With the idea that we would actually be able to get what is going on there. By tracking down who added items to this particular instance, I hoped that I would be able to figure out why we have an instance that is shared among multiple threads.

When I had that, it was pretty easy to see that it was indeed the replication bundle that was causing the issue. The problem was that the replication bundle was modifying an inner array inside the document metadata. We protected the root properties of the metadata from concurrent modifications, and most of the time, it works just fine. But the problem was that now we had a bundle that was modifying a nested array, which wasn’t protected.

This is one of those bugs that are really hard to catch:

  • My understanding of the code said that this is not possible, since I believed that we protected the nested properties as well*.
  • This bug will only surface if and only if:
    • You have the replication bundle enabled.
    • You have a great deal of concurrent modifications (with optimistic concurrency enabled) to the same document.
    • You are unlucky.

I was grateful that the user figured out the replication connection, because I already sat on that bug previously, and there was no way I could figure out what is going on unless I had the trace to point me to where the actual problem was.

Rhino ETL Union Operation

Yes, it is somewhat of a blast from the past, but I just got asked how to create a good Union All operation for Rhino ETL.

The obvious implementation is:

   1: public class UnionAllOperation : AbstractOperation
   2: {
   3:     private readonly List<IOperation> _operations = new List<IOperation>(); 
   4:  
   5:     public override IEnumerable<Row> Execute(IEnumerable<Row> rows)
   6:     {
   7:         foreach (var operation in _operations)
   8:             foreach (var row in operation.Execute(null))
   9:                 yield return row;
  10:     }
  11:  
  12:     public UnionAllOperation Add(IOperation operation)
  13:     {
  14:         _operations.Add(operation);
  15:         return this;
  16:     }
  17: }

The problem is that this does everything synchronously. The following code is a better impl, but note that this is notepad code, with all the implications of that.

   1: public class UnionAllOperation : AbstractOperation
   2: {
   3:     private readonly List<IOperation> _operations = new List<IOperation>(); 
   4:  
   5:     public override IEnumerable<Row> Execute(IEnumerable<Row> rows)
   6:     {
   7:         var blockingCollection = new BlockingCollection<Row>();
   8:         var tasks = _operations.Select(currentOp => Task.Factory.StartNew(() =>{
   9:                 foreach(var operation in currentOp.Execute(null))
  10:                 {
  11:                     blockingCollection.Add(operation);
  12:                 }
  13:                 blockingCollection.Add(null); // free the consumer thread
  14:             });
  15:  
  16:         Row r;
  17:         while(true){
  18:             if(tasks.All(x=>x.IsFaulted || x.IsCanceled || x.IsCompleted)) // all done
  19:                 break;
  20:             r = blockingCollection.Take();
  21:             if(r == null)
  22:                 continue;
  23:             yield return r;
  24:         }
  25:         while(blockingCollection.TryTake(out r)) {
  26:             if(r == null)
  27:                 continue;
  28:             yield return r;
  29:         }
  30:         Task.WaitAll(tasks.ToArray()); // raise any exception that were raised during execption
  31:     }
  32:  
  33:     public UnionAllOperation Add(IOperation operation)
  34:     {
  35:         _operations.Add(operation);
  36:         return this;
  37:     }
  38: }

Usual caveats apply, notepad code, never actually run it, much less tested / debugged it.

Feel free to rip into it, though.

Dale Newman did some improvements, the most important one is to make sure that we aren’t going to evaluate the tasks several times (opps! I told ya it was notepad code Smile), and now it looks like this:

   1: /// <summary>
   2: /// Combines rows from all operations.
   3: /// </summary>
   4: public class UnionAllOperation : AbstractOperation {
   5:  
   6:     private readonly List<IOperation> _operations = new List<IOperation>();
   7:  
   8:     /// <summary>
   9:     /// Executes the added operations in parallel.
  10:     /// </summary>
  11:     /// <param name="rows"></param>
  12:     /// <returns></returns>
  13:     public override IEnumerable<Row> Execute(IEnumerable<Row> rows) {
  14:  
  15:         var blockingCollection = new BlockingCollection<Row>();
  16:  
  17:         Debug("Creating tasks for {0} operations.", _operations.Count);
  18:  
  19:         var tasks = _operations.Select(currentOp => Task.Factory.StartNew(() => {
  20:             Trace("Executing {0} operation.", currentOp.Name);
  21:             foreach (var row in currentOp.Execute(null)) {
  22:                 blockingCollection.Add(row);
  23:             }
  24:             blockingCollection.Add(null); // free the consumer thread
  25:         })).ToArray();
  26:  
  27:         Row r;
  28:         while (true) {
  29:             if (tasks.All(x => x.IsFaulted || x.IsCanceled || x.IsCompleted)) {
  30:                 Debug("All tasks have been canceled, have faulted, or have completed.");
  31:                 break;
  32:             }
  33:  
  34:             r = blockingCollection.Take();
  35:             if (r == null)
  36:                 continue;
  37:  
  38:             yield return r;
  39:  
  40:         }
  41:  
  42:         while (blockingCollection.TryTake(out r)) {
  43:             if (r == null)
  44:                 continue;
  45:             yield return r;
  46:         }
  47:  
  48:         Task.WaitAll(tasks); // raise any exception that were raised during execption
  49:  
  50:     }
  51:  
  52:     /// <summary>
  53:     /// Initializes this instance
  54:     /// </summary>
  55:     /// <param name="pipelineExecuter">The current pipeline executer.</param>
  56:     public override void PrepareForExecution(IPipelineExecuter pipelineExecuter) {
  57:         foreach (var operation in _operations) {
  58:             operation.PrepareForExecution(pipelineExecuter);
  59:         }
  60:     }
  61:  
  62:     /// <summary>
  63:     /// Add operation parameters
  64:     /// </summary>
  65:     /// <param name="ops">operations delimited by commas</param>
  66:     /// <returns></returns>
  67:     public UnionAllOperation Add(params IOperation[] ops) {
  68:         foreach (var operation in ops) {
  69:             _operations.Add(operation);
  70:         }
  71:         return this;
  72:     }
  73:  
  74:     /// <summary>
  75:     /// Add operations
  76:     /// </summary>
  77:     /// <param name="ops">an enumerable of operations</param>
  78:     /// <returns></returns>
  79:     public UnionAllOperation Add(IEnumerable<IOperation> ops) {
  80:         _operations.AddRange(ops);
  81:         return this;
  82:     }
  83:  
  84: }
Tags:

Published at

Originally posted at

Comments (6)

RavenDB has the best users

image

Talk about making my job easy. Thank you Barry, and I am sorry it took me 17 minutes to get the fix out.

Published at

Originally posted at

Comments (5)

Defensive coding is your friend

We just had a failing test:

image

As you can see we assumed that fiddler is running, when it isn’t. Here is the bug:

image

Now, this is great when I am testing things out, and want to check what is going on the wire using Fiddler, but I always have to remember to revert this change, otherwise we will have a failing test and a failing build.

That isn’t very friction free, so I added the following:

image

Now the code is smart enough to not fail the test if we didn’t do things right.

Software architecture with nail guns

As you probably know, I get called quite a lot to customers to “assist” in failing or problematic software projects. Maybe the performance isn’t nearly what it should be, maybe it is so very hard to make changes, maybe it is… one of the thousand and one things that can go wrong, and usually does.

Internally, I divide those projects into two broad categories: The stupid and the nail guns.

I rarely get called to projects that fall under the stupid category. When it happens, it is usually because someone new came in, looked at the codebase and called for help. I love working with stupid code bases. They are easy to understand, if hard to work with, and it is pretty obvious what is wrong. And the team is usually very receptive about getting advice on how to fix it.

But I usually am called for nail gun projects, and those are so much more complex…

But before I can talk about them, I need to explain first what I meant when I say “nail gun projects”. Consider an interesting fact. Absolutely no one will publish an article saying “we did nothing special, we had nothing out of the ordinary, and we shipped roughly on time, roughly on budget and with about the expected feature set. The client was reasonably happy.” And even if someone would post that, no one would read it.

Think about your life, as an example. You wake up, walk the dogs, take kids to school, go to work, come back from work, fall asleep reading this sentence, watch some TV, eat along the way, sleep. Rinse, repeat.

Now, let us go and look at the paper. At the time of this writing, those were the top stories at CNN:

Hopefully, there is a big disconnect between your life and those sort of news.

Now, let us think about the sort of posts, articles and books that you have been reading. You won’t find any book called: "Delivering OK projects”

And most of the literature about software projects is on one of two ends: We did something incredibly hard, and we did it well or we did something (obvious, usually) and we failed really badly. People who read those books tend to look at those books (either kind) and almost blindly adopt the suggested practices. Usually without looking at that section called “When it is appropriate to do what we do”.

Probably the best example is the waterfall methodology, originated in the 1970  paper "Managing the Development of Large Software Systems" from Winston W. Royce.

From the paper:

…the implementation described above is risky and invites failure

As you can imagine, no one actually listened, and the rest is history.

How about those nail guns again?

Well, imagine that you are a contractor, and here are you tools of the trade:

They are good tools, and they served you well for a while. But now you are reading about “Nail guns usage for better, faster and more effective framing or roofing". In the study, you read how there was a need to nail 3,000 shingles and using a nail gun the team was successfully able to complete the task with higher efficiency over the use of the standard hammer.

Being a conscientious professional, you head the advice and immediately buy the best nail gun you can find:

(This is just a random nail gun picture, I don’t know what brand, nor really care.)

And indeed, a nail gun is a great tool when you need to nail a lot of things very fast. But it is a highly effective tool that is extremely limited in what it can do.

But you know that a nail gun is 333% more efficient than the hammer, so you throw it away. And then you get a request: Can you hang this picture on the wall, please?

It would be easy with a hammer, but with a nail gun:

It isn’t the stupid / lazy / ignorant people that go for the nail gun solutions.

It is the really hard working people, the guys who really try to make things better. Of course, what usually happen is this:

 

And here we get back to the projects that I usually get called for. Those are projects that were created by really smart people, with the best of intentions, and with the clear understanding that they want to get quality stuff done.

The problem is that they are using Nail Guns for the architecture. For example, let us just look at this post. And the end is already written.

Tags:

Published at

Originally posted at

Comments (10)

RavenDB 2.01 Stable Release

Just about a month after the 2.0 release ,we have a minor stable release for RavenDB containing a lot of bug fixes, a few minor features and some new cool stuff.

Before I move on to anything else, everyone who is using 2.0 should be upgrading to 2.01. There have been a number of issues that were fixed between 2.0 and 2.01, and some of them are pretty important.

You can see the full list here, the highlights are below:

http://hibernatingrhinos.com/builds/ravendb-stable/2260

Bug Fixes:

  • Fixed race condition with replication when doing rapid update to the same document / attachment.
  • Fixed issues of using the RavenDB Profiler with a different version of jQuery.
  • Fixed bug where disposing of one changes subscription would also dispose others.
  • Replication doesn't work when using API key.
  • HTTP Spec Comp: Support quoted etags in headers.
  • Fixed a problem with map/reduce indexes moving between single step and multi step reduce would get duplicate data.
  • Fixed an error when using encryption and reducing the document size.
  • Support facets on spatial queries
  • Fixed unbounded load when loading items to reduce in multi step reduce.
  • Fixed bulk insert on IIS with authentication.
  • Fixed Last-Modified date is not being updated in embedded mode

Features

  • Added SQL Replication support.
  • Will use multiple index batches if we have a slow index.
  • More aggressive behavior with regards to releasing memory on the client under low memory conditions.
  • Adding debug ability for authentication issues.
  • Implemented server side fulltext search results highlighting
  • Moved expensive suggestions operation init to a separate thread.
  • Allow to define suggestions as part of the index creation.
  • Better facets paging.
  • Expose better view of internal storage concerns.
  • Support TermVector options on index fields.
  • RavenDB-894 When authenticating from the studio using API Key, set a cookie This allows us to use standard browser commands after authenticating using the studio just once.
  • Periodic backup can now backup to a local path as well.
  • Adding debug information for when we filter documents for replication
  • Async API improvements:
    • AnyAsync
    • StoreAsync
Tags:

Published at

Originally posted at

Comments (11)

On failing tests

I made a change (deep in the guts of RavenDB), and then I run the tests, and I go this:

image

I love* it when this happens, because it means that there is one root cause that I need to fix, really obvious and in the main code path.

I hate it when there is just one failing test, because it means that this is an edge condition or something freaky like that.

* obviously I would love it more if there were no failing tests.

Hibernating Rhinos Practices: Design

One of the things that I routinely get asked is how we design things. And the answer is that we usually do not. Most things does not require complex design. The requirements we set pretty much dictate how things are going to work. Sometimes, users make suggestions that turn into a light bulb moment, and things shift very rapidly.

But sometimes, usually with the big things, we actually do need to do some design upfront. This is usually true in complex / user facing part of our projects. The Map/Reduce system, for example, was mostly re-written  in RavenDB 2.0, and that only happened after multiple design sessions internally, a full stand alone spike implementation and a lot of coffee, curses and sweat.

In many cases, when we can, we will post a suggested design on the mailing list and ask for feedback. Here is an example of such a scenario:

In this case, we didn’t get to this feature in time for the 2.0 release, but we kept thinking and refining the approach for that.

The interesting things that in those cases, we usually “design” things by doing the high level user visible API and then just let it percolate. There are a lot of additional things that we would need to change to make this work (backward compatibility being a major one), so there is a lot of additional work to be done, but that can be done during the work. Right now we can let it sit, get users’ feedback on the proposed design and get the current minor release out of the door.

Hibernating Rhinos Practices: Pairing, testing and decision making

We actually pair quite a lot, either physically (most of our stations have two keyboards & mice for that exact purpose) or remotely (Skype / Team Viewer).

2013-01-27 14.05.24 HDR

And yet, I would say that for the vast majority of cases, we don’t pair. Pairing is usually called for when we need two pairs of eyes to look at a problem, for non trivial debugging and that is about it.

Testing is something that I deeply believe in, at the same time that I distrust unit testing. Most of our tests are actually system tests. That test the system end to end. Here is an example of such a test:

[Fact]
public void CanProjectAndSort()
{
    using(var store = NewDocumentStore())
    {
        using(var session = store.OpenSession())
        {
            session.Store(new Account
            {
                Profile = new Profile
                {
                    FavoriteColor = "Red",
                    Name = "Yo"
                }
            });
            session.SaveChanges();
        }
        using(var session = store.OpenSession())
        {
            var results = (from a in session.Query<Account>()
                           .Customize(x => x.WaitForNonStaleResults())
                           orderby a.Profile.Name
                           select new {a.Id, a.Profile.Name, a.Profile.FavoriteColor}).ToArray();


            Assert.Equal("Red", results[0].FavoriteColor);
        }
    }
}

Most of our new features are usually built first, then get tests for them. Mostly because it is more efficient to get things done by experimenting a lot without having tests to tie you down.

Decision making is something that I am trying to work on. For the most part, I have things that I feel very strongly about. Production worthiness is one such scenario, and I get annoyed if something is obviously stupid, but a lot of the time decisions can fall into the either or category, or are truly preferences issues. I still think that too much goes through me, including things that probably should not.  I am trying to encourage things so I wouldn’t be in the loop so much. We are making progress, but we aren’t there yet.

Note that this post is mostly here to serve as a point of discussion. I am not really sure what to put in here, the practices we do are pretty natural, from my point of view. And I would appreciate any comments asking for clarifications.

RavenDB Webinars

The last RavenDB Webinar was really nice, so I decided to make this a bi weekly event. I hope this will give us even closer relationship with our user group.

Also, this will allow us to generate some truly awesome feedback loops.

Tags:

Published at

Originally posted at

Comments (2)

Hibernating Rhinos Practices: We are hiring again

As part of this series, I wanted to take the time and let you know that we are hiring full time developers again.

This is applicable solely for developers in Israel.

We are working with C# (although I’ll admit that sometime we make it scream a little bit:

image

Candidate should be able to provide a project (and preferably more than one) that we can look at to see their code. It has got to be your code. It is ain’t yours (if it is code that you wrote for an employer, or if it is a university code project) I don’t wanna see it.

We are talking about a full time employee position, working on RavenDB, Uber Profiler, RavenFS and a bunch of other stuff that I don’t want to talk about yet.

Ping me with your CV if you are interested.