Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,640
|
Comments: 51,260
Privacy Policy · Terms
filter by tags archive
time to read 2 min | 238 words

image

Well, I planned to do it last week, but it got delay for personal reasons.

But here it is, Linq to SQL Profiler is now out of beta, and I personally think it is awesome. Using the profiler, you gain valuable insight about the actual data access pattern of your application (which is usually abstracted away by the Linq to SQL framework). But the profiler goes beyond just dumping a heap of data on you, it takes it several steps further by:

  • Tying together queries and code, you can go directly from any query to the line of code that generated it, no more scratching about “what caused this query”.
  • Analyzing your data access patterns and alerting your about bad practices and suggesting how to fix them.
  • Provide detailed reports on your application’s database usage, perfect for sending to the DBA for optimization.

I got some really nice feedback from people with it:

image

image

Happy linqing

time to read 5 min | 950 words

Software processes has always been a popular topic of discussion in our industry. Those can get quite heated, with advocates of the “stable / stale” Waterfall method pointing fingers toward “rapid / rabid” Agile methods, with the CMMI people throwing documents around and Lean people standing on the sidelines muttering about Waste.

This isn’t a post about a specific software process, I’ll defer that to another day. Instead, I want to focus on a flaw in the basic building blocks in many* software building processes.

They ignore the actual building the software.

That may sound ridiculous on the face of it, after all, how can a software process ignore the act of building software. But take a look at the following diagrams:

image

If you’ll pay attention, you’ll notice that those processes talk about everything except how to actually build software. They talk about people, about requirements, about managing customers, about a whole lot of things, but not about the part where you have people sitting down and writing code. In most of those, in fact, that part is usually defined as one of those:

image

Why is that a problem? After all, isn’t there a big distinction between software engineering (we know what to do, now let us do it) and project management (getting to know what we need to do, and verifying that we did it right). Those processes deal primarily with project management and leave the engineering part to be defined in a way that fit that particular project. Surely that is better, right? In theory, it might be. But there is a big problem when you have a software process that ignore the software engineering aspects of building software.

The problem is that that in many cases, there are hidden assumptions that are going to hammer you down the road if you use a certain process with engineering practices that doesn’t fit it. Take a look at the following chart, showing a team velocity over time, does this look familiar?

image

The term I heard used for this is Scrum Wall, but I have seen similar results in other processes as well. The best description for that is Allan Kelly’s:

You hit the Scrum wall when you adopt Scrum and everything goes well, then, after a few Sprints things don’t work any more - to use an English expression, they go pear shaped. You can’t keep your commitments, you can’t release software, your customers get annoyed and angry, it looks like Scrum is broken.

This is what happens when you adopt Scrum without technical practices such as Test Driven Development, continuous integration and refectoring. When teams adopt the Scrum process, they go faster, show progress, things look good... and then the quality becomes a problem. Now the team are fighting through quick sand.

The code quality is poor and developers are expected to continue to make progress. Maybe the Scrum Master/Project Manager reverts to past behavior and demands overtime and weekend working. Maybe the team start busting a gut to keep their commitments. Either way the team is heading for burn-out.

The major issue is in focusing so much effort and time on project management with what amounts to willful ignorance of the technical and engineering practices will inevitably leads to disaster. The process of building software is intractably linked to the engineering practices involved in building the software. Moreover, some technical practices are actively harmful in some scenarios and life savers in others.

Many Agile and post-Agile processes focus on short cycles, each of them producing something with a distinct value to the customer. That may be an iteration, a commit or a feature, where the goal is to increase the velocity over time so we can provide as much value to the customer in as short a time as possible. What those processes ignore are things like technical debt, large scale refactoring and non functional efforts. Oh, you see those things mentioned on the edge, but they aren’t something that is dealt with heads on, as a core issue to consider.

There is a bit more to that, actually. The software engineering practices and the project management strategies are linked and of paramount importance when the time comes to decide how the software should actually be built. No, this is not tautology. We just need to take into account Conway’s law and expand on it a bit.

Any organization that designs a system will inevitably produce a design whose structure is a copy of the organization's communication structure.

Part of the design process of a project should include design the team(s) structure, the project management strategy and the software engineering practices in order to align the end result with what is desired. Ignoring this leads to imbalance in the project, and if that imbalance is big enough, and goes on for long enough, the project is going to rip itself apart.

* Nitpicker corner: I said many, not all. Don’t bother to list me software process that deals with it. I had a reason to explicitly list the processes that I did.

time to read 2 min | 269 words

Well, it arrived two days ago, but I only just finished doing the bare bones installation. My new laptop is a Lenovo Thinkpad W510 (4319-29G, if anyone really cares). I have been a happy Thinkpad user for a long time, so it seems to be a natural choice. One think that I did directly off the bat, though, was to replace the HD that it came with with an SSD one.

It seems to be a pretty nice machine, so far.

image

I find it highly amusing that the slowest part here is actually the memory.

One thing that i do find unforgivable, however, is the install from scratch experience. Since i replaced the HD, I installed the machine using Win7, and after a successful installation, the computer could neither connect to the internet or use USB. That meant that in order to get drivers (you know, so I can connect to the internet or use USB) I had to burn a CD. It felt so 90s.

Even more unforgivable?

 image

After being able to go on the net and download all the drivers for my machine, I still can’t get USB to work. There doesn’t seem to be any USB drivers for my model, which I find extremely puzzling.

At a minimum, I would expect Lenovo to include a drivers disk.

time to read 6 min | 1174 words

Let us look at the following pieces of code:

public void Consume(MyBooksRequest message)
{
    var user = session.Get<User>(message.UserId);
    
    bus.Reply(new MyBooksResponse
    {
        UserId = message.UserId,
        Timestamp = DateTime.Now,
        Books = user.CurrentlyReading.ToBookDtoArray()
    });
}

public void Consume(MyQueueRequest message)
{
    var user = session.Get<User>(message.UserId);

    bus.Reply(new MyQueueResponse
    {
        UserId = message.UserId,
        Timestamp = DateTime.Now,
        Queue = user.Queue.ToBookDtoArray()
    });
}

public void Consume(MyRecommendationsRequest message)
{
    var user = session.Get<User>(message.UserId);

    bus.Reply(new MyRecommendationsResponse
    {
        UserId = message.UserId,
        Timestamp = DateTime.Now,
        Recommendations = user.Recommendations.ToBookDtoArray()
    });
}

Looking at this, I see that I have a requirement to getting my books, my queues and my recommendations. It appears that getting each datum is going to result in 2 queries, the first to load the User, and the second to lazy load the actual collection that we want to return.

An almost trivial optimization would be to eliminate the lazy loading, right? That would reduce the cost from 6 queries to just 3.

However, that assumption would be wrong. The following client code:

bus.Send(
    new MyBooksRequest
    {
        UserId = userId
    },
    new MyQueueRequest
    {
        UserId = userId
    },
    new MyRecommendationsRequest
    {
        UserId = userId
    });

Produces this SQL:

-- statement #1
enlisted session in distributed transaction with isolation level: Serializable

-- statement #2
SELECT user0_.Id          as Id2_0_,
       user0_.Name        as Name2_0_,
       user0_.Street      as Street2_0_,
       user0_.Country     as Country2_0_,
       user0_.City        as City2_0_,
       user0_.ZipCode     as ZipCode2_0_,
       user0_.HouseNumber as HouseNum7_2_0_
FROM   Users user0_
WHERE  user0_.Id = 1 /* @p0 */

-- statement #3
SELECT currentlyr0_.[User] as User1_1_,
       currentlyr0_.Book   as Book1_,
       book1_.Id           as Id0_0_,
       book1_.Name         as Name0_0_,
       book1_.ImageUrl     as ImageUrl0_0_,
       book1_.Image        as Image0_0_,
       book1_.Author       as Author0_0_
FROM   UsersReadingBooks currentlyr0_
       left outer join Books book1_
         on currentlyr0_.Book = book1_.Id
WHERE  currentlyr0_.[User] = 1 /* @p0 */

-- statement #4
SELECT queue0_.[User]  as User1_1_,
       queue0_.Book    as Book1_,
       queue0_.[Index] as Index3_1_,
       book1_.Id       as Id0_0_,
       book1_.Name     as Name0_0_,
       book1_.ImageUrl as ImageUrl0_0_,
       book1_.Image    as Image0_0_,
       book1_.Author   as Author0_0_
FROM   UsersWaitingBooks queue0_
       left outer join Books book1_
         on queue0_.Book = book1_.Id
WHERE  queue0_.[User] = 1 /* @p0 */

-- statement #5
SELECT recommenda0_.[User]  as User1_1_,
       recommenda0_.Book    as Book1_,
       recommenda0_.[Index] as Index3_1_,
       book1_.Id            as Id0_0_,
       book1_.Name          as Name0_0_,
       book1_.ImageUrl      as ImageUrl0_0_,
       book1_.Image         as Image0_0_,
       book1_.Author        as Author0_0_
FROM   UsersRecommendedBooks recommenda0_
       left outer join Books book1_
         on recommenda0_.Book = book1_.Id
WHERE  recommenda0_.[User] = 1 /* @p0 */


-- statement #7
commit transaction

That seems strange, can you figure out why?

Bonus points for figuring out whatever it would be worth it to do the eager load optimization or not.

time to read 2 min | 295 words

Let us look at the following pieces of code:

public void Consume(MyBooksRequest message)
{
    var user = session.Get<User>(message.UserId);
    
    bus.Reply(new MyBooksResponse
    {
        UserId = message.UserId,
        Timestamp = DateTime.Now,
        Books = user.CurrentlyReading.ToBookDtoArray()
    });
}

public void Consume(MyQueueRequest message)
{
    var user = session.Get<User>(message.UserId);

    bus.Reply(new MyQueueResponse
    {
        UserId = message.UserId,
        Timestamp = DateTime.Now,
        Queue = user.Queue.ToBookDtoArray()
    });
}

public void Consume(MyRecommendationsRequest message)
{
    var user = session.Get<User>(message.UserId);

    bus.Reply(new MyRecommendationsResponse
    {
        UserId = message.UserId,
        Timestamp = DateTime.Now,
        Recommendations = user.Recommendations.ToBookDtoArray()
    });
}

Looking at this, I see that I have a requirement to getting my books, my queues and my recommendations. Looking at the code, can you guess how many queries are being generated to get those?

And can you suggest an optimization?

time to read 2 min | 224 words

One of the things that makes working with the profiler easier is the fact that it gives you not just information, but information in context.

I was working with an app using Rhino Service Bus, and it really bothered me that I couldn’t immediately figure out what was the trigger for a session. When using ASP.Net or WCF, the profiler can show the URL that triggered the request, but when we are not using a url based mechanism, that turns out to be much harder.

So I set out to fix that, you can see the results below:

image

This session was generated by a message batch containing messages for MyBooks, MyQueue, etc.

The integration is composed of two parts, first, from the profiler perspective, you now have the ProfilerIntegration.CurrentSessionContext property, which allows you to customize how the profiler detects the current context.

The second part is the integration from the application framework itself, you can see how I did that for Rhino Service Bus, which will dynamically detect the presence of the profiler and fill the appropriate values. The result makes it a lot easier to track down what is going on.

time to read 3 min | 497 words

Well, I already covered how you can handle this challenge several times in the past, so I’ll not do that again. What I actually did is quite different. Instead of having to deal with the complexity (which is possible) I decided to remove it entirely.

The solution to my problem is to simplify the model:

image

Which is represented as the following physical data model:

image

Now, querying for this is about as simple as you can get:

select user0_.Id          as Id2_0_,
       book2_.Id          as Id0_1_,
       user0_.Name        as Name2_0_,
       user0_.Street      as Street2_0_,
       user0_.Country     as Country2_0_,
       user0_.City        as City2_0_,
       user0_.ZipCode     as ZipCode2_0_,
       user0_.HouseNumber as HouseNum7_2_0_,
       book2_.Name        as Name0_1_,
       book2_.ImageUrl    as ImageUrl0_1_,
       book2_.Image       as Image0_1_,
       book2_.Author      as Author0_1_,
       queue1_.[User]     as User1_0__,
       queue1_.Book       as Book0__,
       queue1_.[Index]    as Index3_0__
from   Users user0_
       inner join UsersWaitingBooks queue1_
         on user0_.Id = queue1_.[User]
       inner join Books book2_
         on queue1_.Book = book2_.Id
where  user0_.Id = 1 /* @p0 */

De-normalizing the model has significantly improved my ability to work with it.

time to read 2 min | 301 words

This issue came out in my Alexandria sample app. I wanted to have a Queue of books for each user, and each book has a collection of its authors. The model looks something like this:

image

And the physical data model:

image

So far, so good, and relatively simple and straightforward to work with, right?

Except, what kind of a query would you want to make to get all the books in the user’s queue? Here is the code that uses your results:

bus.Reply(new MyQueueResponse
{
    UserId = message.UserId,
    Timestamp = DateTime.Now,
    Queue = books.Select(book => new BookDTO
    {
        Id = book.Id,
        Image = book.Image,
        Name = book.Name,
        Authors = book.Authors.Select(x => x.Name).ToArray()
    }).ToArray()
});

Hint, note that we need to bring the Book’s image as well, and pay attention to the number of joins you require as well as the number of queries.

time to read 1 min | 121 words

Okay, the previous time I asked this question, I decided that buying a big desktop machine would be preferable, but I am currently traveling and I am pained at the speed of my current laptop.

Therefore, I am going to need a new one. I am the happy owner of a Mac Book Pro, but I can’t really justify purchasing a new Mac given the cost of a new Mac vs. a comparable PC. The Mac tax just doesn’t pay for itself at those levels for me.

Any recommendations? I want something 15” with enough horsepower to be a major development machine. If there is something like the Mac’s touchpad, I would love that.

Any ideas?

time to read 3 min | 461 words

As it turns out, there are a LOT of issues with this code:

public class QueueActions : IDisposable
{
    UnmanagedDatabaseConnection database;
    public string Name { get; private set; }

    public class QueueActions( UnmanagedDatabaseConnectionFactory factory)
    {
         database = factory.Create();
         database.Open(()=> Name = database.ReadName());
    }

   // assume proper GC finalizer impl

    public void Dispose()
    {
          database.Dispose();
    }
}

And the code using this:

using(var factory = CreateFactory())
{
   ThreadPool.QueueUserWorkItem(()=>
   {
          using(var actions = new QueueActions(factory))
          {
               actions.Send("abc");     
          }
    });
}

To begin with, what happens if we close the factory between the first and second lines in QueueActions constructors?

We already have an unmanaged resource, but when we try to open it, we are going to get an exception. Since the exception is thrown from the constructor, it will NOT invoke the usual using logic, and the code will not be disposed.

Furthermore, and the reason for the blog post about it. Dispose itself can also fail.

Here is the actual stack trace that caused this blog post:

Microsoft.Isam.Esent.Interop.EsentErrorException: Error TermInProgress (JET_errTermInProgress, Termination in progress)
at Microsoft.Isam.Esent.Interop.Api.Check(Int32 err) in Api.cs: line 1492
at Microsoft.Isam.Esent.Interop.Api.JetCloseTable(JET_SESID sesid, JET_TABLEID tableid) in Api.cs: line 372
at Microsoft.Isam.Esent.Interop.Table.ReleaseResource() in D:\Work\esent\EsentInterop\Table.cs: line 97
at Microsoft.Isam.Esent.Interop.EsentResource.Dispose() in EsentResource.cs: line 63
at Rhino.Queues.Storage.AbstractActions.Dispose() in AbstractActions.cs: line 146 

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. API Design (10):
    29 Jan 2026 - Don't try to guess
  2. Recording (20):
    05 Dec 2025 - Build AI that understands your business
  3. Webinar (8):
    16 Sep 2025 - Building AI Agents in RavenDB
  4. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  5. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
View all series

Syndication

Main feed ... ...
Comments feed   ... ...