Ayende @ Rahien

Unnatural acts on source code

When you pit RavenDB & SQL Server against one another…

Here is how it works. I hate benchmarks, because they are very easily manipulated. Whenever I am testing performance stuff, I am posting numbers, but they are usually in reference to themselves (showing improvements).

That said…

Mark Rodseth .Net Technical Architect at Fortune Cookie in London, UK and he did a really interesting comparison between RavenDB & SQL Server. I feel good about posting this because Mark is a totally foreign agent (hm…. well, maybe not that Smile ) but he has no association with RavenDB or Hibernating Rhinos.

Also, this post really made my day.

Mark setup a load test for two identical applications, one using RavenDB, the other one using SQL Server. The results:

SQL Load Test
Transactions: 111,014 (Transaction = Single Get Request)
Failures: 110,286 (Any 500 or timeout)

And for RavenDB ?

RavenDB Load Test
Transactions: 145,554 (Transaction = Single Get Request)
Failures: 0 (Any 500 or timeout)

And now that is pretty cool.

Limit your abstractions: Application Events–Proposed Solution #2–Cohesion

In my previous post, I spoke about ISP and how we can replace the following code with something that is easier to follow:

image_thumb3_thumb

I proposed something like:

public interface IHappenOn<T>
{
   void Inspect(T item);
}

Which would be invoked using:

container.ExecuteAll<IHappenOn<Cargo>>(i=>i.Inspect(cargo));

Or something like that.

Which lead us to the following code:

public class CargoArrived : IHappenedOn<Cargo>
{
  public void Inspect(Cargo cargo)
  {
    if(cargo.Delivery.UnloadedAtDestination == false)
      return;
      
    // handle event
  }
}

public class CargoMisdirected : IHappenedOn<Cargo>
{
  public void Inspect(Cargo cargo)
  {
    if(cargo.Delivery.Misdirected == false)
      return;
      
    // handle event
  }
}

public class CargoHandled : IHappenOn<HandlingEvent>
{
   // etc
}

public class EventRegistrationAttempt : IHappenedOn<HandlingEventRegistrationAttempt>
{
  // etc
}

But I don’t really like this code, to be perfectly frank. It seems to me like there isn’t really a good reason why CargoArrived and CargoMisdirected are located in different classes. It is likely that there is going to be a lot of commonalities between the different types of handling events on cargo. We might as well merge them together for now, giving us:

public class CargoHappened : IHappenedOn<Cargo>
{
  public void Inspect(Cargo cargo)
  {
    if(cargo.Delivery.UnloadedAtDestination)
      CargoArrived(cargo);
      
    
    if(cargo.Delivery.Misdirected)
      CargoMisdirected(cargo);
      
  }
  
  public void CargoArrived(Cargo cargo)
  {
    // handle event
  }
  
  public void CargoMisdirected(Cargo cargo)
  {
    //handle event
  }
}

This code put a lot of the cargo handling in one place, making it easier to follow and understand. At the same time, the architecture gives us the option to split it to different classes at any time. We aren’t going to end up with a God class for Cargo handling. But as long as it make sense, we can keep them together.

I like this style of event processing, but we can probably do better job at if if we actually used event processing semantics here. I’ll discuss that in my next post.

Tags:

Published at

Originally posted at

Comments (5)

Limiting your abstractions: Reflections on the Interface Segregation Principle

I have found two definitions for ISP (in this post, I assume that you know what ISP is, if you don’t, look at the links below).

This one I can fully agree with:

Clients should not be forced to depend upon interfaces that they don't use.

And this one I strongly disagree with:

Many client specific interfaces are better than one general purpose interface

For fun, both of those links are from Object Mentor.

It would be more accurate to say that I don’t so much disagree with the intent of the second statement, but that I disagree with the wording. One (literal) way of understanding the second statement is to say that we want specific interfaces over generic ones. And I would strongly disagree. An interface is an abstraction (for the most part), and I want to reduce the amount that I have in my system.

Let us look at an example, based on my recent posts in this series.

image_thumb[3]_thumb

In my previous post, I said that I want to remove both CargoWasHandled and CargoWasMisdirected in favor of IHappenOnCargoInspection. But what about the two other methods?

Would ISP force me to have IHappenOnEventRegistrationAttempt and IHappenOnCargoHandling?

Sure, those are many specific interfaces, but are they really better than something like IHappenOn<T>? Is there something truly meaningful that gets lost when we have the generic interface?

I would say that this isn’t the case, further more, I would actually state that having a single interface will lead to more maintainable code, because there are less abstractions in the code. There is a codebase that is more cohesive and easier to understand.

But I’ll talk more about cohesion on my next post.

Tags:

Published at

Originally posted at

Send me a patch for that

This post is in reply to Hadi’s post. Please go ahead and read it.

Done? Great, so let me try to respond, this time, from the point of view of someone who regularly asks for patches / pull requests.

Here are a few examples.

To make things more interesting, the project that I am talking about now is RavenDB which is both Open Source and commercial. Hadi says:

Numerous times I’ve seen reactions from OSS developers, contributors or merely a simple passer by, responding to a complaint with: submit a patch or well if you can do better, write your own framework. In other words, put up or shut up.

Hadi then goes on to explain exactly why this is a high barrier for most users.

  • You need to familiarize yourself with the codebase.
  • You need to understand the source control system that is used and how to send a patch / pull request.

And I would fully agree with Hadi that those are stumbling blocks. I can’t speak for other people, but in our case, that is the intention.

Nitpicker corner here: I am speaking explicitly and only about features here. Bugs gets fixed by us (unless the user already submitted a fix as well).

Put simply, there is an issue of priorities here. We have a certain direction for the project that we want to take it. And in many cases, users want things that are out of scope for us for the foreseeable future. Our options then become:

  • Sorry, ain’t going to happen.
  • Sure, we will push aside all the work that we intended to do to do your thing.
  • No problem, we added that to the queue, expect it in 6 – 9 months, if we will still consider it important then.

None of which is an acceptable answer from our point of view.

Case in point, facets support in RavenDB was something that was requested a few times. We never did it because it was out of scope for our plan, RavenDB is a database server, not a search server and we weren’t really sure how complex this would be and how to implement this. Basically, this was an expensive feature that wasn’t in the major feature set that we wanted. The answer that we gave people is “send me a pull request for that”.

To be clear, this is basically an opportunity to affect the direction of the project in a way you consider important. What ended up happening is that Matt Warren took up the task and created an initial implementation. Which was then subject to intense refactoring and finally got into the product. You can see the entire conversation about this here. The major difference along the way is that Matt did all the research for this feature, and he had working code. From there the balance change. It was no longer an issue of expensive research and figuring out how to do it. It was an issue of having working code and refactoring it so it matched the rest of the RavenDB codebase. That wasn’t expensive, and we got a new feature in.

Here is another story, a case where I flat out didn’t think it was possible. About two years ago Rob Ashton had a feature suggestion (ad hoc queries with RavenDB). Frankly, I thought that this was simply not possible, and after a bit of back and forth, I told Rob:

Let me rephrase that.
Dream up the API from the client side to do this.

Rob went away for a few hours, and then came back with a working code sample. I had to pick my jaw off the floor using both hands. That feature got a lot of priority right away, and is a feature that I routinely brag about when talking about RavenDB.

But let me come back again to the common case, a user request something that isn’t in the project plan. Now, remember, requests are cheap. From the point of view of the user, it doesn’t cost anything to request a feature. From the point of view of the project, it can cost a lot. There is research, implementation, debugging, backward compatibility, testing and continuous support associated with just about any feature you care to name.

And our options whenever a user make a request that is out of line for the project plan are:

  • Sorry, ain’t going to happen.
  • Sure, we will push aside all the work that we intended to do to do your thing.
  • No problem, we added that to the queue, expect it in 6 – 9 months, if we will still consider it important then.

Or, we can also say:

  • We don’t have the resources to currently do that, but we would gladly accept a pull request to do so.

And that point, the user is faced with a choice. He can either:

  • Oh, well, it isn’t important to me.
  • Oh, it is important to me so I have better do that.

In other words, it shift the prioritization to the user, based on how important that feature is.

We recently got a feature request to support something like this:

session.Query<User>()
   .Where(x=> searchInput.Name != null && x.User == searcInput.Name)
   .ToArray();

I’ll spare you the details of just how complex it is to implement something like that (especially when it can also be things like: (searchInput.Age > 18). But the simple work around for that is:

var q = session.Query<User>();
if(searchInput.Name != null)
  q = q.Where(x=> x.User == searcInput.Name);

q.ToArray();

Supporting the first one is complex, there is a simple work around that the user can use (and I like the second option from the point of view of readability as well).

That sort of thing get a “A pull request for this feature would be appreciated”. Because the alternative to that is to slam the door in the user’s face.

Limit your abstractions: Application Events–Proposed Solution #1

In my previous post, I explained why I really don’t the following.

image_thumb[3]

public class CargoInspectionServiceImpl : ICargoInspectionService 
{
  // code redacted for simplicity

 public override void InspectCargo(TrackingId trackingId)
 {
    Validate.NotNull(trackingId, "Tracking ID is required");

    Cargo cargo = cargoRepository.Find(trackingId);
    if (cargo == null)
    {
      logger.Warn("Can't inspect non-existing cargo " + trackingId);
      return;
    }

    HandlingHistory handlingHistory = handlingEventRepository.LookupHandlingHistoryOfCargo(trackingId);

    cargo.DeriveDeliveryProgress(handlingHistory);

    if (cargo.Delivery.Misdirected)
    {
      applicationEvents.CargoWasMisdirected(cargo);
    }

    if (cargo.Delivery.UnloadedAtDestination)
    {
      applicationEvents.CargoHasArrived(cargo);
    }

    cargoRepository.Store(cargo);
 }
}

Now, let us see one proposed solution for that. We can drop the IApplicationEvents.CargoHasArrived and IApplicationEvents.CargoWasMisdirected, instead creating the following:

public interface IHappenOnCargoInspection
{
   void Inspect(Cargo cargo);
}

We can have multiple implementations of this interface, such as this one:

public class MidirectedCargo : IHappenOnCargoInspection
{
   public void Inspect(Cargo cargo)
   {
        if(cargo.Delivery.Misdirected == false)
             return;

        // code to handle misdirected cargo.
   }
}

In a similar fashion, we would have a CargoArrived implementation, and the ICargoInspectionService would be tasked with managing the implementation of IHappenOnCargoInspection, probably through a container. Although I would probably replace it with something like:

container.ExecuteOnAll<IHappenOnCargoInspection>(i=>i.Inspect(cargo));

All in all, it is a simple method, but it means that now the responsibility to detect and act is centralized in each cargo inspector implementation. If the detection of misdirected cargo is changed, we know that there is just one place to make that change. If we need a new behavior, for example, for late cargo, we can do that by introducing a new class, which implement the interface. That gives us the Open Closed Principle.

This is better, but I still don’t like it. There are better methods than that, but we will discuss them in another post.

Tags:

Published at

Originally posted at

Comments (7)

Limit your abstractions: Application Events–what about change?

In my previous post, I showed an example of application events and asked what is wrong with them.

image

public class CargoInspectionServiceImpl : ICargoInspectionService 
{
  // code redacted for simplicity

 public override void InspectCargo(TrackingId trackingId)
 {
    Validate.NotNull(trackingId, "Tracking ID is required");

    Cargo cargo = cargoRepository.Find(trackingId);
    if (cargo == null)
    {
      logger.Warn("Can't inspect non-existing cargo " + trackingId);
      return;
    }

    HandlingHistory handlingHistory = handlingEventRepository.LookupHandlingHistoryOfCargo(trackingId);

    cargo.DeriveDeliveryProgress(handlingHistory);

    if (cargo.Delivery.Misdirected)
    {
      applicationEvents.CargoWasMisdirected(cargo);
    }

    if (cargo.Delivery.UnloadedAtDestination)
    {
      applicationEvents.CargoHasArrived(cargo);
    }

    cargoRepository.Store(cargo);
 }
}

This is very problematic code, from my point of view, for several reasons. Look at how it allocate responsibilities. IApplicationEvents is supposed to execute the actual event, but deciding when to execute the event is left for the caller. I said several reasons, but this is the main one, all other flows from it.

What happen when the rules for invoking an event change? What happen when we want to add a new event?

The way this is handled is broken. It violates the Open Closed Principle, it violates the Single Responsibility Principle and it frankly annoys me.

Can you think about ways to improve this?

I’ll discuss some in my next post.

Tags:

Published at

Originally posted at

Comments (22)

RavenDB workshop in NDC, Oslo 4-5th June

The RavenDB workshop is coming to Oslo, Norway in June!

Join us to an intensive RavenDB hands-on workshop just before the great NDC conference starts.

During the first day of this workshop we will get to know RavenDB and its core concepts, get comfortable with its API, learn how to build and customize indexes, and how to correctly model data for use in a document database.

After getting familiar with all the basics in the first day, during the second day we will build on that knowledge to properly grok Map⁄Reduce, Multi–maps and other advanced usages of indexes, learn how to extend RavenDB and the various options of scaling out.

More details on the workshop and the conference can be found here.

Tags:

Published at

Originally posted at

Limit your abstractions: Application Events–the wrong way

In my previous post, I have taken a few interfaces from a DDD sample application and called the application procedural and hard to maintain. In this post, I want to show you exactly why.

We will start with examining this interface, and how it is used:

image_thumb[3]

public class CargoInspectionServiceImpl : ICargoInspectionService 
{
  // code redacted for simplicity

 public override void InspectCargo(TrackingId trackingId)
 {
    Validate.NotNull(trackingId, "Tracking ID is required");

    Cargo cargo = cargoRepository.Find(trackingId);
    if (cargo == null)
    {
      logger.Warn("Can't inspect non-existing cargo " + trackingId);
      return;
    }

    HandlingHistory handlingHistory = handlingEventRepository.LookupHandlingHistoryOfCargo(trackingId);

    cargo.DeriveDeliveryProgress(handlingHistory);

    if (cargo.Delivery.Misdirected)
    {
      applicationEvents.CargoWasMisdirected(cargo);
    }

    if (cargo.Delivery.UnloadedAtDestination)
    {
      applicationEvents.CargoHasArrived(cargo);
    }

    cargoRepository.Store(cargo);
 }
}

Can you see what I find painful in this code?

Tags:

Published at

Originally posted at

Comments (34)

Limit your abstractions: Analyzing a DDD application

Abstractions have a cost. You should limit them. That seems like an obvious statement, but in a recent discussion I had, I realized that I didn’t articulate things in quite the proper way before.

Let me see if I can explain better now. One of the problems in typical applications is that we don’t really think before we introduce abstractions. For the purpose of this discussion, an abstraction in an interface. Let us take a look at a sample DDD application.

Nitpicker corner: No, I am not saying avoid using interfaces.

I took all the interesting interfaces out of the application, you can see them here:

image

Take a look at those interfaces. They bother me. They bother me because each of them represent an abstraction that is specific for a particular problem. In other words, it represent an non-abstracted abstraction, if that make any sense.

As I said, this is taken a DDD sample application. It isn’t a big one (and no, I didn’t review the actual code to see if it is a good one), but the interfaces that it has reveal a common problem, namely, interface explosion, or over abstraction. I removed any infrastructure / persistence stuff that was in the app, so you are looking just at the business interfaces, mind.

The problem is that the way this application is structured, it is highly procedural and had to maintain.

Huh?! I can hear you say, procedural? This is a DDD app, just look at the names, we have services and facades and events and… those are all good things. This design is pretty much text book. How can you say that this design is hard to maintain?

I’ll answer this question (and propose answers) in this series of posts. In the meantime, feel free to look at the code (it is Java, in its origin, and I simply modified it to C# for easier working) and make your own conclusions.

Tags:

Published at

Originally posted at

Comments (55)

Embracing RavenDB

RavenDBliconBurgandyIn fact, for over 30 years or so, the Database Wars have been settled, the relational databases have won the fight, and the decision left was which relational database to use. Everyone “knows” that NoSQL is something that Google invented to handle the amount of data and users they have, and that is it somehow related to scaling to Google or Facebook levels.

And since most of us aren’t working on applications that have millions of users, we get to keep using the tried and true methods, no need to bother with something that is only relevant at extreme high scale. The relational database has served us well, and can continue serving us in the future. Learning SQL was a very smart investment, after all.

Let us go back a bit to those Database Wars that I mentioned. In the 70s there were actually quite a lot of different types of databases competing with each other, anything from ISAM to DBase to relational databases. And as you can see, the relational database has won such a decisive victory that it rules undisputed for over a generation.

But one thing that we have to remember is that those Database Wars were fought on a drastically different ground than the one we have today.

1980 vs. 2012

In 1980, a 10 MB hard disk (that is in megabytes, about 2 songs or 3 pictures) cost around 4,000 $ US. Adjusting for inflation, that comes at about 11,000 $ US in today’s dollars. Just to compare, today a 10 MB of disk space would cost you about half a dollar cent. And those aren’t the only changes, of course. Computation speed, memory sizes and networks are all many orders of magnitude faster and cheaper than they were in the 80s.

Even more interesting is the differences in the type of applications being built. In the 80s, a typical application had exactly one user. Multi users’ applications had to support… 3 users. All of them at the same time! The UI paradigms were drastically different, as well. At the time, the master – details form was the top of the hill, the uncontested king of good UI design. But today… there are usually so many items and active elements on a single web page today as there were in entire applications then.

Why the history lesson, you ask? Why, to give you some perspective on the design choices that led to the victory of the relational databases. Space was at a premium, the interaction between the user and the application closely modeled the physical layout of the data in the database. That made sense, because there really were no other alternatives given the environment that existed at the time.

That environment is no longer here, and the tradeoffs made when 30 MB would cost as much as an annual salary are no longer relevant. In particular, one immensely annoying aspect of relational database is very much a problem today. Relational databases trade off read speed in favor of write speed. Because it made perfect sense to make this trade off when disk space was so costly, and making users wait an extra second was no big deal.

In separate studies conducted by both Google and Amazon, they found that even additional 100ms added to the latency of a page severely impacted their bottom line. And yet relational database tradeoff read speed (having to do joins and extra loads) for write speed (having to write small amount of data).

Another problem that pops up frequently with relational database is their inability to handle complex data types, such as collections or nested objects. Oh, you can certainly map that to a relational database, but that requires additional tables, and each additional table is going to make it that much more complex to query the data, work with it and display it to the user.

For many years, I have been working with customers on optimizing their applications using relational databases, and I’ve seen the same problems occur over and over again. That led me to the belief that the NoSQL databases aren’t suitable just for extremely scalable scenarios. NoSQL databases make sense for a wide range of options, and this realization led me to RavenDB.

In my company, we are using RavenDB as the backend database for everything from a blog, our ordering and purchasing systems, the daily build server and many more.

The major advantages that we found weren’t the ability to scale (although that exists), it is the freedom that it gives us in terms of modeling our data and changing our minds.

Tags:

Published at

Originally posted at

Comments (62)

Ask Ayende: What about the QA env?

Matthew Bonig asks, with regards to a bug in RavenDB MVC Integration (RavenDB Profiler) that caused major slow down on this blog.:

I'd be very curious to know how this code got published to a production environment without getting caught. I would have thought this problem would have occurred in any testing environment as well as it did here. Ayende, can you comment on where the process broke down and how such an obvious bug was able to slip through?

Well, the answer for that comes in two parts. The first part  is that no process broke down. We use our own assets for final testing of all our software, that means that whenever there is a stable RavenDB release pending (and sometimes just when we feel like it) we move our infrastructure to the latest and greatest.

Why?

Because as hard as you try testing, you will never be able to catch everything. Production is the final test ground, and we have obvious incentives of trying to make sure that everything works.  It is dogfooding, basically. Except that if we get a lemon, that is a very public one.

It means that whenever we make a stable release, we can do that with high degree of confidence that everything is going to work, not just because all the tests are passing, but because our production systems had days to actually see if things are right.

The second part of this answer is that this is neither an obvious bug nor one that is easy to catch. Put simply, things worked. There wasn’t even an infinite loop that would make it obvious that something is wrong, it is just that there was a lot of network traffic that you would notice only if you either had a tracer running, or were trying to figure out why the browser was suddenly so busy.

Here is a challenge, try to devise some form of an automated test that would catch something like this error, but do so without actually testing for this specific issue. After all, it is unlikely that someone would have written a test for this unless they run into the error in the first place. So I would be really interested in seeing what sort of automated approaches would have caught that.

Tags:

Published at

Originally posted at

Comments (19)

Bug Hunt: What made this blog slow?

A while ago the blog start taking 100% CPU on the client machines. Obviously we were doing something very wrong there, but what exactly was it?

We tracked down the problem to the following code:

image_thumb

image_thumb[1]

As you can probably guess, the problem is that we have what is effective an infinite loop. On any Ajax request, we will generate a new Ajax request. And that applies to our own requests as well.

The fix was pretty obvious when we figured out what was going on, but until then…

image

Tags:

Published at

Originally posted at

Comments (14)

Bug Hunt: What made this blog slow?

A while ago the blog start taking 100% CPU on the client machines. Obviously we were doing something very wrong there, but what exactly was it?

We track down the problem to the following code, can you figure out what the problem?

image

image

Tags:

Published at

Originally posted at

Comments (11)

Northwind Starter Kit Review: Conclusion

This is a review of the Northwind Starter Kit project, this review revision 94815 from Dec 18 2011.

A while ago I said:

Seriously?!  22(!) projects to do a sample application using Northwind?

And people took me up to task about it. The criticism was mostly focused on two parts:

  • I didn’t get that the project wasn’t about Northwind, but about being a sample app for architectural design patterns.
  • I couldn’t actually decide that a project was bad simply by looking at the project structure and some minor code browsing.

I am sad to say that after taking a detailed look at the code, I am even more firmly back at my original conclusion.  I started to do a review of the UI code, but there really is no real need to do so.

The entire project, as I said in the beginning, is supposed to be a sample application for Northwind. Northwind is a CRUD application. Well, not exactly, it is supposed to be an example of an Online Store, which is something much bigger than just Northwind. But it isn’t.

Say what you will, the Northwind Starter Kit is a CRUD application. It does exactly that, and nothing else. It does so in an incredibly complicated fashion, mind, but that is what it does.

Well, it doesn’t do updates, or deletes, or creates. So it is just an R application (I certainly consider the codebase to be R rated, not for impressionable developers).

If you want to have a sample application to show off architectural ideas, make sure that the application can actually, you know, show them. The only thing that NSK does is loading stuff from the database, try as I might, I found no real piece of business logic, no any reason why it is so complicated.

So, to the guys who commented on that, it isn’t a good project. If you like it, I am happy for you, there are also people who loves this guy:

Personally, I would call pest control.

Ask Ayende: Handling filtering

With regards to my quests against repositories, Matt asks:

…if my aggregate root query should exclude entities that have, for example, and IsActive = false flag, I also don't want to repeatedly exclude the IsActive = false entities. Using the repository pattern I can expose my Get method where internally it ALWAYS does this.

The problem with this question is that it make a false assumption, then go ahead and follow on that false assumption. The false assumption here is that the only way to handle the IsActive = false in by directly querying that. But that is wrong.

With NHibernate, you can define that with a where condition, or as a filter. With RavenDB, you can define that inside a query listener. You can absolutely set those things up as part of your infrastructure, and you won’t need to create any abstractions for that.

Tags:

Published at

Originally posted at

Comments (17)

Northwind Starter Kit Review: That CQRS thing

This is a review of the Northwind Starter Kit project, this review revision 94815 from Dec 18 2011.

It is obvious from reading the code that there was some attention given to CQRS. Unfortunately, I can’t really figure out what for.

To start with, both the Read Model and the Domain Model are actually sitting on the same physical location. If you are doing that, there is a 95% chance that you don’t need CQRS. If you have that, you are going to waste a lot of time and effort and are very unlikely to get anything from it.

In the case of NSK, here is the domain model vs. the read model for the customer.

imageimage

I marked the difference.

I am sorry, there is nothing that justify a different model here. Just needless complexity.

Remember, our job is to make things simpler, not make it hard to work with the application.

Northwind Starter Kit Review: It is all about the services

This is a review of the Northwind Starter Kit project, this review revision 94815 from Dec 18 2011.

Okay, enough about the data access parts. Let us see take a look at a few of the other things that are going on in the application. In particular, this is supposed to be an application with…

Domain logic is implemented by means of a Domain Model, onto a layer of services adds application logic. The model is persisted by a DAL designed around the principles of the "Repository" patterns, which has been implemented in a LINQ-friendly way.

Let us try to figure this out one at a time, okay?

The only method in the domain model that have even a hint of domain logic is the CalculateTotalIncome method. Yes, you got it right, that is a method, as in singular. And that method should be replaced with a query, it has no business being on the domain model.

So let us move to the services, okay? Here are the service definitions in the entire project:

image

Look at the methods carefully. Do you see any action at all? You don’t, the entire thing is just about queries.

And queries should be simple, not abstracted away and made very hard to figure out.

The rule of the thumb is that you try hard to not abstract queries, it is operations that you try to abstract. Operations is where you usually actually find the business logic.

Northwind Starter Kit Review: From start to finishing–tracing a request

This is a review of the Northwind Starter Kit project, this review revision 94815 from Dec 18 2011.

One of the things that I repeatedly call out is the forwarding type of architecture, a simple operation that is hidden away by a large number of abstractions that serves no real purpose.

Instead of a controller, let us look at a web service, just to make things slightly different. We have the following:

image

Okay, let us dig deeper:

image

I really like the fact that this repository actually have have FindById method, which this service promptly ignores in favor of using the IQueryable<Customer> implementation. If you want to know how that is implemented, just look (using the EF Code First repository implementations, the others are fairly similar):

image

 

All in all, the entire thing only serves to make things harder to understand and maintain.

Does anyone really think that this abstraction adds anything? What is the point?!

Ask Ayende: Life without repositories, are they worth living?

With regards to my quests against repositories, Matt asks:

For example, you dismiss the repository pattern, but what are the alternatives? For example, in an ASP.NET web application you have controllers. I do NOT want to see this code in my controllers:

var sessionFactory = CreateSessionFactory();

using (var session = sessionFactory.OpenSession()) { using (var transaction = session.BeginTransaction()) { // do a large amount of work

// save entities
session.SaveOrUpdate(myEntity);

transaction.Commit();

} }

That is ugly, repetitive code. I want in my service methods to Get, update, save, and not have to worry about the above.

This is a straw dummy. Set up the alternative as nasty and unattractive as possible, then call out the thing you have just set up as nasty and unattractive. It is a good tactic, except that this isn’t the alternative at all.

If you go with the route that Matt suggested, you are going to get yourself into problems. Serious ones. But that isn’t what I recommend. I talked about this scenario specifically in this post. This is how you are supposed to set things up. In a way that doesn’t get in the way of the application. Everything is wired in the infrastructure, and we can just rely on that to be there. And in your controller, you have a Session property that get the current property, and that is it.

For bonus points, you can move your transaction handling there as well, so you don’t need to handle that either. It makes the code so much easier to work with, because you don’t care about all those external concerns, they are handled elsewhere.

Tags:

Published at

Originally posted at

Comments (53)

Northwind Starter Kit Review: If you won’t respect the database, there will be pain

This is a review of the Northwind Starter Kit project, this review revision 94815 from Dec 18 2011.

The database is usually a pretty important piece in your application, and it likes to remind you that it should be respected. If you don’t take care of that, it will make sure that there will be a lot of pain in your future. Case in point, let us look at this method:

image

It looks nice, it is certainly something that looks like a business service. So let us dig down and see how it works.

image

It seems like a nice thing, the code is clear, and beside the bug where you get 100% discount if you buy enough and the dissonance between the comment and the code, fairly clear. And it seems that we have service logic and entity logic, which is always nice.

Except that this piece of code issues the following queries (let us assume a customer with 50 orders).

1 Query to load the customer, line 34 in this code. And now let us look at line 35… what is actually going on here:

image

Okay, so we have an additional query for loading the customer’s orders. Let us dig deeper.

image

And for each order, we have another query for loading all of that order’s items. Does it gets worse?

image

Phew! I was worried here for a second.

But it turns out that we only have a Select N+2 here, where N is the number of orders that a customer has.

What do you want, calculating the discount for the order is complicated, it is supposed to take a lot of time. Of course, the entire thing can be expressed in SQL as:

SELECT 
  SUM((UnitPrice * Quantity) * (1 - Discount) Income
FROM OrderItems o
WHERE o.OrderID in (
  SELECT Id FROM Orders
  WHERE CustomerId = @CustomerId
)

But go ahead and try putting that optimization in. The architecture for the application will actively fight you on that.

Ask Ayende: Repository for abstracting multiple data sources?

With regards to my recommendation to not use repositories, Remyvd asks:

… if you have several kind of data sources in different technologies, then it would be nice if you have one kind of interface. Also when an object (like Customer) is combined from data out of different data sources, the repository is for me a good place to initialize the object and return it. How would you solve this cases?

My answer is: System.ArgumentException: Question assumes invalid state.

More fully, this is one of those times where, in order to actually answer the question, we have to correct the question. Why do I say that?

Well, the question makes the assumption that actually combining the customer entity out of different data stores is desirable. Having made that assumption, it proceed to see what is the best way to do that. I am not going to recommend a way to do that, because the underlying assumption is wrong.

If your Customer information is stored in multiple data stores, you have to ask yourself, is it actually the same thing in all places? For example, we may have Customer entity in our main database, Customer Billing History in the billing database, Customer credit report accessible over a web service, etc. Note what happens when we start actually drilling down into the entity design. It suddenly becomes clear that that information is in different data stores for a reason.

Those aren’t the druids you are looking for might be a good quote here. The fact that the information is split usually means that there is a reason for that. The information is handled differently, usually by different teams and applications, it deals with different aspects of the entity, etc.

Trying to abstract that away behind a repository layer loses that very important distinction. It also forces us to do a lot of additional work, because we have to load the customer entity from all of the different data stores every time we need it. Even if most of the data that we need is not relevant for the operation at hand.

If would be much easier, simpler and maintainable to actually expose the idea of the multiple data stores to the application at large. You don’t end up with a leaky abstraction and it is easy to see when and how you actually need to combine the different data stores, and what the implications of that are for the specific scenarios that requires it.

Tags:

Published at

Originally posted at

Comments (43)

Northwind Starter Kit Review: Refactoring to an actual read model

This is a review of the Northwind Starter Kit project, this review revision 94815 from Dec 18 2011.

In my previous post, I talked about how the CatalogController.Product(id) action was completely ridiculous in the level of abstraction that it used to do its work and promised to show how to do the same work on an actual read model in a much simpler fashion. Here is the code.

image

There are several things that you might note about this code:

  • It is all inline, so it is possible to analyze, optimize and figure out what the hell is going on easily.
  • It doesn’t got to the database to load data that it already has.
  • The code actually does something meaningful.
  • It only do one thing, and it does this elegantly.

This is actually using a read model. By, you know, reading from it, instead of abstracting it.

But there is a problem here, I hear you shout (already reaching for the pitchfork, at least let me finish this post).

Previously, we have hidden the logic of discontinued products and available products behind the following abstraction:

image

Now we are embedding this inside the query itself, what happens if this logic changes? We would now need to go everywhere we used this logic and update it.

Well, yes. Except that there are two mitigating factors.

  • This nice abstraction is never used elsewhere.
  • It is easy to create our own abstraction.

Here is an example on how to do this without adding additional layers of abstractions.

image

Which means that our action method now looks like this:

image

Simple, easy, performant, maintainable.

And it doesn’t make my head hurt or cause me to stay up until 4 AM feeling like an XKCD item.

A meta post about negative code reviews

A lot of people seems to have problems whenever I post a code review. The general theme of the comments is mostly along the lines of:

  1. You are an evil person and a cyber bully to actually do those sort of things and humiliate people.
  2. You have something against the author of the personally, and you set out to avenge them.
  3. There is no value in doing a negative code review.
  4. You never do any good reviews, only bad ones.
  5. You only tells us what is wrong, but not what is right.

There are a bunch of other stuff, but those are the main points.

For point 1 & 2, I have the same answer:

I talk about the code, not the person. I am actually very careful with my phrasing whenever I do this sort of a review. The code or the architecture is wrong, not the person. There is a difference, and a big one.

For point 3:

Most of those code reviews are generated because someone asks me to do them. And given some of the responses to the posts, I feel that they serve a very good purpose. Here is one such comment. I redacted the project name, because it doesn’t really matter, but the point stands:

I grew up on *** as it was my first layered application, almost 5 years ago and I personally believe that the effort ****** has put into this project during the years is simply amazing. The architecture reflect the most common architectural design patterns and represents almost the concepts expressed by Fowler and Evans. *** is not a project to teach you how to work with Northwind and it is not a project designed exclusively for Nhb. It is designed to show how a layered application should be architected in .NET and there is also a book wrote around this project.

My professional opinioned, backed by over a decade of practical experience and work in the field tells me that the project in question is actually not a good template for an application. And I feel that by pointing out exactly why, I am doing a service to the community.

Look, it is actually quite simple. The major reason that I do those negative code reviews is because I keep seeing the same types of mistakes repeated over and over again at customer sites. And the major reason is that those projects follow best practices as they see it. The problem is that they usually ignore the context of those best practices, so it becomes a horrible mess.

What is worse, there is the issue of the non coding architect, when you have someone that doesn’t actually have responsibility for the output making decisions about how it is going to be built. And those things are actually hard to fight against, precisely because they are considered to be best practices. One of the reasons that I am pointing out the problems in those projects is to serve as a reference point for other people when they need a way to escape the over architecture.

For point 4:

It takes something out of the ordinary to get me to actually post something to the blog. The barrier for a negative code review is how much is annoys me. The barrier for a positive code review is how much it impresses me. It is easier to annoy me to impress me, admittedly, but I quite frequently do good code reviews.

The problem is that most good code bases are actually fairly boring. That is pretty much the definition of a good codebase, of course Smile, so there isn’t really that much to talk about.

For point 5:

That usually come up when I do negative code reviews, “you only show the bad stuff, it doesn’t teach us how to do it right”. Well, that is pretty much the point of a negative code review. To show the bad stuff so you would know how to avoid it. There are literally thousands of posts in this blog, and quite a few of them are actually devoted to discussions on how to do it right. I have very little inclination to repeat that advice again in every post.

Even a blog post must obey the single responsibility principle.

Tags:

Published at

Originally posted at

Comments (61)

Northwind Starter Kit Review: Data Access review thoughts

This is a review of the Northwind Starter Kit project, this review revision 94815 from Dec 18 2011.

In my last few posts, I have gone over the data access strategy used in NSK. I haven’t been impressed. In fact, I am somewhere between horrified, shocked and amused in the same way you feel when you see a clown slipping on a banana peel.  Why do I say that? Let us trace a single call from the front end all the way to the back.

The case in point, CatalogController.Product(id) action. This is something that should just display the product on the screen, so it should be fairly simple, right? Here is how it works when drawn as UML:

image

To simplify things, I decided to skip any method calls on the same objects (there are more than a few).

Let me show you how this looks like in actual code:

image

Digging deeper, we get:

image

We will deal with the first method call to CatalogServices now, which looks like:

image

I’ll skip going deeper, because this is just a layer of needless abstraction on top of Entity Framework and that is quite enough already.

Now let us deal with the second call to CatalogServices, which is actually more interesting:

image

Note the marked line? This is generating a query. This is interesting, because we have already loaded the product. There is no way of optimizing that, of course, because the architecture doesn’t let you.

Now, you need all of this just to show a single product on the screen. I mean, seriously.

You might have noticed some references to things like Read Model in the code. Which I find highly ironic. Read Models are about making the read side of things simple, not drowning the code in abstraction on top of abstraction on top of abstraction.

In my next post, I’ll show a better way to handle this scenario. A way that is actually simpler and make use an of actual read model and not infinite levels of indirection.

Northwind Starter Kit Review: The parents have eaten sour grapes, and the children’s teeth are set on edge

This is a review of the Northwind Starter Kit project, this review revision 94815 from Dec 18 2011.

In my previous posts, I focused mostly on the needlessness of the repositories implementation and why you want to avoid that (especially implementing it multiple times). In this post, I want to talk about other problems regarding the data access. In this case, the sudden urge to wash my hands that occurred when I saw this:

image

I mean, you are already using an OR/M. I don’t like the Repository implementation, but dropping down to SQL (and unparapetrized one at that) seems uncalled for.

By the way, the most logical reason for this to be done is to avoid mapping the Picture column to the category, since the OR/M in use here (EF) doesn’t support the notion of lazy properties.

Again, this is a problem when you are trying to use multiple OR/Ms, and that is neither required nor really useful.

Okay, enough about the low level data access details. On my next post I’ll deal with how those repositories are actually being used.