Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,624
|
Comments: 51,250
Privacy Policy · Terms
filter by tags archive
time to read 5 min | 871 words

Well, I just added the last piece in what I consider the major features to Binsor, which is component references. I'm a big fan of decorators and chains of responsabilities, which mean that I tend to create a lot of references between objects.

In Binsor, it is as natural as this:

import Rhino.Commons

 

Component(defualt_repository, IRepository, NHRepository)

 

customer_repository = Component(customer_repository,

       IRepository of Customer, 

       CustomerValidationRepository of Customer)

#notice that I'm using @ here for compnent reference

customer_repository.inner = @defualt_repository

Just by defining the component, it is automagically expose it for references as @component_name. You can also define it in the Component itself as a compiled literal string, instead of a string. I can even use generic types there (which is a weakness in Boo), although not in a very nice syntax.

Update: The syntax weakness in generics in Boo was fixed (about 12 hours from the moment that I posted it), so the code above is very clean again. I know why I like OSS now!

With this done, I'm not in that blissful world where you got what you wanted and everything is fuzzy around the edges. I got three major things done in the last 12 hours, Windosr's IOC-29, Binsor (which required IOC-29, and NHibernate batching. I haven't done this amount of coding in what feels like ages.

Damn, that feels good.

The source can be found here

time to read 11 min | 2026 words

On the on-going battle between yours truly and XML, there has been a score on the good side!

I just finished implementing most of the usable functionality in Binsor, which is a Boo DSL* that is directed at configuring Windsor. Before I get into the details, take a look at the most minimal xml configuration possible for Windsor:

<?xml version="1.0" encoding="utf-8" ?>

<configuration>

       <components>

              <component id="default.repository"

                              service ="Rhino.Commons.IRepository`1, Rhino.Commons"

                              type="Rhino.Commons.NHRepository`1, Rhino.Commons"/>

       </components>

</configuration>

Now, let us take a look at the same configuration using Binsor:

import Rhino.Commons

Component("defualt_repository", IRepository, NHRepository)

This is it!

What is even more fun, is that you get to treat the component as if it was a real object, instead of a lump of configuration data that you need to work with. Take a look at how you do it for Windsor using XML here.

Take a look at how you do it using Binsor:

import Rhino.Commons

email = Component("email_sender", ISender, EmailSender)

email.Host = "example.dot.org"

It can't get any more natural than that.

Of course, the problem with the XML configuration for Windsor only start when you crazy with it and build an object graph of arbitrary depth no less than five. I stopped trying to make it work when I got 15 different XML files, mostly containing the same information.

Now, if I want to add the same component to listen for each of my server, I can simple do this:

import Rhino.Commons

 

servers = [   "server1",    "server2",    "server3"]

 

for server in servers:

       listerner = Component("listener_for_${server}", IListener, WSListener)

       listener.Server = server

If I want to make a change to the configuration, I can make it in one place, and all the listeners are updated.

Did I mention already that it is debuggable ?!

* Maybe DSL is taking it a bit too much, though. It is a single class that has some fancy sticks in it, and a lot of Boo magic.

time to read 3 min | 517 words

As of about 90 minutes ago, NHibernate has batching support. :-D

All the tests are green, but there may be things that broke in exciting ways, so I encourage you to try it out and see if you can break it. This functionality exists only for SQL Server, and only on .Net 2.0 (for complaints, go directly to the ADO.Net team).

You can enable this functionality by adding this to your hibernate configuration.

<

add key="hibernate.batch_size" value="10" />

Setting this size to very large number and treating NHibernate as an OO tool for bulk inserts is still not recommended.

My previous tests showed about 50% performance benefits over normal calls, I decided to try to take the new code for a speed using NHibernate's perf tests. They are fairly simple, but they are at least an indication of what is going on. The tests I run were all run against a local instance of SQL Server, with log level set to WARN. The test just compare similar operations using NHibernate and direct ADO.Net for some operations, usually inserts / deletes in increasing amounts. (For reference, I'm running the Simultanous() test from PerformanceTest fixture).

I should also mention that these are by no mean real benchmarks, it is more in the way of an indication.

With no batching:

(Image from clipboard).png

As you can see, there isn't much of a performance difference between the two, NHibernate has about 15% overhead, mostly it can be seen as a background noise, especially on the lower ranges.

Let us try with a batching of 25, shall we?

(Image from clipboard).png

Now the roles are reversed, and it is NHibernate that is faster. In fact, in this benchmark, it was on average faster by 25% - 30% than the direct ADO.Net code (without batching). Just for kicks, I run the benchmark with batch size of 256, and got about 30% - 45% improvements.

(Image from clipboard).png

All in all, I think that I like this :-D

As a side note, most of the performance in an ORM is not in the INSERT / UPDATE / DELETE side of things, but rather in how smart the engine in SELECTing the data. Issuing a thousands unnececary SELECTs is going to be a performance hog no matter what you do.
time to read 6 min | 1015 words

Another thing that came up in conversation today, a very good example of where the data set model breaks down. Take a look at the following (highly simplified) tables:

(Image from clipboard).png

Each type of employee gets a different salary calculation for overtime per the (current) salary type:

Global Hourly Global + Hourly
Manager None 15% over hourly rate 15% over hourly rate, to a max of totally 20% of monthly salary over a calendar month
Employee None 12% over hourly rate 12% precentage
Freelance 10$ hour 9% over hourly rate 10$ hour or 9% over hourly rate, lowest one

Given the above business rules, how do you propose to write the AddOverTime() method using the dataset model in a maintainable fashion? There will be additonal employee types and additional salary types in the future.

time to read 3 min | 439 words

Joel wrote about Ruby's performance, and DHH replied with a post showing how he outsourcedhe performance-intensive functions. To note, my only experiance in Ruby is writing very few Watir tests. So I can't really say anything about Ruby's perfomance first hand. I agree with DHH that this is a good thing, but I wonder about how to handle this in situations where the performance critical part is something that is core to the business logic.

I'm not talking about general stuff like image resizing, encryption or bayesian filtering (which I think you are crazy if you are writing your own for production). What I am talking about is that an application that is not mostly data entry and pre-calculated reports (a good example of which is a bug tracking system).

Let us assume a package delivery application, which let the user choose the route that they can send their packages, of the top of my head, you need to calculate cost (time & money) of moving the package each route while taking into account service level agreements, legal responsabilities, contract issues, past history,  validity dates, etc. This get complex very fast, and the amount of data that you need to consider is fairly big, especially if you need to consider business rules like (if customer send more than 10 package a month for the last 3 months, give 4% off, etc).

You can do this on the backend, to pre-calculate the most common ( or all ) routes and their costs, but it may very well be that you simply have too much parameters to do this pre-calculation (or are prevent for business reasons).

Assuming that I had a web application in Ruby on Rails, and I wanted to make the choose a route page work, how would I go about building it? This is mainly a CPU bound task, with a limited amount of data to fetch and process, but I can't easily drop down to C for this task. This is a task that involve quite a bit of business logic (just finding out if a contract is valid or not may be a complex process, for instance), which I would have to duplicate in C (I may be able to hand the data to the C program from Ruby in a usable form, but I doubt it) in order to gain the neccecary performance.

So, given this scenario (and, of course, assuming that doing this in Ruby is not performant enough), what are the options that I have?

time to read 2 min | 387 words

Take a guess, what is going to be the result of the following code?

Enum one = DayOfWeek.Sunday;

Enum two = DayOfWeek.Sunday;

Assert.IsTrue(one == two);

Update: Tomas Restrepo, hit the nail on the head in the comments.

System.Enum is a reference type. The first line actually translates to:

L_0001: ldc.i4.0
L_0002: box [mscorlib]System.DayOfWeek
L_0007: stloc.0

There is boxing done here, and then the == is doing reference equality. I actually learned about this from Tomas, a while ago, when discussing about wierd-ass questions.

This is a freakish issue because Enum inherits from ValueType, so you would expect it to keep value types semantics. Check the comments for Tomas' for explanation

time to read 4 min | 660 words

After last night's post about the performance benefits of SqlCommandSet, I decided to give the ADO.Net team some headache, and release the results in a reusable form.

The relevant code can be found here, as part of Rhino Commons. Beside exposing the batching functionality, it is very elegant (if I say so myself) way of exposing functionality that the original author decided to mark private / internal.

I really liked the declaration of this as well:

[

ThereBeDragons("Not supported by Microsoft, but has major performance boost")]
public class SqlCommandSet : IDisposable

The usage is very simple:

SqlCommandSet commandSet = new SqlCommandSet();

commandSet.Connection = connection;

for (int i = 0; i < iterations; i++)

{

       SqlCommand cmd = CreateCommand(connection);

       commandSet.Append(cmd);

}

int totalRowCount = commandSet.ExecuteNonQuery();

As a note, I spiked a little test of adding this capability to NHibernate, and it seems to be mostly working, I got 4 (out of 694) test failing because of this. I didn't check performance yet.

time to read 3 min | 490 words

I have ranted before about the annoying trend from Microsoft, to weld the hood shut in most of the interesting places. One particulary painful piece is the command batching implementation in .Net 2.0 for SQL Server. The is extremely annoying mainly because the implementation benefits are going for those who are going to be using DataSets (ahem, not me), but are not avialable to anyone outside of Microsoft. (See topic: OR/M, NHibernate, etc).

Today, I have decided to actually check what the performance difference are all about. In order to do this, I opened the (wonderful, amazing) Reflector and started digging. To my surprise, I found that the Batching implementation seems to be centralized around a single class, System.Data.SqlClient.SqlCommandSet (which is internal, of course, to prevent it from being, you know, useful).

Since the class, and all its methods, are internal to System.Data, I had to use Reflection to pry them out into the open. I noticed that the cost of reflection was fairly high, so I converted the test to use delegates, which significantly imporved perfromance. The query I run was a very simple query:

INSERT

INTO [Test].[dbo].[Blogs] ([blog_name]) VALUES (@name)

With the @name = 'foo' as the parameter value. The table is simple Id (identity), Blog_Name (nvarchar(50))

Note: Before each test, I truncated the table, to make sure it is not the additional data that is causing any slowdown.

The Results:

(Image from clipboard).png

The X axis is the number of inserts made, the Y axis is the number of ticks that the operation took. As you can see, there is quite a performance difference, even for small batch sizes. There is a significant difference between batching and not batching, and that reflection / delegates calls are not a big cost in this scenario.

Here is the cost of a smaller batch:

(Image from clipboard).png

This shows a significant improvement even for a more real-world loads, even when we use Reflection. 

I just may take advantage of this to implement a BatchingBatcher for NHibernate, it looks like it can make a good benefit for perfromance. Although this will probably not affect SELECT performance, which is usually a bigger issue.

You can get the code here: BatchingPerfTest.txt

FUTURE POSTS

  1. PropertySphere bot: understanding images - 3 days from now

There are posts all the way to Dec 26, 2025

RECENT SERIES

  1. Recording (20):
    05 Dec 2025 - Build AI that understands your business
  2. Webinar (8):
    16 Sep 2025 - Building AI Agents in RavenDB
  3. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  4. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
  5. RavenDB News (2):
    02 May 2025 - May 2025
View all series

Syndication

Main feed ... ...
Comments feed   ... ...
}