Benchmarks are useless, yes, again

time to read 4 min | 691 words

I was informed about ORMBattle.NET existence a few minutes ago. This is a site that:

ORMBattle.NET is devoted to direct ORM comparison. We compare quality of essential features of well-known ORM products for .NET framework, so this site might help you to:

Compare the performance of your own solution (based on a particular ORM listed here) with peak its performance that can be reached on this ORM, and thus, likely, to improve it.

Choose the ORM for your next project taking its performance and LINQ implementation quality into account.

NHibernate is faring rather poorly in the tests there, and I thought that I might respond to that.

First, I want to point out that the company behind the website is Xtensive, which makes another ORM Product. This is not to discredit them, they make it very clear who is behind the site and that they have an ORM product in the benchmark. And they were the one who contacted me about it.

The problem with benchmarks, especially when you are trying to compare a wide variety of products that have different feature sets and different capabilities, is that they are essentially useless. The problem is that in order to be able to measure anything useful, you have to resort to the common denominators, and they are pretty bad.

In the case of the benchmark scenarios used for NHibernate, it shows the problem rather clearly.

I don’t even need to think about anything to know that this is going to perform badly. There are several reasons for that. First, NHibernate was never intended to be a batch processing tool. It is an OLTP tool. The benchmark (all the tests in the benchmark) are aimed specifically at measuring batch processing. That is problem number one.

Second, NHibernate actually contains a lot of features that are aimed to give us great performance in many scenarios, including batch processing. For example, in this case, using a stateless session alone would generate a significant performance boost, not to mention that there is no use of batching whatsoever.

This two tests are also showing examples of “this isn’t how we do things”. For one thing, calling the database in a loop is a bug. For another, I would generate the same result, which greatly improved performance using Executable DML:

session.CreateQuery("update Simplest s set s.Value = s.Value + 1").ExecuteUpdate();

session.CreateQuery("delete Simplest").ExecuteUpdate();

And so on, and so forth.

To summarize, because I have talked about this before. Trying to compare different products without taking into account their differences is a flawed approach. Moreover, this benchmark is intentionally trying to measure something that NHibernate was never meant to perform. We don’t need to try to optimize those things, because they are meaningless, we have much better ways to resolve things.

And you know what? The problem is that even if you did put those in, they would still be invalid. Benchmarks tend to ignore such things as the impact of the builtin caching features, or the optimization options that are available in the mapping.

In short, if you want me to “admit” that NHibernate isn’t a batch processing tool, I will do so gladly, it was never meant to be. And benchmarks that try to show how it is in batch processing are going to show it being slow. For real world application development, however, NHibernate is a great fit, and show excellent performance.

Oh, and because management told me that I must, if you find perf problems with NHibernate, that is why we have the NHibernate Profiler for you :-)

Tweet Share Share 207 comments

Tags:

NHibernate

Comments

15 Aug 2009
06:51 AM

junior programmer

what tool do you recommend for batching processing?

15 Aug 2009
08:15 AM

fanny

While I do agree that things can be done better in the code for comparing with Nhibernate, I do not like that you dismiss the whole project as useless because you can't compare different products with different capabilities.

All these products are meant to solve only one and very common problem most of the software project have: data persistence.

And all the developers (and their manager) know that the performance of the solution influence greatly the performance of the whole application.

It's just too easy to say that you should not compare.

Join the project, modify and add some tests, join the battle and show that Nhibernate is better!

15 Aug 2009
08:59 AM

Davy Landman

The thing I found most interesting was that the high score for DataObjects.Net ... I hadn't heard about it, and it turns out, the creators of the website are also the developers of that product ( http://ormbattle.net/index.php/about.html) ....

So the benchmarks are made in the way the developers of that product think an ORM should perform well... In my opinion this is a bit biased...

I'm not saying there hiding their association, but still it should have been a good idea to attract some outside members to setup the benchmark suite..

15 Aug 2009
09:14 AM

Davy Landman

Should have read better, you did notice it... (and they did contact you)

15 Aug 2009
09:51 AM

Ray

That's rather disgraceful way to advertise own tool. I will never use a product of the company who uses such dirty tricks.

15 Aug 2009
09:52 AM

Rafal

fanny, I think comparing ORM 'performance' is a flawed idea because all ORM will have very similar performance in typical scenarios such as load, update or insert, at least they should have because they execute almost the same sql. The test cases should be more tricky, to check additional performance-enhancing features such as caching, query batching, intelligent prefetching and so on, now this would give us an idea of what benefits can be expected by choosing ORM over raw ADO.Net. I would also like to see the comparison of ORM 'usability' like the ease to set up and use in existing projects, 'integrability' with other platform components, learning curve, ugliness of data access code etc.

15 Aug 2009
09:56 AM

Jimmy

We use the delete case with the foreach in our project. Just because we did not find any alternative. And off course we suffer the bad performance.

from now on we ' ll use

"session.CreateQuery("delete Simplest where xxx = yyy").ExecuteUpdate();"

Thank for the tip.

With Nhibernate U can do things in several ways and lots of them are wrong.

I d really like to see somewhere a list of Nhibernate begginers common mistakes, because Nhibernate is so not beginner friendly.

15 Aug 2009
10:02 AM

Dave Mertens

@junior programmer: Are you sure you really need to download all that data from your persistent storage, do a little update and than upload it all again?

We use MSSQL2008 in about 98% of our projects. We don't write generic products, so we don't need cross platform database support and that means you can optimize for your database. But for every database their one very important step. Do as much on the database server itself.

Take an import for example. I parse the import file and put all records in an database and use SqlBulkCopy.WriteToServer to send it all to a global temporary table. Than we perform multi queries on that table and finally update the target table.

We use NHibernate in OLTP applications like websites, desktop applications and middleware (ESB, WCF, etc). We don't have much batch processing there.

@Fanny: There nothing wrong with comparing ORM framework, but use REAL tests (like you do with your unittests) and not a batch update benchmark. We didn't they write a simple blog application and written an implementation for every ORM framework that want to compare. That you can write stories and measure how long an certain use case takes. Like creating a blog, creating a blog entry, select the last 10 entries for syndication, etc. Than put all those timings in a matrix table and you have a good comparison of ORM tools because you benchmark something real.

If you want high performance, you don't use a ORM framework to import a massive CSV file. It can certainly make things easier, that than YOU have to trade performance for convenience..

15 Aug 2009
10:32 AM

Alex Yakunin

Oren, can I quote my e-mail conversation with you? I disagree with you, and it explains my position very well.

15 Aug 2009
11:02 AM

Frans Bouma

Alex: I also mailed you because you use simply unfair code. I also mailed you how to fix them so you get much higher results with our framework for example.

I've seen these kind of sites before, and although I don't like the results we got (which I know are due to your lack of understanding of our framework, so I am not surprised the results of other frameworks like nhibernate is also poor), I agree with Oren in this: they're actually pretty useless.

Not only can I show you your testing method is seriously flawed, it's also a case of what would the user use in practise? Like Oren showed in this post: if a user has to do ABC, he will use the feature of the framework which suits ABC best. Which very likely will perform much better, because it 1) fits the job to do better as the feature is tailored for the job and 2) doesn't fall into the trap of doing ABC very inefficiently to begin with

However at the end of the day, I can only shrug and move on. I thank you for the 100 extra test queries for linq which showed 3 bugs in our provider which caused many queries to fail (which is a signal you're not benching very good) so I'll fix these for our customers. But you have to understand: these kind of sites have little value, as what Oren described: they don't show how a framework will behave in real-life, in real world projects with real world problems: like updating entities using an expression. Some frameworks can do that with 1 query (like we do, and nhibernate too, can yours?), so a user will do that with ... 1 query in practise not with a loop!

Despite what you might think: customers don't suddenly think "Oh! I have to use dataobjects.net!", as they haven't done that in massive droves in the past, so why would they do that now? You can pat yourself on the back for the most feature-rich framework there is (and I think many of my fellow o/r mapper developers will agree with that), but honestly... if you want to sell your work, you have to invest time in documentation, features your customers want and need, instead of comparison sites which have little value other than "oh look Ma, in my own benchmarks I'm the best!"-kind of ego boosting ;).

15 Aug 2009
11:25 AM

Frans Bouma

Oh, and Alex, where are these 'CUD' tests / calculation methods ? You babble about pages, but you don't use paging code and the Materialization 'test' is equal to the 'Query' test (except a where clause)... which makes no difference for systems other than the ones who use prepared commands.

You see, prepared commands are only useful for these kind of tests, in real life, you don't need them that much, as you need many different commands per batch anyway if you're saving many entities (as they're often changed on different places) or you have many different connections, and a prepared command is gone once the connection is gone :)

Let's hope for the O/R mapper users in the world that we, O/R mapper writers, aren't forced to add optimized code paths for these silly loops, like videocard driver writers have to do to their driver code to show up best in the used benchmarks (which thus says nothing about reallife performance, only that the code does great in the particular benchmark).

15 Aug 2009
11:36 AM

Alex Yakunin

Frans, I just e-mailed you about unfair code. Certainly we'll fix all the issues in these particular test scenarios, but won't use anything like single statement table updates in these tests, because they test different case. I'll explain this here further.

And I confirm for LLBLGen there are some, but until now Oren didn't point to any issue to fix for NHibernate.

Concerning NHibernate-like frameworks: do you agree EF is NHibernate-like (POCO, similar DataContext behavior, etc.)? I didn't get any claims related to its tests yet (but let's wait, of course). Anyway, I'm 99.9% sure we know how such frameworks work very well.

15 Aug 2009
11:37 AM

Alex Yakunin

Sorry, there must be "And I confirm there are some issues to fix for LLBLGen, but until now Oren didn't point to any issue to fix for NHibernate."

15 Aug 2009
11:49 AM

Alex Yakunin

And I'll reply you later in the evening about everything else (e.g. paging). Should get some sleep :(

About prepared statements: for now we don't use them at all, because as you said, it's gone when connection is closed. Later - may be. But for now I'm not sure about this. They don't significantly reduce query time, if it is a query which plan can be cached (in our case this is always true).

AFAIK only OpenAccess uses prepared statements ("sp_prepexec" is frequently shown in SQL Profiler output on its tests).

15 Aug 2009
11:59 AM

Ayende Rahien

Fanny,

The underlying premise of the project is flawed. It is trying to measure raw database calls, and that is just not something that is interesting in any OR/M that uses Unit of Work.

A real benchmark would be building a real world application on top of each, and optimize each story, not each specific test that they created. None of the created tests would ever be used in real code.

15 Aug 2009
12:03 PM

Frans Bouma

"About prepared statements: for now we don't use them at all"

err... how can you then be faster than sqlclient (see 30K batch tests for example) ?

That's batching, nothing else. And for batching there 's more clever code to be used. You know, the only FAIR way to test what you want without doing batching is to create for every insert a new session/context, open a transaction, save, commit/flush context/session, repeat.

Anyway, I'll mail you about the legal side of things about this, as management here isn't very pleased with you using our brand name for your marketing and want our name, results and any hint towards our framework removed from your site.

15 Aug 2009
12:10 PM

Ayende Rahien

Alex,

Sure, you can use the email converstaion

15 Aug 2009
12:16 PM

Ayende Rahien

Alex,

I did point out issues to fix in your NH tests, they are trying to test the wrong thing.

As such, I don't see much to bother with anything else.

When your underlying premise is flawed, you get GIGO

But here is a hint, set the batch size to a high level a see what is going on

15 Aug 2009
12:39 PM

Frans Bouma

Alex, you have email (about the removal of us from your marketing site)

Also, I found the derivatives calculations, but these are wrong: you insert/update/delete in a batch (single session/context, 1 commit/flush), and therefore can use optimized codepaths for that. However, you can't use that for extrapolation over single client actions. E.g. doing 30K updates using a single context, is a batch of 30K updates by a single client. This isn't the same as 30,000 clients doing 1 update however you seem to think it is and your site seem to suggest the tests are about that. They're not (otherwise you can't be faster than sqlclient)

15 Aug 2009
14:51 PM

Fabio Maulo

@Alex

"Anyway, I'm 99.9% sure we know how such frameworks work very well."

But you are showing the code and seeing it I'm sure you don't know how work NH. We can change NH results without touch the wrong code of the test (only touching the cfg).

@Oren, have a look to the NH conf. they are using and compare it to some of their results.

Btw, good work @Alex. The time will say us which is the future of DO.NET.

Good luck.

15 Aug 2009
18:20 PM

fanny

I'll be again out of the blame chorus here.

Just to clarify things I'm not in anyway related to DO.Net and discovered it today with the benchmark site.

I'm an Nhibernate user (and btw also a Nhprofiler user too).

In the past I used several frameworks like Ibatis.Net and LLBLGen.

Reading the comments seems to me that this site scares...

It has touched one pain points we have when using ORM.

The tests may be biased in favor of creators ORM so what?

They released the source code and said you are free to modify and test by your own.

You do not like the code? Change it.

You would like a more real life example: add it. We would like it too.

We would love to have it because we could see all the application of best practices not all of us may be aware of.

On the same damn example.

Now we also have tools to stress test an application and see how it profiles with many different clients.

This would show how to apply an ORM in particular domain and how easy (or not) all the ORM in the .Net space solve it the same situation.

And Ayende yes: Nhibernate is hard (but for you) if you want to do in the best performing way and you know that as you created a tool for this (and btw I bought it).

You are creating an example app if I am not mistaken, use it as the example to test with the other ORM.

Just do not tell me that if I need to do some batch updates I must not use the ORM I am already using because is not intended to do that.

And Frans: your reaction with this threatening "legal department stuff" removal doesn't help you either. At least for the public image your tool has gained.

The only problem this site has was that it came from a "contestant" ORM producer so it is obviously in favor of that particular ORM.

This can be (and I would say it should be) changed.

But the "battle" is good.

Every battle shows us how to improve: Frans proved it in catching bugs.

Benchmarks are good, they are not all, but it is worth to have them.

15 Aug 2009
20:01 PM

Ayende Rahien

fanny,

I think you are missing the main point.

It isn't a case of needing to optimize some code. It is a case of a benchmark that is specifically built to show off non real world scenarios that benchmark well on one ORM.

There is nothing to learn from trying to pass such a thing, except bad ways of building a benchmark.

15 Aug 2009
21:28 PM

firefly

@ fanny, NHibernate and LLBGen are two proven ORM framework. They both perform well in real world application and that is the most important test there is. While it's certain that Oren and Frans can tweak the test to suit their framework why bother to get caught up in this little game? In the end they are probably comparable to each other with some edge case.

Like most have already pointed out, this is a petty marketing game that being play by the benchmark owner.

"This would show how to apply an ORM in particular domain and how easy (or not) all the ORM in the .Net space solve it the same situation."

The question is how would we define this particular domain to suit the general audience? Then again does it really matter? Unless the ORM, or any frameworks, is crawling by its knee a few millisecond won't matter in most case. While performance matter, premature optimization tend to be a waste of time. Oren already touched on the subject of premature optimization else where on this site and I happen to agree with it.

Since you mentioned that you are a NHibernate user. Are you satisfy with its performance? If not then is NHibernate the bottleneck? Did NProf helped your project in anyway? Is there something else that you wish NProf would do for you? As a current NHibernate user, those are they kind of real world story that I am much more interested in rather than some useless benchmark.

16 Aug 2009
01:07 AM

Jonesy

@fanny:

"Reading the comments seems to me that this site scares...

It has touched one pain points we have when using ORM."

Are you joking? It's a misleading website posted by someone who has a clear bias and everything to gain by misleading people. It has nothing to do with pain points, it has to do with dishonesty and false advertising.

"And Frans: your reaction with this threatening "legal department stuff" removal doesn't help you either. At least for the public image your tool has gained."

So you would be ok with building a brand and then have a competing company use your brand (dishonestly) to make its product look better? I don't think many companies would agree with your point of view.

16 Aug 2009
08:12 AM

Frans Bouma

"And Frans: your reaction with this threatening "legal department stuff" removal doesn't help you either. At least for the public image your tool has gained."

I'm not threatening anything. Our management just asked them to remove us from their site, KINDLY. That you think we threatened them is your imagination. We didn't threat anyone with anything, why would we do that? It only costs money and time.

What doesn't help us is that this misleading FUD site keeps spreading lies about what o/r mapper is the fastest. We don't want to be in this challenge.

I'll give you a couple of examples why we don't want to be in this challenge:

1) our linq provider doesn't support Skip() without Take(). This is because our framework doesn't support skipping, only paging (in various ways, but that's another story). Adding Skip() to all linq queries make ALL fail and we therefore would score a 0. That must signal our linq provider is really bad, doesn't it?

2) what's the difference between:

for(int i=0;i<30000;i++)

{

  // pseudocode, you get the drill

  using(var s = new Session())

  {

      s.StartTransaction();

       var someEntity = new SomeEntity() { Id=i, Value=i};

       s.Save(someEntity);

       s.Commit();

  }

}

vs.

// pseudocode, you get the drill

using(var s = new Session())

{

 s.StartTransaction();

 for(int i=0;i<30000;i++)

 {

       var someEntity = new SomeEntity() { Id=i, Value=i};

       s.Save(someEntity);

  }

 s.Commit();

}

a) mimics 30000 users saving 1 entity. A real life (albeit a bit lame implemented) scenario. b) mimics a bulk insert, only used rarely in importtools which can't use the native import system of the database at hand.

The 'battle' uses b) because it can show how fast DO is compared to other systems which are optimized for a). You know how you can see that? because it's faster than plain ADO.NET. How can it be faster than 30000 times a plain SqlCommand.ExecuteNonQuery() call? I know how, because it uses an optimized path for b): a prepared query (it's described below the figures).

3) If you're asked to update the salary of all employees of department 3 with 10%, and you have just a sql prompt, would you do that with:

a) an insert in a temp table, a cursor over the temp table updating the rows and then a cursor over the temptable which updates the real rows in the main table?

b) you would use a single UPDATE statement?

The battle uses approach a) as it then again can optimize for bulk updates, however SQL offers more advanced techniques to do this, and some frameworks optimized therefore for b). But again these frameworks look like poor performers in this 'battle'.

Do you know all this when you just look at the numbers? no. The numbers show that there's a clear winner. That's not a surprise, this battle has no other winner but the product of the owner of the website. I.o.w.: there's nothing to win and absolutely nothing to gain from any other framework to be on that website.

That's why we asked kindly to get our name, the code of our framework and the results of our framework to get removed. After all, they claim to be 'honest', so I don't see why they won't do what we kindly asked them.

16 Aug 2009
09:35 AM

Frans Bouma

I looked again into the code, and saw that the SqlClient code was actually included.

Now, here's the deal: it uses Prepare() already. So the numbers of the sql client tests are already created using prepared statements. This means that you can't get faster than that in single query execution sequences, you can only get as quick as that. But as the sqlclient test uses a tight loop over a prepared SqlCommand, you really can't get quicker than that, unless you batch statements.

Batching can be done on sqlserver, by packing multiple queries together using ';' and eventually prepare them. It's more efficient if the query is equal every time, so the plan can be re-used for the packed query. You can only utilize this if you're doing batch inserts, if you're mimicing a lot of different actions on the DB you can't use batches, as they're not useful, there's nothing to batch. As I described above, if you move the loop outside the session/context creation, the batching isn't possible anymore and you will see very different numbers which will mimic real world scenarios instead.

There's another way of batching on sqlserver, but that way isn't public. It's the way SqlDataAdapter batches statements using internal code. (using SqlCommandSet).

Lots of databases don't support batching of multiple statements, like Oracle. It's however nowhere mentioned on this 'honest' site. ;)

16 Aug 2009
09:56 AM

Ayende Rahien

Frans,

FYI, NHibernate supports batching for both SQL Server and Oracle using SqlCommandSet (and the equivalent for ORA).

You might want to take a look at how this is done and copy it.

Admittedly, [ThereBeDragons] was applied :-)

16 Aug 2009
10:48 AM

Frans Bouma

Heh :). Btw I didn't know odp.net had a batch pipeline under the hood.

I checked if we can implement it in the pipeline (after all, it processes elements in a queue anyway), but it's a little tricky as we have features which take place right after a single entity is saved, like validation after refresh, auditing and error recovery for a single entity failure during save is then out of the window (if packing 10 statements together makes statement 9 fail, how is this reported back so the calling code can recover and retry entity 9... ).

So only if the user doesn't want to use all that, AND bulk inserts are possible, it might be useful. Sounds like an edge case. (as, like you said in the blogpost, in OLTP you're more interested in a lot of features around the persistence, that's what makes using an o/r mapper useful, not bulk inserts, sqlBulkcopy is much faster than any o/r mapper.

hehe I see your comments on the class, my thoughts exactly :)

I've to look into whether this is worth the effort in our case (I'm not going to copy it, it's LGPL-ed code after all), considering the consequences for various features we have (which all have to be disabled or not used). Also reflection would destroy medium trust, some people like to have that due to shared hosting providers.

It's a trade off really, and I think in general users don't really want to give up the features for the batching just for a few extra cycles, as there are always other ways to make it even quicker (like bulk copies).

16 Aug 2009
11:58 AM

Ayende Rahien

Frans,

NH uses the UoW model, so we don't have the concept of partial update, just a transaction failure.

With NH, batch updates disable some things, too, mostly related to update validation checking (if you are trying to update instead of insert, frex).

The class is actually taken from Rhino Tools, which is a BSD license, so you can use that instead of the one in NH.

But while this isn't a common scenario, it does show up fairly often.

For example, imagine a feature like copying an entity (deep copy), which may result in many statement. Batching that produces nice perf boost.

16 Aug 2009
12:43 PM

Alex Yakunin

Ok guys, I see we've made some of you boil. First of all, let's stay cool ;)

Secondly, now I'm writing the answers to all the critics I got from you. They'll will be published in FAQ @ ormbattle.net in few hours. That's just because I dislike the idea to answer all the same questions for N times.

16 Aug 2009
12:51 PM

Frans Bouma

Good point. And thanks for the hint about the different version :).

I'll look into it later this year.

16 Aug 2009
13:10 PM

Fred

Yep...Time machine please...6 years ago ..

https://www.hibernate.org/157.html

I do totally agree with Frans, and the same should apply with NHibernate :

It is a shame to publish on a website this kind of information.

Who is stupid enough today to use an 'out of nowhere' commercial Framework, and use it, just because the provider said it is the best using its own benchmarks. Do you really think that .Net developpers are that stupid ?

At least, now .net has its "storm" benchmark...

16 Aug 2009
13:28 PM

Set

Do you really think that .Net developpers are that stupid ?

Managers are...

16 Aug 2009
14:05 PM

Alex Yakunin

The first answer:

ormbattle.net/.../...ur-tests-are-unrealistic.html

Others are upcoming.

16 Aug 2009
14:10 PM

firefly

Boil? Hardly. Disgusted? Definitely. That not the kind of feeling you want from a could-have-been potential customer.

Petty little marketing game disgust me. Programmers, or any other sensible being out there, are sensitively to thing like that :) . So I am glad that both Oren and Frans didn't fall for this dirty little trap.

I am a NHibernate user. While I haven't used LLBGen I would have no problem to recommend to customer that ask me about it. Why? Because both Oren and Frans impress me with their real knowledge and honest conversation. They showed me how much they know about their own product. If a future is missing it's missing for a reason. So if I ever need support from them I know I can expect a clear and reasonable explanation of why abc was done xyz way. It's something that I can appreciate. They won't give me the run around because they don't need to.

16 Aug 2009
14:52 PM

Frans Bouma

"The first answer:

ormbattle.net/.../...ur-tests-are-unrealistic.html"

You aren't seriously believing that, do you Alex?

Thanks Fred for digging up that old post. While we're at it:

ayende.com/.../PissedOffByVanatecOpenAccess.aspx

and my personal favorite:

blog.hibernate.org/.../11#versant

Two examples of other attempts to 'show' how 'crap' the competition is of product X, which actually didn't work that well...

16 Aug 2009
15:47 PM

Alex Yakunin

Because both Oren and Frans impress me with their real knowledge and honest conversation.

That's one more pitfall you're falling into. Can I show you they have a good knowledge, but as anyone, make some serious mistakes? Guys, sorry about this :)

1) Frans written an excellent sequence of articles about implementing LINQ provider. I've been reading many of them, but frankly speaking, they didn't help us a lot. Mainly, because we've chosen a different way. The proposed one have lots of lacks you simply don't know because you didn't try. And I, as the that guy also passed the same path, can just show this is true: see their results on LINQ implementation tests. Just take a bit more complex case (subqueries, etc.), and it fails. Oren, actually I hardly even can imagine why you released such a "LINQ".

2) Read weblogs.asp.net/.../...rst-doing-is-for-later.aspx - Frans, I'm sorry again for being unkind there. Anyway, look on the facts, not the tone.

3) Oren, let's discuss your replies:

First case:

Me: Btw, NHibernate really drops query rate with the amount of already fetched instances. From ~ 1K/sec for 100 instances and to ~ 100/sec for 30K instances. I feel there is definitely some bug.

Ayende: Again, this is fully expected. NHibernate uses the Unit of Work model.

Can someone explain me, why:

NH's UoW model leads to dramatic performance decrease on read-only operations?
What kind of internals are there? I.e. I perfectly know how to implement UoW without such drametic wastes. Read e.g. about STS implementation in .NET 4.0 - obviously, it's provides much harder warranties, but performance decrease there is ~ constant (4...8 times). So it's simply hardly to imagine how to implement it in such a way!

Second case:

Ayende about our fetch test: Fetch is invalid usage, it is not something that you will ever there except as a bug.

Me: Why does it exist than? ;)

Ayende: For single use scenario. Calling it in a loop is a bug, you use a query for that.

I agree we test this on unusual scenario, but I explained why this is ok in this test. See ormbattle.net/.../...ur-tests-are-unrealistic.html

But let's return to the numbers: Oren, you're behaving like you don't see an obvious fact:

NH materializes entities so sloooowly (at least on LINQ queries), that actually its materialization performance is comparable to fetch performance of leaders! So basically, if you'll do the same just with fetches on other frameworks, you'll be at max 2 times slower than with especially written queries fetching bulks of them in NH. What does this show?

a) No one ever profiled this in NH

b) Ayende is not aware about this flaw

But he recommends this as panacea to the people. Imagine, even if I'd rewrite this test for NH to use just a single query, it will anyway FAIL to SqlClient! (you'd get materialization performance: 16399 op/sec, but SqlClient's fetch performance is 21129 op/sec).

So frankly speaking, I think NH users should simply say thanks to us for these tests. They show there are at least 2 huge flaws. I'd bet they'll be fixed in 6 months, if Ayende won't be so disappointed ;)

16 Aug 2009
15:55 PM

Alex Yakunin

Finally, I can't mention the following: you guys have applied so many efforts here... What is your goal? DEPRECIATION OF RESULTS SHOWN AT ORMBattle.NET.

Ok, it's obvious at least NH simply fails there. And actually there are two paths to fix this:

Complex: spend few months on its performance profiling (instead of NHProf , yep ;) ) and make it work at least ~ as EF. I know it can, because they're very similar from the point of architecture & proposed approach - POCO, always "disconnected" graphs (sessions), AFAIK, data can be stored inside entity fields, etc.
Simple: to say it does not reflects the reality to depreciate it and hope this pretty false statement will be accepted by the people just because of your good reputation,

16 Aug 2009
15:59 PM

Alex Yakunin

Frans, about Versant's advertisement and comparisons: you just mention FEATURE-BASED COMPARISONS. Have you read http://ormbattle.net/index.php/summary.html ?

Here I wrote why fully feature-based comparisons are initially false, and why we test just very very basic stuff.

16 Aug 2009
16:16 PM

Frans Bouma

@Alex, where have I insulted you? So why are you insulting me? ->

"Can I show you they have a good knowledge, but as anyone, make some serious mistakes? Guys, sorry about this :)"

Which mistakes did I make? I didn't implement silly code paths for some 'benchmark' of some competitor's fud site?

I'm really starting to get fed up by this crap. Remove our name and the results of LLBLGen Pro from your FUD spreading marketing site _NOW_. Ok? Not tomorrow, NOW. I've kindly asked you that yesterday, and also today a couple of times. We don't like being used as a pawn in a competitors ego boosting and self-promotion.

16 Aug 2009
16:22 PM

Alex Yakunin

Who is stupid enough today to use an 'out of nowhere' commercial Framework, and use it, just because the provider said it is the best using its own benchmarks. Do you really think that .Net developers are that stupid ?

... Managers are.

It looks like .NET developers & managers are stupid enough to use something that shows worst performance on very basic tests in comparison to other frameworks.

Concerning DO: yes, it performs very well on these tests now. But I think it easily can loose its positions there in near future:

All you need to compete with leaders on CUD tests is batching. Have I shown you here it's nearly as usual as future queries? Why just we and OpenAccess developers noticed it can be used? Btw, they use even faster option: they automatically build DELETE FROM ... WHERE Id IN (...) queries for removals, but we (currently) use just batches consisting of a set of regular DELETE FROM ... WHERE Id==@Id statements. So OpenAccess is #1 on entity removal test. It does removals almost 1.5 times faster than DO, and 6+ times faster than anyone else. Is this information useless?
You can be #1 on materialization test. We compete well with EF there (although it is definitely #1 on this test) just because our materialization pipeline is likely the most optimized one on the market, but in general it's really hard for DO to compete with EF-like frameworks on this test: if everything is optimized well, all depends on per-entity memory consumption here. So POCO entities must rule on this test. DO entities aren't POCO entities. That's it.
It's difficult to reach good result on LINQ tests. E.g. EF looses some points there mainly because they don't support equality comparisons for references (fields of Entity type; or, better to say, comparisons of non-primitive types) and few other strange limitations like impossibility to use First/Single in subqueries. But in general 6 months are enough to get 100 out of 100 score here; at least our team spend 6 months on LINQ provider for DO.

Finally, you must notice DO isn't explicitly shown as #1 here; moreover, it's easy to prove we didn't specially develop tests for it. If anyone will find something contradictory to this, please write us, we'll remove the test or part almost immediately and update the results.

16 Aug 2009
16:34 PM

Alex Yakunin

Which mistakes did I make? I didn't implement silly code paths for some 'benchmark' of some competitor's fud site?

Frans, you precisely know the mistake you've made. It's explained in my first post there.

Remove our name and the results of LLBLGen Pro from your FUD spreading marketing site _NOW_. Ok? Not tomorrow, NOW.

So I must ask you here again (since you asked to not quote you), don't you find yourselve hiding important information from your customers? In fact, you prefer to hide the truth to sell you product better. Do you know how does this called?

16 Aug 2009
16:37 PM

Alex Yakunin

@ Frans, concerning removal: it will be done tomorrow, if you want. I just need some help with this.

16 Aug 2009
16:43 PM

Ayende Rahien

Alex,

Side note: Are you aware that your tone is becoming more and more offensive?

As for the results, as I mentioned, they reflect fake scenarios that has nothing to do with the real world.

As such, the benchmark results are worse than useless, they are misleading. I am willing to give you the benefit of doubt and assume that they are not intentionally misleading, but frankly, your behavior is... not encouraging that assumption.

I am not even going to try to answer your points, your are iterating things that were already answered. Please read the thread and take a look at the STORM reply from Hibernate. Pretty much everything else applies here as well.

As for the FAQ page, when I gave you permission to use my reply to you, I assume you would do it in an informal context, such as a comment or a blog post.

I don't think that the quotes in the FAQ page are appropriate and I am asking that you would remove them. The quotes, as shown, are out of context, both in terms of the actual discussion and the tone of discussion. Please remove them from the FAQ page, as I withdraw my permission to use them.

16 Aug 2009
16:50 PM

Fred

Well,

I know LLBL for being a customer. I know Nhibernate and using it since the first release. None is perfect.

But if your product is the best product (and it is not only related to performances), then it will be the market reality.

I don't care that Linq NH doesn't pass 99,9% of your tests : they do not represent real life examples.

When I look at your result, I only doubt of one thing : that you have understood the other products.

Let be honnest : You have just promoted your solution using a low moral solution. Not only is it stupid because it is too big and obvious to work, but it is also insulting for many people (commercial or oss) because you use their work as a leverage.

As far as I am concerned, I will probably use both LLBL and Nhibernate, with a preference for the second, but I will never work or spent a cent with your company or your product.

16 Aug 2009
17:17 PM

Frans Bouma

"So I must ask you here again (since you asked to not quote you), don't you find yourselve hiding important information from your customers? In fact, you prefer to hide the truth to sell you product better. Do you know how does this called?"

Important information? Contrary to your product, our framework comes with a 600+ pages manual and the knowledge that many thousands of projects (for example critical applications in banks and oil companies) use the runtime successfully every day, for many years already.

So, I don't think they'll lose anything when they can't read up how a competitor deliberately made our framework look really bad.

About the serious mistake I made... sorry but I don't follow you. Anyway, enough...

"concerning removal: it will be done tomorrow, if you want. I just need some help with this."

Yes, we want that, how many times do I have to repeat that, Alex? Thanks for removing us.

You must know that this whole mess really ruined my weekend and I still feel crap. I know I shouldn't be, after all it's marketing and who cares about marketing. But let me assure Alex, I won't forget this. When LLBLGen Pro v3 comes out with solid model first/ schema management etc. support for NHibernate, EF, L2S, LLBLGen Pro RT, Euss, genom-e and likely some others too, I will not spend a single second writing support code for your framework. You deliberately misused my life's work, and although I shouldn't, it feels like a personal attack. That's perhaps me, but so be it.

16 Aug 2009
17:22 PM

Mike G

@Alex

While your tests may prove that your product is more performant on those exact tests, I think this was a mistake in a few ways:

From what I've seen, the major ORM developers are generally very cooperative and respectful of eachother's efforts.Doing a hand-picked speed comparison on a (misleading) marketing site probably just got you removed from the community as far as future cooperation is concerned.
That site appears to be set up as a impartial test results site. When I stumbled across that site, I saw no obvious implication that this was a site run by the creators of one of the products being testing. ORMBattle.Net? If you didn't intend to deceive, why not put this under your normal DataObjects.Net domain? This approaches astroturfing territory...bad.
There's nothing wrong with contributing ideas by debating techniques which may lead to better performance. Doing this in a public forum (such as this one) does everyone a lot of good. Skipping the debate and just showing raw benchmark scores does no good at all except from a marketing perspective... it is meant to impress managers who like to make major decisions based on charts.
ORM performance is about a lot more than how long it takes that to do a single entity update, fetch, delete, etc...for example, how does it make sure that only what needs to get updated, gets updated? How does it minimize database chattiness? These are issues that I would imagine are difficult to benchmark as there are so many different possible scenarios, but they are as important, if not more important than raw "scores" on simple CRUD operations.

I don't know if your test were meant to be an honest comparison between the different ORM frameworks out there, maybe they were...? But the way you go about it definitely shows, to borrow a phrase from the American legal system, the appearance of impropriety...which is bad in and of itself.

16 Aug 2009
17:25 PM

Alex Yakunin

Side note: Are you aware that your tone is becoming more and more offensive?

Side note: Do you see the same happens from your side? From my point of view, you say "everything is wrong", and now I must defend my position. Particularly, on such examples.

As for the results, as I mentioned, they reflect fake scenarios that has nothing to do with the real world.

Don't you see such statements are just FUD as well?

As such, the benchmark results are worse than useless, they are misleading.

Ok, have you seen my above post about at least two flaws in NH? Is this useless?

I am not even going to try to answer your points, your are iterating things that were already answered.

FUD again.

Please read the thread and take a look at the STORM reply from Hibernate. Pretty much everything else applies here as well.

You mean this: https://www.hibernate.org/157.html ?

Yes, I've read it. And, frankly speaking, didn't find much similarity except the fact it was made by another one ORM vendor (it's not the flaw: let's think this is acceptable, if benchmark is honest).

There is a large part about fixing bugs related to NH usage in STORM test. I ASKED YOU PERSONALLY TO STUDY OUR TEST FOR NH TO AVOID ANY MISUSE OF NH THERE! I understand we aren't experts in NH, so I asked expert to do this. What did I got? Mainly, common answers like "your test is wrong" - I already mentioned this once here. And the whole this blog post.

The other left part there is this one: "With such small datasets, accessed repeatedly, the database is able to completely cache results in memory; the benchmark never actually involves any real disk access (watch your hard drive while STORM runs). We never get to see what happens once the dataset is too large to fit in memory, or is being updated by another transaction. We never get to see what happens with the database is under load. In fact, this benchmark involves no concurrency at all! We never get to see any joins or any of those other things that happen in realistic use cases. Furthermore, these kinds of benchmarks are often run against a local database, which gives results that are absolutely meaningless once the database is installed on a physically separate machine."

I can explain (and I already explained this in test suite description, but you don't see this) why MINIMIZING database load is GOOD for ORM test rather then BAD: any ORM adds some overhead to all the database operations; to measure it well (i.e. to expose JUST IT), you must MINIMIZE the database load. That's what we do.

In fact, since we measure the efficiency of intermediate layer, we must make all the layers behind it operating as fast as possible - to measure just the efficiency of that intermediate layer. Is this clear, or I must provide an example like with CPUs, add and multiply?

As for the FAQ page, when I gave you permission to use my reply to you, I assume you would do it in an informal context, such as a comment or a blog post. I don't think that the quotes in the FAQ page are appropriate and I am asking that you would remove them. The quotes, as shown, are out of context, both in terms of the actual discussion and the tone of discussion. Please remove them from the FAQ page, as I withdraw my permission to use them.

Ok, so I understood you allowed me to use your quotes here? :)

My first intention was to simply do this to show there is nothing like "out of context". But that's too long for a single post. So actually I don't see anything like "out of context" here. That FAQ page reflects exactly the parts we were talking about.

Anyway, since this isn't really important, I'll remove them all. I really tired to argue about stuff like"please remove my ..., I don't permit you to use it!".

16 Aug 2009
17:41 PM

Alex Yakunin

If you didn't intend to deceive, why not put this under your normal DataObjects.Net domain? This approaches astroturfing territory...bad.

We explicitly declared this almost everywhere. The site has separate domain name by two reasons:

It wasn't made to show DO4 is the best. And there is a long true story about this in "About" page. Yes, currently it markets DO4 well. But that's just currently. I don't fear id DO will be an average tool there in future. We'll simply try to keep its results god from this point.

Btw, we even didn't set up any explicit placing there. Again, by the same reason. I'd like such a benchmark site to run with our without DO4 results on it. And making this web site running on X-tensive domain would depreciate it further.

We seriously though (and still think) about making such benchmarks fully community based. E.g. shortly the code will be shared @ Google Code SVN (currently there is just .RAR with it). We didn't do this just because of lack of time.

So honestly, we are even ready to remove DO4 results from it at all (but I leave the permission to mention them on our own web site + make a public vote of adding DO4 back there).

If you have any ideas on how this (i.e. fully public & credible benchmark could be achieved), just share them here. I fully believe our test suite is done well - again, with or without DO4.

16 Aug 2009
17:44 PM

Alex Yakunin

Hmm... More than two reasons, but anyway, it's a good explanation.

Shortly, I believe such a web site \ benchmark must exists. We are ready to gave it under the community control. And any ideas on how to achieve this are quite appreciated.

16 Aug 2009
17:47 PM

Alex Yakunin

There's nothing wrong with contributing ideas by debating techniques which may lead to better performance. Doing this in a public forum (such as this one) does everyone a lot of good. Skipping the debate and just showing raw benchmark scores does no good at all except from a marketing perspective...

I almost immediately published link to this post there: ormbattle.net/.../...st-suite-and-its-results.html

I'm also for a complete openess of such things. And (actually) that's why I try to pull out any conversation here. Asking Ayende and Frans to run all of them them were was looking as simply impossible.

16 Aug 2009
17:54 PM

Alex Yakunin

ORM performance is about a lot more than how long it takes that to do a single entity update, fetch, delete, etc...for example, how does it make sure that only what needs to get updated, gets updated? How does it minimize database chattiness? These are issues that I would imagine are difficult to benchmark as there are so many different possible scenarios, but they are as important, if not more important than raw "scores" on simple CRUD operations.

As you see, our CUD tests partially show how to minimize database chattiness. Btw, we could do the same with query test (there are future queries in some frameworks), but found this isn't fully honest, since this requires utilization of a special API, that leads to side effects. So likely, there will be a separate test for this shortly.

And I understand it's quite difficult to make a test that will be accepted by all of us.

If it will be too complex, people will say "it's just a particular case", Vendors will try to optimize their framework just for the scenario tested there (all of us know what graphics card vendors do to lead in complex benchmarks - i.e. they fake the results there by all possible ways)
If it will be too crude... Well, you see whats happens right now. Although the operations are really simplest from the simplest ones, and anyone can check this, I must answer why it isn't especially designed for us.

16 Aug 2009
17:56 PM

Mike G

@Alex

"We explicitly declared this almost everywhere."

I look at the main page and the summary results page and see this mentioned nowhere. Again, I'm not suggesting impropriety, but this has very obvious appearances of impropriety, so much so that I wouldn't blame readers from completely distrusting the results.

I think its awesome to have a community based benchmark suite for the simple reason that it could improve the science of benchmarking, and may lead to benchmarks that are actually useful in a real-world context. However, I would expect this to be run and maintained by a completely impartial 3rd party, not by someone with such a vested interest in the outcome of the benchmarks!

To go down the "Bouma" road on this, I would suggest that the benchmarks themselves are not what is important, but rather the science that backs up the benchmarks. This thread seemed to have a promising start insofar as a discussion of what a "useful" benchmark are was introduced. Unfortunately, because of the whole appearance of impropriety issue, things turned to venom rather quickly which ended any useful discussion. A site dedicated to discussion of what a useful ORM benchmark is would be absolutely wonderful. Even some benchmark implementations would be nice, which you have obviously put quite a bit of effort into already.

16 Aug 2009
18:07 PM

Chris Nicola

As for the results, as I mentioned, they reflect fake scenarios that has nothing to do with the real world.

Don't you see such statements are just FUD as well?

I am not even going to try to answer your points, your are iterating things that were already answered.

FUD again.

Errr... neither of those are any of fear, uncertainty or death. This conversation has unfortunately degraded to uselessness, though to be fair to both Oren and Frans with the attitude you came in here with I'm not sure much else could have been hoped for.

I suspect some of that has been some miscommunication but much of it has been caused by your agressive attitude towards them and their work. A fact which can be directly traced back to you creating a site literally called "ORMBattle". I think it is pretty clear someone was looking for a fight.

I believe such a web site \ benchmark must exists.

One of the reasons benchmarks aren't that useful is that typically performance is only important if your software isn't meeting a necessary SLA/requirement. Thousands of people use NH and LLBL and it clearly meets this need for them. Largely because they use it correctly and for the types of real-world scenario's that matter.

Even if meaningful benchmarks exist they probably won't matter much to customers. They will be concerned with other issues like the frameworks flexibility and extensibility, or its community and support. You need to win on builidng a community, depth of knowledge, and proven reliability as Frans, Oren and others do. You should want to be seen as a go-to for insight into ORM issues. Issues in ORM are often subtle, tricky and complex as much as they can be simple. Proving you can be a bit faster is just such a small part of it.

You will not win customers by taking an aggressive approach like this. Trying to prove your product "beats" out others will only label you as biased and misleading and if that is FUD so be it, you brought it on yourself.

Even if, as you say, your benchmarks are not misleading, they will still be interpreted as such by most of the communities who now feel as if you are attacking. Remember developers care more about community, knowledge and support than they do about benchmarks which can say one thing on one day and something totally different on another.

16 Aug 2009
18:55 PM

Fabio Maulo

@Alex

If you are serious you should open that site to public comments as this, and others sites, are open.

Allows users to leave theirs opinions about your tests, about the product they are using and about your product and, obviously, maintain each opinion public without any kind of moderation.

Note: public mean without a login in your site.

Your site has only your words as "the owner of the true".

16 Aug 2009
19:08 PM

Jonesy

+1 for never, ever, ever using this product, and recommending switching to something else if any of my customers are stupid enough to be already using it.

Way to go Alex, in addition to making your company look shady from a marketing perspective, you have made your product itself look bad by making a fool of yourself arguing with two people (Oren and Frans) who are much, much, much, much, much smarter than you are.

16 Aug 2009
19:11 PM

firefly

"it's quite difficult to make a test that will be accepted by all of us.

If it will be too complex, people will say "it's just a particular case", Vendors will try to optimize their framework just for the scenario tested there (all of us know what graphics card vendors do to lead in complex benchmarks - i.e. they fake the results there by all possible ways)
If it will be too crude... Well, you see whats happens right now. Although the operations are really simplest from the simplest ones, and anyone can check this, I must answer why it isn't especially designed for us."

You've said it yourself. Benchmark like this is useless because it's either too complex to judge or to general to have any value. In engineering there is always a design trade off. Each vendor choose to optimize their product differently for different scenario.

Let take the auto business for example. To show that one car and do 200 m/h vs another can only do 110 m/h is meaningless to the customer. What matter more is how far each company support their product in the long run. How much will it cost me for an oil change? How reliable is the car? In another word, critical stuff that any benchmark will fail to show.

Unless you are so confident that you have achieve an absolute breakthrough that your product will beat out all the other smack down. If that is the case why don't you put some money down if people like Frans and Oren can prove that you are wrong? Since I am sure they don't want to work for free :)))

If in the end with the necessary benchmark all product seem to be comparable to each other then what does that mean to me and the community? A big duh :) tell us something we didn't know.

16 Aug 2009
19:17 PM

Samus

@Alex:

Frans said:

"When LLBLGen Pro v3 comes out with solid model first/ schema management etc. support for NHibernate, EF, L2S, LLBLGen Pro RT, Euss, genom-e and likely some others too, I will not spend a single second writing support code for your framework."

Pwned.

16 Aug 2009
20:46 PM

Alex Yakunin

If you are serious you should open that site to public comments as this, and others sites, are open.

We'll do this. Frankly speaking, initially there were comments. But we found two modules for Joomla! we used for this more worse than good. Basically, there were serious issues with appearance we couldn't accept.

But shortly they'll be back again. For now there only are public forums.

By making a fool of yourself arguing with two people (Oren and Frans) who are much, much, much, much, much smarter than you are.

I won't even comment this ;)

If in the end with the necessary benchmark all product seem to be comparable to each other then what does that mean to me and the community.

Do you see they are comparable now? I don't see this. I see a huge disbalance there, which only proves that many of them were not profiled at all.

Tell us something we didn't know.

Did you know:

NH is 25...30 times slower on materialization that EF? So reading a large object sequence in it will take 30 times more time.
NH runs a simple query fetching 1 instance 100+ times slower when ~ 10K instances are already fetched into the Session? No one else exposes similar behavior.

Be honest, you didn't. Yep, this ugly truths. If you'd like to hide it, the simplest way is to say "there is nothing new here, don't pay attention to this". Just depreciate this.

And I'm not speaking about LINQ tests.

If that is the case why don't you put some money down if people like Frans and Oren can prove that you are wrong?

I suspect they'll continue doing this anyway - there is no need for bets. Moreover, you didn't say what exactly do you mean saying "you're wrong". Please state something particular, that will be easy to proof, not everything ;)

16 Aug 2009
20:46 PM

Alex Yakunin

Contrary to your product, our framework comes with a 600+ pages manual and the knowledge that many thousands of projects (for example critical applications in banks and oil companies) use the runtime successfully every day, for many years already.

I won't discuss 600+ pages manual - I believe it's good, and actually I don't fully understand why you're trying to mix this here. ORMBattle is about performance and LINQ. You're writing about your manual and marketing results - exactly what ORMBattle isn't about.

In fact, you're saying "there are evidences this is HQ product" - have I ever mentioned it isn't? Yes, but is isn't the best on our test suite. Actually if your product is so good, I'd simply accept this being on your place. Explained, why others are better, and what are advantages of architecture you have developed.

You deliberately misused my life's work, and although I shouldn't, it feels like a personal attack.

Obviously, it isn't. Did I say something bad about your product here? I said you was wrong and shown few examples of this. Why? You're (as well as Ayende) trying to depreciate my own work by all possible means here; moreover, people believe you, and you're relying on this trying to use the tone "believe me, that's true". What did you expect?

Certainly, I don't dislike you personally. It's all about the opinion here.

Btw, it's even a bit funny you answer to me in such a tone. Actually, I don't expect this. I like to argue - it's a kind of sport for me. And I say "I was wrong" with ease, when I feel this. But currently I feel myself completely differently - and that's only because of your arguments.

Anyway, I don't like the idea of maintaining such a fight here further. I'll provide links to particular test discussions here tomorrow, as well as place to discuss the ideas on any other tests. I think this will be much more productive.

Saying "everything is wrong" is not an argument.

I don't care that Linq NH doesn't pass 99,9% of your tests : they do not represent real life examples.

What's "real life" is actually very difficult to define.

We'll enumerate the test (and name the cases) failing for NH. We didn't do this for other products just because it is depreciative. But since NH is fully open source, I hope it is ok to publish this info. So anyone will be able judge by his own, if this is a "real life", or not. Currently I can only assure you lots of tests there are pretty simple.

When I look at your result, I only doubt of one thing : that you have understood the other products.

That's why we contacted their vendors or experts. But I doubt we understood them so wrong.

You have just promoted your solution using a low moral solution. Not only is it stupid because it is too big and obvious to work, but it is also insulting for many people (commercial or oss) because you use their work as a leverage.

I just mentioned we're ready to remove DO4 out of results there at all. If this will solve the problem with trustworthy of results, it's definitely ok. I believe such test simply must exists.

On contrary, what you see is what a particular vendor shows to you. So do you really think this is good? There are TPC-C and many other tests (btw, frequently testing very similar operations) for databases, but no any credible tests for the next layer. And you're, as customer, taking the position of your favorite vendor. You're so impressed that protect the idea that is attractive for just few vendors, nor customers.

Ok, forget about DO4 positions there. Let's think we already removed it. Do you still think such a benchmark is useless?

16 Aug 2009
20:46 PM

Alex Yakunin

Btw, it's interesting to know what kind of framework would win in our performance tests, yes? It must be:

POCO-based. Its RAM overconsumption must be close to zero. This is needed to win materialization test.
As simple as it's possible, but complex enough to support reference fields and simplest collections. Again, that's about materialization test.
It must communicate with DB quite efficiency. No any chattiness. That's about CUD test.
Provide very good LINQ support.
Be highly optimized. Briefly, if you never profiled your framework, you have almost zero chances to be even #2...#3 there. We know leader of each test seems to be highly optimized at least for this particular scenario.

Looks like all this suits well for NH, yes? But it isn't winner there. EF wins materialization test - because their materialization is written pretty well + they use POCO. OpenAccess wins removal test because they optimized this almost ideally. And the product I can't name here either wins or is #2...#3 on other tests - I hope you understand this couldn't be a result of some particular optimization for these tests, or "just specific tests". Otherwise it would win everything.

16 Aug 2009
21:05 PM

Alex Yakunin

I look at the main page and the summary results page and see this mentioned nowhere.

It's mentioned at about page: http://ormbattle.net/index.php/about.html

A site dedicated to discussion of what a useful ORM benchmark is would be absolutely wonderful. Even some benchmark implementations would be nice, which you have obviously put quite a bit of effort into already.

I fully agree with you here.

Btw, we tried to find some standard benchmarks for ORM tools, and actually found nothing credible. E.g. I remember one testing paging implementation, but they didn't check even query plans there. Obviously, if they're too different (= one is just completely wrong), there is no reason to test paging at all.

So if there will be a test for paging (Take\Skip), we'll definitely take this into account. And there must be a separate test for paging without Take/Skip - that's what frequently used as well.

Anyway, we'll try to make more credible out of this - I believe this is more important what we could get for DO out of this. Obviously, we must remove DO for some period there (3...6 months?), and forward full control over the test suite code to the community. We'll think how to achieve this.

16 Aug 2009
21:06 PM

Alex Yakunin

Sorry, I mistyped: "Anyway, we'll try to make more credible site out of this".

16 Aug 2009
21:35 PM

firefly

"I suspect they'll continue doing this anyway - there is no need for bets. Moreover, you didn't say what exactly do you mean saying "you're wrong". Please state something particular, that will be easy to proof, not everything ;)"

To prove that you are wrong for nothing meaning they are being dragged into your little game which is a waste of time.

"Do you see they are comparable now? I don't see this. I see a huge disbalance there, which only proves that many of them were not profiled at all."

Currently the test is being tweaked toward your framework, whether it's intentional or not. Hence the imbalance result. Like I said unless there is a breakthrough in your framework with the necessary optimization toward each framework the end result will be similar for the benchmark. So if you are confident why don't you put a price on the fact that after the necessary tweaking your product will still beat the other hand down? If not I would rather not see the time of the community being wasted.

16 Aug 2009
21:57 PM

cowgaR

I haven't read the comments here, but here's what I did...

read this article's topic
clicked the ormbattle link
didn't see any version number in contesting ORMs (like subsonic 3 has LINQ provider already)
saw the result, one (unknown to me to this day, I knew all the others) ORM stood out completely as a big winner
clicked "about us"
read:

__ We are experts in ORM tools for .NET, relational databases and related technologies. Moreover, our company has a product participating there (DataObjects.Net).

laughed, and closed the browser

16 Aug 2009
22:12 PM

alwin

Aren't comparisons like these useless? I mean, whatever ORM you use, it can be tweaked and finetuned. With the end result that the time spent in SQL server is much larger than the time spent in the ORM itself, and that ORM time is negligible. I think this is true for every good ORM out there.

Every ORM can reduce roundtrips (I love NH futures), and has options for optimizing the SQL code being generated.

People are better off looking at features/ease of use/support/price of the ORM. Don't worry about performance. After a bit of configuring and optimizing your ORM of choice, you can get almost the same SQL output, and performance, as all the other ORM's.

16 Aug 2009
22:51 PM

Johnston

So the purpose of this was just having such a benchmark available? It's pure altruism regarding ORM? I think that's really noble of you.

Just so it's crystal clear: you would have created this separate website and posted the results of all these tests if your product had been the clear loser across the board?

That's very, very believable! I'll take 100 copies of your software! I can't wait to see the tripling of performance of my code!!! I'm using lots of exclamation points so it's clear that I'm being 100% sincere!!!!!

16 Aug 2009
23:20 PM

cowgaR

@Alex

__I can explain why MINIMIZING database load is GOOD for ORM test rather then BAD...

In fact, since we measure the efficiency of intermediate layer, we must make all the layers behind it operating as fast as possible - to measure just the efficiency of that intermediate layer.

well I'm having fun here (very interesting discussion) but you're a bit wrong here

2 different ORM will issue 2 different SQL statements to a database (trying to solve the same real-life problem), say ORM-A will issue 3 statements whereas ORM-B can do it in a one statement, because its LINQ provider (join support etc), or CRUD handling is more advanced (say will only issue UPDATE statement on particular changed collumns, not on ALL like Lightspeed is doing for example (amazing ORM btw, v3 will improve this))...

therefore DB engine (say SQL 2008) will spent say 85ms (big numbers just for the record) in the first case, 15ms in the second case...and multiply this.

That is what counts in the real world.

(so the formatted sql and the possibility to alter/change that, thanks to NHProfiler you know which side I'm in, so you should messure efficiency of DB engine as well, because it might be stressed by some dumb ORM these days;-)

17 Aug 2009
00:23 AM

Ayende Rahien

I ASKED YOU PERSONALLY TO STUDY OUR TEST FOR NH TO AVOID ANY MISUSE OF NH THERE!

And, as I replied to you when you asked, the main problem isn't with the code (there are severe problems there, but that isn't the issue).

The problem is that the permise of the benchmark is flawed. You are trying to testing _bulk data manipulation_, a scenario which is just not something that NHibernate (or most OR/M) are designed to solve.

As such, there really isn't a point in trying to fix code, it is the idea of the test itself that you should fix.

If you want to create a more realistic benchmark, you should try one of the suggestions above. Write an application and test that using things like web load test.

In other words, you benchmark do something like this:

C++

int[] arr = new int[100];

for(int i=0;i<100;i++)

arr[5] = i;

int[] arr = new int[100];

for(int i=0;i<100;i++)

arr[5] = i;

And then pointing the perf difference related to array bound checking in a contrived scenario.

There isn't a point in trying to fix anything, the scenario that you are using is broken.

to measure it well (i.e. to expose JUST IT), you must MINIMIZE the database load.

That is just nonesence. Trying to minimize database load means that you are removing real world concerns, that completely change the behavior of your applications.

It is like trying to show fuel consumption in city driving for a Formula 1 car. The test itself is meaningless for the designed goals.

Just so I'll make the example clearer. No one car about the fuel consumption in 20KPH, all they care about is fuel consumption in 200KPH. And trading off one for the other is a good design goal.

since this isn't really important, I'll remove them all

Thank you.

17 Aug 2009
00:32 AM

Ayende Rahien

NH is 25...30 times slower on materialization that EF?

No, it isn't. You test is loading 20,000 objects per loop iteration, where you aware of that?

NH runs a simple query fetching 1 instance 100+ times slower when ~ 10K instances are already fetched into the Session?

Didn't I cover that already?

NH uses the UoW model, which means that we manage all loaded objects internally.

That is expected and acceptable, for the simple reason that NH ISession isn't supposed to be used for bulk data manipulation, that is why we have IStatelessSession.

Oh, and as an aside, you might want to track down what SQL is being called for each test, the results should be quite educating.

17 Aug 2009
00:36 AM

Ayende Rahien

it's interesting to know what kind of framework would win in our performance test

And that is why your method is flawed. You mention nothing at all about things that are important for real world concerns like lazy loading or eager loading options, caching semantics, transactional behavior, dirty checking, automatic persistence, and more.

Those aren't just bullet points, they truly change the way you write software.

17 Aug 2009
04:24 AM

Alex Yakunin

You test is loading 20,000 objects per loop iteration, where you aware of that?

Wrong. There are results for 1K (first page), 5K and 30K (in .XLSX). NH slower at least 25 times.

17 Aug 2009
04:37 AM

Alex Yakunin

NH uses the UoW model, which means that we manage all loaded objects internally.

Do you know that almost any other framework supports UoW and "manages loaded objects internally" as well?

Yes, normally there is a state associated with each loaded entity. But:

1) It is stored in dictionary-like object (or, better, an object internally relying on hash table). Thus state search time is constant - i.e. it does not depend on count of already loaded objects.

2) I don't understand what must happen there to make each query 100 times slower on large set of instances. Is NH iterating all the instances before each query (e.g. looping up if any of them is changed). Actually it looks like exactly this happens.

If you aren't sure there is a REAL problem, I'll show you a sequence for NH LINQ queries separately:

N (number of performed queries): Performance (op/sec)

100: 925

1000: 646

5000: 301

10000: 180

30000: 68

So performance has degraded ~ 15 times. You still don't see the problem? There is no any other framework exposing the same behavior.

17 Aug 2009
05:07 AM

Alex Yakunin

And that is why your method is flawed. You mention nothing at all about things that are important for real world concerns like lazy loading or eager loading options, caching semantics, transactional behavior, dirty checking, automatic persistence, and more.

Yes, I do not. But:

1) DO supports lazy & eager loading for all the fields by default. The entity tested on this test could be an entity with LL field with just one attribute added. Do you know we maintain "layered" field state? I.e. in fact, we know what was loaded initially, and what was changed? How its presence affects on results. Almost nohow.

Anyway, I believe there must be a separate test for this (LL); moreover, LL must not lead to much more costly while it's off. So in general, I think results must be ~ the same independently of LL support in a particular framework. If this isn't true, likely, there is a bug.

2) Caching was intentionally turned off for any framework, and we wrote about this. There must be a separate test for this. If it won't be off for some frameowork, we'd compare teleportation with regular transportition (i.e. two completely different technologies).

3) Transactional behavior, dirty checking: the same. E.g. DO relates any acquired entity state to a particular transaction, and can automatically re-load it on demand, if current transaction has changed. There are transactional methods - this ensures any operation you run on entities or persistent services will lead to a transaction, if necessary.

And what? All this stuff doesn't make it to operate slower than others.

4) Automatic persistence: oh, that's what exist in many of tested frameworks. Presence of this feature must not significantly affect on results.

Those aren't just bullet points, they truly change the way you write software.

Yes, I agree they must be implemented, but this test does not rely on these features. Moreover, I'm sure presence of these features must not significantly affect on test results. And actually test shows this: there are some frameworks implement all of them (I'm 100% sure DO does this), and its test results aren't bad at all.

This shows they can be implemented in a way that does not lead to serious performance degrade.

Any other "that's why your method is flawed"?

17 Aug 2009
05:11 AM

Alex Yakunin

Sorry, corrections:

"How its presence affects on results?"
"Moreover, LL must not lead to much more costly operations while it's off."

17 Aug 2009
05:19 AM

Alex Yakunin

So if you are confident why don't you put a price on the fact that after the necessary tweaking your product will still beat the other hand down?

I'm ready to beat for this: i.e. I'm 100% sure some of tested frameworks can beat DO after necessary tweaking on CUD, query & materialization tests (not speaking about LINQ tests - this is obviously true, but likely, I'll be waiting for too long ;) - it's complex as well).

So who wants? I can personally explain what must be done to achieve this.

17 Aug 2009
05:21 AM

Alex Yakunin

didn't see any version number in contesting ORMs (like subsonic 3 has LINQ provider already)

Tested version of any framework is mentioned in its own page.

See e.g. http://ormbattle.net/index.php/subsonic.html

17 Aug 2009
07:30 AM

Arnis L.

What a surprise - DataObjects is a leader. Seems like another ad for me.

17 Aug 2009
07:50 AM

Unholy

What a surprise - DataObjects is a leader. Seems like another ad for me.

Of course it is ad. All record of all product is ad :)

17 Aug 2009
08:11 AM

Phillip

Well i know what i wont be recommending now. DataObjects.

17 Aug 2009
08:14 AM

Frans Bouma

@Alex, Instead of rehashing your own opinion here, you also could have invested 2 minutes to remove us from your website already. It's still up there.

The main problem with your way you look at things, Alex, is that you deliberately want to run these tight loops as 'tests' and use them to create other information like the CUD numbers.

I can prove to you that that route is not correct: because the updates are all done in bulk, you can also use bulk update statements. You ignore them because you then can't run the tight loops which show how great you optimized the batching pipeline. However, for the purpose of 'updating massive entities with the same expression' (as the tight loop implements, as you use 1 context for 30K updates!), using the bulk update statements in some o/r mapper frameworks, the update statements are done much faster: 1 query and they're done. Extrapolating that to CUD numbers therefore would make them the clear 'winners'. Therefore what you use as 'CUD' numbers isn't going to fly.

You don't use bulk update statements (which are just another way of optimizing what you optimized yourself inside the pipeline!) because you find that a 'hack' as you told me. That's your opinion of course, but as you and you alone call the shots on that website, your opinion is the only one that counts. So you can manipulate the tests to get the results you want. And you do that.

Another signal that you compare apples with oranges is the 'Query' test. You have absolutely no clue how our framework works, looking at the code. For starters you fetch inside a transaction. But we don't use a central session object, Alex. That's a deliberate choice. Secondly, your code starts fetch queries on a linq metadata object in a loop. But that simply creates a new expression tree, executes it and returns the object. By contrast, the code for DO creates a single transaction on the already live Session and fetches all queries on that same object. This isn't the same thing as you now can re-use cached objects, be it a query, an expression tree evaluation, anything.

Is this mentioned on your website, that we have a very fast pipeline and therefore don't suffer from problems where queries have to be re-evaluated every time, you know like in 'Real Life'. The website, ironically, mentions that you have setup the tests the way they are to make sure the overhead in the o/r mapper is maximized. However, by re-using session/context objects in a loop, you precisely don't do that, as you can re-use resources, query parsing results etc. which is not giving a good picture. Flawed, one would say.

17 Aug 2009
08:14 AM

Frans Bouma

(part 2)

On to the linq queries. Your '100' number suggests a percentage. The numbers show that 60% of the linq queries thrown at our linq provider failed. However, does that say that 60% of all queries one would write normally would fail? Your website suggests this, (as the linq tests are carefully created, am I right?) but this is utter nonsense. For starters, you exploit lack of support of some features deliberately. For example the linq queries never show a single FK-PK comparison. That might be because you don't support FK fields. Adding 25 FK-PK comparison queries would give you a 25% failure rate. We do support FK fields, in fact our linq provider doesn't support entity instance comparisons, as we don't think it's something that will be used much (and our linq provider is out for more than a year now and we never had that request), as we have fk-pk comparisons, which are closer to what people would do in sql anyway.

Yet, you keep your set of entity comparison queries, which makes sure a lot of the queries fail. You also have queries with multiple Skip() calls. Skipping is something one would rarely use in real life, as one would normally use paging. Did you want to test paging? If so, why didn't you write a couple of them? Oh, Skip(10).Take(50) isn't paging as it is used in real life.

I can go on and on, but enough time wasted. As a final remark, I wanted to add that what I wrote about the personal attack was to make you feel how you ruined my weekend. That you don't seem to understand a single bit of it, that's clear to me. C'est la vie.

You still have something to remove, btw.

17 Aug 2009
08:37 AM

Alex Yakunin

Instead of rehashing your own opinion here, you also could have invested 2 minutes to remove us from your website already. It's still up there.

We're working on this now. We'll make all fixes in tests suggested by other and remove LLBLGen with today's update.

I can prove to you that that route is not correct: because the updates are all done in bulk, you can also use bulk update statements. You ignore them because you then can't run the tight loops which show how great you optimized the batching pipeline.

I'm writing FAQ post about this. Briefly,

We'll add tests for single entity updates
Tests for multiple entity updates will be there.

Obviously, "reality" is somewhere in the middle: some transactions run just 1 update (btw, a rare case), others run many of them. So on general, true CUD average is ~ SingleUpdateTimeSingleUpdateProbability + MultipleUpdateTime MultipleUpdateProbability. E.g. 20%SingleUpdateTime + 80% MultipleUpdateTime.

Obviously, this is a number between SingleUpdateTime and MultipleUpdateTime. Moreover, I'm 100% sure SingleUpdateTime will be approximately the same for all ORM tools, ~ 5..10K op/sec. So this number won't dramatically change real life results (CUD average).

17 Aug 2009
08:51 AM

Set

It reminds me one day, some people claimed that using dataset was better than using an ORM because some "benchmark" showed it. In fact they even claimed that it was faster than datareader...

And those, one should use Dataset over datareader or ORM tools. It was lovely and dumb.

17 Aug 2009
08:58 AM

GiorgioG

I have to say Alex, it's clear that you're in love with your software...and as we all know, love is blind.

Stop wasting our time with your drivel, clean up your site, put a GIANT disclaimer on the main page saying you have a conflict of interest (i.e. you built D.O. and the tests in its favor) and let these two gentlemen (Frans/Ayende) get back to work.

a happy LLBLGen Pro customer & NHibernate user.

17 Aug 2009
09:27 AM

Alex Kofman

Note, that authors of this web site don't claim that you must use ORM with the best performance. It is just a source of performance-related information, that can be useful in some cases.

OK tests are not perfect, but they are not completely wrong. One can suggest another test. How would good tests on ORM performance look like?

17 Aug 2009
09:30 AM

Alex Yakunin

For example the linq queries never show a single FK-PK comparison.

To be fixed. We support this, but likely, in many tests it was written as order.Customer==customer, that isn't supported by you.

17 Aug 2009
09:34 AM

Ayende Rahien

Wrong. There are results for 1K (first page), 5K and 30K (in .XLSX). NH slower at least 25 times.

You are using the following code:

while (i < count)

foreach (var o in session.Linq <simplest())

There is no paging, and NHibernate doesn't do lazy materialization. You are loading every single row (and there are 20,000 of them) for each while loop iteration.

The code make me think that you believe that it will do lazy materalization, though.

17 Aug 2009
09:42 AM

Ayende Rahien

Alex,

I am not believing those numbers for several reasons.

a) Your code and statements about the code shows a big disconnect.

b) It is _not important_, NH isn't trying to optimize loading of a large set of queries. It doesn't matter because any optimization that we may try will be swallowed by the DB time.

c) NH ISession can most certainly handle large number of object in a query, but even then, it is not something that you are supposed to do.

d) State search is O(1), but NHibernate manages query persistence transperantly. What this means is that the following code will work with NHibernate.

tx = s.BeginTransaction();

foo = (Foo)s.Get(typeof(Foo), 1);

foo.Name = "ayende";

var list = s.CreateQuery("from Foo f where f.Name = 'ayende').List();

Assert.Contains(list, foo);

tx.Commit();

When you perform a query, NHibernate will check if it need to synchronize the state of loaded objects with the database (automatic flush) based on a set of rules that aren't really important right now.

Your test happen to force NHibernate to make a check of every object loaded to memory, that is a O(N) operation, by definition.

Now you get it?

17 Aug 2009
09:44 AM

Ayende Rahien

All this stuff doesn't make it to operate slower than others.

Doesn't matter the perf implication, I mention those because you brought up selection criteria for OR/M that were nonsense.

17 Aug 2009
09:45 AM

houl

so many words about "dirty marketing games"

so many words about "useless of this comparison"

so many words about "DO got anti-ad for their own"

looks funny...

Frans, cowgaR, GiorgioG, firefly, Fred, Ayende, and others... if discussed site and its idea is useless, if this is silly, if this only leads to depreciation of DO as product - why all of you spend so many time and words here? I would just have a one-minute fun looking at ormbattle.net and then return to my work, if like you've said - "it is not real life" - then we have nothing to worry about, because we are in real life :)

But such hysteria which i can see here, makes me think, that nothing hurts like the truth... Otherwise you would not spent here so many time.

17 Aug 2009
09:46 AM

Ayende Rahien

true CUD average is ~ SingleUpdateTimeSingleUpdateProbability + MultipleUpdateTime MultipleUpdateProbability. E.g. 20%SingleUpdateTime + 80% MultipleUpdateTime.

That is extremely inaccurate. Care to name a single real world scenario where you need to update/insert 20K of records that isn't an import task?

17 Aug 2009
09:47 AM

Alex Yakunin

Agree. Ayende, all you just wrote about materialization is lie. Or you don't study the test code well. 1 min, writing reply.

17 Aug 2009
09:54 AM

Ayende Rahien

Alex,

Let me get this straight, are you telling me how NHibernate works?

I am willing to admit that I may not have the best knowledge about it, but I do believe that 6 years of working extensively with it does carry with them the capacity to understand how NHibernate operates.

17 Aug 2009
09:56 AM

Alex Yakunin

So about materialization test for NH:

Look up the last method here: code.ormbattle.net/ (I paste full code, not just a peace of it).

As you see, it may really stop "in the middle". On the other hand, if you take a look on its invocation: code.ormbattle.net/ , you'll find it's always invoked for the same count of entities as this query must return.

So it doesn't matter if there is lazy materialization or not. DOES NOT MATTER AT ALL.

To prove this, I paste a single query from SQL profiler it runs per each measurement:

SELECT this_.Id as Id6_0_, this_.Value as Value6_0_ FROM Simplest this_

Any more ideas why it's so slow?

17 Aug 2009
09:57 AM

Alex Yakunin

Let me get this straight, are you telling me how NHibernate works?

Oh, I like this phrase :) LOL, but yes.

17 Aug 2009
10:06 AM

Alex Yakunin

As you see, we've just especially checked this in SQL Profiler.

Care to name a single real world scenario where you need to update/insert 20K of records that isn't an import task?

Can you stop talking about 20K entities? We tested everything for many cases: code.ormbattle.net/

Scoreboard page reflects results for 1K entities.

Moreover, I precisely explained why big numbers are good here: ormbattle.net/.../...ur-tests-are-unrealistic.html

And concerning the batching: DO4 batches CUD updates using 25 statement buckets. Is 25 statements a good number for you? Results for such sequences will be nearly the same. I tell you once more, forget about 20-30K.

Agreed

This was about "But such hysteria which i can see here, makes me think, that nothing hurts like the truth... Otherwise you would not spent here so many time."

17 Aug 2009
10:08 AM

Alex Yakunin

you brought up selection criteria for OR/M that were nonsense.

Yep. They're so "nonsense" that immediately exposed at least two issues related to NH?

17 Aug 2009
10:11 AM

Alex Yakunin

State search is O(1), but NHibernate manages query persistence transperantly. What this means is that the following code will work with NHibernate.

Nice. DO supports this stiff from v1.0. I'm curious, did you suspect DO4 or EF doesn't implement this?

Do they expose the same 100 times slowdown as NH on big numbers?

Ok, but now it's clear why it's so slooow. NH traverses the whole object graph before each query to find out what's changed, yep?

If so, is there any way to disable this for this test? Actually I'd prefer to get the results without this "feature", if this is possible.

17 Aug 2009
10:15 AM

Frans Bouma

Houl: "Otherwise you would not spent here so many time."

You don't get it. The site pretends it shows actual real life measurements, or better: the measurements can be used as real life figures. If not, the site is useless to begin with.

However the 'real life' measurements on the website are not real life measurements, but artificial tight loops.

Furthermore, if Alex would have removed us saturday, and had simply corrected the nhibernate code as he should, things wouldn't have gotten so dramatic.

Alex: (FK-PK compares) "To be fixed. We support this, but likely, in many tests it was written as order.Customer==customer, that isn't supported by you."

And by EF.

Linq tests should focus on what statements are supported and what not, should use normal C# keyword syntaxis and should come with a huge disclaimer: the linq scope is so huge that anyone can write 1000 queries which fail on any provider, also yours (as yours fails on 2, which means if I re-use that failing aspect in 100 queries you won't succeed in 1 single query. But, gee, you didn't do that. Just lucky? (while we were unlucky because you didn't use a single fk-pk compare?), I doubt it.

Houl, that's what this is about. You don't have to agree, I just hope you understand now.

To all, I also hope we all realize the more attention this marketing campagne is given, the more it proves to be effective: giving attention to DO. /ignore is a better alternative.

17 Aug 2009
10:20 AM

fanny

I admire much both Ayende and Frans but is not for this that I will not consider others opinion if they have valid arguments (just if they are exposed in polite and fair manner).

It's also true that I respect more arguments coming from the people that really invested all their careers on the ORM field from long time and already answered long time ago some questions users like me are could have now.

So in some way the I trust their choice.

So let me recap to people that read down to here (as we do at the 102 episode of a soap opera).

1) Alex and others from an ORM vendor started a site called "ormbattle.net" that as the name implies was supposed to compare the performance of different ORM vendors in the .Net space.

Staying in "war" terms this was perceived more like an unfair invasion of others territories without previous war declaration.

2) The tests where unfair and did not compare in the right way.

Both Ayende and Frans seems to have proved that.

The fact that the site was coming from one of the vendor let many of us suppose that was a marketing site more than a fair battle.

This statement was contested by Alex that said was willing to arrive to the removal of their own product to prove that and to give the site to the community.

First suggestion: do it immediately to prove your fairness.

3) On the site the author's connection with one of the contestant (the winner) was not that self evident.

Second suggestion: until you do not release to the community put the DataObject author's origin more evident on the home page,like in the first lines (I mean just after "ORM Comparison Goals").

4) Admittedly by Alex, he likes to argue in a bit aggressive manner ("I like to argue - it's a kind of sport for me") and the comments degenerated pretty quickly from both parts and all ended up "I'm smarter than you..." stuff.

5) Indeed there is an ongoing technical discussion between the lines that is interesting and it is worth to continue debating on that in a polite and cooperative manner.

6) It seems very hard to do a real comparison of different products but a more "real-life" test application would be doable.

The quality of a product cannot be measured only on batch statements but also on features they provide in real word.

7) A site that is supposed to compare products in fair mode would also let users to post comments and have their opinion showed up.

This fairness would be proved also from the immediate removal of Ayende statements from the FAQ (they are still there) as the LLBLGen mentions (still there too).

I, as a user, would much like a site where Orm are compared by their features and progress together in a cooperative manner.

And in honor to Frans I must also say that there is something that is hardly measurable with unit test but it is really important for choosing a product and this is the quality the support a product has.

From my experience, if we could measure this, LLBLGen would beat all the others hands down.

17 Aug 2009
10:20 AM

Ayende Rahien

Alex,

You don't understand how NHibernate is working, that is the issue.

On the first iteration on a Linq query we are loading all the results into memory, then give you a List of them.

It doesn't matter if you iterate only on a few, NHibernate doesn't do lazy materialization.

Your code is loading ALL rows into memory, convert them to objects, but only iterate over a few of them.

The iteration part is _free_.

You are making assumptions that are simply invalid for NHibernate, and then trying to draw conclusions from those.

GIGO

17 Aug 2009
10:20 AM

Alex Yakunin

And by EF.

That doesn't matter. No one is ideal. You're just thinking about how to be not worse than EF? Than it's your choice.

As I've mentioned, we think it's good to test this because:

If you compare keys, you're "binding" your queries to underlying entity & key structure, that's obviously bad. If you'll change this some day, you must change all the queries.
It's simply more readable.
And finally, LINQ to objects allows this. So we think it's resonable to test this.

17 Aug 2009
10:23 AM

Ayende Rahien

Alex,

A quick peek at the NHibernate documentation, Flush, will show you the answer.

session.FlushMode = FlushMode.CommitOnly;

17 Aug 2009
10:23 AM

Alex Yakunin

Your code is loading ALL rows into memory, convert them to objects, but only iterate over a few of them. The iteration part is _free_.

Wrong again. Our code iterates _all of them_, thus it doesn't matter if iteration is free or not.

That's why we included query execution time into this test: if we won't do this, just frameworks as NH would get unreasonably high score on it, because they did all the dirty job (materialization) on execution, and further just iterated the list.

17 Aug 2009
10:26 AM

Alex Yakunin

You are making assumptions that are simply invalid for NHibernate, and then trying to draw conclusions from those.

Do you see assumptions aren't wrong?

Concerning session.FlushMode = FlushMode.CommitOnly - we'll add this to test. But as you see, this isn't obvious (other frameworks don't expose the same behavior on queries, yep?). Thanks a lot for the help here.

17 Aug 2009
10:27 AM

houl

Frans: "You don't get it. The site pretends it shows actual real life measurements ... "

So what? :) i know that there are a lot of sites, that pretend that real men have at least 25 cm. If i would try to bring them the truth about norma=1x cm, it would take all my life and i suppose without any success :)

How many there were comparisons of intel and amd - with a lot of discussions about stream calculations "good" for intel and game-like calculations, suitable for amd? About NVidia and Radeon - almost the same story, just about shaders calculations precision...

When we speak about Internet, i prefer position "Do not like? Just ignore" and recommend it to anybody :)

17 Aug 2009
10:28 AM

Ayende Rahien

Alex,

I really don't know how to try to answer that.

The "test" that you have is loading all objects to memory. You pay for reading the entire table for each loop iteration.

I am not seeing how it is related to the iteration numbers, they are not meaningful.

What is meaningful that you load the entire table each and every time.

17 Aug 2009
10:29 AM

Ayende Rahien

Alex,

NHibernate contains a lot of stuff that can be used. Expecting to pick them up without reading the documentation is... unrealistic.

17 Aug 2009
10:35 AM

Alex Yakunin

What is meaningful that you load the entire table each and every time.

Let's look on it closer again:

Instance count: performance (op/sec)

100: 14267

1000: 16399

5K: 16884

10K: 16650

30K: 16437

So as you see, peak performance is @ 5K entities. The whole table.

Do you see it is lower for 100 entities? Do you understand it will be much lower for 10 entities (remember, we include query exec. time)?

17 Aug 2009
10:37 AM

Alex Yakunin

Ok, so what I want to say: since we're showing maximal possible performance (i.e. upper limit), the test we run here is definitely ok. I agree there can be some mistakes, and I think that's what we must discuss.

17 Aug 2009
10:41 AM

Alex Yakunin

But your attempt yo say "everything is wrong" is nothing more that LOL - especially, with _such results_.

I'd seriously think about this, if guys from EF team said the same (because they lead e.g. on materialization test, + show good LINQ support). But telling us "the test is wrong" and simultaneously being the worst player there... Do you see this is a bit ironical?

17 Aug 2009
10:41 AM

Ayende Rahien

Alex,

You numbers are close enough together to be statistically meaningless.

17 Aug 2009
10:42 AM

Alex Yakunin

Sorry, of course, not the worst one... But anyway.

17 Aug 2009
10:43 AM

Alex Yakunin

You numbers are close enough together to be statistically meaningless.

ARE YOU HUMAN AT ALL?

5ALMOST IDENTICAL NUMBERS ARE "statistically meaningless"???

17 Aug 2009
10:44 AM

Alex Yakunin

Ok, sorry, I just a bit emotional now ;)

17 Aug 2009
10:58 AM

sigh

Alex, your a tool.

17 Aug 2009
11:03 AM

Alex Yakunin

But, gee, you didn't do that. Just lucky? (while we were unlucky because you didn't use a single fk-pk compare?)

Well, not "just lucky", of course. We have a large set of LINQ tests (~ 700), and most of these queries were taken from these tests with adoption to Northwind model. So our LINQ tests aren't ideal from this point, but on the other hand, we didn't want to use standard tests like "101 LINQ Samples" - most of players would succeed there.

Anyway, we're ready to add any tests you'd like. Just say what exactly must be added.

P.S. We just added PK-FK comparison tests. LLBLGen got few more scores. These tests pass for DO4 & EF as well.

17 Aug 2009
11:04 AM

cowgaR

@Alex:

__
Tested version of any framework is mentioned in its own page.

See e.g. http://ormbattle.net/index.php/subsonic.html

let me educate you a bit. Subsonic 3.0 (current public version is 3.0.0.3) has a LINQ provider.

It might not be the best there is, but it certailnly is better than few of the contestants in the your "test" (both NH and Lightspeed will have its much better LINQ provider soon).

So how it is possible Subsonic scored 0 in LINQ test?

Thank you for honest answer.

17 Aug 2009
11:07 AM

cowgaR

__Subsonic got zero score on this sequence because it does not support references (thus we were unable to compile the test for it).

oh I see, it is on the page (I thought it will be somewhere down written with very small (or even smaller) letters, would be more appropriate ;-) but "customers" are lured by numbers in the table, where you should at least put an asterix!

at least! not to mention this statement is a joke...

17 Aug 2009
11:08 AM

Alex Yakunin

Alex, your a tool.

Well, it depends. Yep, I've spent lots of time here yesterday and today. But have you seen how many people are @ ORMBattle.NET? "We have 27 guests and 4 members online" right now.

Anyway, I always prefer to spend additional time but make everything fully clear.

17 Aug 2009
11:14 AM

Alex Yakunin

So how it is possible Subsonic scored 0 in LINQ test?

Yes, we've been testing 3.0.0.3 - there is a mistake on its page.

The problem we've faced with it is that it doesn't support references. I.e. I can't declare order.Category field. I can declare only order.CategoryID. But we expect support of reference properties is a "default" feature.

That's why we couldn't get them compiled. We thought about implementing such reference properties as:

public Category Category { get { throw ...; } }

just to make the tests compile & run. But for now we haven't achieved this.

Could you recommend something to fix this?

Here is our current LINQ test code: code.ormbattle.net/ (T4 template that renders the test for any ORM).

17 Aug 2009
11:16 AM

Alex Yakunin

Subsonic got zero score on this sequence because it does not support references (thus we were unable to compile the test for it).

I just explained, why. We're fully ready to fix this ASAP (e.g. right now or tomorrow).

17 Aug 2009
11:21 AM

firefly

"so many words about..." Most of the words come from one shameless person that don't know when to stop. Clearly he is unfit in the development world. I think he would do well in Washington :). That might be a better fit for him.

I, and probably many other silent observers, know from the outset the outcome of this narrowed discussion. Benchmark like this are useless like Oren said in the beginning of this post. Discussion like this are a waste of time because it get people to focus on a few irrelevance points. Still it was necessary for Oren and Frans to step up and point out what was wrong since Alex don't know when to stop.

"But as you see, this isn't obvious (other frameworks don't expose the same behavior on queries, yep?)." See how he play dumb? Ain't that cute? :) Apparently what obvious to him is his own framework.

Anyway I rest my case. All that needed to be said have been said in the first few posts. All the other post are just there to point out the obvious that obviously wasn't so obvious to one person.

17 Aug 2009
11:29 AM

Frans Bouma

"> And by EF.

That doesn't matter. No one is ideal. You're just thinking about how to be not worse than EF? Than it's your choice."

No, I was referring to the fact that it is convenient for you to use only entity comparisons as you knew they would all fail on EF.

So the tests didn't do:

I have a requirement (fetch X filtered on Y) and implement that in the way the framework supports and test that (as the user would do)

but you did:

I want to use ABC, and if framework X doesn't support that, all the better.

That's not testing real life stuff, that's just making deliberate choices so others would suck. Of course you don't agree, that seems to be in your DNA (remember years ago you argued my framework was crap because using a zip tool would compress the framework much more than yours? :D)

for the people who have no idea what I mean:

// entity comparison:

var q = from o in ctx.Order where o.Customer==c select o;

// vs. fk-pk comparisons

var q = from o in ctx.Order where o.CustomerId ==c.CustomerId select o;

"- And finally, LINQ to objects allows this. So we think it's resonable to test this."

So, did you include some tests using Reverse(), SequenceEqual, Aggregate, ElementAt( Oh, we actually support that one), etc. ? No? why not? Linq to objects seem to do just fine.

"That's why we included query execution time into this test: if we won't do this, just frameworks as NH would get unreasonably high score on it, because they did all the dirty job (materialization) on execution, and further just iterated the list."

Are you saying that your framework keeps open the datareader during enumeration? I find that hard to believe.

@Houl: "So what? :) i know that there are a lot of sites, that pretend that real men have at least 25 cm. If i would try to bring them the truth about norma=1x cm, it would take all my life and i suppose without any success :)"

haha :D Ok, good points.

"P.S. We just added PK-FK comparison tests. LLBLGen got few more scores. These tests pass for DO4 & EF as well."

I doubt EF will succeed, they don't support FK fields. I also don't see how testing our code is still relevant, you are in the process of removing our name/code/results, remember?

""We have 27 guests and 4 members online" right now."

You want to educate a lot of people how the state of O/R mapping is today on .NET? I then would not be satisfied with 27 guests. Let me put it this way: one post on this blog or for example on my blog will reach thousands of readers immediately and many thousands of readers in the following days through google searches.

Even though the numbers are low, I still would like to see you putting less effort in rehashing the same statements but to remove us from your marketing campagne. That's this site, and also 'LLBLGen' as keyword in adsense on google. Look at it this way, Alex: Oren and I don't need competitors' names, their benchmark results or other stuff to explain to users what we have to offer. You apparently have to. That might suggest you are doing something erm... wrong. (and do remember, you're on the market for many years already. Lack of time isn't really the case)

17 Aug 2009
12:19 PM

Alex Yakunin

See how he play dumb? Ain't that cute? :) Apparently what obvious to him is his own framework.

Let me remember you the whole conversation:

We noticed NHibernate query performance degrades with count of fetched entities on our test.
Ayende said "it's OK that default NHibernate query cost has O(CountOfFetchedEntities) factor, because it traverses the whole graph"
I said "Ok, can I do something to get rid of this? Note that this isn't obvious."
He answered "yes", and published a solution.
I said we'll fix this then.

So what do you mean saying I'm playing a dumb? If you mean this is obvious, can you name any framework from tested above that expose the same behavior?

If you're saying this is an "expected feature", but don't name where else it can be found, well... That's what I'd call shameless.

17 Aug 2009
12:29 PM

Alexis Kochetov

Oren, we're trying to implement IStatelessSession pattern within our performance tests and faced an issue: it's impossible to use LINQ queries with it. There is no overload for Linq <t(this ISession) method that accepts IStatelessSession; IStatelessSession does not implement ISession.

Is there any workaround?

17 Aug 2009
12:30 PM

Alex Yakunin

I want to use ABC, and if framework X doesn't support that, all the better.

No, we didn't. Our goal was to create ~ the same LINQ test for all ORM tools. And if some ORM (e.g. Subsonic) requires a specific model & test, we couldn't handle this so far. Now we're trying to fit the needs of everyone (i.e. there will more complex T4 template).

Anyway, LLBLGen got almost nothing because of it.

Btw, crying about "they frequently use extension methods instead of C# syntax, and that's why nothing works there " on you web side just proves the fact that 1 step aside from very basic queries lead to a failure.

EF supports everytihng. We too. I'd expect you're, as author of long cycle of articles related to LINQ provider, simply MUST provide an HQ implementation of it.

17 Aug 2009
12:33 PM

Alex Yakunin

Are you saying that your framework keeps open the datareader during enumeration? I find that hard to believe.

Yes, DO4 does this for MSSQL (i.e. utilizes MARS), although this is provider dependent.

17 Aug 2009
12:37 PM

Alex Yakunin

So, did you include some tests using Reverse(), SequenceEqual, Aggregate, ElementAt( Oh, we actually support that one), etc. ? No? why not? Linq to objects seem to do just fine.

No, we didn't. If some LINQ feature requires an implementation with unexpectedly high execution cost, obviously it is better to say "not supported here".

As you see, this isn't related to reference properties.

Btw, we don't support ElementAt ;) Will be added to tests.

17 Aug 2009
12:39 PM

Alex Yakunin

I then would not be satisfied with 27 guests. Let me put it this way: one post on this blog or for example on my blog will reach thousands of readers immediately and many thousands of readers in the following days through google searches.

I think that's ok to have such amount of readers during first working day. Frans, I aware your blog is very popular, as well as Oren's ;)

So I think this is a good result for us ;)

17 Aug 2009
12:43 PM

Alex Yakunin

30 simultaneous visitors = ~ 2K visitors per day (just extrapolation). So that's ok.

17 Aug 2009
12:48 PM

Alex Yakunin

'LLBLGen' as keyword in adsense on google.

Do you know it's ok to use competitor's name in many many countries? Check out their trade name policy.

Comparison-based advertisement is legal in many countries as well (likely, the main reason for its legalization was that it is good for customers). Just FYI.

17 Aug 2009
12:54 PM

Frans Bouma

"Btw, crying about "they frequently use extension methods instead of C# syntax, and that's why nothing works there " on you web side just proves the fact that 1 step aside from very basic queries lead to a failure. "

based on what does that prove anything besides your rich imagination? I have hundreds of groupby queries which all work in many complex situations, yet none of your queries work. Apparently some oversight, as groupby() is very tricky to implement and a code path you use isn't optimal, HOWEVER, in real life who writes the group by queries in linq with extension methods? Almost no-one.

If you are saying we only support very basic queries, you are flat out lying. e.g.

var q = from o in metaData.Order

    group o by new { CombinedName = o.Customer.City + o.Customer.Country } into g

    select new { Key = g.Key , Sum = g.Sum(o => (o.OrderDetails.Sum(od => od.Quantity * od.UnitPrice))) };

runs fine.

How many queries do you run with this:

var q = from e in metaData.Employee

    select new

    {

        FooId = e.EmployeeId,

        OtherId = e.ReportsTo,

        NestedElements = from o in e.Orders

                         where o.OrderDetails.Any()

                         select new

                         {

                             BarId = o.OrderId,

                             OtherId = o.EmployeeId,

                             Cnt = o.OrderDetails.Count()

                         }

    };

var q = from c in metaData.Customer

    join o in 

        (

            from order in metaData.Order

            group order by order.CustomerId into g

            select new { CustomerId = g.Key, Frequency = (int?)g.Count()}

        ) on c.CustomerId equals o.CustomerId into co

    from v in co.DefaultIfEmpty()

    where c.Country=="USA"

    select new { c.CustomerId, Frequency = v.Frequency??0};

runs fine.

"EF supports everytihng. We too. I'd expect you're, as author of long cycle of articles related to LINQ provider, simply MUST provide an HQ implementation of it."

EF supports everything? No, it doesn't support contains and some other extension methods. Writing 100 of these queries will score them a 0.

As I've said earlier, your queries showed some little bugs (3) which likely tripped up a lot of the tests to fail, mostly due to code paths which aren't used that much if at all. We'll fix them this week. Not that it matters, your '100' number is subjective and misleading anyway.

17 Aug 2009
13:30 PM

Frans Bouma

"Do you know it's ok to use competitor's name in many many countries? Check out their trade name policy.

Comparison-based advertisement is legal in many countries as well (likely, the main reason for its legalization was that it is good for customers). Just FYI."

yes, I know. I was just asking a question. That you are not willing to remove our brandname as word in adsense, says everything about you and your moral standards. The only other company who had similar tactics and moral standards was Vanatec.

17 Aug 2009
14:11 PM

Set

Using other brand names should be used with cautious.

If I recall well, that website is not and far from it.

About google ads, they remove them on request if I recall well else it'd put google open to litigation.

17 Aug 2009
14:18 PM

Ayende Rahien

Alexis,

No, Linq queries doesn't work with IStatelessSession at the moment.

Of the top of my head, I can't think of a reason that they cannot be made to work.

Writing an equivalent for IStatelessSession would probably just copy the Linq(this ISession) for IStatelessSession)

17 Aug 2009
14:21 PM

Ayende Rahien

ARE YOU HUMAN AT ALL?

That is... an interesting debating tactic.

Please be aware that continuing in this vein will get you blocked, I am willing to have a debate, I am NOT willing to have mud slinging in my blog.

5 ALMOST IDENTICAL NUMBERS ARE "statistically meaningless"???

You don't get the meaning of statistically meaningless?

Let me put it this way, you have two numbers, N and R.

N is the number of items you work on.

R is the number of operations per second.

If for wildly varying values of N the value of R is pretty much the same, it means that there is no correlation between N and R.

17 Aug 2009
14:27 PM

alwin

Thanks fanny for the soap recap :)

17 Aug 2009
14:28 PM

Dinesh Gajjar

Another product trying to work up it's sales via attacking well named brands :). We had this with nCover a few days ago and now this.

Looks like nHibernate is becoming new target like what Microsoft used to be :)

Alex, do you comment on blogs for living ? Get a life. Anyone with SANE intelligence is not going to believe your broken english :)

Ayende, I think it's best to close this entry for comments, this will put guys like Alex out of job :)

17 Aug 2009
15:25 PM

Alex Yakunin

That you are not willing to remove our brandname as word in adsense, says everything about you and your moral standards. The only other company who had similar tactics and moral standards was Vanatec.

That's not about moral standards. That's just about advertisement. It's legal, and I really think it's ok. I admit the same activity for any of our own brands.

17 Aug 2009
15:29 PM

Alex Yakunin

If for wildly varying values of N the value of R is pretty much the same, it means that there is no correlation between N and R.

Exactly ;) Ovciously, I didn't mean number there won't be significantly different from 14K-16K, moreover, likely it will be nearly the same (= nearly as bad as it was for 1K, 5K and so on). So "no correlation" is quite meaningful here.

17 Aug 2009
15:32 PM

Alex Yakunin

Not that it matters, your '100' number is subjective and misleading anyway.

Frans, I agree it's very subjective. We've made really complex test suite.

But isn't it good? Just think about this. E.g. from the point of customers.

P.S. I'm removing DO from this web site. LLBLGen is also leaving it.

17 Aug 2009
15:33 PM

Alex Yakunin

For those who are interested: there is one more thread related to NH benchmarks: ayende.com/.../benchmark-cheating-a-how-to.aspx

17 Aug 2009
15:33 PM

Ayende Rahien

Alex,

do you understand why there is no correlation?

That's not about moral standards. That's just about advertisement. It's legal,

And do you understand the difference between moral and legal?

17 Aug 2009
15:36 PM

Alex Yakunin

Just think about this. E.g. from the point of customers.

I mean if such test exists, any player there may easily score its own LINQ implementation.

Yes, it's complex to get 100 of of 100 there. Moreover, i think we must modify test in such a way no one will be able to get it. Like with 3D benchmarks: getting topmost results must be almost impossible.

Finally, we're ready to change these tests in the way community will suggest.

17 Aug 2009
15:37 PM

Alex Yakunin

And do you understand the difference between moral and legal?

Yes. I think using competing brands in ads is moral and legal. Because it's good for competition => good for customers.

17 Aug 2009
15:38 PM

Alex Yakunin

do you understand why there is no correlation?

I just agreed with you there is _no correlation_, but mentioned exactly this is meaningful (= you have ~ constant operation ratio).

17 Aug 2009
15:43 PM

Alex Yakunin

I think using competing brands in ads is moral and legal.

Err... I mean "must be legal worldwide". Just IMHO, of course.

I understand this depreciates the brands, but on the other hand, this allows new ones to grow up faster => more intensive competition.

I have nearly the same opinion on many other points, e.g. patents (know-how is ok, but patents are not, at least in their current state). So generally if something is good for the competition & morally ok, it must be legal.

But that's another story...

17 Aug 2009
16:07 PM

Frans Bouma

"I understand this depreciates the brands, but on the other hand, this allows new ones to grow up faster => more intensive competition."

You're on the market since 2003? or somewhat around that. Since that day, you've gained some marketshare but eventually lost it.

If I may, you mainly lost it due to the lack of progression in your own library (the 3.x branch) in favor of some super-duper framework you've been working on for a long time (v4).

You won't get it back. Not by a long shot. Let me explain: in the past 5-6 years, numerous frameworks have seen the light and have died off, just a few remained. That's logical market progression: several big players (2-3) fight for the top spot, the rest has marginal market share (<10%). Before MS released their frameworks, it was open, but you left more or less the market. MS released their frameworks, and no matter what you think of it, in the end there only will be the following frameworks:

Entity framework
Some open source offering which does things differently. Best cards are in the hands of nhibernate, but their linq provider has to be better.

The rest will die or will have marginal market share. The simple reason for this is that the EF has millions of funding and a very large team of developers, and in the end they will get their act together and fix all the problems in the framework. It's not a competition really, they're included in .NET, you don't have to buy a framework from a 3rd party. If you don't agree with how the EF works, you can use an oss offering, also free, e.g. nhibernate. NHibernate will too in the end implement every single feature there is to implement, simply because all other problems are already solved.

that's also why we didn't decide to compete on framework features, but on the designer level. MS will never catch up at that level and we already have a big advantage in that area.

So where does that leave you? You might have a big framework with lots of features, and perhaps on paper you look better than whatever is out there, but frankly, it doesn't really matter: the people will choose what everybody else is using, what the known brands are. Some small group will pick a small player, but that's it. And you see that happening already today.

I'm not saying this to make you feel bad, I'm not that kind of person, trust me. The reason I explain this to you is that you might, even a little, get a bit of insight in what is happening today in the O/R mapping/data-access market and how the situation is and where you made your serious mistake. No low-level marketing tactic with competitor brandname advertising, rigged benchmarks and other crap will help you with that, markets don't work that way.

Alex, believe in what you can do yourself, instead of basing your success on the downfall of others. The only way to succeed is to take your own destiny in your own hands. Making others look bad in some 'test'/battle/benchmark isn't helping you one bit: it won't get you the marketshare you need to survive against Redmond, as you compete on the framework level, which is a lost race.

17 Aug 2009
17:06 PM

Alex Yakunin

Frans, I clearly understand that's quite difficult to get any significant share now. And all the things you wrote are absolutely correct.

But, as you might find, we're looking a bit further than other ORM vendors: in fact, we ship a database integrated with ORM tool. Not sure of you know this - DO4 is capable of compiling queries against its own storage (with open API) & execute them there. Yes, this part isn't fully ready yet - e.g. this works just in RAM for now. But it works as it must work within fully featured database almost everywhere (i.e. there is index statistics, query optimizer relying on it & so on).

Why I think it's important?

MS will never do the same. They sell SQL Server very well, so there are no any reasons to work on such a tight integration there.
Can you imagine how they'll deliver it to e.g. Silverlight? We can, because our engine is fully managed. Do you know any database working @ Silverlight?
Our final AIM is to provide a transparent .NET API for any relational storage. Our own or not - it does not matter. SQL or not - again, this does not matter. E.g. technically we can support BerkeleyDB (no queries except index operations) or Azure Table Services.

So why we focus just on ORM part now? I clearly see we can't market a solution with integrated database right now. We must prove everything works with regular, well-known databases.

Further this must allow us to propose the people using DO4 to use our integrated database (e.g. instead of slow SQLite). And certainly, we'll work on support of such platforms that others simply can't support so easily: Silverlight, Azure, Amazon Simple DB & so on.

That's why we care so much about full transparency: you can do absolutely the same things on our own storage as on any other. Schema upgrade, queries - everything is the same. Sync feature must be a last chain binding all this stuff together.

But as you see, for now we're showing just basic stuff. And I bet it's visible it works well ;)

17 Aug 2009
17:09 PM

Alex Yakunin

"AIM" -> aim, LOL :) I don't use AIM ;)

Many mistakes, but I hope I illustrated our position.

17 Aug 2009
17:13 PM

Alex Yakunin

If I may, you mainly lost it due to the lack of progression in your own library (the 3.x branch) in favor of some super-duper framework you've been working on for a long time (v4).

Yes, you're right. Frankly speaking, I was more than happy this summer, because a HUGE machine, on which we've spend about 2 years, has finally started to move. Of course it's clear the most complex part is what happens now. But until you don't see the results, it's really difficult to motivate the people & be motivated as well...

So I wish you getting the same point with LLBLGen 3 faster. For me that was really a kind of dream ;) May be that's why I'm a bit nervous protecting it ;)

17 Aug 2009
20:04 PM

Alex Yakunin

Today's results: http://ormbattle.net/index.php/blog.html

17 Aug 2009
20:16 PM

Frans Bouma

Interesting to see that 'openaccess' of all frameworks gets the crown handed over to them... who would have thought...

17 Aug 2009
21:31 PM

Alex Yakunin

Well, they're really good from the point of CUD operations - their bulk deletes are simply ideal. We didn't knew about this, so this was a surprise even for us :)

But materialization currently is their painful place: they consume RAM much faster than e.g. EF & Lightspeed. Their results for 10K / 30K entities are ~ 2 times worse on "big" tests. Likely, just because they're already "wasted" the whole L2 cache, although competitors are still there.

I hope this explains something :)

17 Aug 2009
21:35 PM

Noah

I think Alex's site states it best over here: ormbattle.net/.../...ur-tests-are-unrealistic.html

He mentions that it's not fair to test a CPUs ADD operation by using MULTIPLY because that would work better on CPUs that support MULTIPLY.

I think that sums up this debate. Alex wants to compare everything by what he believes to be the lowest common denominator. However, I think you have to agree, Alex, that it's not really fair to compare based on lowest common denominators because that's not how things are used. Going back to your example of CPUs, when choosing a CPU you would typically do it based on the application. So for a for a device that will do a lot of multiplication, would you want to consult a test that specifically excludes that option?

My point is just that this isn't how things work in real life! I know you want to show lots of numbers and in a certain sense your rationale makes sense, Alex, but I think you have to rethink exactly who might read this and what these numbers might mean to them.

Especially with software, which is an art as much as a science, how you use a tool is just as important as which tool you use. I've seen hacks that trick out really crappy frameworks to do the right thing fast, and I've seen great frameworks used in the stupidest of ways.

I think that there is no way to really test 2 frameworks the same way since the way they are used is different. Oren has mentioned this multiple times, basically telling Alex that "You don't do it that way in NHibernate". Even the application route that Oren suggested (build an application and test that) isn't completely fair because different apps should use different frameworks.

Anyway, that's my 2c.

17 Aug 2009
21:36 PM

Alex Yakunin

Let's comment EF results as well:

Perfect materialization. I bet they can make it ~ 1.5 times faster, if they'd decide. Just because we could (although we have a bit more "fat" entities).
No CUD batches = chatty communication. No good marks.
Query compilation time is simply huge. Likely, because they compile everything to eSQL. May be I'm wrong, but I feel this is just rudiment they got from ObjectSpaces.
On the other hand, they provide compiled query API. It saves them on queries ;)

17 Aug 2009
21:55 PM

firefly

"So what do you mean saying I'm playing a dumb? If you mean this is obvious, can you name any framework from tested above that expose the same behavior?

If you're saying this is an "expected feature", but don't name where else it can be found, well... That's what I'd call shameless."

Do you know what irrelevance mean?

You started with an irrelevance but also erroneous argument to begin with. Then you get people caught up in trying to correct your erroneous statement. By focusing on trying to correct the erroneous statement you divert people attention from dismissing your argument, which is bogus to begin with.

Let say that I drive a car with gray interior you'll say that my car is hotter in the summer because I have a black interior. Well the whole argument is baseless to begin with. My car interior color have little to do with whether it's hot or not. Your benchmark number have little to do with the real performance of the frameworks involved.

Yet you get people caught up in arguing with the fact that the car interior is in fact gray. After that said and done you'll act all innocent and play the nice guy and say well I didn't know I'll correct that. If you didn't know that NHibernate work a certain way maybe you should have investigate it further before doing your benchmark. Instead now you said it's not a common future? That's the same as saying gray is not a common color for a car interior. In short it's just another diversion tactic of your.

Fact is correcting your mistake make you look like a nice guy so you have time to circle back to your baseless argument. It's a diversion tactic that politician love to use. They get people to caught up in the details that they forget the big pictures. So please quit trying to get people to focus on this and that while the main issue here is for you to abandon the whole "useless benchmark" all together.

17 Aug 2009
21:59 PM

Alex Yakunin

However, I think you have to agree, Alex, that it's not really fair to compare based on lowest common denominators because that's not how things are used.

I agree with this. And since we can't make any assumptions on how the things are used, we just provide plain output for each basic operation. No places. I think this is at least more honest than some average points.

I.e. we just tell if each basic feature can be fast or not. Places would break everything.

Especially with software, which is an art as much as a science, how you use a tool is just as important as which tool you use. I've seen hacks that trick out really crappy frameworks to do the right thing fast, and I've seen great frameworks used in the stupidest of ways.

Fully agree.

On the other hand, I'm almost fully sure pretty simple tool done by few newcomers have zero chances winning on these tests:

Getting some scores in LINQ tests means you must be an expert. I hope Frans & Oren can prove this :) It's simply can't be achieved right on the first attempt. Man-years must pass till you get LINQ passing most of these tests.
To get high score CUD tests, you must solve tons of problems. I can enumerate just some of them: full query parameterization, running only queries with caching plans, cache SQL for these queries or be able to build it fast with new parameters (there batches!), batches themselves, IN optimization (OpenAccess results for removals are impressive, yes?), and usual topological sorting, version checks, etc... Can you believe someone can "simply write" all this stuff?
To get high score on materialization test, you must profile it for relatively long time... Clearly knowing what can and must be optimized. Here is all about RAM consumption & count of allocations. You must know all about .NET application performance to achieve good results. Ok, just one example: we use our own IntDictionary(Of T). It is a dictionary resolving Int32 keys to T. Up to 4 times faster than regular dictionary (no GetHashCode(), fast equals, special hash table structure). Internally we use Int32 TypeIds, and this helps to resolve them faster. Gives us additional 5-10% on such tests.

So... That's why we tested just leading tools. At least as we thought. Other players... Frankly speaking, have almost zero chances.

17 Aug 2009
22:08 PM

Alex Yakunin

Fact is correcting your mistake make you look like a nice guy so you have time to circle back to your baseless argument. It's a diversion tactic that politician love to use.

I agreed we've made a mistake there (i.e. used NH in wrong way). As well mentioned the mistake isn't obvious: no one except NH exposes this behavior, so we really couldn't assume this. Of course, if you don't assume we read the whole NH documentation ;)

Moreover, it is written @ ORMBattle.NET (see "About") that _we are humans_, thus we can make some mistakes, so if you're an expert, please help us to fix them.

Be sure, if they exist, they aren't intentional.

Anyway, now all we could fix (yes, there there some wrong proposals even from Oren - do you know he's human as well?) is fixed. NH got ~ 2 times better score. That's it.

If there is something else that can be done for it, just say this ;)

17 Aug 2009
22:10 PM

Alex Yakunin

If you're saying this is an "expected feature", but don't name where else it can be found, well... That's what I'd call shameless.

Btw, so can you name any other tool exposing such an "expected" behavior? ;) You just almost called me "politician", so I'd like to know if you aren't ;)

17 Aug 2009
22:11 PM

JJ Rock

Seriously, did __any of you guys get any real work done today? ;-P

17 Aug 2009
22:14 PM

Alex Yakunin

So please quit trying to get people to focus on this and that while the main issue here is for you to abandon the whole "useless benchmark" all together.

Funny. Have you noted I was talking right about "useless benchmark", but you've just popped up an absolutely irrelevant part of discussion. Moreover, you said that I'm trying to get people focused on it.

Ok, I like the guys loving to argue :) But not this way. Let's close this topic?

17 Aug 2009
22:20 PM

Alex Yakunin

Seriously, did any of you guys get any real work done today? ;-P

I wasted all the day here ;) But I'm a chief, you know ;)

Worse that most part of DO4 team was reading this stuff today ;) And, as I suspect, mainly to ensure I'm polite enough ;)

17 Aug 2009
22:22 PM

firefly

Alex, just go back and read my last post again and all the other posts then think real long and hard, you'll figure out an answer on your own.

18 Aug 2009
04:06 AM

Alex Yakunin

firefly, you're giving the best advices here. Short and clear. At least one thing is pretty clear: no answers from you.

Let me remember some of your phrases here:

Me: "it's quite difficult to make a test that will be accepted by all of us. If it will be too complex ...; if it will be simple ..."

You: "You've said it yourself. Benchmark like this is useless because ..."

Do you see I said a completely different thing, but you rephrased it as you wanted?

Currently the test is being tweaked toward your framework, whether it's intentional or not.

...

So if you are confident why don't you put a price on the fact that after the necessary tweaking your product will still beat the other hand down?

Have you seen I accepted this? Would you still like to bet?

so can you name any other tool exposing such an "expected" behavior? ;) You just almost called me "politician", so I'd like to know if you aren't ;)

No answer.

So one of conclusions I made: guys like you must be ignored. No proofs, no answers, no responsibility, just so important "IMHO".

18 Aug 2009
04:06 AM

houl

hihiks :)

so many argues about speed... from people who are engaged in developing of most draggy type of IT products :)

Assembler is fast. All of other - is compromise :)

Sorry for a bit of offtopic, but really, this discussion is so funny :) if only some of participants would cease heat a bit... :)

18 Aug 2009
04:37 AM

Alex Yakunin

Just remembered an obvious example of why e.g. materialization performance is important:

SqlDataReader is able to fetch about 2M rows per second on this test
PosgreSQL and some other readers normally fetch just 500K rows per second.

So 4 times difference. If you'd look at SqlDataReader code, it's quite optimized. There is simply nothing to make faster. Others are not (lots of boxing, etc.).

So it's obvious the team developing SqlClient & SQL Server protocol spent noticeable time on such optimization.

Fast materialization in ORM is the same as fast IDataReader in ADO.NET. So I thunk there are tons of reasons for optimizing materialization.

So numbers to keep:

SqlDataReader is able to fetch about 2M rows per second on this test
EF materializes up to 600K entities per second relying on SqlDataReader (so it is ~ 3 times slower)
NH materializes just ~ 40K entities per second, at least via LINQ. I understand many people find this acceptable. But that's what pushes them & others to use stored procs & server side logic.

18 Aug 2009
04:45 AM

Ayende Rahien

Alex,

You seem unable to follow that there are different needs for an ORM vs. IDataReader.

IDataReader is used for _bulk data manipulation_, ORM is used for OLTP

18 Aug 2009
05:10 AM

Alex Yakunin

Who said this? If you, I fully understand, why:

EF can be used for bulk data manipulation). LINQ 2 SQL - as well (we know it must be faster than EF on materialization)
NH can not.

18 Aug 2009
05:13 AM

Ayende Rahien

Alex,

OR/M is for OTLP.

OTLP is very rarely about bulk data manipulation.

18 Aug 2009
06:02 AM

Alexis Kochetov

Oren, thank you for the comment about IStatelessSession. It was implemented in our current tests.

18 Aug 2009
06:59 AM

Alex Yakunin

Ok, Oren, I just compared incomparables: teleportation with transportition, just for you.

Entity:

[HierarchyRoot]

[Index("Value")]

public class Simplest : Entity

{

[Field, Key]

public long Id { get; private set; }


[Field]

public long Value { get; set; }

}

As you see, there are 2 indexes now - I made this intentionally, to make the test at least a bit complex for RDBMS. I think most of real life cases related to executable DML are even more complex from this point.

Teleportation: I tried SqlClient query + commit of this one:

"UPDATE [dbo].[Simplest]

SET [Simplest].[Value] = -[Simplest].[Value] " +

WHERE [Simplest].[Value] >= 0"

All rows pass WHERE condition.

Transportition: this code on DO4:

      using (var ts = Transaction.Open()) {

        var query = Query.Execute(() => Query

<simplest.All);

        foreach (var o in query) {

          var value = o.Value;

          if (value>=0)

            o.Value = -value;

        }

        ts.Complete();

      }

Result:

Teleportation: 48K rows (updates) / second
Transportition: 15K rows / second
Total rows updated in this test: 1M.

I think if you find 10+ times difference on materialization "acceptable" for OLTP (as well as 2...4 times difference on CUD tests), this 3 times difference is much more than acceptable. Moreover, I suspect real life results will be even closer (take a look at that tiny entity; + index was fully fitting in RAM).

18 Aug 2009
07:07 AM

Ayende Rahien

Alex,

Are you truly arguing cursor based vs. set based efficiencies here?

Oh, and RPC vs. Local Work?

18 Aug 2009
07:13 AM

Alex Yakunin

Btw, the larger set we take, the smaller is difference. On 10M set teleportation will be ~ 2+ times slower, but transportition will run nearly the same.

The reason is obvious: the larger are indexes, the more costly their updates are. But SQL parsing & communication time is ~ constant.

So about importance of our tests:

CUD tests: important, if you deal with pretty limited sets of data. You'll still fill the difference on millions of records. But definitely not on billions.
Fetch, query: nearly the same.
Materialization: important in any case. If query is returning a part of the index, low-level RDBMS performance here is constant and very high (~ up to 20M of index entries per second by our tests - it's easy to measure this, just calculate an aggregate). So generally materialization is the main limiting factor in bulk reads.

18 Aug 2009
07:17 AM

Alex Yakunin

Are you truly arguing cursor based vs. set based efficiencies here?

Oren, look at the numbers. It's pretty easy to prove they must be ~ of same efficiency on large data arrays.

If you don't believe this is true, think about distributed databases. There is no "local" at all.

18 Aug 2009
07:22 AM

Hendry Luk

On the other hand, in all fairness, I don't think the point of the test is to intentionally make it OLAP. It merely magnifies OLTP operations in magnitute of hudrends-of-thousands times to observe how NH performs in high load, compared to other ORM products. Maybe the approach is not designed quite properly for NH, but probably someone could advice how to do it (high-load OLTP test) to provide more accurate result for NH.

18 Aug 2009
07:52 AM

Alexis Kochetov

Hendry, Oren, we are going to implement TPC-C or TPC-E performance tests where all framework features will be allowed because it's business logic bench with huge DB workload.

First stage is to make infrastructure for these tests (different implementations for particular ORMs should be rather similar). What do you think about it? Please reply.

18 Aug 2009
07:59 AM

Ayende Rahien

Alex,

Your "test" is flawed once again, remove the idea, and try again. Try something that resembles real world practices vs. a specially crafted thingie to show how it behaves.

18 Aug 2009
08:00 AM

Ayende Rahien

Alex,

I suggest reading about the fallacies of distributed computing.

You code would result in 1M+1 remote calls. Put it on a reasonable network, and assume a 2 ms ping time, and your timing is going to so far off it isn't even funny.

18 Aug 2009
08:01 AM

Ayende Rahien

Hendry,

If that is the case, then the test should be using _different sessions_, not a single session

18 Aug 2009
08:02 AM

Ayende Rahien

Alex,

Persistence by reachability isn't costly at all.

I think you are confusing dirty checking with that.

As for implementing a different dirty checking method, the hooks are there, and it is quite easy to do.

18 Aug 2009
08:16 AM

Ayende Rahien

Alexis,

That sounds like a more reasonable approach for this, but see my comments about the PetShop benchmark stink when 1.0 came out

18 Aug 2009
08:24 AM

Alex Yakunin

Oren, I see you don't want to answer just "Yes" or "No" ;)

You code would result in 1M+1 remote calls.

We've already discussed this. I described this here: ormbattle.net/.../...ur-tests-are-unrealistic.html

Put it on a reasonable network, and assume a 2 ms ping time, and your timing is going to so far off it isn't even funny.

Ok, I see you're intentionally trying to hide pure NH performance by other numbers (I agree they might appear in high-stress conditions, but not everywhere), constantly saying "this is real, not what you show to us".

I'll repeat myself: "So about importance of our tests:

CUD tests: important, if you deal with pretty limited sets of data. You'll still fill the difference on millions of records. But definitely not on billions.
Fetch, query: nearly the same.
Materialization: important in any case. If query is returning a part of the index, low-level RDBMS performance here is constant and very high (~ up to 20M of index entries per second by our tests - it's easy to measure this, just calculate an aggregate). So generally materialization is the main limiting factor in bulk reads."

Do you agree at least with these statements?

Persistence by reachability isn't costly at all.

It requires the same ~ O(N) scan time. So it's of the same complexity.

I think you are confusing dirty checking with that

True - I imagined just graph traversal & forgot about this. So this must be much more computationally complex, but with the same ~ O(N) dependency.

18 Aug 2009
10:19 AM

Mats Helander

Ok, I see you're intentionally trying to hide pure NH performance by other numbers (I agree they might appear in high-stress conditions, but not everywhere), constantly saying "this is real, not what you show to us".<<br />

That's not hiding it - that's putting the numbers into exactly the kind of context that you replied to me saying you agreed with was needed and you were working to fix.

I'll repeat myself: "So about importance of our tests:

CUD tests: important, if you deal with pretty limited sets of data. You'll still fill the difference on millions of records. But definitely not on billions.
Fetch, query: nearly the same.
Materialization: important in any case. If query is returning a part of the index, low-level RDBMS performance here is constant and very high (~ up to 20M of index entries per second by our tests - it's easy to measure this, just calculate an aggregate). So generally materialization is the main limiting factor in bulk reads."<<br />

And I will also repeat myself: So what if the room for overall application performance improvment is, say, 5%? Then the points you mention above can convincingly demonstrate in tests that your framework is 100 times faster than mine, yet still you do ot manage to improve overall application performance by more than 1%.

You do agree with my logic, no?

/Mats

18 Aug 2009
11:46 AM

Mats Helander

Hehe, again my comment was too long - well good thing I have my new blog then ;-)

Oren, I know you are every bit as familiar as I am with the potential for spectacle surrounding "PetShop" benchmarks, but please hear me out...I think you may possibly agree with me that there's a kind of Pet Shop benchmark that could be very useful.

Please let me know what you think:

matshelander.blogspot.com/.../...ti-benchmark.html

18 Aug 2009
11:47 AM

Alex Yakunin

You do agree with my logic, no?

Yes, I agree with this. Let's think about other expenses we have now:

Index operations. I wrote Oren any high-performance server is equipped well enough to reply most of queries w/o HDD seeks. Either the whole database or its working set is cached in RAM. So almost no delay here: such operations can be performed with at least 100K/sec rate per second.
Network latencies. If you're caring about performance, 10Gbit channels are right for you. Latencies start from 2.5 microseconds there. Don't paste any links, since they're easy to find. So again, not an issue.
Network stack, or protocol latency. ~ 50 microseconds is latency reachable for WCF, so I suspect specialized protocols, such as VIA for SQL Server, provide much better one. 10-20 microseconds is fully ok for us here.

So what else is left? I don't know. It seems I mentioned all show stoppers.

So querying an enterprise storage with 10K op/sec in a single thread is more than possible (likely, even 20-30K op/sec isn't a limit). But NH can utilize only 1/10 of it on queries. I'm not speaking about CUD: ~ 10K commands per second * batch size (25 for DO) = 250K updates per second. So here we're easily reaching the limits of our framework. The same is true for NH.

So... Have I shown all I shown must be visible even on very large storages (e.g. 0.1...1 TB)? Think about the smaller ones. I admit, 90-95% of databases that are currently used are <100Gb.

Finally, think about the future. Network latency isn't a big problem. RAM isn't as well. But CPU speed is - i.e. they grow up only because of parallelism. But AFAIK neither NH nor any other ORM currently utilizes this (we plan).

18 Aug 2009
11:48 AM

Alex Yakunin

"at least 100K/sec rate per second"

"at least 100K op/sec rate" :)

18 Aug 2009
12:09 PM

Ayende Rahien

Alex,

All the numbers you throw around are maybe accurate for lab situations.

In practice, the numbers are tens of milliseconds.

18 Aug 2009
12:32 PM

Mats Helander

It seems I mentioned all show stoppers<<br />

Well, the database does work, even if it is in RAM - faster than touching disk, of course, but still work. Incidentally, this work can be considered part of the actual "application logic", which if used as an overreaching term (as contrasted to just running a loop) I would say can be used to cover the bulk of the showstoppery you left out.

But you do raise an interesting point with regards to the network throughput and latency (and an interesting fact - I have to say "wow" to that 2.5 microsec latency!! :-O)

"If you're caring about performance, 10Gbit channels are right for you"

That is just the point - do you care that much about performance? Do you care forever about improving performance or will there be some point in a given application where performance is good enough and other aspects become more important?

What I propose is that for a lot of people, the following applies: 1) The performance of the application is good enough with just a 1Gig Network (anything less and I agree performance is apparently so unimportant that we probably don't need to discuss it at all!) and 2) that if they were to measure it, the actual persistence framework code steals a fairly small bit of the overall application performance, making performance optimizations in the persistence framework a complete non-concern in their case.

Finally, think about the future. Network latency isn't a big problem. RAM isn't as well. But CPU speed is<<br />

Or it could be that, given the requirements of an application, the CPU speed isn't a problem either. This is what I suggest is already the case for many apps. Yes, I realize that improving performance results in economic benefits - but the whole point here is that other efforts (such as improving maintainability, etc) may yield even more positive economic benefits after a point when sufficient performance has been acquired.

(Cont)

18 Aug 2009
12:33 PM

Mats Helander

You DO have a good and important point in your observation that as network latency and RAM sizes improve (and disc access times improve as well, for that matter), then we will see less resources drained here, and as a consequence the percentage swallowed by the mapper will be larger. But what exactly are those percentages, now and in the future? Without tests, we just don't know. With all due respect, just doing the maths doesn't tell enough - perhaps you have done real stress testing on big iron stuff, and then you already know this, but if not - prepared to be surprised when comparing to your math based projections ;-)

In fact, it does seem to me that some of your arguments/math examples are slightly bitowards inflating the importance of the mapper (when testing only data access, increasing the data volumes may seem like you are making the test more realistic, but it also serves to cement an application abstraction where the application apparently only does data access, which is very misleading)

So I still say actual tests - Pet Shop style, preferrably (see my new blog post at matshelander.blogspot.com/.../...ti-benchmark.html ) are needed to put any of these numbers into any kind of useful perspective.

So, finally:

You do agree with my logic, no?

Yes, I agree with this.

Let me just make completely sure that you do agree: you do agree, then, that until you actually complement with a test that shows the mapper in context of a real application doing something real, it is impossible to know the actual importance of the numbers from your benchmark? Since, again, if the overall overhead is low enough (say 5%) then any performance benefit is likely pointless (until future breakthroughs in physics that will for some reason be strangely constrained to applying to t´he network and RAM but not to CPU speed (?) which may completely change the equations). I also wonder if you agree that math-based estimates and extrapolations (such as you have provided so far), even if the math itself is decent, is hardly a substitute for a real test? I realize that not yet having had the time to do the tests I ask, you try to provide the next best thing, which is a reasonable estimate. My problem with this is that the whole point of the benchmark you have created (and currently are using in a marketing drive it seems) may in fact stand or fall with the tests I suggest. So the best thing would perhaps have been to not publish at all until you had the tests to put the numbers into context, but imo the next best thing is then NOT to come with estimates, but to confess that currently the relevance of your numbers is entirely unknown and you will have to return when you have actually run some tests.

I hope I don't sound harsh, but you did mention arguing is a sport for you - well, it is for me too ;-)

/Mats

18 Aug 2009
12:37 PM

Mats Helander

slightly bitowards

should be: slightly biased towards

18 Aug 2009
12:39 PM

Roger Alsing

@Alex,I'm sorry but you are solving the wrong problem...

Materialization and Query Performance is NOT the problem with O/R mapping.

The problems are:

It's hard to create mappings (e.g. in XML).

Alot of people mess up with Lazy Load and casue ripple load havoc in their systems.

Error messages are often hard to understand since the inner workings of a mapper is quite complex.

The community around NH solves some of those problems.

e.g. Fluent NH or Castle AR (which by the way has the best and most descriptive errormessages of all frameworks that I have ever seen)

IMO, you are just throwing development money away when you optimize for materialization / query performance since the mapper performance is pretty much dwarfed by all other factors in a big app (as already stated by others).

its a quite neat PR ploy tough..

//Roger

18 Aug 2009
12:52 PM

Ayende Rahien

Alex,

All the numbers you throw around are maybe accurate for lab situations.

In practice, the numbers are tens of milliseconds.

18 Aug 2009
15:30 PM

Alex Yakunin

Materialization and Query Performance is NOT the problem with O/R mapping.

Let me just make completely sure that you do agree: you do agree, then, that until you actually complement with a test that shows the mapper in context of a real application doing something real, it is impossible to know the actual importance of the numbers from your benchmark?

Yes, I understand we can prove this only by other tests. Theory is just theory. Ok, any ideas on relatively simple, but "real life" tests except TPC-X (likely, we must implement them anyway) are welcome.

The problems are:

Agree with all. Frankly speaking, misusing of a particular ORM is something like normal... And I agree, NH is good from this point: there is a huge community.

IMO, you are just throwing development money away when you optimize for materialization

Let's see... I like to develop efficient things. As well as show them :)

18 Aug 2009
15:41 PM

Frans Bouma

Heh, leave it to the puzzle boys to nail the point home :D

@Alex, What I wondered today was: now that DO isn't mentioned in the results, what's the incentive for the site now? It was a marketing campaign, but as DO isn't in the list of results anymore (and visitors will only peek at the green/red squares), what's the point of throwing time at this (== money) at all? Or are you re-adding DO in 2 weeks from now? I mean, I can't imagine you're doing all this to give telerik or the EF team more attention...

@Mats, glad you reanimated your blog ;) Interesting read indeed.

18 Aug 2009
17:29 PM

houl

Frans: "what's the point of throwing time at this"

i can suppose at least three variants:

1) it is just interesting and has educational value - if something in other ORM is faster - then there is obvious possibility for improvements in this "something" - so speed comparisons allow to reveal points of interests in improvements of algorithms/solutions

2) DO team can "kill three rabbits with one shot" - create optimal benchmark suite for ORMs, have permanent public attention on them, and always know current situation with ORMs speed comparing them to DO - so if there any niche of speed-critical projects for ORM will appear in future - DO will become monopolistic in a moment in that niche without any concurents - if all of you will continue think in the way you've demonstrated here about no need of high-speed materialization etc :)

3) imho speed optimizations are directly connected to the whole code perfection. "Fully optimized" and "Perfect" - are almost synonims - including code style, transparency for understanding, bugs absence, etc. So the company that have fastest solution can be almost 100% sure that it really has best product - if features list is the same of course. Now look at #2 - about "compare others with their own and permanently maintain it to be fastest.

18 Aug 2009
17:34 PM

houl

Frans: "I mean, I can't imagine you're doing all this to give telerik or the EF team more attention... "

sometimes russians can be absolutely unpredictable in their generosity :-D

18 Aug 2009
17:38 PM

houl

Ayende: "All the numbers you throw around are maybe accurate for lab situations."

Atmosphere in server rooms sometimes much closer to labs, not to real life ;)

18 Aug 2009
17:49 PM

houl

And by the way - the more speed reserve you have - the more complicated/useful features you can implement.

I think everybody will agree that for example real-time full speech recognition would be rather easy task if we had 10^12 times more calculation power and RAM space in usual PC :)

So it is also a good strategy - gain the same solution which works twice faster, and then "spend" its surplus speed for additional features, which will make this product more attractive than any other :)

19 Aug 2009
03:42 AM

Alex Yakunin

imho speed optimizations are directly connected to the whole code perfection.

That's what we were talking about on the first page of ORMBattle.NET. Not absolutely true, but in wide majority of cases - yes.

Or think about LINQ at least.

And, guys... I just thought it is simply can't be true you're saying ORM performance isn't important. And asked the Google.

"Ayende NHibernate performance" - first link: ayende.com/.../NHibernatePerformanceConcerns.aspx

Few quotes:

"The first problem that Darrel has is with the memory consumtion. I will start right off by saying that I have been a devot user of NHibernate*for over two years now, and I'm building big, complex systems using it. Memory consumtion was never an issue with NHibernate." - think about our materialization test, which results, as I wrote, are mainly related to memory consumption.
"Optimizing NHibernate's performance is almost solely focused on reducing the amount of queries ..." CUD batches are a good example of reducing amount of queries, yes?

I'm almost sure, Oren, I can find at least 10 more articles in your blog related to NH performance. I just don't want to spend more time on this.

And in the end of all: "nhibernate performance" = 0.5M results in Google. That's definitely not important.

19 Aug 2009
04:03 AM

Ayende Rahien

Alex,

Your materialization tests has as much relation to the real world as a swimming competition in the Sahara.

And I think you are missing the point with reducing the # of queries. You are forcing thousands of queries and then try to optimize that.

Don't start with thousands of queries, your life will be better

19 Aug 2009
04:22 AM

Alex Yakunin

Oren, I answered to this many, many times. Say something new, please ;) You provided zero facts & evidences. Comparisons like "swimming competition in the Sahara" may work on public conversation, but never works @ forums & blogs.

19 Aug 2009
04:38 AM

Alex Yakunin

Ok, my remark about "real life performance" in your "swimming in Sahara" style, if you want:

If I buy tomatoes, I'm interested in their calorie content, but not in calorie content of some soup with tomatoes. But you're suggesting all tomato producers must publish calorie content of some soup with tomatoes. Why? Just because this will make difference between your fat and our slim ones less visible.

19 Aug 2009
04:59 AM

Ayende Rahien

Alex,

Because what you are publishing is not tomatoes calories value, you are publishing the nutrient values in the water used to grow a different set tomatoes.

19 Aug 2009
05:01 AM

Ayende Rahien

@All,

This thread is getting repetitive, and I don't see any value in continuing the same discussion over & over again.

Comments are closed.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB