Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,640
|
Comments: 51,261
Privacy Policy · Terms
filter by tags archive
time to read 1 min | 134 words

We start by killing ViewData. I don't want it, and I want to fail hard if someone is trying to use it:

image

Likewise in the views, I hate ViewData.Model, so I remove it from there as well:

image

For Javascript, I don't want to use <script/> tags, they are prone to path problems, and I might need to go back and change them (using compression, combining scripts, etc). I use this approach:

image

The page title is set by the view, the way it should:

image

Or, I just set it up in the master page, if I don't care that much about changing it.

time to read 1 min | 94 words

I thought it would be obvious, but I am currently cleaning up an application that has mixed them in a very annoying way.

I decided to track the reason for this coupling and I found this on the sample controller in ASP.Net MVC:

image

I am pretty sure that setting the page title is a presentation concern, as such, I don't want to see this in the controller, and I certainly don't want to see it in the base sample from which everyone is going to start.

time to read 2 min | 352 words

This is a bit from the docs for NH Prof, which I am sharing in order to get some peer review.

Unbounded result set is perform a query without explicitly limiting the number of returned result (using SetMaxResults() with NHibernate, or using TOP or LIMIT clauses in the SQL). Usually, this means that the application is assuming that a query will only return a few records. That works well in development and testing, but it is a time bomb in production.

The query suddenly starts returning thousands upon thousands of rows and in some cases, it is returning millions of rows. This leads to more load on the database server, the application server and the network. In many cases, it can grind the entire system to a halt, usually ending with the application servers crashing with out of memory errors.

Here is one example of a query that will trigger unbounded result set warning:

session.CreateQuery("from OrderLines lines where lines.Order.Id = :id")
       .SetParameter("id", orderId)
       .List();

If the order have many line items, we are going to load all of them, which is probably not what we intended. A very easy fix for this issue is to add pagination:

session.CreateQuery("from OrderLines lines where lines.Order.Id = :id")
	.SetParameter("id", orderId)
	.SetFirstResult(0)
	.SetMaxResult(25)
	.List();

Now we are assured that we need to only handle a predictable number, and if we need to work with all of them, we can page through the records as needed. But there is another common occurrence of unbounded result set, directly traversing the object graph, as in this example:

var order = session.Get<Order>(orderId);
DoSomethingWithOrderLines(order.OrderLines); 

Here, again, we are loading the entire set (in fact, it is identical to the query we issued before) without regard to how big it is. NHibernate does provide robust handling of this scenario, using filters.

var order = session.Get<Order>(orderId);
var orderLines = session.CreateFilter(order.OrderLines, "")
	.SetFirstResult(0)
	.SetMaxResults(25)
	.List();
DoSomethingWithOrderLines(orderLines);

This allow us to page through a collection very easily, and save us from having to deal with unbounded result sets and their consequences.

time to read 2 min | 399 words

This is a bit from the docs for NH Prof, which I am sharing in order to get some peer review.

A common mistake when using a database is that we should use only transactions to orchestrate several write statements. Every operation that the database is doing is done inside a transaction. This include both queries and writes ( update, insert, delete ).

When we don't define our own transactions, we fall back into implicit transaction mode, in which every statement to the database run in its own transaction, resulting in a higher performance cost (database time to build and tear down transactions) and reduced consistency.

Even if we are only reading data, we want to use a transaction, because using a transaction ensure that we get a consistent result from the database. NHibernate assume that all access to the database is done under a transaction, and strongly discourage any use of the session without a transaction.

Example of valid code:

using(var session = sessionFactory.OpenSession()) 
using(var tx = session.BeginTransaction()) 
{ 
	// execute code that uses the session 
	tx.Commit(); 
} 

Leaving aside the safety issue of working with transactions, the assumption that transactions are costly and we need to optimize them is a false one. As already mentioned, databases are always running in transaction. And databases have been heavily optimized to work with transactions. The question is whatever this is per statement or per batch. There is some amount of work that need to be done to create and dispose a transaction, and having to do it per statement is actually more costly than doing it per batch.

It is possible to control the number and type of locks that a transaction takes by changing the transaction isolation level (and indeed, a common optimization is to reduce the isolation level).

NHibernate treat the call to Commit() as the time to flush all changed items from the unit of work to the database, without an explicit Commit(), it has no way of knowing when it should do that. A call to Flush() is possible, but it is generally strongly discouraged, because this is usually a sign that you are not using transactions properly.

I strongly suggest that you would use code similar to the one shown above (or use another approach to transactions, such as TransactionScope, or Castle's Automatic Transaction Management) in order to handle transactions correctly.

time to read 2 min | 378 words

This is a bit from the docs for NH Prof, which I am sharing in order to get some peer review.

Select N+1 is a data access anti pattern, in which we are accessing the database in one of the least optimal ways. Let us take a look at a code sample, and then discuss what is going on. I want to show the user all the comments from all the posts, so they can delete all the nasty comments. The naןve implementation would be something like:

// SELECT * FROM Posts
foreach (Post post in session.CreateQuery("from Post").List()) 
{
     //lazy loading of comments list causes: SELECT * FROM Comments where PostId = @p0
    foreach (Comment comment in post.Comments) 
    {
        //do something with comment
    }
}


In this example, we can see that we are loading a list of posts ( the first select ) and then traversing the object graph. However, we access the lazily loaded collection, causing NHibernate to go to the database and bring the results one row at a time. This is incredibly inefficient, and the NHibernate Profiler will generate a warning whenever it encounters such a case. The solution for this example is simple, we simple force an eager load of the collection up front.

Using HQL:

var posts = session
	.CreateQuery("from Post p left join fetch p.Comments")
	.List();

Using the criteria API:
session.CreateCriteria(typeof(Post)) 
	.SetFetchMode("Comments", FetchMode.Eager) 
	.List();


In both cases, we will get a join and only a single query to the database. Note, this is the classic appearance of the problem, it can also surface in other scenarios, such as calling the database in a loop, or more complex object graph traversals. In those cases, it it generally much harder to see what is causing the issue.

NHibernate Profiler will detect those scenarios as well, and give you the exact line in the source code that cause this SQL to be generated. Another option for solving this issue is: MutliQuery and MultiCriteria, which are also used to solve the issue of Too Many Queries.

time to read 3 min | 583 words

IconSo, I talked a bit about the architecture and the actual feature, but let us see how I have actually build & implemented this feature.

This is the actual code that goes into the actual product, I want to point out. And this is actually one of the more complex ones, because of the possible state changes.

public class UnboundedResultSetStatementProcessor : IStatementProcessor
{
	public void BeforeAttachingToSession(SessionInformation sessionInformation, 
		FormattedStatement statement)
	{
	}

	public void AfterAttachingToSession(SessionInformation sessionInformation, 
		FormattedStatement statement, OnNewAction newAction)
	{
		if(statement.CountOfRows!=null)
		{
			CheckStatementForUnboundedResultSet(statement, newAction);
			return;
		}
		bool addedAction = false;
		statement.ValuesRefreshed += () =>
		{
			if(addedAction)
				return;
			addedAction = CheckStatementForUnboundedResultSet(statement, newAction);
		};
	}

	public bool CheckStatementForUnboundedResultSet(FormattedStatement statement,
		 OnNewAction newAction)
	{
		if (statement.CountOfRows == null)
			return false;

		// we are discounting statements returning 1 or 0 results because
		// those are likely to be queries on either PK or unique values
		if (statement.CountOfRows < 2)
			return false;

		// we don't check for select statement here, because only selects have row count
		var limitKeywords = new[] { "top", "limit", "offset" };
		foreach (var limitKeyword in limitKeywords)
		{
			//why doesn't the CLR have Contains() that takes StringComparison ??
			if (statement.RawSql.IndexOf(limitKeyword, StringComparison.InvariantCultureIgnoreCase) != -1)
				return true;
		}

		newAction(new ActionInformation
		{
			Severity = Severity.Suggestion,
			Title = "Unbounded result set"
		});
		return true;
	}

	public void ProcessTransactionStatement(TransactionMessageBase tx)
	{
	}
}

And now the test:

[TestFixture]
public class Ticket_51_UnboundedResultSet : IntegrationTestBase
{
	[Test]
	public void Will_issue_alert_for_unbounded_result_sets()
	{
		ExecuteScenarioInDifferentAppDomain<LoadPostsUsingCriteriaQuery>();

		var statement = observer.Model.RecentStatements.Statements
			.OfType<StatementModel>()
			.First();
		Assert.AreEqual(1, statement.Actions.Count);
		Assert.AreEqual("Unbounded result set", statement.Actions[0].Title);
	}
}

And, just for fun, the scenario that we are testing:

public class LoadPostsUsingCriteriaQuery : IScenario
{
    public void Execute(ISessionFactory factory)
    {
        using (var session = factory.OpenSession())
        using (var tx = session.BeginTransaction())
        {
            session.CreateCriteria(typeof(Post))
                .List();

            tx.Commit();
        }
    }
}

And this is it. All you have to do to implement a new feature. This make building the application much easier, because at each point in time, we have to deal with only one thing. It is the aggregation of everything put together that is actually of value.

Also, notice that I heavily optimized my workflow for tests and scenarios. I can write just what I want to happen, not caring about how this is actually happening. Optimizing the ease of test is another architectural concern that I consider very important. If we don't deal with that, the tests would be a PITA to write, so they would either wouldn't get written, or we would get tests that are hard to read.

Also, notice that this is a full integration tests, we execute the entire backend, and we test the actual view model that the UI is going to display. I could have tested this using standard unit testing, but in this case, I chose to see how everything works from start to finish.

time to read 2 min | 349 words

IconThe back end in NH Prof is responsible for intercepting NHibernate's events, making sense of all the mess, applying best practices suggestions and forwarding to the front end for display.

Second, it is also a good example of how I apply the Open Close Principle at the architecture level. With NH Prof, there are multiple extension points that I can use to add new features.

Here is a schematic of how things works:

image

Not shown here is the NHibernate Listener (of which, of course, I have several), which is publishing events to the bus. Those events are first handled by the low level message handlers, which publish new events on the bus.

Those are interesting only in the sense that they translate the low level details into events with semantics that we can use in the app itself. Most of those events, as you probably guessed, end up in the Model Building part. This is responsible of taking a set of unstructured events into a coherent structure.

Along with the model building, we have another extension point here, best practices analysis. Those are implemented as a set of classes that we plug into the model building part. If we want to add a new best practice, we need to create a new class, register it, and we are done.

Here is the checkin for implementing the unbounded row set (which is ticket #51):

image

We add a new class (and a test for the class), register it in the BusFacility and in this case I actually had to fix a bug in the tested scenario, which loaded the wrong item.

I'll post more details about the actual implementation of Unbounded Result Set soon, but I wanted to talk about the architecture that enable this. Because we structured the whole thing around a common core that we can use, anything that fit the core (and most things does) doesn't require any special effort. Apply a new behavior, done.

time to read 1 min | 125 words

One of the things that I am doing with NH Prof is not only giving you visibility into what NHibernate is doing, but also trying to automate my own experience in analyzing best practices and problematic usages of NHibernate.

NH Prof will detect usage patterns and warn against bad practices and suggest how to deal with them. The first one that I implemented was detecting Select N+1, and the feedback from the beta group was "Wow! I didn't even know that we had this problem, but casual use with the profiler immediately showed it."

Here is the newest feature, detecting & warning about unbounded result sets:

image

And the actual warning:

image

time to read 1 min | 97 words

That was actually a hard to implement feature, since this is not something that NHibernate just give out. Nevertheless, by trawling through the codebase long enough, I was able to figure out how to do this.

image

As an aside, one of the requested features for NH Prof was to be able to get DB level stats. I am not going to do this for v1.0, but I now have a pretty firm idea about how to implement this. We will have to see how many people request this information.

time to read 2 min | 234 words

Did you know that Windows came with an embedded database?

Did you know that this embedded database is the power behind Active Directory & Exchange?

Did you know that this is actually part of Windows' API and is exposed to developers?

Did you know that it requires no installation and has zero administration overhead?

Did you know there is a .Net API?

Well, the answer for all of that is that you probably didn't know that, but it is true!

The embedded database is called Esent, and the managed library for this API was just released.

This is an implementation of ISAM DB, and I have been playing around with it for the last few days. It isn't as nice for .Net developers as I would like it to be (but Laurion is working on that).

I think making this public is a great thing, and the options that this opens up are quite interesting. I took that for a spin and came up with this tiny bit of code that allow me to store JSON documents:

https://rhino-tools.svn.sourceforge.net/svnroot/rhino-tools/branches/rhino-divandb

It is not done, not nearly done, but the fact that I could rely on the embedded DB to do so made my life so much easier. I wish I knew about that when I played with Rhino Queues, it would have made my life so much simpler.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. API Design (10):
    29 Jan 2026 - Don't try to guess
  2. Recording (20):
    05 Dec 2025 - Build AI that understands your business
  3. Webinar (8):
    16 Sep 2025 - Building AI Agents in RavenDB
  4. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  5. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
View all series

Syndication

Main feed ... ...
Comments feed   ... ...