Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,640
|
Comments: 51,263
Privacy Policy · Terms
filter by tags archive
time to read 3 min | 464 words

So, I am sitting there quietly trying to get EF Prof to work in a way that I actually like, when all of a sudden I realize that I am missing something very important, I can’t see the generated queries parameters in the profiler.

Looking closely, I started investigating what appear to be a possible SQL injection issue with EF. My issue was that this query:

entities.Posts.Where(x=>x.Title == “hello”)

Generated the following SQL:

SELECT
1 AS [C1],
[Extent1].[Id] AS [Id],
[Extent1].[Title] AS [Title],
[Extent1].[Text] AS [Text],
[Extent1].[PostedAt] AS [PostedAt],
[Extent1].[BlogId] AS [BlogId],
[Extent1].[UserId] AS [UserId]
FROM [dbo].[Posts] AS [Extent1]
WHERE N'hello' = [Extent1].[Title]

It literally drove me crazy. Eventually I tried this query:

var hello = "hello";
entities.Posts.Where(x=>x.Title==hello);

Which generated:

SELECT
1 AS [C1],
[Extent1].[Id] AS [Id],
[Extent1].[Title] AS [Title],
[Extent1].[Text] AS [Text],
[Extent1].[PostedAt] AS [PostedAt],
[Extent1].[BlogId] AS [BlogId],
[Extent1].[UserId] AS [UserId]
FROM [dbo].[Posts] AS [Extent1]
WHERE [Extent1].[Title] = @p__linq__1

This was more like it.

It seems (and Julie Lerman confirmed it) that EF is sticking constant expressions directly into the SQL, and treating parameters differently.

I am not quite sure why, but from security standpoint, it is obviously not a problem if it does so for constants. It have a lot less hair now, though.

time to read 2 min | 276 words

I was sitting with Julie Lerman today, and we got into a discussion on how to provide more information to EF users. It appears that there is much need for that.

This is what I do, more or less, so we decided to tackle that problem in the bar. Some drinks later, we had a working version of EF Prof that was actually able to intercept all queries coming from the Entity Framework. Initial testing also shows that I’ll be able to provide much more information about EF than I’ll be able to do with Linq to SQL.

Unfortunately, the current way of intercepting EF traffic is extremely invasive, and I don’t really like it. I consider it a good proof of concept, but I am going to spend some of next week trying to figure out a less invasive approach.

In the meanwhile, take a look (not the final look & feel):

image

image

And here is the project that we profile:

image

It supports all the usual niceties, so you get stack trace support, tracking selects (including lazy loading), updates, inserts & deletes. And I tested it on both EF 3.5 and EF 4.0.

I expect to have a private beta starting next week…

time to read 2 min | 297 words

Every now and then I do a quick check on the EF blog, just to see what there status is. My latest peek had caused me to gulp. I am not sure where the EF is going with things, I just know that I don’t really like it.

For a start, take a look at the follow sample from their Code Only mapping (basically Fluent NHibernate):

.Case<Employee>(
e => new {
manager = e.Manager.Id
thisIsADiscriminator = “E”
}
)

There are several things wrong here: “manager” and “thisIsADiscriminator” are strings for all intent and purposes. The compiler isn’t going to check them, they aren’t there to do something, they are just there to avoid being a literal string. But they are strings.

Worse, “thisIsADiscriminator” is a magic string.

Second, and far more troubling, I am looking at this class definition and I cringe:

public class Category{
public int ID {get;set;}
public string Name {get;set;}
public List<Product> Products {get;set;}
}

The problem is quite simple, this class has no hooks for lazy loading. You have to load everything here in one go. Worse, you probably need to load the entire graph in one go. That is scaring me on many levels.

I am not sure how the EF is going to handle it, but short of IL Rewrite techniques, which I don’t think the EF is currently using, this is a performance nightmare waiting to happen.

time to read 4 min | 708 words

Update: It appears that I am wrong, and NHibernate can support this functionality by eagerly loading the association at load time. You can do by specifying lazy="false" (and optionally, outer-join="true") on the many to one association.

Yesterday I presented an interesting problem that pop up with any OR/M that supports inheritance and lazy loading.

Let us say that we have the following entity model:

image_thumb

Backed by the following data model:

image_thumb[2]

As you can see, we map the Animal hierarchy to the Animals table, and we have a polymorphic association between Animal Lover and his/her animal. Where does the problem starts?

Well, let us say that we want to load the animal lover. We do that using the following SQL:

SELECT Name,
	 Animal,
	 Id
FROM AnimalLover
WHERE Id = 1 /* @p0 */

And now we have an animal lover instance:

var animalLover = GetAnimalLoverById(1);
var isDog = animalLover.Animal is Dog;
var isCat = animalLover.Animal is Cat;

Can you guess what would be the result of this code?

The answer is that both isDog nor isCat would be… false.

But how is that?

To answer that question, let us take a look at the SQL that was used to load the animal lover, and let us take a look at a typical example of hydrating entities. I am using Davy’s DAL here to show off the problem, because the code is simple and it demonstrate that the problem is not unique to a particular OR/M, but is shared among all of them (Davy’s DAL doesn’t even support inheritance, for example).

private void SetReferenceProperties<TEntity>(
	TableInfo tableInfo, 
	TEntity entity, 
	IDictionary<string, object> values)
{
	foreach (var referenceInfo in tableInfo.References)
	{
		if (referenceInfo.PropertyInfo.CanWrite == false)
			continue;
		
		object foreignKeyValue = values[referenceInfo.Name];

		if (foreignKeyValue is DBNull)
		{
			referenceInfo.PropertyInfo.SetValue(entity, null, null);
			continue;
		}

		var referencedEntity = sessionLevelCache.TryToFind(
			referenceInfo.ReferenceType, foreignKeyValue);
			
		if(referencedEntity == null)
			referencedEntity = CreateProxy(tableInfo, referenceInfo, foreignKeyValue);
								   
		referenceInfo.PropertyInfo.SetValue(entity, referencedEntity, null);
	}
}

Take a look at what the code is doing, we are currently processing the Aminal property on the AnimalLover class. And we try to find an Animal that was loaded with a primary key matching to the value of the Animal column in the AnimalLovers table.

When we can’t find it, we have to create a lazy loading proxy for the referenced entity. And here is where the conundrum kicks into play. When we have inheritance, we have a real problem here. What is the type of the referenced entity?

From the model, we know that it must be a derivation of Animal of some sort, and we have its PK, but we have no way of knowing which without going to the database for it.

So what are we going to do? Because we don’t have enough information to create a lazy loading proxy of the appropriate type, we actually generate a lazy loading proxy of the type that we do know about, Animal.

But what about when it is being loaded?

Well, that is where the lack of #become in .NET becomes painful, we already have an instance, and we can’t change its types. And we can’t replace the reference on the AnimalLover because someone might have grab a reference to the animal before the lazy load.

The way to handle it is by turning the lazy loading proxy into a real one. We load a new instance that represent the entity, now with the correct type, since we query the DB to find out what it is (along with the rest of the entity’s data).

And the lazy loading proxy that we originally used is now loaded, and any call made on it will be forwarded to the new instance that was loaded.

animalLover.Animal stays an AnimalProxy, and cannot be cast to a Dog or a Cat, even if the actual row it is pointing to is a Dog or a Cat.

time to read 1 min | 193 words

Here is an interesting problem that pop up with any OR/M that supports inheritance and lazy loading.

Let us say that we have the following entity model:

image

Backed by the following data model:

image

As you can see, we map the Animal hierarchy to the Animals table, and we have a polymorphic association between Animal Lover and his/her animal. Where does the problem starts?

Well, let us say that we want to load the animal lover. We do that using the following SQL:

SELECT Name,
	 Animal,
	 Id
FROM AnimalLover
WHERE Id = 1 /* @p0 */

And now we have an animal lover instance:

var animalLover = GetAnimalLoverById(1);
var isDog = animalLover.Animal is Dog;
var isCat = animalLover.Animal is Cat;

Can you guess what would be the result of this code?

I’ll post the actual answer tomorrow…

time to read 2 min | 315 words

Continuing to shadow Davy’s series about building your own DAL, this post is about the Executing Custom Queries.

The post does a good job of showing how you can create a good API on top of what is pretty much raw SQL. If you do have to create your own DAL, or need to use SQL frequently, please use something like that rater than using ADO.Net directly.

ADO.Net is not a data access library, it is the building blocks you use to build a data access library.

A few comments on queries in NHibernate. NHibernate uses a more complex model for queries, obviously. We have a set of abstractions between the query and generated SQL, because we are supporting several query options and a lot more mapping options. But while a large amount of effort goes into translating the user desires to SQL, there is an almost equivalent amount of work going into hydrating the queries. Davy’s support only a very flat model, but with NHibernate, you can specify eager load options, custom DTOs or just plain value queries.

In order to handle those scenarios, NHibernate tracks the intent behind each column, and know whatever a column set represent an entity an association, a collection or a value. That goes through a fairly complex process that I’ll not describe here, but once the first stage hydration process is done, NHibernate has a second stage available. This is why you can write queries such as “select new CustomerHealthIndicator(c.Age,c.SickDays.size()) from Customer c”.

The first stage is recognizing that we aren’t loading an entity (just fields of an entity), the second is passing them to the CustomerHealthIndicator constructor. You can actually take advantage of the second stage yourself, it is called Result Transformer, and you can provide you own and set it a query using SetResultTransformer(…);

time to read 2 min | 260 words

Continuing to shadow Davy’s series about building your own DAL, this post is about the Lazy Loading. Davy support lazy loading is something that I had great fun reading, it is simple, elegant and beautiful to read.

I don’t have much to say about the actual implementation, NHibernate does things in much the same way (we can quibble about minor details such as who holds the identifier and other stuff, but they aren’t really important). A major difference between Davy’s lazy loading approach and NHibernate’s is that Davy doesn’t support inheritance. Inheritance and lazy loading plays some nasty games with NHibernate implementation of lazy loaded entities.

While Davy can get away with loading the values into the same proxy object that he is using, NHibernate must load them into a separate object. Why is that? Let us say that we have a many to one association from AnimalLover to Animal. That association is lazy loaded, so we put a AnimalProxy (which inherit from Animal) as the value in the Animal property. Now, when we want to load the Animal property, NHibernate has to load the entity, at that point, it discovers that the entity isn’t actually an Animal, but a Dog.

Obviously, you cannot load a Dog into an AnimalProxy. What NHibernate does in this case is load the entity into another object (a Dog instance) and once that instance is loaded, direct all methods calls to the new instance. It sounds complicated, but it is actually quite elegant and transparent from the user perspective.

time to read 4 min | 760 words

Continuing to shadow Davy’s series about building your own DAL, this post is about the Session Level Cache.

The session cache (first level cache in NHibernate terms) exists to support one main scenario, in a single session, a row in the database is represented by a single instance. That means that the session needs to track all the instances that it loads and be able to search through them. Davy does a good job covering how it is used, and the implementation is quite similar to the way it is done in NHibernate.

Davy’s implementation is to use a nested dictionary to hold instances per entity type. This is done mainly to support  RemoveAllInstancesOf<TEntity>(), a method that is unique to Day’s DSL. The reasoning for that method are interesting:

When a user executes a custom DELETE statement, there is no way for us to know which entities were actually removed from the database. But if any of those deleted entities happen to remain in the SessionLevelCache, this could lead to buggy application code whenever a piece of code tries to retrieve a specific entity which has already been removed from the database, but is still present in the SessionLevelCache. In order to deal with this scenario, the SessionLevelCache has a ClearAll and a RemoveAllInstancesOf method which you can use from your application code to either clear the entire SessionLevelCache, or to remove all instances of a specific entity type from the cache.

Personally, I think this is using an ICBM to crack eggshells. But I am probably being unfair. NHibernate has much the same issue, if you issue a Delete via SQL or HQL queries, NHibernate doesn’t have a way to track what was actually delete and deal with it. With NHibernate, it doesn’t tend to be a problem for the session level cache, mostly because of usage habits than anything else. The session used to do so rarely have to deal with entities loaded that were deleted by the query issue (and if it does, the user needs to handle that by calling Evict() on all the objects manually). NHibernate doesn’t try to support this scenario explicitly for the session cache. It does support this very feature for the second level cache.

It make sense, though. With NHibernate, in the vast majority of cases deleting is going to be done using NHibernate itself, rather than a special query. With Davy’s DAL, the usage of SQL queries for deletes is going to be much higher.

Another interesting point in Davy’s post is the handling of queries:

When a custom query is executed, or when all instances are retrieved, there is no way for us to exclude already cached entity instances from the result of the query. Well, theoretically speaking you could attempt to do this by adding a clause to the WHERE statement of each query that would prevent cached entities from being loaded. But then you might have to add the cached entity instances to the resulting list of entities anyways if they would otherwise satisfy the other query conditions. Obviously, trying to get this right is simply put insane and i don’t think there’s any DAL or ORM that actually does this (even if there was, i can’t really imagine any of them getting this right in every corner case that will pop up).

So a good compromise is to simply check for the existence of a specific instance in the cache before hydrating a new instance. If it is there, we return it from the cache and we skip the hydration for that database record. In this way, we avoid having to modify the original query, and while we could potentially return a few records that we already have in memory, at least we will be sure that our users will always have the same reference for any particular database record.

This is more or less how NHibernate operates, and for much the same reasoning. But there is a small twist. In order to ensure query coherency between the data base queries and in memory entities, NHibernate will optionally try to flush all the items in the session level cache that have been changed that may be affected by the query. A more detailed description of this can be found here, Davy’s DAL doesn’t do automatic change tracking, so this is not a feature that can be easily added with this prerequisite.

time to read 4 min | 652 words

Continuing to shadow Davy’s series about building your own DAL, this post is about hydrating entities.

Hydrating Entities is the process of taking a row from the database and turning it into an entity, while de-hydrating in the reverse process, taking an entity and turning it into a flat set of values to be inserted/updated.

Here, again, Davy’s has chosen to parallel NHibernate’s method of doing so, and it allows us to take a look at a very simplified version and see what the advantages of this approach is.

First, we can see how the session level cache is implemented, with the check being done directly in the entity hydration process. Davy has some discussion about the various options that you can choose at that point, whatever to just use the values from the session cache, to update the entity with the new values or to throw if there is a change conflict.

NHibernate’s decision at this point was to assume that the entity that we have is correct, and ignore any changes made in the meantime to the database. That turn out to be a good approach, because any optimistic concurrency checks that we might want will run when we commit the transaction, so there isn’t much difference from the result perspective, but it does simplify the behavior of NHibernate.

Next, there is the treatment of reference properties, what NHibernate call many-to-one associations. Here is the relevant code (editted slightly so it can fit on the blog width):

private void SetReferenceProperties<TEntity>(
	TableInfo tableInfo, 
	TEntity entity, 
	IDictionary<string, object> values)
{
	foreach (var referenceInfo in tableInfo.References)
	{
		if (referenceInfo.PropertyInfo.CanWrite == false)
			continue;
		
		object foreignKeyValue = values[referenceInfo.Name];

		if (foreignKeyValue is DBNull)
		{
			referenceInfo.PropertyInfo.SetValue(entity, null, null);
			continue;
		}

		var referencedEntity = sessionLevelCache.TryToFind(
			referenceInfo.ReferenceType, foreignKeyValue);
			
		if(referencedEntity == null)
			referencedEntity = CreateProxy(tableInfo, referenceInfo, foreignKeyValue);
								   
		referenceInfo.PropertyInfo.SetValue(entity, referencedEntity, null);
	}
}

There are a lot of things going on here, so I’ll take them one at a time.

You can see how the uniquing process is going on. If we already have the referenced entity loaded, we will get it directly from the session cache, instead of creating a separate instance of it.

It also shows something that Davy’s promised to touch in a separate post, lazy loading. I had an early look at his implementation and it is pretty. So I’ll skip that for now.

This piece of code also demonstrate something that is very interesting. The lazy loaded inheritance many to one association conundrum.  Which I’ll touch on a future post.

There are a few other implications of the choice of hydrating entities in this fashion. For a start, we are working with detached entities this way, the entity doesn’t have to have a reference to the session (except to support lazy loading). It also means that our entities are pure POCO, we handle it all completely externally to the entity itself.

It also means that if we would like to handle change tracking (with Davy’s DAL currently doesn’t do), we have a much more robust way of doing so, because we can simply dehydrate the entity and compare its current state to its original state. That is exactly how NHibernate is doing it. This turn out to be a far more robust approach, because it is safe in the face of method modifying state internally, without going through properties or invoking change tracking logic.

I wanted to also touch about a few things that makes the NHibernate implementation of the same thing a bit more complex. NHibernate supports reflection optimization and multiple ways of actually setting the values on the entity, is also support things like components and multi column properties, which means that there isn’t a neat ordering between properties and columns that make the Davy’s code so orderly.

time to read 2 min | 224 words

Continuing the series about Davy’s build your own DAL, this post is talking about CRUD functionality.

One of the more annoying things about any DAL is dealing with the repetitive nature of talking to the data source. One of Davy’s stated goal in going this route is to totally eliminate those. CRUD functionality shouldn’t be something that you have to work with for each entity, it is just something that exists and that you get for free whenever you are using it.

I think he was very successful there, but I also want to talk about his method in doing so. Not surprisingly, Davy’s DAL is taking a lot of concepts from NHibernate, simplifying them a bit and then applying them. His approach for handling CRUD operations is reminiscent of how NHibernate itself works.

He divided the actual operations into distinct classes, called DatabaseActions, and he has things like FindAllAction, InsertAction, GetByIdAction.

This architecture gives you two major things, maintaining things is now far easier, and playing around with the action implementations is far easier. It is just a short step away from Davy’s Database Actions to NHibernate’s event & listeners approach.

This is also a nice demonstrations of the Single Responsibility Principle. It makes maintaining the software much easier than it would be otherwise.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. API Design (10):
    29 Jan 2026 - Don't try to guess
  2. Recording (20):
    05 Dec 2025 - Build AI that understands your business
  3. Webinar (8):
    16 Sep 2025 - Building AI Agents in RavenDB
  4. RavenDB 7.1 (7):
    11 Jul 2025 - The Gen AI release
  5. Production postmorterm (2):
    11 Jun 2025 - The rookie server's untimely promotion
View all series

Syndication

Main feed ... ...
Comments feed   ... ...