Designing the Entity Framework 2nd level cache

time to read 2 min | 348 words

One of the things that I am working on is another commercial extension to EF, a 2nd level cache. At first, I thought to implement something similar to the way NHibernate does this, that is, to create two layers of caching, one for entity data and the second for query results where I would store only scalar information and ids.

That turned out to be quite hard. In fact, it turned out to be hard enough that I almost gave up on that. Sometimes I feel that extending EF is like hitting your head against the wall, eventually you either collapse or the wall fall down, but either way you are left with a headache.

At any rate, I eventually figured out a way to get EF to tell me about entities in queries and now the following works:

// will hit the DB
using (var db = new Entities(conStr))
{
    db.Blogs.Where(x => x.Title.StartsWith("The")).FirstOrDefault();
}

// will NOT hit the DB, will use cached data for that
using(var db = new Entities(conStr))
{
   db.Blogs.Where(x => x.Id == 1).FirstOrDefault();
}

The ability to handle such scenarios is an important part of what makes the 2nd level cache useful, since it means that you aren’t limited to just caching a query, but can perform far more sophisticated caching. It means better cache freshness and a lot less unnecessary cache cleanups.

Next, I need to handle partially cached queries, cached query invalidation and a bunch of other minor details, but the main hurdle seems to be have been dealt with (I am willing to lay odds that I will regret this statement).