Designing the Entity Framework 2nd level cache

time to read 2 min | 348 words

One of the things that I am working on is another commercial extension to EF, a 2nd level cache. At first, I thought to implement something similar to the way NHibernate does this, that is, to create two layers of caching, one for entity data and the second for query results where I would store only scalar information and ids.

That turned out to be quite hard. In fact, it turned out to be hard enough that I almost gave up on that. Sometimes I feel that extending EF is like hitting your head against the wall, eventually you either collapse or the wall fall down, but either way you are left with a headache.

At any rate, I eventually figured out a way to get EF to tell me about entities in queries and now the following works:

// will hit the DB
using (var db = new Entities(conStr))
{
    db.Blogs.Where(x => x.Title.StartsWith("The")).FirstOrDefault();
}

// will NOT hit the DB, will use cached data for that
using(var db = new Entities(conStr))
{
   db.Blogs.Where(x => x.Id == 1).FirstOrDefault();
}

The ability to handle such scenarios is an important part of what makes the 2nd level cache useful, since it means that you aren’t limited to just caching a query, but can perform far more sophisticated caching. It means better cache freshness and a lot less unnecessary cache cleanups.

Next, I need to handle partially cached queries, cached query invalidation and a bunch of other minor details, but the main hurdle seems to be have been dealt with (I am willing to lay odds that I will regret this statement).

Tweet Share Share 14 comments

Tags:

O/R Mappers

Comments

13 Jan 2010
16:05 PM

Ayende Rahien

Mike,

There are two main problems with the EF CachingProvider.

a) it is very invasive.

b) is does caching in a very brute force manner, that is, it handle all caching using the queries. Having a separate entities & queries cache make the process significantly more efficient.

13 Jan 2010
16:17 PM

Mike Chaliy

Ayende, seems that this will be your second provider. Have you considered to extract common infrastructure for them and to open source it?

13 Jan 2010
16:35 PM

Ayende Rahien

Mike,

How will this be a second provider?

And I intend to make this into a commercial product

13 Jan 2010
17:03 PM

Mike Chaliy

My guess was that first one was for EfProf.

Regarding open source vs commercial product. I have had to ask this :). Actually I am facing to implement multitenancy to EF, and the only way is to use yet another plugged provider. I am just frustrated with amount of stupid classes that minimal implmentation of the provider requires. So I believe this is good candidate for sharing. Hope I will be able to share this.

13 Jan 2010
19:41 PM

Rafal

Hi, I think the most natural place for caching query results is the database server, not the application. And for some reason database servers don't generally do it (at least MS SQL doesn't). The rationale is that in frequently modified tables the cache would be trashed way too often, besides it would work only if the application was sending mostly identical queries over and over, which is rarely the case.Why do you think caching query results in application would help, especially when the cache is generic, built in db access interface?

13 Jan 2010
19:49 PM

Stephen

Is EF not extensible or are you just unaware of its api? nhibernate would of course be easier for you considering you know a considerable amount about its guts.

Seems to me that if EF wasn't extensible then you wouldn't be able to do any of these things period.. a .. 'takes a long while but I got there' sounds more like learning the api.

14 Jan 2010
03:20 AM

Ayende Rahien

Rafal,

The problem is that the whole idea of a cache is to avoid hitting the DB server.

Much of the perf boost is the fact that you don't have to go and hit a remote server, and DB servers tend to be awfully busy, anyway.

Most DBs already do what you describe, by loading stuff to memory, so that isn't an issue, but the database has to also worry about things like ACID, which is a whole different kettle of fish.

And in most apps, you DO perform a lot of repeated queries. Take this blog, for instance.

We have the set of queries to show this page, and the set of pages that are being used in any one time. Caching those results would be very helpful in the long run, since you have high cache hit probability.

14 Jan 2010
03:24 AM

Ayende Rahien

Stephen,

EF has very few real extension points. The major one is the provider, but the problem is that by the time you reach the provider is it usually too late to do the sort of things that you want to do.

With NHibernate, I have many extension points, at various levels, so I can not only pick the level of granularity that I have but also the particular location for the extension. That means that extending NHibernate to do something is six orders of magnitude easier, even if I want to do something that NHibernate was never meant to do.

14 Jan 2010
08:09 AM

Demis Bellot

I'm not sure why you would bother implementing a cache inside an ORM. The most efficient cache is kept at the application response level (i.e. page output cache, dto response, etc).

If for some reason you want to cache at the data level your better off using a dedicated cache service or DHT which can scale horizontally, be shared by multiple app servers, supports expiration etc.

14 Jan 2010
11:01 AM

Ayende Rahien

Denis,

I don't think that you understand what I have in mind. You might want to read a bit about NHibernate's 2nd level cache to understand how this works

14 Jan 2010
12:12 PM

Frank Quednau

Hmm, 6 orders of magnitude easier...If the required amount of time to accomplish task A is proportional to the involved complexity, and also assuming that easiness of an action is directly related to its complexity, and let's further assume that the constant between required amount of time and complexity is 1, a feature that is implemented in 1 minute in NH takes about a million minutes in EF, or almost 2 years.

Careful with the superlatives there :)

14 Jan 2010
12:35 PM

Ayende Rahien

Frank,

Good catch :-), but you get my drift, I assume

29 Jan 2010
14:24 PM

Chris

"and also assuming that easiness of an action is directly related to its complexity" Don't you mean inversely? You have lots of assumptions there ;)

23 Feb 2010
14:50 PM

Frank Quednau

"Don't you mean inversely?"

nope, I merely assumed a relationship without specifying its kind. ;)

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB