RavenDB & HTTP Caching

time to read 4 min | 766 words

The RavenDB’s Client API uses the session / unit of work model internally. That means that this code will only go to the database once:

session.Load<User>("users/1");
session.Load<User>("users/1");
session.Load<User>("users/1");

And that all three calls will return the same instance as well. This is just the identity map at work, and with NHibernate, it is also called the first level cache or the session level cache.

Having implemented that, a natural progression was to ask what about the second level cache. NHibernate’s second level cache is complicated (it takes an hour just to explain how exactly it works, and that is when skipping on all the actual implementation details).

For a while, my response was that we don’t actually need that, RavenDB is fast enough that we don’t need caching. Except that I forgot about the Fallacies of Distributed Computing, the first three rules of which state:

The network is reliable.
Latency is zero.
Bandwidth is infinite.

Most specifically, caching can help with the third rule, since when you are querying potentially large documents (or over a large set of documents), you are going to spend most of your time just on the network, sending bytes to and fro.

It is to avoid that that we actually need caching.

I was slightly depressed that I actually had to implement the same complicated logic as NHibernate for caching, so I dawdled in implementing this. And suddenly it dawned on me that as usual, I was being stupid.

RavenDB is REST based. One of the important parts of REST is that:

Cacheable
As on the World Wide Web, clients are able to cache responses. Responses must therefore, implicitly or explicitly, define themselves as cacheable or not to prevent clients reusing stale or inappropriate data in response to further requests. Well-managed caching partially or completely eliminates some client–server interactions, further improving scalability and performance.

RavenDB is an HTTP server, in the end. Why not use HTTP caching?

That required some thought, I’ll admit. It couldn’t be that simple, right?

HTTP Caching is a somewhat complex topic, if you think it is not, talk to me after reading this 24 pages document describing it. But in essence, I am actually using only a small bit of it.

Whenever RavenDB sends a response to a GET request (the only thing that can be safely cached), it adds an ETag header. The ETag header stands for Entity Tag, and it changes every time that the resource is changed.

RavenDB already generated ETags for documents and attachments, those are part of how we implement optimistic concurrency. But since we already had those, we could now move to the next stage, and that was to have the client remember the responses for all the GET requests and when a new request for a Url that we already GET before, it will generate a If-None-Match header for the request.

RavenDB then checks whatever the ETag that the client holds matches the ETag on the server, and if so, will generate a 304 Not Modified response. That instruct the client that it can use the cached response safely.

In order to fully implement caching on the client, that was all we had to do. On the server side, we had to modify a few endpoints to properly generate an ETag and 304 if the client sent us the current If-None-Match value. With RavenDB, this is handled very deep in the guts of the client api, directly on top of the HTTP layer. It is always on by default and it should drastically reduce the amount of data across the network when the data hasn’t been modified.

Please note that unlike NHibernate’s second level cache, we don’t need a distributed cache to ensure consistency. Each node has its own local cache, but all of them will always get valid results, thanks to RavenDB’s ETag checks. In fact, the biggest challenge was actually involved in figuring out how to cheaply generate a valid ETag without performing the actual work for the request Smile .

Tweet Share Share 19 comments

Tags:

Raven

Comments

11 Jan 2011
14:37 PM

Jason Meckley

So, http (2nd level) cache is on by default, just update the client and server binaries? That is frictionless :)

11 Jan 2011
14:40 PM

Ayende Rahien

Jason,

Yep, pretty much

11 Jan 2011
14:41 PM

Brian Vallelunga

This is really great, but I have an implementation question. Where is the client cache stored? Is it in the DocumentSession object, DocumentStore, or somewhere else?

I ask because in a web application the DocumentSession will likely be created and destroyed per request, making the cache not too useful, unless it's a static property of the session that sticks around.

11 Jan 2011
14:43 PM

Ayende Rahien

This is stored in AppDomain level, it is a static property

11 Jan 2011
15:25 PM

Patrick Huizinga

For those who get scared by RFCs, I found this Caching Tutorial to be a good read.

11 Jan 2011
15:28 PM

Yuriy

Can cache size be somehow limited?

11 Jan 2011
15:34 PM

El Jobso

Yep Oren, welcome to The Internet ;-) , a place that is essentially a Representational Resource State Transporation System.

All you need (well, 99.9%) is already there!

11 Jan 2011
15:45 PM

Ayende Rahien

Yuriy,

Internally this is a MemoryCache with the name:

"Raven.Client.Client.HttpJsonRequest.Cache"

You can configure this any way you want.

11 Jan 2011
21:43 PM

stephane

ha, I was at a .NET user group session by Glenn Block about the next WCF api for http endpoint where he explain exactly that.

Wouldn't it be possible to use it? It is available on codeplex I think.

12 Jan 2011
01:20 AM

Cassio Tavares

Ayende, are you using any third party API to implement REST and JSON serialization?

Like stephane said, Glenn Block is ahead of a project to support REST over WCF. There is OpenRasta too but I would like to hear more opinions.

I know that WCF REST doesn't support ETag yet but will in future. Probably OpenRasta already support it.

12 Jan 2011
07:59 AM

Ayende Rahien

Stephane,

Can you send me a link to this?

12 Jan 2011
07:59 AM

Ayende Rahien

Cassio,

no, just standard http call from .NET

12 Jan 2011
08:52 AM

Cassio Tavares

Glenn is working on this project

wcf.codeplex.com

His blog - http://codebetter.com/glennblock/

But I'm pretty sure ETag is not implemented

It is in preview version and lacks docs, but is open source. You can digest it in one morning. :)

12 Jan 2011
09:12 AM

Ayende Rahien

Cassio,

I have a working version, one that supports ETags and caching and everything.

It is not a burden to maintain, so I think I'll not use it.

12 Jan 2011
10:07 AM

Ayende Rahien

Glenn,

Take a look at how this implemented in RavenDB:

github.com/.../HttpJsonRequest.cs

Maybe 20 lines of code, and it works. No configuration, no need to understand an extensibility mechanism.

12 Jan 2011
10:10 AM

blogs.msdn.com/gblock

Ayende I was simply responding to the question of ETags. I wasn't necessarily saying you should take a dependency on it.

12 Jan 2011
10:13 AM

blogs.msdn.com/gblock

Just as a side note, the code for the processors is going to get much cleaner / less verbose.

12 Jan 2011
10:14 AM

Ayende Rahien

Glenn,

Am I mistaken, or is the code you posted the server code?

12 Jan 2011
17:44 PM

blogs.msdn.com/gblock

Yes that is just an illustration of the server side of generating ETags.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB