NHibernate and the second level cache tips
I have been tearing my hair our today, because I couldn't figure out why something that should have worked didn't ( second level caching , obviously ).
Along the way, I found out several things that you should be aware of. First of all, let us talk about what the feature is, shall we?
NHibernate is design as an enterprise OR/M product, and as such, it has very good support for running in web farms scenarios. This support include running along side with distributed caches, including immediate farm wide updates. NHibernate goes to great lengths to ensure cache consistency in these scenarios (it is not perfect, but it is very good). A lot of the things that tripped me today were related to just that, NHibernate was working to ensure cache consistency and I wasn't aware of that.
The way it works, NHibernate keeps three caches.
- The entities cache - the entity data is disassembled and then put in the cache, ready to be assembled to entities again.
- The queries cache - the identifiers of entities returned from queries, but no the data itself (since this is in the entities cache).
- The update timestamp cache - the last time a table was written to.
The last cache is very important, since it ensures that the cache will not serve stale results.
Now, when we come to actually using the cache, we have the following semantics.
- Each session is associated with a timestamp on creation.
- Every time we put query results in the cache, the timestamp of the executing session is recorded.
- The timestamp cache is updated whenever a table is written to, but in a tricky sort of way:
- When we perform the actual writing, we write a value that is somewhere in the future to the cache. So all queries that hit the cache now will not find it, and then hit the DB to get the new data. Since we are in the middle of transaction, they would wait until we finish the transaction. If we are using low isolation level, and another thread / machine attempts to put the old results back in the cache, it wouldn't hold, because the update timestamp is into the future.
- When we perform the commit on the transaction, we update the timestamp cache with the current value.
Now, let us think about the meaning of this, shall we?
If a session has perform an update to a table, committed the transaction and then executed a cache query, it is not valid for the cache. That is because the timestamp written to the update cache is the transaction commit timestamp, while the query timestamp is the session's timestamp, which obviously comes earlier.
The update timestamp cache is not updated until you commit the transaction! This is to ensure that you will not read "uncommited values" from the cache.
Another gotcha is that if you open a session with your own connection, it will not be able to put anything in the cache (all its cached queries will have invalid timestamps!)
In general, those are not things that you need to concern yourself with, but I spent some time today just trying to get tests for the second level caching working, and it took me time to realize that in the tests I didn't used transactions and I used the same session for querying as for performing the updates.
Comments
"The last cache is very important, since it ensures that the cache will not serve stale results."
You talk about webfarms. What if a second server writes data to the db and the first server is reading data (but as it's cached, it will read from cache). The first server's session won't get a notify that a second server has updated the data in the db, meaning that when data is read by a thread on the first server, it will read stale data which IS already updated in the db. You always have stale data, however in this case, it's not said when the data in the first server's cache is updated with data from the db. I.o.w. the staleness can go on for a long time (as long as data is kept in the cache).
Not really.
The sync of the caches is left to the cache implementation.
If we are talking about something like Memcached, we have no issue with syncing the cache, because all the machines in the farm see the same cache.
Other cache implementations can send notifications, but that is outside the scope of what NHibernate does.
If that's true, objects need to serializable, are they not? As we're talking about multiple appdomains.
I wonder what's more efficient: relying on the cache of the db server or transporting objects back/forth using serialization layers.
No, they don't need that.
This is because NHibernate doesn't save the entity in the cache. Doing so would open you to race conditions.
NHibernate saves the entity data alone, which is usually composed of primitive data (that is what the DB can store, after all).
In general, it is more efficient to hit a cache server, because those are very easily scalable to high degrees, and there is no I/O involved.
I did not think you could use the querycache with memcache.
Is it possible now to use memcache?
Paul,
Yes, you can.
It is an cache implementation just like all the rest
Lets consider some trival example:
IDbConnection conn = myApp.GetOpenConnection();
ISession session = sessions.OpenSession(conn);
Since we have seesion with cosutom provide connection I can use cache at all in Nhibernate?
Or i got it wrong.
sory for typos in last post. It should go like this:
Since we have seesion with custom provided connection I can't use cache at all in Nhibernate?
Yes, is you are doing it in this route. Which is rarely recommended.
If you want a custom connection provider, you can implement the IConnectionProvider interface.
Thanks for quick shot.
it's raising another question - how to handle multiple database scenario? (that's why we provide manualy connection to session). Since there is no easy way (or at least i dont know any) to handle multiple databases in one solution with one config file.
You generally create two session factories for it.
Hi Oren,
I am new to 2nd level caching in nhibernate but have come across this exact problem today where the 'UseSecondLevelCacheForSecurityQuestions' fixture in Rhino.Security.Tests is failing. The ficture is using the technique you outlined above.
When debugging I can see from the output that the IsAllowed() call is calling into the db not the cache.
Is there anyway to access the cache provider directly or is there something else I'm potentialy missing.
Well, you can generally just access the cache object itself.
I think it is accessible of the session factory.
I can't find it in the session factory. How did you debug the problem you had above? It seems difficult to get any visibility on cache hits...
I am toying with the idea of writing a DebugHashTableCacheProvider to log the the internals...
Any ideas?
After some painfull debugging I have found that the second IsAllowed() method although producing identical sql and on from what I can see identical QueryKey objects, somehow the second call to IsAllowed() computes a different hashcode for the key to the one contained in the hashtable.
I have had a look at the QueryKey implementation of GetHashCode override in nhibernate source and cannot fathom why the two would be different.
Both calls are in different sessions/transactions and I am not using a custom provided connection.
Am I missing something here?
Stuart,
Can you produce a failing test case?
Yes. Rhino.Security.Tests.AuthorizationServiceFixture.UseSecondLevelCacheForSecurityQuestions() fails. I simply replaced the CacheProvider with one of my own to enable me step into the code to see what was happening. But the test fails no matter which CacheProvider I use.
I tested this against Sql2005 as SqlLite throws an ado exception (doesn't like the sql supplied)
I am using a trunk build of nhibernate...
BTW The error with SqLite is interesting. It was failing with an error "no such table" which on doing some research is due to the fact that once you close the connection to the in memory db you cannot re-open a connection to it. This is disappointing if true...I changed the data source to point to a file on disk and this fixed the sqlite exception.
Still no resolution on the cache miss though...
Stuart, I had the same problem with SQLite in-memory and NHibernate 1.2: "no such table".
After some research I have found that during transaction commit SQLite connection is closed therefore next queries could not find any tables. NHibernate doc (10.7. Connection Release Modes section) has interesting note: "As of NHibernate 1.2.0, if your application manages transactions through .NET APIs such as System.Transactions library, ConnectionReleaseMode.AfterTransaction may cause NHibernate to open and close several connections during one transaction, leading to unnecessary overhead and transaction promotion from local to distributed. Specifying ConnectionReleaseMode.OnClose will revert to the legacy behavior and prevent this problem from occuring." After addition key="hibernate.connection.release_mode" value="on_close" to the config file all my tests work with SQLite in memory with no problems.
Thanks Maxim, that worked...
Maxim, actually my mistake this did not work, I had that setting configured already.
Comment preview