N+1 queries are hardly a feature
I run into this article, talking about N+1 issues from a “fresh” perspective. In the article, there is a quote by DHH there that goes like this:
If you have N+1 query it means you’re executing one SQL query per element so if you have 50 emails in an inbox, that’d be 50 SQL calls, right? That sounds like a bug. Well in a Russian doll caching setup, it’s not a bug, it’s a feature. The beauty of those individual calls are that they’re individually cached, on their own timeline, and that they’re super simple.
Now, I have been involved with OR/Ms and databases for over a decade now, and I have spent large parts of my career working to resolve performance problems in database driven applications.
In a word, the idea that having a larger amount of simpler queries is better is nonsense. In particular, it completely ignores the cost of going to the database. Sure, a more complex query may require the database to do additional work, and if you are using caching, then you’ll not have the data in the cache in neat “cache entry per row”. But in practice, this leads to applications doing hundreds of queries per page view, absolute reliance on the cache and tremendous cost at startup.
In RavenDB, most queries are so fast that when we measured, the network component was the largest cost we had to face.
Let us do the numbers, shall we? Let us assume that we are talking about the best case, we have the database machine and the web machine in the same datacenter. Cost of roundtrip in the same datacenter is 0.5 ms. Now, let us go back to the 50 emails example above, shall we? We need to send 50 queries to the database. We’ll assume that we have a perfect database that takes no time at all to answer. That is still 25 ms wasted just on roundtrip times.
And the problem is that this is usually a lot more than 50 queries per page when you adopt this kind of silliness. You typically see hundreds or thousands of them, and the database isn’t really able to answer you in no time, so expect to see much bigger delays in practice.
But wait, I hear you say, this is all about caching, you are ignoring that part. Well, no, I’m not. If you are using a distributed cache, the costs over the network are exactly the same. So this is only relevant if you are using a cache on the same server, but then you run into issues when you have different caches on different machines hold different information. Not to mention that you are now in the wonderful world of having to worry about cache invalidation strategies and aligning them with the business requirements.
And for fun, what happens one of your cache nodes goes down? You got it, you just created a DoS attach on your own database.
On the other hand, you can actually create proper queries, get the data in as few roundtrips as possible and trust the database engine to do its work.
I'm in complete agreement that this is daft, but presumably the argument is that those 50 queries are pipelined or otherwise parallelized, and not done completely sequentially, so it doesn't really cost you 25ms to do them.
Then you get into some kind of argument about whether the complexity risk of such reliance on parallelization is better or worse than the alternatives, and the whole thing degenerates into subjectivity.
I think there may be more to consider -- the article is referring specifically to Ruby on Rails fragment caching, in which case it is a rendered HTML fragment that is placed into cache. In a typical [ruby/rails]+[postgres/mysql] stack it is [ruby/rails] that is likely to be the slowest-performing runtime component, so the stated benefits of granular fragment caching likely have more to do with the avoidance of HTML rendering (as opposed to DB roundtrips, although a cached fragment would avoid the DB roundtrip too). Even such, it would seem to me that we could have both granular, per-item HTML caching AND optimal data access.
Can you get the best of both worlds by caching the query by resulting entity id's and separately a cache of entities by id?
That way your uncached hit is one query, your cached hit is two cache gets (one for list of ids, one for list of entities by id). 1 + 1 instead of N + 1.
If you're in a rare update situation like the ruby example, you then have the option to cache either the whole list at once or still do the one by one item output cache, while retaining a low number of roundtrips.
I agree completely. There where times when writing SQL was a requirement. Nowadays you let the ORM to the job and wonder why the application is so slow, just to see that the ORM does a lot more than expected. Everything has it's place and has it's limit. There are times when a ORM solves the problem, and sometimes a SQL statement that is properly written and optimized solves the problem, and sometimes both. A good engineer should determine when to use what.
No, its not. You can send multiple keys at once, and get many answers in one round-trip.
David, N+1 issues are 99% sequential, so they are not parallelized. They are also usually happen very much behind the scenes, without you being aware of them.
Chris, Sure, but in this case, you'll still have to do multiple cache lookups, and go over the netwrok for each of them. And I find it hard to believe that string processing (HTML generation), even in something like Ruby, can be slow enough to outweigh a network call.
Chris, You can sort of do that, yes. NHibernate does this for certain things, but it is a pretty complex setup, and you have to be explicitly aware of it.
It also doesn't help you to reduce the number of cache queries.
Maciej, Not really, no. Not unless you build your code specifically to do that. Otherwise, you'll have sequential queries to the cache.
And if you wrote your code to do a multi key lookup, you might as well do the proper query to the DB.
Yes, it does influence how you write your queries, but so does every optimization.
Depends on the database. For relational ones, cache scales much better and you can have many cache servers, while overloading single db may be a problem. For others I agree, distributed non-relational db may handle all of this for you.
Totally agree with the post. IMO this is a no brainer.
A good example is http/2 multiplexing etc ...
DHH is talking specifically about Rails, and it does fetch multiple values with one round-trip in this situation.
<%= render(partial: 'posts/post', collection: @posts, cached: true) %>will fetch all cached fragments at once and materialize relations only for fresh posts.
1- Don't use N+1 queries 2- This is a path that's all too easy to stray from in relational+ORM 3- Use RavenDB
Did I get that right? ;)
I wonder how people were able to create any applications at all in the 90's, having only SQL and windows api. Should be impossible with today's approach.
Rafal, People were writing very different types of applications in the 90s.
Pavel, Great, that is actually nice, but it still means that you have to remember to do that in the app. In that case, you are better off writing the proper query in the first place.
As a note, I'm seeing stuff like this (http://blog.bigbinary.com/2016/03/09/rails-5-makes-partial-redering-from-cache-substantially-faster.html):
Which make me cringe, because it is just a horrible thing to be proud of.
Well, you have to remember to include related records as well but true, caching is in fact harder even with many conveniences Rails provides (like including template digest in cache key). For example if the template uses helpers defined in outside module, you have to remember to update the template to avoid serving stale data. The numbers from the article you cited are really weird, I did a small benchmark rendering 100 issues from 10 projects (each one displayed 10 text fields and project name) and with >80% cache hits lazy load version was comparable with eager load one, outputting 2 megs of HTML in ~100ms. With 100% cache hit rate response time was ~15ms regardless of the DB access approach. To be fair, I think it's quite rare to get performance advantages when using lazy loading (and if the cache is busted you are completely screwed, Russian doll caching doesn't have any protection against dog pile effect) but maybe Basecamp is one such application. One should be really careful generalizing DHH's views about programming.
In the blogs example (N+1) is in fact (2N + 1). One fetch for the master record. N fetches for the detail-ids. And another N fetches for the detail records.