Ayende @ Rahien

Ayende @ Rahienhttp://ayende.comAyende @ RahienCopyright (C) Ayende Rahien 2004 - 2021 (c) 202660Ayende Rahien commented on RavenDB indexing optimizations, Step II–Pre FetchingRafal, We have several ways of doing that. We expose a number of performance counters, and we also provide /admin/stats and /databases/DB_NAME/stats endpoint that expose a lot of details about the internal structure of how ravendb works.http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment9http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment9Mon, 17 Dec 2012 10:37:25 GMTRafal commented on RavenDB indexing optimizations, Step II–Pre FetchingThanks for explanation, Ayende In case anyone thought so, I'm not nitpicking, just being curious about how Raven manages its resources during periods of high load. And another question: what is your idea for monitoring Raven's performance? I'm talking about automated, continuous collection of key performance data, like number of updates/sec, number of docs indexed/sec, cache size/hit ratio, indexing lag, number of sessions, transactions, Esent performance, memory, etc? I've been recently quite busy with monitoring application and server performance in Windows ecosystem and was wondering how Raven does these things, compared for example to MS SQL. And btw I have some pretty nice results with using NLog for collecting performance data, which might be useful for RavenDB too.http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment8http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment8Mon, 17 Dec 2012 10:27:38 GMTAyende Rahien commented on RavenDB indexing optimizations, Step II–Pre FetchingRafal, Docs loaded for indexes are not actually cached. And we have steps in place to avoid starvation, we move to higher and higher batch sizes, optimizing our IO throughput along the way. And I am talking about things like adding an index, or what happens after a restart, etc.http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment7http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment7Mon, 17 Dec 2012 09:00:44 GMTAyende Rahien commented on RavenDB indexing optimizations, Step II–Pre FetchingRafal, Consider what happens when you have existing data in the database and you add an index. You don't have all of the previously created documents in memory. Also, indexing by most recently modified means that you run into a LOT of issues with just tracking what you indexed and what you didn't. Especially when you add the notion of updates _during_ indexing. http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment6http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment6Mon, 17 Dec 2012 08:58:54 GMTMatt Warren commented on RavenDB indexing optimizations, Step II–Pre Fetching@Rafal Take a look at the post in the queue, it's titled, so I think it'll answer some of your questions. "RavenDB indexing optimizations, Step III–Skipping the disk altogether"http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment5http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment5Fri, 14 Dec 2012 09:43:11 GMTRafal commented on RavenDB indexing optimizations, Step II–Pre Fetchingoops, my response disappeared somehow. So, let's try again: 1. if your indexing cant keep up with the rate of modifications and there's starvation then it doesn't matter how you order documents for indexing - you won't be able to index them anyway and some will always 'starve' 2. But if you start with the wrong order and you have to load documents because they are not in the cache then you pay a double performance penalty - a cost of loading the data and even greater cost of throwing away already cached documents 3. Imho in normal operation you should never have to load documents to be indexed - they should always be already in the cache. So I'm not sure why Ayende is talking about the cost of loading documents - maybe this applies to batch processing or initial data loadhttp://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment4http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment4Thu, 13 Dec 2012 20:08:08 GMTChris commented on RavenDB indexing optimizations, Step II–Pre Fetching@Rafal You would have to also be mindful of "starvation" of the older documents. If you have a steady stream of new documents coming in, eventually you have to just say "enough guys, I've got to go back and get these other documents in."http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment3http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment3Thu, 13 Dec 2012 18:45:56 GMTRafal commented on RavenDB indexing optimizations, Step II–Pre Fetching.... and the cache wouldn't be polluted with older documents loaded there just for indexing.http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment2http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment2Thu, 13 Dec 2012 14:04:43 GMTRafal commented on RavenDB indexing optimizations, Step II–Pre FetchingI wonder why you have to load any data at all. If the docs have just been inserted or modified they should be in memory so you can index them without any loading. Maybe you should index the most recently modified document first and catch-up with the remaining ones later? This way the 'hottest' document would be indexed first, without any additional loading cost.http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment1http://ayende.com/160290/ravendb-indexing-optimizations-step-ii-pre-fetching#comment1Thu, 13 Dec 2012 13:35:12 GMT