Rob’s Sprint: Idly indexing

architecture (616) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (642) rss
hibernating-practices (71) rss
miscellaneous (592) rss
performance (397) rss
programming (1088) rss
raven (1457) rss
ravendb.net (541) rss
reviews (184) rss

2025
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

RavenDB Workshops - Deep dive into practical use of Document Data Modeling

Feb 28 2013

Rob’s SprintIdly indexing

time to read 2 min | 359 words

During Rob Ashton’s visit to our secret lair, we did some work on hard problems. One of those problems was the issue of index prioritization. As I have discussed before, this is something that isn’t really easy to do, because of the associated IO costs with not indexing properly.

With Rob’s help, we have the defined the following:

An auto index can be set to idle if it hasn’t been queried for a time.
An index can be forced to be idle by the user.
An index that was automatically set to idle will be set to normal on its first query.

What are the implications for that? And idle index will not be indexed by RavenDB during the normal course of things. Only when the database is idle for a period of time (by default, about 10 minutes with no writes) will we actually get it indexing.

Idle indexing will continue indexing as long as there is no other activity that require their resources. When that happens, they will complete their current run and continue to wait for the database to become idle again.

But wait, there is more. In addition to introducing the notion of idle indexes, we have also created another two types of indexes. The first is pretty obvious, the disabled index will use no system resources and will never take part in indexing. This is mostly there so you can manually shut down a single index. For example, maybe it is a very expensive one and you want to stop it while you are doing an import.

More interesting, however, is the concept on an abandoned index. Even idle indexes can take some system resources, so we have added another level beyond that, an abandoned index is one that hasn’t been queried in 72 hours. At that point, RavenDB is going to avoid indexing it even during idle periods. It will still get indexed, but only if there has been a long enough time passed since the last time it was indexed.

Next, we will discuss why this feature was a crucial step in the way to killing temporary indexes.

Tweet Share Share 21 comments

Tags:

Comments

28 Feb 2013
12:12 PM

Patrik Potocki

Cool,

When will you push it into the unstable branch so we can test it out?

28 Feb 2013
14:32 PM

configurator

"an abandoned index is one that hasn’t been queried in 72 hours" - so a weekly report will never be up to date?

Also, why do idle indexes wait for 10 minutes of inactivity instead of just working only when all other indexes are up to date?

28 Feb 2013
14:45 PM

Chris Marisic

"An index that was automatically set to idle will be set to normal on its first query."

What if you want the index to always be an idle index? Like a reporting index that pulls tons of things together, or a crazy reporting map/reduce that is not relevant to OLTP functionality at all?

28 Feb 2013
15:42 PM

Rob Ashton

Chris - while not covered explicitly in the entry above, there is a flag to "force idle" and this will be exposed in the studio

28 Feb 2013
23:33 PM

Alex Spence

Can we get a way to set these flags on the index creators as well?

01 Mar 2013
00:04 AM

Ayende Rahien

Patrik, This is already available at: http://hibernatingrhinos.com/builds/ravendb-unstable-v2.5

01 Mar 2013
00:05 AM

Ayende Rahien

Configurator, You can force an index to not go into idle / abandoned mode. But in general, if you have an index that is queried weekly, you can afford to wake it up and then wait for it to catch up.

01 Mar 2013
00:06 AM

Ayende Rahien

Configurator, And the reason we wait for 10 minutes on inactivity is that we don't want to get into: "we have 1 second of rest, let us start indexing all the idle indexes, which can be VERY expensive".

01 Mar 2013
00:07 AM

Ayende Rahien

Alex, No, you can't do that at creation, but you can do that immediately after.

01 Mar 2013
01:18 AM

Alex Spence

In my still limited experience with Raven, specifically trying to work with bundles like replication and versioning. I have noticed that its not very straightforward to accomplish certain functionality without using the studio.

This specific feature is not that big of a deal to us, but we would really love to see functionality like this be configurable without going through the UI.

01 Mar 2013
07:18 AM

Ayende Rahien

Alex, ALL of RavenDB functionality is exposed via REST interface, and you can do absolutely everything the studio does. After all, the studio just uses HTTP to talk to RavenDB himself, it is not a privileged client.

01 Mar 2013
07:18 AM

Ayende Rahien

Alex, In other words, anything that you can do through the UI can be done in code, and pretty easily, at that.

01 Mar 2013
11:48 AM

Damian Hickey

RavenDB already caches compiled indexes ( https://github.com/ayende/ravendb/blob/master/Raven.Database/Linq/QueryParsingUtils.cs#L334 , discussion https://groups.google.com/d/msg/ravendb/hsMc4lLnaXU/h0WRLOYog9EJ ) which makes second and subsequent test runs that use create that index much faster.

I'm wondering if it would be possible to configure the indexes to be lazily compiled? That is, compiled and loaded when first queried?

Am currently doing system acceptance tests where we have an increasing number of indexes and am experiencing some time pain (20-30s +) on single test runs.

01 Mar 2013
11:51 AM

Ayende Rahien

Damian, There is really no cost in doing the compilation (it happens once, and that is it.)

01 Mar 2013
11:52 AM

Ayende Rahien

Oh, you are talking about the cost _per test run_, right? I was thinking about production runs, actually. In that case, can't you handle this via the index compilation caching that we already have?

01 Mar 2013
12:16 PM

Damian Hickey

Yes, the cost per test run, where I am run _one test at a time_, in the usual TDD(-ish) scenario. The index compilation caching (which is great) only kicks in when I run 2 or more tests per session. http://i.imgur.com/38DF0fc.png - second test benefits from the caching.

My other approach is to be able to supply a predicate to my application so the test fixture can configure it to only create indexes that are going to be used. But that means my acceptance test fixtures need to know what indexes may be required which I find to be leaky. (I take a different approach with my unit tests, no problems there)

Yes, it's a development pain and not a production issue. I may be an edge case though.

01 Mar 2013
12:24 PM

Ayende Rahien

Damian, In that case, how about implementing on disk caching for this?

01 Mar 2013
12:37 PM

Damian Hickey

Yeah, that sounds good too. Generate a hash from the source, use it as the CompilerParameters.OutputAssemblyName and if the assembly already exists on disk (in a location that will exist between test sessions i.e. users temp dir) load it.

Or something like that :)

01 Mar 2013
12:41 PM

Damian Hickey

Actually, that may be a nice-to-have from a production pov. An index that is deleted and then re-created, assuming it is exactly the same, would be slightly faster. Don't know how often that would happen though really.

01 Mar 2013
13:05 PM

Ayende Rahien

Damian, We have 2K+ tests, most of them with some form of indexes. We run them a LOT. any saving there would be useful in general.

01 Mar 2013
14:27 PM

Damian Hickey

Cool. Created the issue: http://issues.hibernatingrhinos.com/issue/RavenDB-969

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

Rob’s SprintIdly indexing

More posts in "Rob’s Sprint" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

Related posts that you may find interesting:

More posts in "Rob’s Sprint" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication