Oren Eini

aka Ayende Rahien

Oren Eini

CEO of RavenDB

a NoSQL Open Source Document Database

Get in touch with me:

oren@ravendb.net +972 52-548-6969

Posts: 7,582
|
Comments: 51,212

Copyright ©️ Ayende Rahien 2004 — 2025

Privacy Policy · Terms
filter by tags archive
stack view grid view
  • architecture (611) rss
  • bugs (450) rss
  • challanges (123) rss
  • community (379) rss
  • databases (481) rss
  • design (895) rss
  • development (641) rss
  • hibernating-practices (71) rss
  • miscellaneous (592) rss
  • performance (397) rss
  • programming (1085) rss
  • raven (1448) rss
  • ravendb.net (532) rss
  • reviews (184) rss
  • 2025
    • June (4)
    • May (10)
    • April (10)
    • March (10)
    • February (7)
    • January (12)
  • 2024
    • December (3)
    • November (2)
    • October (1)
    • September (3)
    • August (5)
    • July (10)
    • June (4)
    • May (6)
    • April (2)
    • March (8)
    • February (2)
    • January (14)
  • 2023
    • December (4)
    • October (4)
    • September (6)
    • August (12)
    • July (5)
    • June (15)
    • May (3)
    • April (11)
    • March (5)
    • February (5)
    • January (8)
  • 2022
    • December (5)
    • November (7)
    • October (7)
    • September (9)
    • August (10)
    • July (15)
    • June (12)
    • May (9)
    • April (14)
    • March (15)
    • February (13)
    • January (16)
  • 2021
    • December (23)
    • November (20)
    • October (16)
    • September (6)
    • August (16)
    • July (11)
    • June (16)
    • May (4)
    • April (10)
    • March (11)
    • February (15)
    • January (14)
  • 2020
    • December (10)
    • November (13)
    • October (15)
    • September (6)
    • August (9)
    • July (9)
    • June (17)
    • May (15)
    • April (14)
    • March (21)
    • February (16)
    • January (13)
  • 2019
    • December (17)
    • November (14)
    • October (16)
    • September (10)
    • August (8)
    • July (16)
    • June (11)
    • May (13)
    • April (18)
    • March (12)
    • February (19)
    • January (23)
  • 2018
    • December (15)
    • November (14)
    • October (19)
    • September (18)
    • August (23)
    • July (20)
    • June (20)
    • May (23)
    • April (15)
    • March (23)
    • February (19)
    • January (23)
  • 2017
    • December (21)
    • November (24)
    • October (22)
    • September (21)
    • August (23)
    • July (21)
    • June (24)
    • May (21)
    • April (21)
    • March (23)
    • February (20)
    • January (23)
  • 2016
    • December (17)
    • November (18)
    • October (22)
    • September (18)
    • August (23)
    • July (22)
    • June (17)
    • May (24)
    • April (16)
    • March (16)
    • February (21)
    • January (21)
  • 2015
    • December (5)
    • November (10)
    • October (9)
    • September (17)
    • August (20)
    • July (17)
    • June (4)
    • May (12)
    • April (9)
    • March (8)
    • February (25)
    • January (17)
  • 2014
    • December (22)
    • November (19)
    • October (21)
    • September (37)
    • August (24)
    • July (23)
    • June (13)
    • May (19)
    • April (24)
    • March (23)
    • February (21)
    • January (24)
  • 2013
    • December (23)
    • November (29)
    • October (27)
    • September (26)
    • August (24)
    • July (24)
    • June (23)
    • May (25)
    • April (26)
    • March (24)
    • February (24)
    • January (21)
  • 2012
    • December (19)
    • November (22)
    • October (27)
    • September (24)
    • August (30)
    • July (23)
    • June (25)
    • May (23)
    • April (25)
    • March (25)
    • February (28)
    • January (24)
  • 2011
    • December (17)
    • November (14)
    • October (24)
    • September (28)
    • August (27)
    • July (30)
    • June (19)
    • May (16)
    • April (30)
    • March (23)
    • February (11)
    • January (26)
  • 2010
    • December (29)
    • November (28)
    • October (35)
    • September (33)
    • August (44)
    • July (17)
    • June (20)
    • May (53)
    • April (29)
    • March (35)
    • February (33)
    • January (36)
  • 2009
    • December (37)
    • November (35)
    • October (53)
    • September (60)
    • August (66)
    • July (29)
    • June (24)
    • May (52)
    • April (63)
    • March (35)
    • February (53)
    • January (50)
  • 2008
    • December (58)
    • November (65)
    • October (46)
    • September (48)
    • August (96)
    • July (87)
    • June (45)
    • May (51)
    • April (52)
    • March (70)
    • February (43)
    • January (49)
  • 2007
    • December (100)
    • November (52)
    • October (109)
    • September (68)
    • August (80)
    • July (56)
    • June (150)
    • May (115)
    • April (73)
    • March (124)
    • February (102)
    • January (68)
  • 2006
    • December (95)
    • November (53)
    • October (120)
    • September (57)
    • August (88)
    • July (54)
    • June (103)
    • May (89)
    • April (84)
    • March (143)
    • February (78)
    • January (64)
  • 2005
    • December (70)
    • November (97)
    • October (91)
    • September (61)
    • August (74)
    • July (92)
    • June (100)
    • May (53)
    • April (42)
    • March (41)
    • February (84)
    • January (31)
  • 2004
    • December (49)
    • November (26)
    • October (26)
    • September (6)
    • April (10)
RavenDB Workshops - Deep dive into practical use of Document Data Modeling
  previous post next post  
Jun 14 2013

And some people will INSIST on shooting them own foot off

time to read 1 min | 14 words

Because, clearly, that is what is missing. RavenDB GetAll extension method

Tweet Share Share 35 comments
Tags:
  • raven
  • wtf?!

  previous post next post  

Comments

Damien
14 Jun 2013
09:43 AM
Damien

Creating it in the first place is a bit WTF. Deciding to hold all of the results in a List and only return it after all of the calls complete, despite being inside an IEnumerable method just... elevates it to another level.

Patrick Huizinga
14 Jun 2013
09:52 AM
Patrick Huizinga

I can somewhat understand wanting to get all documents. But: var results = new List<T>(); Really..?

Btw, what do you think of my addition? public static IEnumerable<T> GetRange<T>(this IDocumentStore documentStore, int start, int count) { var results = new List<T>(); for (int i = 0; i < count; i++) { result.Add(documentStore.GetAll().ElementAt(start + i)); } return results; }

:trollface:

Patrick Huizinga
14 Jun 2013
09:53 AM
Patrick Huizinga

Ugh, no preview and no edit >.< Let's see if this works:

public static IEnumerable<T> GetRange<T>(this IDocumentStore documentStore, int start, int count)
{
    var results = new List<T>();
    for (int i = 0; i < count; i++)
    {
        result.Add(documentStore.GetAll().ElementAt(start + i));
    }
    return results;
}

:trollface:

Joel
14 Jun 2013
11:12 AM
Joel

Can someone clarify what's wrong with this please? I'm new to ravendb and understand the basic Do's and Dont's, but a rundown of why this is bad would be great, for myself as well as anyone else, especially those who might come to this page after googling 'ravendb getall'.

Ayende Rahien
14 Jun 2013
11:24 AM
Ayende Rahien

Joel, Look at unbounded result sets, as well as the real reason why we don't allow this in RavenDB. Basically, what happens if you have 1 million results.

Wyatt Barnett
14 Jun 2013
11:42 AM
Wyatt Barnett

For the record I agree with the design impetus for making this so. Then again, sometimes one just wants to get all the Ts and many times you know you won't have 1m or even 1000 records in a collection but you could well have more than 128 and you don't want to write a pager loop to handle it.

Now, I recall seeing somewhere there was a new 'stream me all the T' api option but that doesn't help people on older versions.

Duckie
14 Jun 2013
11:53 AM
Duckie

I have some collections with many small documents, and i just need all of them, easy. As i am working a lot with moving/importing data (~2000 docs) around, i had to do the same workaround. Forcing users to make stupid things themselves, and then blaming them i find is quite silly.

David Zidar
14 Jun 2013
12:15 PM
David Zidar

I agree that most of the time you don't want unbounded result sets. But there are legitimate reasons for wanting to retrieve all the data in a collection. For instance when exporting data in some other format or when generating a sitemap.xml with all pages and such.

There are exceptions to every rule.

Scott Scowden
14 Jun 2013
13:19 PM
Scott Scowden

I agree, there are definitely cases that you need more than 1024 records. Even worse, when using a hosted RavenDB, you can't easily change this value to retrieve more.

For example, I need to list all Zip Codes in a state to allow users to multi-select them.

Not saying his implementation is good, but there are definitely cases where it's needed.

Frank
14 Jun 2013
13:23 PM
Frank

@Duckie,

having to move/import data in batches already sounds like a "workaround". If you would send a message the target system as soon as your entity represented by the document changes would change that batch process into a real-time interface. And remove the query all documents necessity.

Kijana Woodard
14 Jun 2013
13:38 PM
Kijana Woodard

Yield return would at least prevent complete waste when the calling code does Take(x).

The "pager code" is pretty simple to write and is a good warning that you are doing something potentially dangerous.

Trying to make GetAll generic and reusable is much much more difficult. What I've seen is that soon you want to add a Where condition, then you want custom skip/take, then you want to get the Statistics, then you want to Include some other document, then you want to WaitForStale...

Soon this GetAll method and it's overloads are a pretty substantial API for which each combination of parameters has exactly one usage in the system.

And then there's this: http://ayende.com/blog/161249/ravendbs-querying-streaming-unbounded-results

Kijana Woodard
14 Jun 2013
13:41 PM
Kijana Woodard

@Scott - Each zip code has it's own document? I would think they would be grouped into far fewer docs.

@duckie - import/export could be done via the smuggler api. It would be interesting to see what Studio is doing here and emulate that.

João Bragança
14 Jun 2013
14:17 PM
João Bragança

What's wrong with this? I mean theoretically a windows server can 'scale' up to 4TB of memory. That way you don't have to pay developers to think and write good code!

Ayende Rahien
14 Jun 2013
15:01 PM
Ayende Rahien

Wyatt, What is the actual user scenario that requires all the data, when the data can be many thousands of records?

Ayende Rahien
14 Jun 2013
15:02 PM
Ayende Rahien

Duckie, We have explicit support for bulk insert / reads. That prevent you from loading everything into memory.

Ayende Rahien
14 Jun 2013
15:02 PM
Ayende Rahien

Scott, Why are you storing all the zip codes as a separate documents?

Daniel Lang
14 Jun 2013
15:15 PM
Daniel Lang

... and I don't understand why you don't understand it. There are cenarios beyond OLTP web applications where you just need this: GetAll(). I'm using it heavily in a desktop application that runs on RavenDB embedded. I know the perfomance implications of every other approach and yes, I think GetAll is the best in our situation. I'm sure there are other valid use-cases as well which you could have addressed with a better implementation of the streaming API.

Duckie
14 Jun 2013
15:25 PM
Duckie

Ayende, i need all data in memory, so i can use whatever linq commands, filtering, querying, sorting etc i want. Performance here is not an issue at all. I got loads of data i need to do manipulation on.

Ayende Rahien
14 Jun 2013
15:26 PM
Ayende Rahien

Duckie, Whatever for? Filtering, querying & sorting are db tasks, not in memory tasks.

jdn
14 Jun 2013
15:35 PM
jdn

@Duckie, @Daniel:

Don't worry. Ayende has been wrong about this from the start but implemented this auto-handcuff for marketing reasons.

There are sound technical reasons for wanting GetAll(). There used to be a way to override the "dumb by default" behavior in RavenDB, not sure if it is still in the code base or not.

Judah Gabriel Himango
14 Jun 2013
16:11 PM
Judah Gabriel Himango

I wonder how many hundreds or thousands of apps are actually efficient because RavenDB forced them to be, and forced lazy developers to do proper paging and/or document structure.

RavenDB has forced me to think about performance from the start, when normally I'd be lazy about it with SQL+O/RM.

Kijana Woodard
14 Jun 2013
16:16 PM
Kijana Woodard

@Daniel and @jdn

Sure. And it's pretty easy to roll yourself with the exact "flavor" you need (from my other commment). A GetAll in the API doesn't add much value to the common case.

For "embedded and not that much data and I understand" scenarios, I personally have used LoadStartingWith and avoid the query issues altogether.

LoadStartingWith + the new Streaming API + Smuggler + roll your own while loop = a lot of ways to handle these situations without having a simple, but dangerous, method exposed on the api.

Kijana Woodard
14 Jun 2013
16:20 PM
Kijana Woodard

Also, Dynamic Reporting takes care of another set of cases: http://ayende.com/blog/162339/ravendbs-dynamic-reporting

Facets solve for still others.

The difference being that these choices address specific concerns regarding working with the entire dataset instead of exposing a seemingly simple api method and hoping the user understands the intersection between the subtleties of what they are actually trying to achieve and what the api is actually doing.

jdn
14 Jun 2013
16:23 PM
jdn

@Kijana:

If I say "Select * from", I want select *.

If I want "select top 1024 from", then I will write that.

"LoadStartingWith + the new Streaming API + Smuggler + roll your own while loop = " a pain in the kiester.

At some point, it went from "running with scissors" to "crawling with pillows."

Tim Murphy
14 Jun 2013
16:28 PM
Tim Murphy

@Judah is quite right that Raven makes you think about performance and therefore paging.

My only beef is I think an exception should be thrown if the number of documents requested is greater than the default 128.

Kijana Woodard
14 Jun 2013
16:31 PM
Kijana Woodard

@jdn - Sure. If I was writing sql, fine. The problem is we're using abstractions on top of abstractions.

Code like that GetAll extension method is one of the primary reasons so many people (DBAs) say "EF Sucks". EF is fine, but once you abstract away what's going past a certain point, it will just lead to painful "surprises" down the road.

I once worried about this and typed up a post for the forum. I then realized that the while loop to page the results was shorter than the post I was writing.

Duckie
14 Jun 2013
16:32 PM
Duckie

Ayende, the DB cannott do what i want, without a lot of investment in time. I just need my data out, so i can work with it myself.

I understand the desire in optimal use of Ravendb by limiting the API, but forcing users to do stupid things is .. stupid.

Maybe just make a method called quyery.GetAllWhileUnderstandingThisIsStupid() ..

Kijana Woodard
14 Jun 2013
16:34 PM
Kijana Woodard

@Tim, you mean if the total document count is greater than 128 and you haven't specified a Take?

I like to explicitly define a Take for all queries, but I'd probably say log WARN instead of throw.

Foo
14 Jun 2013
18:20 PM
Foo

This reminds me of a technical lead in a fortune 500 company explaining me how having a web service exposing something like public dataset execute(string query, string connectionstring) was great to speed up development and deployments. Yes you can, no you shouldn't.

João Bragança
14 Jun 2013
18:50 PM
João Bragança

@David

The 'I might need to get everything because of sitemap' is questionable. Google doesn't NEED sitemap to index your site. You just need to ensure that all of your pages are reachable from the bookmark url. Oren's blog has lots of dynamic content too, a lot more than 1024 posts - see the sidebar. But of course it is all indexed by google. Someone should write an article about this...

Duckie
14 Jun 2013
20:02 PM
Duckie

Sitemaps is not only about making a list of links for indexing, but also to show google the structure of the site. Besides, if they want to expose a sitemap, why is this questionable?

Fact is, if you want to load many documents to memory you have to do special stuff with ravendb, No matter what valid reason you might have for it.

This is what users experience / what i experienced.

You only get a limited number of records. You increase this. You run in to the maximum limit of records. You start paging it out, but you run in to the maximum queries per session exception. You increase the number of allowable requests, or you create multiple sessions.

Since streams were added, it is of course easier to do.

Sarmaad
15 Jun 2013
14:08 PM
Sarmaad

at the beginning I had the same thoughts.. but now, no way.. I rather while loop than just blindly get all documents.

I found myself asking.. do i need this here, is the model designed correctly or should this be a map/reduce..

don't change a thing.

Karg
17 Jun 2013
20:27 PM
Karg

We actually have some legacy APIs that we've converted over to use RavenDB on the back end, but we still have to maintain the non-paged methods.

We have the following (better) extension method to get all. It obeys skipped results and returns an IEnumerable<T> so you can avoid materializing the whole thing if you're just operating over the whole set.

This is with Raven 1.0, we'll use Streams when we upgrade.

http://pastebin.com/AqaAu6DC

Sean Kearon
18 Jun 2013
11:15 AM
Sean Kearon

I'm using embedded in a desktop application and I have to agree completely with @Daniel here. "GetAll" is absolutely essential for my use cases, as it ensuring that the query does not wait for any stale results.

I'm also using 1.0 currently, but will likely move to streams when I get time to upgrade.

Jon Canning
20 Jun 2013
14:16 PM
Jon Canning

Oh dear, how embarrassing, I know it's wrong but I needed a quick hack and had just read this:

http://stackoverflow.com/questions/11268955/retrieving-entire-data-collection-from-a-raven-db

I put in on my blog in case I needed it again; honestly didn't expect anyone to find it! I'll remove it for fear of encouraging others.

Comment preview

Comments have been closed on this topic.

Markdown formatting

ESC to close

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - - 

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

 

FUTURE POSTS

  1. RavenDB GenAI Deep Dive - 8 hours from now
  2. fsync()-ing a directory on Linux (and not Windows) - 3 days from now

There are posts all the way to Jun 09, 2025

RECENT SERIES

  1. Webinar (7):
    05 Jun 2025 - Think inside the database
  2. Recording (16):
    29 May 2025 - RavenDB's Upcoming Optimizations Deep Dive
  3. RavenDB News (2):
    02 May 2025 - May 2025
  4. Production Postmortem (52):
    07 Apr 2025 - The race condition in the interlock
  5. RavenDB (13):
    02 Apr 2025 - .NET Aspire integration
View all series

RECENT COMMENTS

  • Scooletz, Page faults when working with data that is greater than RAM is not an uncommon issue for us. One of the reasons ...
    By Oren Eini on Recording: RavenDB's Upcoming Optimizations Deep Dive
  • What a massive presentation! As a person who spent some time with a db written in .NET I can strongly relate to some points. ...
    By Scooletz on Recording: RavenDB's Upcoming Optimizations Deep Dive
  • I’d love to learn your thoughts on SPANN https://arxiv.org/abs/2111.08566 that with centroids and keeping the posting lists s...
    By Scooletz on Comparing DiskANN in SQL Server & HNSW in RavenDB
  • Joel, The DiskANN paper talks about it being viable for more than a billion vectors datasets.  In such a scenario, it would ...
    By Oren Eini on Comparing DiskANN in SQL Server & HNSW in RavenDB
  • Do you know why they chose DiskANN? These things are usually about tradeoffs but it seems DiskANN is just worse in every way.
    By Joel on Comparing DiskANN in SQL Server & HNSW in RavenDB

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats
}