Challenge: What killed the application?

architecture (614) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (642) rss
hibernating-practices (71) rss
miscellaneous (592) rss
performance (397) rss
programming (1086) rss
raven (1455) rss
ravendb.net (539) rss
reviews (184) rss

2025
- July (5)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

RavenDB Workshops - Deep dive into practical use of Document Data Modeling

Apr 28 2010

ChallengeWhat killed the application?

time to read 2 min | 267 words

I have been doing a lot of heavy performance testing on Raven, and I run into a lot of very strange scenarios. I found a lot of interesting stuff (runaway cache causing OutOfMemoryException, unnecessary re-parsing, etc). But one thing that I wasn’t able to resolve was the concurrency issue.

In particular, Raven would slow down and crash under load. I scoured the code, trying to figure out what was going on, but I couldn’t figure it out. It seemed that after several minutes of executing, request times would grow longer and longer, until finally the server would start raising errors on most requests.

I am ashamed to say that it took me a while to figure out what was actually going on. Can you figure it out?

Here is the client code:

Parallel.ForEach(Directory.GetFiles("Docs","*.json"), file =>
{
    PostTo("http://localhost:9090/bulk_docs", file);
});

The Docs directory contains about 90,000 files, and there is no concurrent connection limit. Average processing time for each request when running in a single threaded mode was 100 – 200 ms.

That should be enough information to figure out what is going on.

Why did the application crash?

Tweet Share Share 29 comments

Comments

28 Apr 2010
09:19 AM

Henning Anderssen

The files in Directory.GetFiles("Docs","*.json") is the same directory as http://localhost:9090/bulk_docs, so you have an ever increasing filecount?

28 Apr 2010
09:20 AM

Peter Ibbotson

Wild guess is that it ran out of IP source port numbers?

28 Apr 2010
09:23 AM

Directory.GetFiles("Docs",".json") should be Directory.EnumerateFiles("Docs",".json") if you want to be Parallel.

28 Apr 2010
10:09 AM

Ayende Rahien

Henning,

No, there is no association between the two.

28 Apr 2010
10:10 AM

Ayende Rahien

Peter,

No, we haven't got that. But I run into this before.

It usually only pop up using HTTPS, or authenticated connections, though.

LS,

Actually, no, we parallelize the action, not the enumeration, but thanks for letting me know about the new API

28 Apr 2010
10:25 AM

Henning Anderssen

Your testclient is sending more requests than the server can handle, maybe you're using some sort of queue on the server which overflows.

Wild guessing from my side.

28 Apr 2010
10:27 AM

Is it because the directory contains too many files?

28 Apr 2010
10:32 AM

Frank Quednau

Depending how your test is set up, could it be that Parallel ForEach and Raven DB are getting worker threads from the same thread pool?

28 Apr 2010
10:40 AM

manningj

hit OOM because the server was buffering all the post'ed files? It's gotta get the whole request (including file contents) into memory before passing it along AFAIK

28 Apr 2010
11:09 AM

Rafal

What about the underlying database - maybe it had some concurrency problems - deadlocks, transaction timeouts, or run out of pooled connections?

28 Apr 2010
11:10 AM

Tim van der Weijde

A wild guess, doesthe Directory.GetFiles() method return a non-generic collection instead of a generic one? If so, you should cast it.

28 Apr 2010
11:15 AM

Paul

It effectively DoS'd the server by uploading too many files at the one time (there were more parallel threads going on the client than the server could accept, so they started to timeout).

28 Apr 2010
11:21 AM

Richard Dingwall

90,000 files @ 100-200ms each, no limit on the degrees of parallelization - lemme guess you had around 8,000 threads active, with 1MB stack allocated each, and hit OOM?

28 Apr 2010
11:46 AM

Barry

Was it getting the same set of files ..

28 Apr 2010
11:49 AM

Uriel Katz

Richard:it is single-threaded

you hit max sockets,file descriptors per process/system.

msdn.microsoft.com/.../ms739169%28VS.85%29.aspx

the deafult is 64.

28 Apr 2010
11:49 AM

Dan Finucane

Unless you modify the registry to increase the limit WININET makes at most two distinct connections to the same remote host so you are only going to benefit from two threads. The other threads are going to block waiting for one of the two connections and if you have more than two processors in your system you are going to spin up more and more threads out of the .NET thread pool all of them blocking and taking up 1-2mb of virtual address space.

28 Apr 2010
12:12 PM

tobi

The thread-pool was spawning more and more threads (max by default is 250) because from its perspective the work was IO bound (waiting on the posts). It tries to saturate the CPU by spawning more threads.

28 Apr 2010
12:32 PM

Shaun

Is PostTo doing an async post? I can't imagine how Parallel.ForEach would be bogging down the server since it limits the number of parallel tasks to the number of cores that you have. So if you are doing synchronous POST requests, it is only going to be posting 2-4 requests at a time, which is obviously not a lot.

28 Apr 2010
12:35 PM

jonnii

Is it something to do with the fact that you're posting to the same uri over and over again?

I can imagine a scenario where at some point you decide to persist the documents, by recursively walking the documents to be written and because there are so many you end up blowing the stack somehow.

28 Apr 2010
13:17 PM

otsdr

Does it have anything to do with TIME-WAIT? msdn.microsoft.com/en-us/library/ms819739.aspx

28 Apr 2010
13:35 PM

Dag

HttpWebRequest.KeepAlive was set to its default "true" value?

28 Apr 2010
14:21 PM

tobi

I am impressed because many creative solutions have been posted. By coincidence I faced the same issue 5min ago. It was the threadpool. Breaking in the debugger and executing ThreadPool.SetMaxThreads in the immediate window helped so I did not have to restart my long-running batch job.

28 Apr 2010
17:08 PM

Dan Finucane

The .NET thread pool does not create a thread unless there is a processor/core on your system that is doing nothing. If there are no processors available the thread pool puts your request in a queue. You shouldn't use with ThreadPool.SetMaxThreads. The problem is that a thread is created and it blocks immediately when WININET already has two connections to a given host. When it blocks the processor it was running on is freed and the thread pool takes a request out of its queue and schedules a thread. You end up with all these threads blocked each taking up 1-2mb of virtual address space and they are all waiting for the same WININET resource to become available.

28 Apr 2010
17:11 PM

Dan Finucane

This article discusses the WININET limit http://support.microsoft.com/kb/183110

28 Apr 2010
18:11 PM

Francisco Velazquez

Maybe it could be a problem with max http connections by server:

stackoverflow.com/.../improving-performance-of-...

28 Apr 2010
20:59 PM

Felix

Don't know if the Paralle.ForEach uses some sort of I/O port completion, but if so, I would think than blocking time waiting for socket reply ( the http request ) will be used, and do other file handles open and evantually run out of maximum file handles available. If I remember well, file handles are forced to some not so large count, in order to prevent buggy/malicious software to arm the system

29 Apr 2010
07:47 AM

Derek Fowler

Are you enumerating the entire contents of bulk_docs for every request to check your filename is unique?

30 Apr 2010
12:20 PM

Mark

Because you dumped 90,000 tasks into the Parallel Framework task scheduler?

30 Apr 2010
13:03 PM

Ayende Rahien

Actually, it handled that really nicely.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

ChallengeWhat killed the application?

More posts in "Challenge" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

Related posts that you may find interesting:

More posts in "Challenge" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication