How to get REALLY fast benchmarks

time to read 2 min | 231 words

I was very excited when I tried this code, it produced amazing performance:

private static void LoadDataFor(string searchPattern)
{
    foreach (var file in Directory.GetFiles("Docs", searchPattern).OrderBy(x=>x))
    {
        var sp = Stopwatch.StartNew();
        var httpWebRequest = (HttpWebRequest)WebRequest.Create("http://localhost:8080/bulk_docs");
        httpWebRequest.Method = "POST";
        using(var requestStream = httpWebRequest.GetRequestStream())
        {
            var readAllBytes = File.ReadAllBytes(file);
            requestStream.Write(readAllBytes, 0, readAllBytes.Length);
        }
        Console.WriteLine("{0} - {1}", Path.GetFileName(file), sp.Elapsed);
    }
}

Can you figure out what the fatal flaw in this code is?

Tweet Share Share 21 comments

Tags:

Bugs

Comments

25 Apr 2010
09:17 AM

Marco De Sanctis

Uhm... perhaps you're not flushing the request stream and thus you're only measuring how long does it take to write data onto the stream's internal buffer?

25 Apr 2010
09:28 AM

Demis Bellot

What @Marco said, plus you're not checking the response so you don't know if the request was successful.

25 Apr 2010
09:30 AM

Richard Dingwall

Forgot to specify HttpWebRequest.ContentLength before writing to the request stream?

msdn.microsoft.com/.../...quest.contentlength.aspx

25 Apr 2010
09:41 AM

Marco De Sanctis

Uh oh... nope sorry, made a mistake.. The request is within a using block, so you're actually flushing it! It's Sunday morning here in Italy, and my brain started in Safe Mode today, LOL ;-)

25 Apr 2010
10:04 AM

Philipp

Another try (my last comment was flagged as spam): It might be that the server doesn't get the request until you invoke GetResponse due to buffering.

Not related to the performance issue: Stopping the Stopwatch might be a good idea, or directly encapsulating the whole measurement. I did something similar here:

http://tinyurl.com/39op9j7

25 Apr 2010
10:12 AM

Sam

If your trying to benchmark the overall performance of loading all the documents in the directory, then I'm guessing that the 'var sp = Stopwatch.StartNew();' should be outside the foreach loop.

Otherwise the elapsed time is going to get reset on each iteration.

25 Apr 2010
10:23 AM

Merill Fernando

Marco was right, you need to call GetResponse to perform the upload. If not, nothing is going to be uploaded.

25 Apr 2010
10:43 AM

leonard

Should this httpWebRequest be created in the loop?

25 Apr 2010
11:36 AM

Ken Egozi

As people pointed out, GetResponse actually fire the HTTP request.

@Leonard - you should not reuse HttpWebRequest instances. Calling consecutive calls for GetResponse() will return a cached result of the first call, without re-issuing an HTTP request.

btw, I'd use WebClient as long as you do not need to mingle too much with the returned Response in term of headers, certificates etc.

25 Apr 2010
13:18 PM

manningj

Haven't tested it much (would definitely only work for sync, for instance), but:

public class TimedWebClient : WebClient

{

    public TimeSpan LastResponseTime { get; private set; }

    protected override WebResponse GetWebResponse(WebRequest request)

    {

        var stopWatch = Stopwatch.StartNew();

        var response = base.GetWebResponse(request);

        this.LastResponseTime = stopWatch.Elapsed;


        return response;

    }

}

consume like:

    static void Main(string[] args)

    {

        var webClient = new TimedWebClient();

        var url = "

http://www.google.com";

        webClient.DownloadString(url);

        Console.WriteLine("Downloading {0} took {1} ms", url, webClient.LastResponseTime.TotalMilliseconds);

        Console.ReadLine();

    }

25 Apr 2010
15:10 PM

Shane

You initialized your variable in the for loop so technically you are measuring only the last request and none of the file access or any of the other requests.

var sp = Stopwatch.StartNew();

foreach (var file in Directory.GetFiles("Docs", searchPattern).OrderBy(x=>x))

{

    //var sp = Stopwatch.StartNew();

25 Apr 2010
17:34 PM

Aaron Carlson

You never hit Stop on the stop watch.

26 Apr 2010
06:13 AM

Magesh

The http request is not submitted until the "GetResponse()" method of the HttpWebRequest class is invoked. The code in the blog post writes the data to the local stream and discards it without initiating the actual web request.

26 Apr 2010
06:36 AM

Justin

The statement "Directory.GetFiles("Docs", searchPattern).OrderBy(x=>x)"

By default, the GetFiles would usually return the files within the alphabetical order. Also, "GetFiles" is a blocking operation waiting for the entire array to be built up.

Therefore, the Stopwatch would only start measuring after the more expensive operations began. This would also only measure the very last cycle in the foreach loop... not the overall response time of the ENTIRE method.

26 Apr 2010
07:51 AM

Manu

You must call the Stream.Close method to close the stream and release the connection for reuse. Failure to close the stream causes your application to run out of connections.

26 Apr 2010
08:02 AM

Leyu Sisay

@Justin I think what you said above is correct, but when using Directory.GetFiles the order of the returned file names is not guaranteed, so using OrderBy is necessary.

26 Apr 2010
16:56 PM

meisinger

you have to either consume or close the ResponseStream

the very least that you have to do is call GetResponse().Close()

the "fatal flaw" being that you are not closing the underlying HTTP connection

at some point in time you are going to run out of available connections and be blocked waiting for an HTTP connection to timeout

testing and even "production" running code will work given a small number of files

27 Apr 2010
07:30 AM

Steve Py

I'd say it's the missing requestStream.Close within the loop. My guess would be that the Close operation would block the thread until the upload was complete so without it, (aside from the risk of running out of connections) your timer wouldn't reflect the actual upload performance. Hence, a false report of smashing performance.

27 Apr 2010
08:48 AM

Gavin

I would use DirectoryInfo instead of Directory to get the file listing. It does less security checks when combined with FileInfo.OpenText it should speed the process when processing a significant number of files ...

28 Apr 2010
09:19 AM

Magesh

Would you care to post the solution?

28 Apr 2010
10:08 AM

Ayende Rahien

Magesh,

KAE figured that out :-)

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB