Code review challenge: The concurrent dictionary refactoring–answer

architecture (614) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (642) rss
hibernating-practices (71) rss
miscellaneous (592) rss
performance (397) rss
programming (1086) rss
raven (1455) rss
ravendb.net (539) rss
reviews (184) rss

2025
- July (5)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

Dec 31 2015

Code review challengeThe concurrent dictionary refactoring–answer

time to read 4 min | 612 words

Here is the full method that we refactored:

public void ReturnMemory(byte* pointer)
 {
 var memoryDataForPointer = GetMemoryDataForPointer(pointer);

 _freeSegments.AddOrUpdate(memoryDataForPointer.SizeInBytes, x =>
 {
 var newQueue = new ConcurrentStack<AllocatedMemoryData>();
 newQueue.Push(memoryDataForPointer);
 return newQueue;
 }, (x, queue) =>
 {
 queue.Push(memoryDataForPointer);
 return queue;
 });
 }

And here is the allocation map for this method:

public unsafe void ReturnMemory(byte* pointer)
{
 <>c__DisplayClass9_0 CS$<>8__locals0 = new <>c__DisplayClass9_0();
 CS$<>8__locals0.memoryDataForPointer = this.GetMemoryDataForPointer(pointer);
 this._freeSegments.AddOrUpdate(CS$<>8__locals0.memoryDataForPointer.SizeInBytes, 
 new Func<int, ConcurrentStack<AllocatedMemoryData>>(CS$<>8__locals0.<ReturnMemory>b__0), 
 new Func<int, ConcurrentStack<AllocatedMemoryData>, ConcurrentStack<AllocatedMemoryData>>(CS$<>8__locals0.<ReturnMemory>b__1));
}

As you can see, we are actually allocating three objects here. One is the captured variables class generated by the compiler (<>c__DisplayClass9_0) and two delegate instances. We do this regardless of if we need to add or update.

The refactored code looks like this:

public void ReturnMemory(byte* pointer)
 {
 var memoryDataForPointer = GetMemoryDataForPointer(pointer);

 var q = _freeSegments.GetOrAdd(memoryDataForPointer.SizeInBytes, size => new ConcurrentStack<AllocatedMemoryData>());
 q.Push(memoryDataForPointer);

 }

And what actually gets called is:

public unsafe void ReturnMemory(byte* pointer)
{
 Interlocked.Increment(ref this._returnMemoryCalls);
 AllocatedMemoryData memoryDataForPointer = this.GetMemoryDataForPointer(pointer);
 if(<>c.<>9__9_0 == null)
 {
 <>c.<>9__9_0 = new Func<int, ConcurrentStack<AllocatedMemoryData>>(this.<ReturnMemory>b__9_0);
 }
 this._freeSegments.GetOrAdd(memoryDataForPointer.SizeInBytes, <>c.<>9__9_0).Push(memoryDataForPointer);
}

The field (<>c.<>9__9_0) is actually a static field, so it is only allocated once. Now we have a zero allocation method.

Tweet Share Share 8 comments

Tags:

challanges

Comments

01 Jan 2016
17:33 PM

HarryDev

Ha, either bug. or perf... it is not always my perf first. mentality is right, but great.

I would make a static readonly member for the func to eliminate the branch as well.

01 Jan 2016
20:32 PM

Oren Eini

HarryDev, I think that branch prediction should pretty much null that issue, but I decided to test the difference. So I created three tests, one for the version above, one using an instance field and one using a static field. https://gist.github.com/ayende/3cf665c613e90d9320f8

Below are the results for 32 bits and 64 bits. On 32 bits, the method above is pretty much on par (very small difference) than the other two, but on 64 bits, it is significantly faster. I'm not really quite sure _why_, but those are the results I'm getting

BenchmarkDotNet=v0.8.0.0 OS=Microsoft Windows NT 6.2.9200.0 Processor=Intel(R) Core(TM) i7-4710MQ CPU @ 2.50GHz, ProcessorCount=8 HostCLR=MS.NET 4.0.30319.42000, Arch=32-bit Type=Program Mode=Throughput Platform=HostPlatform Jit=HostJit .NET=HostFramework toolchain=Classic Runtime=Clr Warmup=5 Target=10 Method | AvrTime | StdDev | op/s | -------------------- |------------ |---------- |------------- | CompilerDoesWork | 695.5009 ns | 4.1586 ns | 1,437,862.04 | ReadOnlyField | 679.8743 ns | 5.9000 ns | 1,470,966.85 | ReadOnlyFieldStatic | 686.0820 ns | 7.5863 ns | 1,457,724.95 |

BenchmarkDotNet=v0.8.0.0 OS=Microsoft Windows NT 6.2.9200.0 Processor=Intel(R) Core(TM) i7-4710MQ CPU @ 2.50GHz, ProcessorCount=8 HostCLR=MS.NET 4.0.30319.42000, Arch=64-bit [RyuJIT] Type=Program Mode=Throughput Platform=HostPlatform Jit=HostJit .NET=HostFramework toolchain=Classic Runtime=Clr Warmup=5 Target=10 Method | AvrTime | StdDev | op/s | -------------------- |------------ |----------- |------------- | CompilerDoesWork | 712.0287 ns | 8.1491 ns | 1,404,617.41 | ReadOnlyField | 829.7725 ns | 75.5855 ns | 1,214,984.00 | ReadOnlyFieldStatic | 798.9948 ns | 70.2898 ns | 1,260,914.23 |

01 Jan 2016
20:33 PM

Oren Eini

Here are the results in in a format that is easy to read:

https://gist.github.com/ayende/3cf665c613e90d9320f8#file-results-txt

02 Jan 2016
10:22 AM

HarryDev

I copied your test code, added some platform attributes and got the results below (also see https://gist.github.com/nietras/12941e429df2b085c0f2 ).

As you can see I don't get the big difference you get, also the variance is a lot smaller, indicating that when you ran the test something odd was going on. Did you do these tests on a laptop?

BenchmarkDotNet=v0.8.0.0 OS=Microsoft Windows NT 6.2.9200.0 Processor=Intel(R) Core(TM) i5-3475S CPU @ 2.90GHz, ProcessorCount=4 HostCLR=MS.NET 4.0.30319.42000, Arch=32-bit Type=Program Mode=Throughput .NET=HostFramework toolchain=Classic Runtime=Clr Warmup=5 Target=10
Method | Platform | Jit | AvrTime | StdDev | op/s | -------------------- |--------- |---------- |------------ |---------- |------------- | CompilerDoesWork | X64 | LegacyJit | 697.8024 ns | 2.8879 ns | 1,433,094.11 | CompilerDoesWork | X64 | RyuJit | 702.7631 ns | 5.1605 ns | 1,423,028.09 | CompilerDoesWork | X86 | LegacyJit | 673.8250 ns | 4.7500 ns | 1,484,135.65 | ReadOnlyField | X64 | LegacyJit | 698.0743 ns | 7.4199 ns | 1,432,666.90 | ReadOnlyField | X64 | RyuJit | 698.6331 ns | 5.3205 ns | 1,431,445.49 | ReadOnlyField | X86 | LegacyJit | 676.5499 ns | 6.0682 ns | 1,478,201.20 | ReadOnlyFieldStatic | X64 | LegacyJit | 707.8221 ns | 3.8842 ns | 1,412,825.46 | ReadOnlyFieldStatic | X64 | RyuJit | 703.4171 ns | 5.4386 ns | 1,421,713.31 | ReadOnlyFieldStatic | X86 | LegacyJit | 673.0903 ns | 5.4836 ns | 1,485,778.83 |

02 Jan 2016
10:35 AM

HarryDev

I definitely didn't expect any significant difference between the versions, but its one line of code and this eliminates the "implicit" knowledge that the lambda in question does not get allocated each time, thus, reducing code "duplicity", also I would be somewhat concerned given the lambda is stored in a field accessed by multiple threads. So is a single threaded test showing any possible (albeit minute) issues? Probably not. I would argue code readability and knowledge is the primary argument for me. If I saw the code I would make a mental note of it as having a probable alloc issue, even though it does not for a given compiler...

I would note that the test code is very different from the code in production. The fact that you have a for-loop changes a lot with regards to code generation etc. E.g. inlining and such.

Can't wait for BenchmarkDotNet to add GC/memory alloc profiling as well. Also a source diagnoser that can show actual machine assembly code. Pretty awesome.

02 Jan 2016
10:56 AM

Oren Eini

HarryDev, We actually got into a discussion about this particular issue during an interview (with the team members, not the interviewee). That is something that we teach the devs to look and understand :-)

19 Jan 2016
10:48 AM

Matt Warren

@Harry

Can't wait for BenchmarkDotNet to add GC/memory alloc profiling as well. Also a source diagnoser that can show actual machine assembly code. Pretty awesome. Well it already has a diagnoser that can show the assembly code, see this tweet for more info https://twitter.com/matthewwarren/status/649587739915079684. BUT we need to improve our documentation to tell people about it! With regards to the GC/memory alloc profiling, yes that's coming too, see https://github.com/PerfDotNet/BenchmarkDotNet/issues/56

19 Jan 2016
10:49 AM

Matt Warren

Uggh, something went wrong with the formatting of my comment, hopefully it makes sense though!

"Well it already has a diagnoser that can show the assembly code, see this tweet for more info https://twitter.com/matthewwarren/status/649587739915079684. BUT we need to improve our documentation to tell people about it! With regards to the GC/memory alloc profiling, yes that's coming too, see https://github.com/PerfDotNet/BenchmarkDotNet/issues/56"

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

Code review challengeThe concurrent dictionary refactoring–answer

More posts in "Code review challenge" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

More posts in "Code review challenge" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication