PR Review: Encapsulation stops at the assembly boundary

architecture (618) rss
bugs (451) rss
challanges (123) rss
community (381) rss
databases (481) rss
design (896) rss
development (646) rss
hibernating-practices (72) rss
miscellaneous (592) rss
performance (397) rss
programming (1092) rss
raven (1459) rss
ravendb.net (544) rss
reviews (184) rss

2025
- August (5)
- July (7)
- June (7)
- May (10)
- April (10)
- March (10)
- February (7)
- January (12)
2024
- December (3)
- November (2)
- October (1)
- September (3)
- August (5)
- July (10)
- June (4)
- May (6)
- April (2)
- March (8)
- February (2)
- January (14)
2023
- December (4)
- October (4)
- September (6)
- August (12)
- July (5)
- June (15)
- May (3)
- April (11)
- March (5)
- February (5)
- January (8)
2022
- December (5)
- November (7)
- October (7)
- September (9)
- August (10)
- July (15)
- June (12)
- May (9)
- April (14)
- March (15)
- February (13)
- January (16)
2021
- December (23)
- November (20)
- October (16)
- September (6)
- August (16)
- July (11)
- June (16)
- May (4)
- April (10)
- March (11)
- February (15)
- January (14)
2020
- December (10)
- November (13)
- October (15)
- September (6)
- August (9)
- July (9)
- June (17)
- May (15)
- April (14)
- March (21)
- February (16)
- January (13)
2019
- December (17)
- November (14)
- October (16)
- September (10)
- August (8)
- July (16)
- June (11)
- May (13)
- April (18)
- March (12)
- February (19)
- January (23)
2018
- December (15)
- November (14)
- October (19)
- September (18)
- August (23)
- July (20)
- June (20)
- May (23)
- April (15)
- March (23)
- February (19)
- January (23)
2017
- December (21)
- November (24)
- October (22)
- September (21)
- August (23)
- July (21)
- June (24)
- May (21)
- April (21)
- March (23)
- February (20)
- January (23)
2016
- December (17)
- November (18)
- October (22)
- September (18)
- August (23)
- July (22)
- June (17)
- May (24)
- April (16)
- March (16)
- February (21)
- January (21)
2015
- December (5)
- November (10)
- October (9)
- September (17)
- August (20)
- July (17)
- June (4)
- May (12)
- April (9)
- March (8)
- February (25)
- January (17)
2014
- December (22)
- November (19)
- October (21)
- September (37)
- August (24)
- July (23)
- June (13)
- May (19)
- April (24)
- March (23)
- February (21)
- January (24)
2013
- December (23)
- November (29)
- October (27)
- September (26)
- August (24)
- July (24)
- June (23)
- May (25)
- April (26)
- March (24)
- February (24)
- January (21)
2012
- December (19)
- November (22)
- October (27)
- September (24)
- August (30)
- July (23)
- June (25)
- May (23)
- April (25)
- March (25)
- February (28)
- January (24)
2011
- December (17)
- November (14)
- October (24)
- September (28)
- August (27)
- July (30)
- June (19)
- May (16)
- April (30)
- March (23)
- February (11)
- January (26)
2010
- December (29)
- November (28)
- October (35)
- September (33)
- August (44)
- July (17)
- June (20)
- May (53)
- April (29)
- March (35)
- February (33)
- January (36)
2009
- December (37)
- November (35)
- October (53)
- September (60)
- August (66)
- July (29)
- June (24)
- May (52)
- April (63)
- March (35)
- February (53)
- January (50)
2008
- December (58)
- November (65)
- October (46)
- September (48)
- August (96)
- July (87)
- June (45)
- May (51)
- April (52)
- March (70)
- February (43)
- January (49)
2007
- December (100)
- November (52)
- October (109)
- September (68)
- August (80)
- July (56)
- June (150)
- May (115)
- April (73)
- March (124)
- February (102)
- January (68)
2006
- December (95)
- November (53)
- October (120)
- September (57)
- August (88)
- July (54)
- June (103)
- May (89)
- April (84)
- March (143)
- February (78)
- January (64)
2005
- December (70)
- November (97)
- October (91)
- September (61)
- August (74)
- July (92)
- June (100)
- May (53)
- April (42)
- March (41)
- February (84)
- January (31)
2004
- December (49)
- November (26)
- October (26)
- September (6)
- April (10)

RavenDB Workshops - Deep dive into practical use of Document Data Modeling

Nov 08 2017

PR ReviewEncapsulation stops at the assembly boundary

time to read 2 min | 247 words

The following set of issues all fall into code that is used within the scope of a single assembly, and that is important. I’m writing this blog post before I got the chance to talk to the dev in question, so I’m guessing about intent.

This change is likely motivated by the fact that callers are not expected to make a modification to the resulting dictionary.

That said, this is used between different components in the same assembly, and is never exposed outside. That means that we have a much higher trust between the components, and reading IReadOnlyDictionary means that we need to spend more cycles trying to figure out who you are trying to protect from.

Equally important, in this case, the Dictionary methods can be called without any virtual call overhead, while the IReadOnlyDictionary needs interface dispatch to work.

This is a case that is a bit more subtle. The existingData is a variable that is passed to a method. The problem is that in this case, no one is ever going to send null, and sending a null is actually an error.

In this case, if we did get a null, I would rather that the code would immediately crash with “what just happened?” rather than limp along with bad data.

Tweet Share Share 20 comments

Tags:

Comments

08 Nov 2017
11:01 AM

Paul Turner

The second case looks like a classic "On Error Resume Next" style of code that is trying to be helpful but I think is generally far more harmful. It can be hard to explain to less-experienced developers why hard failure is preferable to trying to helpfully handle error cases , at least until they learn the pain of debugging a system that isn't giving you a clear signal of an error and instead just does the wrong thing.

The first case is interesting to me: I often overlook the cost of virtual dispatch for the sake of making the contracts between components clearer. I wonder if a happier middle-ground would be to use ImmutableDictionary? I much prefer the strong behaviour (you can never change this dictionary) versus the weaker contract (you can't modify this dictionary, but we don't promise nobody else will), and you'd eliminate the virtual dispatch too.

08 Nov 2017
11:17 AM

Oren Eini

Paul, One thing about ImmutableDictionary is that I run into a lot of perf problems with them. See: https://ayende.com/blog/164739/immutable-collections-performance

08 Nov 2017
11:24 AM

Jorge

In the first case, if the code is used within a single (trusted) assembly, shouldn't the interface be internal instead of public ? That way, you can safely expose a Dictionary and know that it (probably) won't be modified.

08 Nov 2017
11:41 AM

Oren Eini

Jorge, Internal means that we have more complications when we need to deal with internal tools. Tests, other stuff that plug into it. In general I don't like the usage of public / internal at stuff to place such boundaries.

08 Nov 2017
11:50 AM

Pop Catalin

GetLastProcessedDocumentTombstonesPerCollection ... well if you're going to add documentation in method names I suggest to make it more detailed ... something like 'GetLastProcessedDocumentTombstonesPerCollectionSinceThisMethodWasCalledAndTheDocumentsWereMakedAToucheOrTheLastSinceStartOf ... TheEnd' ;)

Joking aside I think a shorter name like GetLatestThombsones with doc comments are actually more readable.

08 Nov 2017
11:59 AM

Oren Eini

Pop Catalin, Why? To start with, we usually don't use doc comments in the internals. Second, this is something that needs to be read at the call site, not the function itself. It is much better to be clear about things.

RavenDB used to have a property named: EnableBasicAuthenticationOverUnsecureHttpEvenThoughPasswordsWouldBeSentOverTheWireInClearTextToBeStolenByHackers

08 Nov 2017
12:26 PM

Pop Catalin

@Ayende, I think well-placed doc comments in internal code can greatly reduce source code churn for developer onboarding. Source code churn can be a steep artificial barrier to entry for new developers. You don't need to read the code and know how a settings cache is implemented (until you actually have a need to do it), a description is enough if captures the purpose. IE: "stores the settings in memory and automatically reloads them whenever an update operation is intercepted from ...." This can save minutes to hours of code reading in a phase the developer should skip it anyway.

08 Nov 2017
12:33 PM

Oren Eini

Pop Catalin, In 95% of the cases, you can express the same thing in a docs comment as the name. Cache.Get is a good example, what can you really say in a docs comment that isnt' already expressed. https://github.com/dotnet/corefx/blob/master/src/System.Runtime.Caching/src/System/Runtime/Caching/MemoryCache.cs#L634

And things like putting the "reload them automatically?" That is not supposed to be on a Get method. There is a reason why comments rot is an insidious thing, and it is easy to get things completely out of wack.

That just lead to more confusion. I would rather have clear code.

And note that I'm talking about internal stuff, no things that are externally exposed.

08 Nov 2017
13:23 PM

Paul Turner

It looks like I was part of that conversation on the BCL immutable collections four years ago. I had forgotten how badly they under-perform for your use cases. I see that you ended up with your own ImmutableAppendOnlyList for Voron. I don't suppose you have another magnificent type that fits this case?

08 Nov 2017
13:47 PM

Oren Eini

Paul, Afraid not. We do a lot of evil stuff around buffering and reusing things to get this working.

08 Nov 2017
15:22 PM

Pop Catalin

There is a reason why comments rot is an insidious thing

It's the same reason source code rots, people who don't take that extra 5 min at the end of a change to refactor and clean their changes.

I would rather have clear code

Given the choice me too. If however, I can have both, clear code and doc comments, I would gladly take both.

08 Nov 2017
15:38 PM

Oren Eini

Pop Catalin, Actually, the problem is that we don't look at comments when we edit the code one we know what is going on, so we never think of fixing this. And if you repeat information in many places, that is going to deteriorate quickly.

And given that you can't have the cake and eat it...

08 Nov 2017
17:10 PM

Andrzej Martyna

IMHO, in the first case, it is always better to have more “readonly” and “immutable” stuff, always, always. Even for internal works. And change it only if you have to. It’s a pity that containers you mention has performance issues, but if you do not implement super-highly-performant services that these concerns look like optimizing too early.

For the second case I prefer to do more checking. Is it contrary to fail early? Probably. But if you have many components interacting with each other IMHO it is better to they be more forgiving to each other then expecting very strictly.

08 Nov 2017
18:55 PM

Oleg Mihailik

Having explicit READONLY is great, but definitely not at the cost of performance.

In general, in C# any collection interface is a code smell (except IEnumerable of course). People dragging their Java/C++ wisdom into C# create undue trouble, so keep bashing them on small points and they'll learn quicker.

If CLR did have a decent ReadonlyDictionary, that's what would be sensible to use. Of course, ImmutableDictionary isn't a trivial replacement perf-wise, especially in edge cases. Whether it fits here — hard to say. It may well fit, memory-wise it could win over flatter sparser hashtable implementation.

Of course, if you don't use ImmutableDictionary widely anyway, picking it for one specific use case hurts maintainability. Every time developer stumbles upon it, she has to STOP AND THINK. Wow, is this important? Why not normal? Thinking leads to errors.

08 Nov 2017
19:01 PM

Oleg Mihailik

Checking for null should be done using if statements. Apply ruler on the fingers of those who disobeys.

Note how ?? needs a comment? That's why. A dumb if (existingData==null) wouldn't, BECAUSE IT'S EASIER TO UNDERSTAND WITHOUT COMMENTS.

If existingData cannot be null, fine, remove it. Otherwise ask them to share the single empty instance, save on memory. Obvious plus for performance-sensitive code!

08 Nov 2017
20:06 PM

Oren Eini

Andrzej, We are implementing super highly performant services. I'm literally counting the number of instructions and laying out my memory for efficient processing. Having something that is 10 times more expensive for the purity sake is no go.

As for being forgiving, that is a great case to end up with a system that has an error, limp along for a while and errors in mysterious ways.

08 Nov 2017
20:08 PM

Oren Eini

Oleg, Actually, the ?? needed a comment because it wasn't something that we usually do. Not to explain what it does, but the reasoning behind it. We use it quite extensively, and it is great for such scenarios.

public Executor(RuntimePolicy policy = null)
{
   _policy = policy ?? new ThreadingPolicy();
}

For example

08 Nov 2017
21:57 PM

Andrzej Martyna

Oren, so you are right if performance is top priority. Respect! Oleg, how do you know it!? Respect to you too! :) I was C++ developer for years. After switching to C# I’m constantly miss two things. “const” for the first case which highly promotes and spread immutability in the system. References for the second case when you do not have to check for nulls. BTW. C++ is associated with pointer but in C# are all pointers everywhere because you have to check for nulls.

End in end maybe some functional languages are rescue here? (both for immutability and performance)

09 Nov 2017
11:31 AM

Oleg Mihailik

Oren, not saying ?? is a bad thing — but not for this use case (hence comments). Having an explicit if here would be better.

Not least because semantically it's different: assignment will only happen in edge case. The original example has existingData redundantly assigned to itself in the much-used hot path.

09 Nov 2017
11:35 AM

Oleg Mihailik

Andrzej, the non-nullability is being added to C# in the next iteration.

Frankly, I don't like it: the checks are being bolted on the side, with no guarantees and whole infrastructure reliant on good intentions rather than fact. It may have been better integrated in C++, although the mental cost of unpicking const-related compiler errors/warnings is a burden.

Comment preview

Comments have been closed on this topic.

Markdown turns plain text formatting into fancy HTML formatting.

Phrase Emphasis

*italic*   **bold**
_italic_   __bold__

Links

Inline:

An [example](http://url.com/ "Title")

Reference-style labels (titles are optional):

An [example][id]. Then, anywhere
else in the doc, define the link:
  [id]: http://example.com/  "Title"

Images

Inline (titles are optional):

![alt text](/path/img.jpg "Title")

Reference-style:

![alt text][id]
[id]: /url/to/img.jpg "Title"

Headers

Setext-style:

Header 1
========
Header 2
--------

atx-style (closing #'s are optional):

# Header 1 #
## Header 2 ##
###### Header 6

Lists

Ordered, without paragraphs:

1.  Foo
2.  Bar

Unordered, with paragraphs:

*   A list item.
    With multiple paragraphs.
*   Bar

You can nest them:

*   Abacus
    * answer
*   Bubbles
    1.  bunk
    2.  bupkis
        * BELITTLER
    3. burper
*   Cunning

Blockquotes

> Email-style angle brackets
> are used for blockquotes.
> > And, they can be nested.
> #### Headers in blockquotes
> 
> * You can quote a list.
> * Etc.

Horizontal Rules

Three or more dashes or asterisks:

---
* * *
- - - -

Manual Line Breaks

End a line with two or more spaces:

Roses are red,   
Violets are blue.

Fenced Code Blocks

Code blocks delimited by 3 or more backticks or tildas:

```
This is a preformatted
code block
```

Header IDs

Set the id of headings with {#<id>} at end of heading line:

## My Heading {#myheading}

Tables

Fruit    |Color
---------|----------
Apples   |Red
Pears	 |Green
Bananas  |Yellow

Definition Lists

Term 1
: Definition 1
Term 2
: Definition 2

Footnotes

Body text with a footnote [^1]
[^1]: Footnote text here

Abbreviations

MDD <- will have title
*[MDD]: MarkdownDeep

Oren Eini

Oren Eini

CEO of RavenDB

PR ReviewEncapsulation stops at the assembly boundary

More posts in "PR Review" series:

Comments

Comment preview

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication

Main feed
Comments feed

Oren Eini

CEO of RavenDB

Related posts that you may find interesting:

More posts in "PR Review" series:

Comments

Comment preview

Markdown formatting

Phrase Emphasis

Links

Images

Headers

Lists

Blockquotes

Horizontal Rules

Manual Line Breaks

Fenced Code Blocks

Header IDs

Tables

Definition Lists

Footnotes

Abbreviations

FUTURE POSTS

RECENT SERIES

RECENT COMMENTS

Syndication