I was wrong, reflecting on the .NET design choices
I have been re-thinking about some of my previous positions with regards to development, and it appear that I have been quite wrong in the past.
In particular, I’m talking about things like:
- non virtual by default. Representative post.
- abstract classes vs. interfaces. Representative post.
Note that those posts are parts of a much larger discussion, and both are close to a decade old. They aren’t really relevant anymore, I think, but it still bugs me, and I wanted to outline my current thinking on the matter.
C# is non virtual by default, while Java is virtual by default. That seems like a minor distinction, but it has huge implications. It means that proxying / mocking / runtime subclassing is a lot easier with Java than with C#. In fact, a lot of frameworks that were ported from Java rely on this heavily, and that made it much harder to use them in C#. The most common one being NHibernate, and one of the chief frustrations that I kept running into.
However, given that I’m working on a database engine now, not on business software, I can see a whole different world of constraints. In particular, a virtual method call is significantly more expensive than a direct call, and that adds up quite quickly. One of the things that we routinely do is try to de-virtualize method calls using various tricks, and we are eagerly waiting .NET Core 2.0 with the de-virtualization support in the JIT (we already start writing code to take advantage of it).
Another issue is that my approach to software design has significantly changed. Where I would previously do a lot of inheritance and explicit design patterns, I’m far more motivated toward using composition, instead. I’m also marking very clear boundaries between My Code and Client Code. In My Code, I don’t try to maintain encapsulation, or hide state, whereas with stuff that is expected to be used externally, that is very much the case. But that give a very different feel to the API and usage patterns that we handle.
This also relates to abstract class vs interfaces, and why you should care. As a consumer, unless you are busy doling some mocking or so such, you likely don’t, but as a library author, that matters a lot to the amount of flexibility you get.
I think that a lot of this has to do with my view point, not just as an Open Source author, but someone who runs a project where customers are using us for years on end, and they really don’t want us to make any changes that would impact their code. That lead to a lot more emphasis on backward compact (source, binary & behavior), and if you mess it up, you get ricochets from people who pay you money because their job is harder.
Comments
"Where I would previously do a lot of inheritance and explicit design patterns, I’m far more motivated toward using composition, instead. I’m also marking very clear boundaries between My Code and Client Code. In My Code, I don’t try to maintain encapsulation, or hide state, whereas with stuff that is expected to be used externally, that is very much the case."
Sounds like you're predisposed to like Functional Programming. Haskell would be my obvious choice, but since you mentioned Java, maybe Scala is another good alternative worth looking into.
Inheritance is really just a special case of an object that implements some methods by storing delegates. The two models are mostly equivalent. Therefore, deriving from a class is equivalent to reconfiguring that class just like one would configure it with any other data inside of it (such as a string).
The delegate model is closer to composition than inheritance is and I like it more as well. If a class needs to be injected with behavior pass in a delegate. No need to necessarily derive from it which is a more clumsy way to pass in code. And you have no runtime control over inheritance overrides. You can't programmatically decide to override or not override.
I think the main issue with virtual by default is that you as a class author now have to support all kinds of reconfigurations of your class. Designing that correctly (as opposed to ignoring the issue and hoping consumers don't do anything stupid) consumes dev time and is surface for bugs. A terrible model by Java.
I think this shows a bit that the first version of Java was not meant for rock solid enterprise software but for smaller, less important even throwaway software where architecture does not matter that much. Anders Hejlsberg had a very different idea!
I found this blog post quite insightful. This is not just your viewpoint but an objective truth.
so, since .Net includes devirtualization mechanism, and you're doing lots of such optimizations by hand, what's the point of having non-virtual methods? Just to save some compiler effort? They could all be virtual just like in Java.
Appreciate your observations on the goodness of prioritizing stability to assure long-term utility ($) of product. Clear demarcation of what's yours and what's theirs takes effort not often accounted. All sides benefit from friendly paranoia! <g>
Worthy read.
Porting software is a big issue for product management rather than the programmers themselves. Technology is heavily relied on, changing it BREAK things. It is known.
I agree with you that composition tends to be favorable and the main reason is easier configurable objects - It is far more easier to upgrade or reconstruct a class which is composed from different objects than changing the inheritance tree - bigger changes will break the inheritance chain and likely the whole API.
@Rafal, no matter how good the compiler it only has partial information to work with. It will never be able to deal efficiently on a virtual by default without an insane amount of effort. So they always work on the "good enough", when you need a bit more the non-by-default case is actually a very good thing; you dont need to go full blown C/C++ or other languages that compile to the JVM as you have to do in Java.
Rafal, We are both doing de-virtualization by hand and already writing code that will be de-virtualized in the future by the new JIT. Two different things.
Sure, but 1. Why two different things... if the JIT is the part that should do the devirtualization, why you need to do things by hand at all? What about compatibility with different jit/.net versions? 2. In Java world JIT is the only place where devirtualization happens, this way you don't have to think about it too early - esp. when you dont know how your classes will be used (ORMs, proxying, etc). They assume it's too early to devirtualize even at compilation time, the only right time is the runtime. And imho this makes a lot of sense. Wouldn't it be easier for you as well?
Rafal, In some cases, we have more information then the JIT, so we can do better. In other places, we can lean on what it will do. Java & JVM are very different, since they don't have structs / value types, for example.
JVM can definitely make a non-virtual call when the class hierarchy permits it, so there's no performance penalty. Even with its potential for abuse, I still believe that virtual by default is the right choice.
How does manual de-virtualization work? AFAIK, the compiler always emits a CallVirt instruction for non-static method calls, regardless of weather the method is virtual or not (and regardless of weather the class is sealed or not). Are you extending the C# compiler and overriding the emission code? Are you re-JITting the code in runtime (as profilers do using the profiler API)?
Maayan, Thanks for the question, I wrote a couple of blog posts about this, answering this in depth. Will go live by the end of the week.
Comment preview