Using GOTO in C#
After talking about GOTO in C, I thought that I should point out some interesting use cases for using GOTO in C#. Naturally, since C# actually have proper methods for resource cleanups (IDisposable and using), the situation is quite different.
Here is one usage of GOTO in RavenDB’s codebase:
This is used for micro optimization purposes. The idea is that we put the hot spots of this code first, and only jump to the rare parts of the code if the list is full. This keep the size of the method very small, it allow us to inline it in many cases and can substantially improve performance.
Here is another example, which is a bit crazier:
As you can see, this is a piece of code that is full of gotos, and there is quite a bit of jumping around. The answer to why we are doing this is again, performance. In particular, this method is located in a very important hot spot in our code, as you can imagine. Let’s consider a common usage of this:
var val = ReadNumber(buffer, 2);
What would be the result of this call? Well, we asked the JIT to inline the method, and it is small enough that it would comply. We are also passing a constant to the method, so the JIT can simplify it further by checking the conditions. Here is the end result in assembly:
Of course, this is the best (and pretty common for us) case where we know what the size would be. If we have to send a variable, we need to include the checks, but that is still very small.
In other words, we use GOTO to direct as much as possible the actual output of the machine code, explicitly trying to be more friendly toward the machine at the expense of readability in favor of performance.
I'd be interested in hearing why the first example would be preferred over this:
Or even a else condition.
I mean, it's not clear to me how it actually leads to a performance optimisation. :-)
The underlying reason is you want to keep hot code sequentially to better use the frontend and avoid tripping over cache lines. Most of those optimizations are instances of a more general class of optimization known as code layout optimizations. I talked about it here: https://youtu.be/DD3w66Ff8Ms?t=20436
Federico, but hot code is sequential in Diego's version of the code. Diego's code actually produces exactly the same IL as the original version.
gotodoesn't improve anything in this case.
ReadNumber, if I use multiple
returns instead of
gotos, the assembly is the same in the constant case (at least on .Net Core 2.1).
In the variable case, I see different assembly than what's in the gist. And
gotodoes produce more efficient assembly than multiple
returns, but not by much (the only different is that where
Sorry, didn't notice the actual code difference in the first message.
That's a simple example, in that case, there is no difference (it used to have not long ago); but we have a much larger codebase with a much bigger surface and those are JIT based optimization that when the JIT improves they shouldn't exist; so all of them follow the same pattern in order to be able to roll them back when the issues that prevent the JIT to take the right choice get fixed (which in this case is the Unlikely part: https://github.com/dotnet/coreclr/issues/6024). For example, no long ago, having multiple returns would mess your code layout. You would use a GOTO to avoid code repetition (pop repetition), and as soon as we are sure noone is using those versions anymore they will get rolled back in bulk.
So the question of why we use one way and not the other is because of consistency; even though the goto-less version is equivalent.
Moreover, the second code is more or less in the same venue. It has been fixed at 2.1 after a few PRs spanning the 1.1, 2.0 and 2.1 release. For our purposes, it is effectively solved, but we cannot roll it back until we do not need to support those targets anymore. First, they solved the marking the throw only method as cold code (which has a very important impact on highly inlined code), then they solved the multiple returns the
goto Successfulis not needed anymore and they went the extra mile to solve that also for loops (which are not showcased here).
Very uncommon in C# but a common practice in C++ (Linux kernel etc.). If one is surprised by such issues, here is an interesting blog post on this subject: http://250bpm.com/blog:6
I still don't see the point to this - even in the 2nd example you can just return the value instead of using goto. It also means I don't have to scroll down to see what goto actually does.
In a more complex example you should just create a method that does what is in the goto sections. Nobody has ever been able to show me a good use case example of goto that either produced more readable code or more optimized without sacrificing readability.
Anthony, Method call are costly, we want to avoid that.
I agree with previous opinions, I'm not convinced that using a GOTO have any advantages over standard if/else workflow. I personally think that if/else is much easier to read and follow the structure. GOTO is a tool which might be useful in some cases (haven't seen any), but it's an odd one which much easier to misuse than to use properly.
As previously established, just because one .NET JIT implementation/version happens to produce the same machine code for an
ifand tail calls as it does for
gotos, it doesn't mean that all will. When you've got a loop being run tens or hundreds of millions of times per second, eliminating individual instructions in a consistent and reliable manner across all supported versions of .NET is worth gold.
Readability is subjective. A 20-line method being inlined into a very tight loop is not somewhere readability is going to suffer if a few
gotos are deployed judiciously.
And quite frankly, anyone who's scared off by a
gotounder those conditions has no business maintaining that code anyway.
gotohas ever gone into the codebase without a vtune profiling run showing it improves by a decent enough amount. That some
gotos are not needed anymore, on certain newer CLRs, doesn't preclude they are still needed in our codebase for the cases where a client is still using an older one (we have clients using RavenDB 2.5 still, which is 6 years old). And definitely no, the kind of codebase that should use
gotoout of necessity is far away from the usual code. And they are also a rarely seen in the wild kind; in most cases, they are locked down behind walls.
In the ReadNumber, why not just replace the "goto Error" with the code from the Error label?
@Mark Because the preparation work needed to throw an exception would interrupt the flow of execution and cause a cache miss. In that way you get a single jump instruction which will play nice with the instruction prefetcher.