ChallengeThe code review bug that gives me nightmares–The fix
After presenting the issue of how to return items to the array pool without creating a use after free bug, I asked you how you would fix that. There are several ways to try to do that, you can use reference counting scheme, or try to use locking, etc. All of those options are expensive, what is worse, they are expensive on a routine basis, not just for the free the buffer code path.
Instead, I changed the way we are handling returning the buffer. Take a look at the following code:
This may require some explanation. I’m using a ConditionaWeakTable here, that was added to the runtime to enable dynamic properties on objects. Basically, it creates a table that you can lookup by an object to get a key. The most important feature is that the runtime ensures that the associated reference lifetime match the key object lifetime. In other words, when we add the buffer in the eviction callback, we ensure that the ReturnBuffer we register will live at least as long as the buffer.
That means that we can let the GC do the verification job. We’ll now return the buffer back to the pool only after the GC has ensured that there are no outstanding references to it. Not a lot of code, and an elegant solution. This also ensures that we are only paying the code on eviction (likely rare), and not all the time.
More posts in "Challenge" series:
- (04 Jan 2023) what does this code print?
- (14 Dec 2022) What does this code print?
- (01 Jul 2022) Find the stack smash bug… – answer
- (30 Jun 2022) Find the stack smash bug…
- (03 Jun 2022) Spot the data corruption
- (06 May 2022) Spot the optimization–solution
- (05 May 2022) Spot the optimization
- (06 Apr 2022) Why is this code broken?
- (16 Dec 2021) Find the slow down–answer
- (15 Dec 2021) Find the slow down
- (03 Nov 2021) The code review bug that gives me nightmares–The fix
- (02 Nov 2021) The code review bug that gives me nightmares–the issue
- (01 Nov 2021) The code review bug that gives me nightmares
- (16 Jun 2021) Detecting livelihood in a distributed cluster
- (21 Apr 2020) Generate matching shard id–answer
- (20 Apr 2020) Generate matching shard id
- (02 Jan 2020) Spot the bug in the stream
- (28 Sep 2018) The loop that leaks–Answer
- (27 Sep 2018) The loop that leaks
- (03 Apr 2018) The invisible concurrency bug–Answer
- (02 Apr 2018) The invisible concurrency bug
- (31 Jan 2018) Find the bug in the fix–answer
- (30 Jan 2018) Find the bug in the fix
- (19 Jan 2017) What does this code do?
- (26 Jul 2016) The race condition in the TCP stack, answer
- (25 Jul 2016) The race condition in the TCP stack
- (28 Apr 2015) What is the meaning of this change?
- (26 Sep 2013) Spot the bug
- (27 May 2013) The problem of locking down tasks…
- (17 Oct 2011) Minimum number of round trips
- (23 Aug 2011) Recent Comments with Future Posts
- (02 Aug 2011) Modifying execution approaches
- (29 Apr 2011) Stop the leaks
- (23 Dec 2010) This code should never hit production
- (17 Dec 2010) Your own ThreadLocal
- (03 Dec 2010) Querying relative information with RavenDB
- (29 Jun 2010) Find the bug
- (23 Jun 2010) Dynamically dynamic
- (28 Apr 2010) What killed the application?
- (19 Mar 2010) What does this code do?
- (04 Mar 2010) Robust enumeration over external code
- (16 Feb 2010) Premature optimization, and all of that…
- (12 Feb 2010) Efficient querying
- (10 Feb 2010) Find the resource leak
- (21 Oct 2009) Can you spot the bug?
- (18 Oct 2009) Why is this wrong?
- (17 Oct 2009) Write the check in comment
- (15 Sep 2009) NH Prof Exporting Reports
- (02 Sep 2009) The lazy loaded inheritance many to one association OR/M conundrum
- (01 Sep 2009) Why isn’t select broken?
- (06 Aug 2009) Find the bug fixes
- (26 May 2009) Find the bug
- (14 May 2009) multi threaded test failure
- (11 May 2009) The regex that doesn’t match
- (24 Mar 2009) probability based selection
- (13 Mar 2009) C# Rewriting
- (18 Feb 2009) write a self extracting program
- (04 Sep 2008) Don't stop with the first DSL abstraction
- (02 Aug 2008) What is the problem?
- (28 Jul 2008) What does this code do?
- (26 Jul 2008) Find the bug fix
- (05 Jul 2008) Find the deadlock
- (03 Jul 2008) Find the bug
- (02 Jul 2008) What is wrong with this code
- (05 Jun 2008) why did the tests fail?
- (27 May 2008) Striving for better syntax
- (13 Apr 2008) calling generics without the generic type
- (12 Apr 2008) The directory tree
- (24 Mar 2008) Find the version
- (21 Jan 2008) Strongly typing weakly typed code
- (28 Jun 2007) Windsor Null Object Dependency Facility
Hmm, I'm a bit baffled by this last post in this post series.
Although not explicitly stated I believe the original question was how to achieve caching of expensive calculation results with low memory impact (MemoryCache) and low allocation count (ArrayPool) without having additional bugs.
Maybe I was wrong in my expectation?
This post brings up a wrapper object (ReturnBuffer) which needs to be allocated in memory (so we go up in allocation) and this object will also get placed into finalizer queue by GC to ensure release of memory.
As the entire object that we are holding is actually just 32 bytes we could have just returned the copy of the original array and I assume that the performance (speed/memory) would be roughly equivalent or even in favor of the simple copy.
Frankly at this point I believe someone else wrote the post and not Oren:
- There seems to be no purpose to_joinLifetimes ('...that was added to the runtime to enable dynamic properties on objects. Basically, it creates a table that you can lookup by an object to get a key' --> I can not see Oren writing this)
- Caller which gets back the ReturnBuffer can change the content of the array
- Caller can also take direct reference to the byte from the ReturnBuffer.Buffer and keep it around longer than it is 'alive'
- Caller can swap the ReturnBuffer.Buffer with some other byte array
Maybe I missed something completely here :(
I don't think that this class is exposed to the caller. It looks like it is only used internally to avoid "use after free" bug. But to be honest I also think this looks like an overkill unless you know (from usage data) that this will be called a lot of times, so a lot of copies are avoided. But what about ArrayPool exhaustion that was mentioned on previous post?
Have you considered not using
ArrayPoolat all in this scenario? I don't know how long the cache will keep the array, or how large they'll be, but it seems to me that renting a pooled array for a long, maybe unknown time could have negative effects on the ArrayPool. Won't it have to re-allocate new arrays if it runs out of rented ones? Wouldn't it be pointless then to return a rented array that has been held so long that the pool "forgot" about it?
Would it perhaps be acceptable to just allocate an array in this case, since it will be cached and re-used anyway? This code is definitely not simple anymore and there are many ways someone might introduce subtle bugs accidentally. If it is performance critical, then of course. If I remember correctly from the original article, this is "new" code though, so who knows if it's actually needed, or perhaps premature optimization.
The pool doesn't remember the arrays that it rented out. There is no tracking of that at all. And the key use case here is to avoid overflowing the cache, which will cause us to hold the memory just long enough to get to Gen2, and then we basically significantly increase the memory costs with no benefits.
ReturnBufferobject is allocated only on eviction, which would hopefully be rare. We also expect that object to be collected quite quickly, since we tie that to the lifetime of the buffer. We rely on the GC to tell us that there are no more references to the buffer by calling the
This is basically a technique to add a finalizer to an object that you don't control. The caller never gets a
ReturnBuffer, mind. We create an immediately discard it, letting the GC resolve the lifetime issue.
This is a very nice trick, so the solution is clearly interesting. However, there is now one allocation for every cached buffer, so it might be much simpler to just allocate the initial buffer. Also, if you know that the number of cached files will be very small, you can use a dedicated pool for the buffers, and never returns them.
Well, if you write "no-allocation" code, I suspect that your objects are not going to be collected :)
I meant it conceptually "forgot", which is why I put quotation marks around it, but that obviously wasn't clear. What I mean is, if we rent all arrays that a pool has to offer, it will have to allocate new ones when further requests are made. Assume a pool can hold
nbuffers and we rent
n, then we rent another
nwhich have to be allocated anew. Then we return the first batch of
nbuffers, then the second batch. Now the pool has
2*nbuffers it could use or free, but the point is that the second batch had to be allocated and essentially the pool might as well not have existed.
If the problem as I have outlined it exists, isn't there a risk then that your cache could cause this scenario? By holding buffers obtained from a pool for such a long time (and assuming that you make heavy use of the pool, so that it runs the risk of running out of buffers), then you could cause the very thing you try to avoid: allocating large arrays, in random places (whatever requests a buffer from the pool once it is empty).
There is now an allocation for every evicted buffer, and that is a very short term allocation, usually.
That plays well to the GC strengths. In my real scenario, I cannot predict the number of the buffer or their sizes. Note that there is a big difference between a long term allocation, that reached Gen2 and one that is a short term one that is unlikely to survive the next collection.
In practice, I don't expect this to be a problem. The system will settle down quickly to have the total number of allocated buffers.
Even if technically we need less than the total allocated, we are more concerned with allocation churn than actual allocation size. What usually happens is that we are either keeping the buffer for very long, or recycle it quickly, so it gets to go back to the pool.
This solution depends on implementation details of the
ArrayPool. If it holds on to a reference for the buffer this solution will leak memory as the
ReturnBufferis never collected. I also assume you did measure that sparing the allocation of 32 bytes is worth it, as Finalizers & WeakReferences have their own overhead for the GC.
The documentation clearly states that not returning the buffer to the pool is fine, see: https://docs.microsoft.com/en-us/dotnet/api/system.buffers.arraypool-1.rent?view=net-5.0#System_Buffers_ArrayPool_1_Rent_System_Int32_
In other words, they are documented to not hold a reference, so this isn't an implementation issue. As for the actual scenario, the key here is that I'm only paying the price here iff the buffer is evicted. In most normal uses, I have no cost to pay.
That said, the scenario I had in mind for the actual usage makes use of much bigger buffers, where it is certainly worth it. Note that even for 32 bytes buffer, enough churn can cause issues, so reducing allocations in general is almost always a good thing.