Finding a set of consecutive cleared bits in a bitmap
Consider the following scenario, I have a buffer and I need to allocate from this buffer a certain size. Here is how this looks like:
The buffer is divided into cells and I want to be able to find a free cell. An common way to handle this is to use a bitmap (something called a bit array or bit vector). In the case of the buffer above, we have 64 cells to deal with.
If I use a single bit to signal if a cell is free or in use, I can handle all of that with a single uint64 value. Let’s assume that I have a big buffer and thus an array of uint64 that represent the allocation bitmap. In the case of the buffer above, the X marks a busy cell and the bitmap for this is:
I’m going to base this code on this post be Lemire. In the post, he is finding the set bits, and I need a range of the inverse. It isn’t hard to convert Lemire’s code to a range search. The complexity is that I wanted to enable search from arbitrary location. Here is the code, I’ll discuss it below:
The idea is that we use Lemire’s code to find the first set bit and see if the distance from the last set bit is enough to satisfy the range request we have. Using this function, we can tell that the 0x80520A27102E21 value has the following ranges:
- 8 consecutive cells at: 
- 7 consecutive cells at: 
- 6 consecutive cells at: [14,47]
- 5 consecutive cells at: [14,36,47]
4 consecutive cells at: [1,14,36,47,51]
- 3 consecutive cells at: [1,6,14,17,21,30,36,47,50]
- 2 consecutive cells at: [1,3,6,14,16,18,21,27,30,36,38,42,47,49,51,53]
- 1 consecutive cell at: [1,2,3,4,6,7,8,12,14,15,16,17,18,19,21,22,23,27,28,30,31,32,34,36,37,38,39,40,42,43,45,47,48,49,50,51,52,53,54]
Note that the function gives us quite a bit of information about the bitmap structure. We can tell how big a range we got back (so we can make a decision and get the optimal one). Here is an example of using this function to do best fit locality aware allocations. We have the following buffer, composed of the following bitmap: [0x80520A27102E21, 0xE0000E00020, 0x1A40027F025802C8, 0xEE6ACAE56C6C3DCC].
What we want is to find the following ranges:
- single cell near the red marker (9th cell)
- 2 cells range near the red marker (9th cell)
- 4 cells range near the yellow marker (85th cell)
- 7 cells range near the green marker (198th cell)
We want to get the best cell range for each request, where best is defined as:
- Near the requested range
- Will create the least amount of fragmentation.
I’ll try to give an answer to this in my next post, see how you do here.
A word of caution. It took a while to refactor the code above to its final form. I’ll have another post that discuss the changes as well. Consider this code a template, it hasn’t been well tested and I re-wrote it since this post several times.
I had some fun with creating a solution in C# I went for iterators, it's probably not the fasted algo, but I like the outcome, it's very readable.
Yes, this makes perfect sense. I used something similar to verify my C code :-)