Ayende @ Rahien

Hi!
My name is Ayende Rahien
Founder of Hibernating Rhinos LTD and RavenDB.
You can reach me by phone or email:

ayende@ayende.com

+972 52-548-6969

@

Posts: 5,947 | Comments: 44,541

filter by tags archive

Sparse files & memory mapped files


One of the problems with memory mapped files is that you can’t actually map beyond the end of the file. So you can’t use that to extend your file.  I had a thought and set out to check, what will happen if I create a sparse file, a file that only take space when you write to it, and at the same time, map it?

As it turn out, this actually work pretty well in practice. You can do so without any issues. Here is how it works:

using (var f = File.Create(path))
{
    int bytesReturned = 0;
    var nativeOverlapped = new NativeOverlapped();
    if (!NativeMethod.DeviceIoControl(f.SafeFileHandle, EIoControlCode.FsctlSetSparse, IntPtr.Zero, 0,
                                      IntPtr.Zero, 0, ref bytesReturned, ref nativeOverlapped))
    {
        throw new Win32Exception();
    }
    f.SetLength(1024*1024*1024*64L);
}

This create a sparse file that is 64Gb in size. Then we can map it normally:

using (var mmf = MemoryMappedFile.CreateFromFile(path))
using (var memoryMappedViewAccessor = mmf.CreateViewAccessor(0, 1024*1024*1024*64L))
{
    for (long i = 0; i < memoryMappedViewAccessor.Capacity; i += buffer.Length)
    {
        memoryMappedViewAccessor.WriteArray(i, buffer, 0, buffer.Length);
    }
}

And then we can do stuff to it.  And that include writing to yet unallocated parts of the file. This also means that you don’t have to worry about writing past the end of the file, the OS will take care of all of that for you.

Happy happy, joy joy, etc.

One problem with this method, however. It means that you have a 64Gb file, but you don’t have that much allocated. What that means in turn is that you might not have that much space available for the file. Which brings up an interesting question, what happens when you are trying to commit a new page, and the disk is out of space? Using file I/O you would get an IO error with the right code. But using memory mapped files, the error would actually turn up during access, which can happen pretty much anywhere. It also means that it is a Standard Exception Handling error in Windows, which require special treatment.

To test this out, I wrote the following so it would write to a disk that had only about 50 Gb free. I wanted to know what would happen when it run out of space. That is actually something that happens, and we need to be able to address this issue robustly. The kicker is that this might actually happen at any time, so that would really result is some… interesting behavior with regards to robustness. In other words, I don’t think that this is a viable option, it is a really cool trick, but I don’t think it is a very well thought out option.

By the way, the result of my experiment was that we had an effectively a frozen process. No errors, nothing, just a hung. Also, I am pretty sure that WriteArray() is really slow, but I’ll check this out at another pointer in time.


Comments

Damien

I assume there's meant to be more code after the "I wrote the following" paragraph, but I'm not seeing any.

Ayende Rahien

Damien, I forgot to include it, but this is basically just filling up 50 GB in random bytes.1

tobi

Sparse files also fragment like crazy. The allocation pattern on disk looks like a log structured file system.

njy
njy

@Oren: not that it is really relevant to this specific topic, but may this give you some nice idea to play with? I don't know where to send this tip.

https://github.com/antonmks/Alenka/blob/master/green.md

Comment preview

Comments have been closed on this topic.

FUTURE POSTS

No future posts left, oh my!

RECENT SERIES

  1. RavenDB Sharding (3):
    22 May 2015 - Adding a new shard to an existing cluster, splitting the shard
  2. The RavenDB Comic Strip (2):
    20 May 2015 - Part II – a team in trouble!
  3. Challenge (45):
    28 Apr 2015 - What is the meaning of this change?
  4. Interview question (2):
    30 Mar 2015 - fix the index
  5. Excerpts from the RavenDB Performance team report (20):
    20 Feb 2015 - Optimizing Compare – The circle of life (a post-mortem)
View all series

RECENT COMMENTS

Syndication

Main feed Feed Stats
Comments feed   Comments Feed Stats