Memory Mapped Files, File I/O & Performance
I have been testing out several approaches for writing out to files. And I thought that the results are interesting enough to share. In all cases, I was writing a 128Kb buffer of random data to a file with size of 256Mb.
The first thing that I wanted to try was the trivial managed memory map approach:
using (var mmf = MemoryMappedFile.CreateFromFile("test.bin", FileMode.Create, "test", 1024*1024*256)) { using (var accessor = mmf.CreateViewAccessor()) { for (int i = 0; i < accessor.Capacity; i += buffer.Length) { accessor.WriteArray(i, buffer, 0, buffer.Length); } accessor.Flush(); } }
This completed in 3.871 seconds.
Next, I Wanted to see what would happen if I were using direct memory access, and used CopyMemory to do that:
[DllImport("kernel32.dll", EntryPoint = "RtlMoveMemory")] static extern void CopyMemory(byte* dst, byte* src, long size); using (var mmf = MemoryMappedFile.CreateFromFile("test.bin", FileMode.Create, "test", 1024*1024*256)) { using (var accessor = mmf.CreateViewAccessor()) { byte* p = null; accessor.SafeMemoryMappedViewHandle.AcquirePointer(ref p); fixed (byte* src = buffer) { for (int i = 0; i < accessor.Capacity; i += buffer.Length) { CopyMemory(p + i, src, buffer.Length); } } accessor.SafeMemoryMappedViewHandle.ReleasePointer(); accessor.Flush(); } }
As you can see, this is somewhat more complex, and require unsafe code. But this completed in 2.062 seconds. Nearly twice as fast.
Then I decided to try with raw file IO:
using (var f = new FileStream("test.bin",FileMode.Create)) { f.SetLength(1024*1024*256); for (int i = 0; i < f.Length; i += buffer.Length) { f.Write(buffer, 0, buffer.Length); } f.Flush(true); }
This is about the most trivial code that you can think of, and this completed in about 1.956 seconds. Slightly faster, but within the margin of error (note, in repeated tests, they were consistently very close, and the file I/O was very near).
So, in other words, the accessor code adds a lot of overhead when using Memory Mapped Files.
Comments
Interesting discovery ;) But are you sure the last example is 'raw io' and not 'buffered IO'? You've done a sequential write which is probably faster than random write that only happens to be sequential in your test case.
BTW C-style pointers in C# are ugly. They just don't look good mixed with LongCamelCased.ClassAndAPISymbols. In C# it would be nicer to declare pointers like:
RawPointer<byte> ptr = something
I'd like to find out where the time was spent in each test. For example, does creating the file and the view take time or is it almost instant?
I'm also surprised that copying 256mb of memory can take 2s, no matter how many intermediate copies are being done. Memory can be accessed sequentially at >=10gb/s as far as I'm informed.
Use Reflector to see what WriteArray does. It is not a memcpy, but a generic function. That must cause the extreme CPU usage. I'd try using a stream on the MMF to write byte[]'s.
I'd not trust the numbers until I'd have seen the profiler results. You might end up measuring stuff you don't care about.
Using mmf.CreateViewStream gives you same performance as raw IO:
using (var mmf = MemoryMappedFile.CreateFromFile("test.bin", FileMode.Create, "test", 1024 * 1024 * 256)) { using (var mmvs = mmf.CreateViewStream(0, 0 /* 0 == create a complete view */, MemoryMappedFileAccess.Write)) { for (int i = 0; i < mmvs.Length; i += buffer.Length) { mmvs.Write(buffer, 0, buffer.Length); } mmvs.Flush(); } }
Rafal, Yes, this is buffered IO, but note that I called Flush(true), and included that in the cost of doing this.
Tobi, Note that we include the time to flush this to disk.
Ayende, I hope you run this test under .NET 4.5. In 4.0 it may not flush always http://connect.microsoft.com/VisualStudio/feedback/details/792434/flush-true-does-not-always-flush-when-it-should
Rafal, sometimes when doing externs to unmanaged functions, I wrap pointers with a struct with LayoutKind.Sequential. That's make my wraps a bit safer. How about it?
Scooletz, Yes, that was run under 4.5, I am aware of that bug.
I don't understand. Last code snippet uses memory mapped files? If not then what sense would memory mapped files have if they are 2 times slower?
Guest, The last code snippet didn't use mmap files. It was the control test.
It appears that the memory mapped scenarios do not do an "fsync" whereas the file based control test does. This may make a significant difference especially if you try writing to files a lot larger than 256 MB.
Also, I would expect that in your intended usage scenario, flushes/fsyncs would be more frequent than every 256 MB, which - especially in combination with large (multiple GB) files - will have a significant effect on performance depending on what kind of I/O strategy you use.
@alex There's accessor.Flush() which writes the modified pages to disk, so all examples are fsynced. However, I wonder how 'Voron' will handle btree page modifications - will it fsync after every update operation?
@Rafal accessor.Flush() does not perform an "fsync", it calls into MemoryMappedView.Flush() which in turn calls "FlushViewOfFile", not the same as an "fsync". See also http://msdn.microsoft.com/en-us/library/windows/apps/aa366563(v=vs.85).aspx.
"The FlushViewOfFile function does not flush the file metadata, and it does not wait to return until the changes are flushed from the underlying hardware disk cache and physically written to disk. To flush all the dirty pages plus the metadata for the file and ensure that they are physically written to disk, call FlushViewOfFile and then call the FlushFileBuffers function."
Alex, You are correct, except that in both cases, we also close the file handle, which will do the flushing for us, so it is the same thing, effectively.
@Ayende, as far as I am aware, closing the file handle will not cause the drive's caches to be flushed (i.e. it will not issue an "fsync" command to the device: "SYNCHRONIZE CACHE" for SCSI, "FLUSH CACHE" for IDE/ATAPI). Since on an average consumer PC, these drive caches may be as large as 8 MB and on more high end systems 16 MB, that represents the amount of data that is potentially at risk.
Comment preview