Windows Overlapped I/O and TPL style programming
I really like the manner in which C# async tasks work. And while building Voron, I run into a scenario in which I could really make use of Windows async API. This is exposed via the Overlapped I/O. The problem is that those are pretty different models, and they don’t appear to want to play together very nicely.
Since I don’t feel like having those two cohabitate in my codebase, I decided to see if I could write a TPL wrapper that would provide nice API on top of the underlying Overlapped I/O implementation.
Here is what I ended up with:
1: public unsafe class Win32DirectFile : IDisposable2: {
3: private readonly SafeFileHandle _handle;4:
5: public Win32DirectFile(string filename)6: {
7: _handle = NativeFileMethods.CreateFile(filename,
8: NativeFileAccess.GenericWrite | NativeFileAccess.GenericWrite, NativeFileShare.None, IntPtr.Zero,
9: NativeFileCreationDisposition.CreateAlways,
10: NativeFileAttributes.Write_Through | NativeFileAttributes.NoBuffering | NativeFileAttributes.Overlapped, IntPtr.Zero);
11:
12: if (_handle.IsInvalid)13: throw new Win32Exception();14:
15: if(ThreadPool.BindHandle(_handle) == false)16: throw new InvalidOperationException("Could not bind the handle to the thread pool");17: }
Note that I create the file with overlapped enabled, as well as write_through & no buffering (I need them for something else, not relevant for now).
It it important to note that I bind the handle (which effectively issue a BindIoCompletionCallback under the cover, I think), so we won’t have to use events, but can use callbacks. This is much more natural manner to work when using the TPL.
Then, we can just issue the actual work:
1: public Task WriteAsync(long position, byte* ptr, uint length)2: {
3: var tcs = new TaskCompletionSource<object>();4:
5: var nativeOverlapped = CreateNativeOverlapped(position, tcs);
6:
7: uint written;8: var result = NativeFileMethods.WriteFile(_handle, ptr, length, out written, nativeOverlapped);9:
10: return HandleResponse(result, nativeOverlapped, tcs);11: }
As you can see, all the actual details are handled in the helper functions, we can just run the code we need, passing it the overlapped structure it requires. Now, let us look at those functions:
1: private static NativeOverlapped* CreateNativeOverlapped(long position, TaskCompletionSource<object> tcs)2: {
3: var o = new Overlapped((int) (position & 0xffffffff), (int) (position >> 32), IntPtr.Zero, null);4: var nativeOverlapped = o.Pack((code, bytes, overlap) =>
5: {
6: try7: {
8: switch (code)9: {
10: case ERROR_SUCCESS:11: tcs.TrySetResult(null);12: break;13: case ERROR_OPERATION_ABORTED:14: tcs.TrySetCanceled();
15: break;16: default:17: tcs.TrySetException(new Win32Exception((int) code));18: break;19: }
20: }
21: finally22: {
23: Overlapped.Unpack(overlap);
24: Overlapped.Free(overlap);
25: }
26: }, null);27: return nativeOverlapped;28: }
29:
30: private static Task HandleResponse(bool completedSyncronously, NativeOverlapped* nativeOverlapped, TaskCompletionSource<object> tcs)31: {
32: if (completedSyncronously)33: {
34: Overlapped.Unpack(nativeOverlapped);
35: Overlapped.Free(nativeOverlapped);
36: tcs.SetResult(null);37: return tcs.Task;38: }
39:
40: var lastWin32Error = Marshal.GetLastWin32Error();
41: if (lastWin32Error == ERROR_IO_PENDING)42: return tcs.Task;43:
44: Overlapped.Unpack(nativeOverlapped);
45: Overlapped.Free(nativeOverlapped);
46: throw new Win32Exception(lastWin32Error);47: }
The complexity here is that we need to handle 3 cases:
- Successful completion
- Error (no pending work)
- Error (actually success, work is done in an async manner).
But that seems to be working quite nicely for me so far.
Comments
Very nice. Presumably you'll be calling WriteAsync quite often, creating lots of Task objects. All those allocations will add pressure to the GC. .NET's Socket APIs added a specialized async pattern specifically to avoid the APM's IAsyncResult allocations.
Stephen Toub wrote about how to wrap these Socket APIs in a way that allows them to be awaitable, without the Task allocations. http://blogs.msdn.com/b/pfxteam/archive/2011/12/15/10248293.aspx
I wonder if an approach along these lines would be useful for Voron.
Justin, That is something we'll be looking at, yes. We need to do a lot more stuff first.
Comment preview