Ayende @ Rahien

It's a girl

Not all bytes weight exactly 8 bits

Or, pay attention to how you write to the disk. Here is a simple example:

static void Main(string[] args)
{
	var count = 10000000;

	Stopwatch stopwatch = Stopwatch.StartNew();

	using (var stream = CreateWriter())
	using (var bw = new BinaryWriter(stream))
	{
		for (var i = 0; i < count; i++)
		{
			bw.Write(i);
		}
		bw.Flush();
	}
	stopwatch.Stop();
	Console.WriteLine("Binary Writer: " + stopwatch.ElapsedMilliseconds);

	stopwatch = Stopwatch.StartNew();

	using (var stream = CreateWriter())
	{
		for (var i = 0; i < count; i++)
		{
			var bytes = BitConverter.GetBytes(i);
			stream.Write(bytes, 0, 4);
		}
		stream.Flush();
	}
	stopwatch.Stop();
	Console.WriteLine("BitConverter: " + stopwatch.ElapsedMilliseconds);


	stopwatch = Stopwatch.StartNew();

	using (var stream = CreateWriter())
	using (var ms = new MemoryStream())
	{
		for (var i = 0; i < count; i++)
		{
			var bytes = BitConverter.GetBytes(i);
			ms.Write(bytes, 0, 4);

		}
		var array = ms.ToArray();
		stream.Write(array, 0, array.Length);
		stream.Flush();
	}
	stopwatch.Stop();
	Console.WriteLine("Memory stream: " + stopwatch.ElapsedMilliseconds);


	stopwatch = Stopwatch.StartNew(); 
using (var stream = CreateWriter()) { byte[] buffer = new byte[sizeof(int) * count]; int index = 0; for (var i = 0; i < count; i++) { buffer[index++] = (byte)i; buffer[index++] = (byte)(i >> 8); buffer[index++] = (byte)(i >> 16); buffer[index++] = (byte)(i >> 24); } stream.Write(buffer, 0, buffer.Length); stream.Flush(); } stopwatch.Stop(); Console.WriteLine("Single buffer: " + stopwatch.ElapsedMilliseconds); } private static FileStream CreateWriter() { return new FileStream(Path.GetTempFileName(), FileMode.Create, FileAccess.Write, FileShare.Read, 0x10000, FileOptions.SequentialScan | FileOptions.WriteThrough); }

And the results:

Binary Writer: 1877
BitConverter: 1985
Memory stream: 1702
Single buffer: 1022

Comments

Bil Simser
07/29/2008 01:31 AM by
Bil Simser

Is there a line missing in the code when you wrote it to the blog? There's no call to StartNew() in the last chunk.

Ayende Rahien
07/29/2008 05:32 AM by
Ayende Rahien

BIl,

Yeah, sense. You found a bug :-)

I Updated the post accordingly

Rik Hemsley
07/29/2008 08:49 AM by
Rik Hemsley

Similar results here, though MemoryStream and 'Single buffer' seem proportionally faster for some reason (tried many iterations, same results):

Binary writer: 1581

BitConverter: 1608

MemoryStream: 1016

Single buffer: 709

With the target stream being a MemoryStream rather than FileStream:

Binary writer: 362

BitConverter: 479

MemoryStream: 683

Single buffer: 349

Davy Landman
07/29/2008 09:20 AM by
Davy Landman

I find it rather logical that when you create your own buffering system with knowledge of the data size, it will be faster than the default buffering in a framework (which aims for overall average performance)

I looked at the source of the FileStream class, and it indeed holds an internal buffer of 4096 bytes. When write is called, the data is copied to the buffer and when the buffer is full, it's flushed tot the actual file handle.

So using the binary writer and bitconverter you have 10000000 copies to a internal buffer and 19532 separate flushes.

While the single buffer avoids the buffering of the FileStream class and therefore doesn't copy the memory but writes it directly to the file handle.

I suspect the memory stream uses a different buffering mechanism, but that's for someone else to look at?

Alessandro Riolo
07/29/2008 09:54 AM by
Alessandro Riolo

It is not related, but a byte is not ever 8 bits. There are (mostly were, i.e. the PDP-10) many architectures where a byte has a different weight.

Davy Landman
07/29/2008 09:54 AM by
Davy Landman

Seeing the fact that the buffering of the filestream doesn't slow us down with the single buffer method, maybe it's possible we could convert the int array faster to an byte array...

I created a faster variant, but its ugly (Unmanaged code) and I wouldn't use it unless this part was really a bottleneck.

stopwatch = Stopwatch.StartNew();

        using (var stream = CreateWriter())

        {

            byte[] buffer = new byte[sizeof(int) * count];

            int[] data = new int[count];

            for (int i = 0; i < count; i++)

            {

                data[i] = i;

            }

            IntPtr tempBuffer = Marshal.AllocHGlobal(buffer.Length);

            Marshal.Copy(data, 0, tempBuffer, count);

            Marshal.Copy(tempBuffer, buffer, 0, buffer.Length);

            Marshal.FreeHGlobal(tempBuffer);


            stream.Write(buffer, 0, buffer.Length);

            stream.Flush();

        }

        stopwatch.Stop();


        Console.WriteLine("Single buffer (Marshalling): " + stopwatch.ElapsedMilliseconds);
Davy Landman
07/29/2008 01:06 PM by
Davy Landman

Correction..

So using the binary writer and bitconverter you have 10000000 copies to a internal buffer and 19532 separate flushes.

So using the binary writer or the bitconverter solution you have 10000000 copies to the internal buffer of the filestream and 153 separate flushes to the real file.

(Looked over the buffer paramater)

tcmaster
08/03/2008 03:15 PM by
tcmaster

It seems you really like the "var" thing.

I'm really a stupid guy, and I like to be able to figure out meaning of code by reading 1st, then debugging. But "var" does well to prevent this

Ayende Rahien
08/03/2008 04:12 PM by
Ayende Rahien

tcmaster,

var is always initialized, just look at what the value is.

Comments have been closed on this topic.