Performance numbers in the pub
Originally posted at 3/31/2011
I am currently sitting with 3 guys in the pub, and the discussion naturally turned to performance. I asked all three the following question: “How many CLR objects can you create in one second?”
I got the following replies:
- 2,000 objects per second
- 50,000 objects per second
- 100,000 objects per second
Then I sat down to check:
class Program { static void Main(string[] args) { var sp = Stopwatch.StartNew(); int i = 0; while(sp.ElapsedMilliseconds < 1000) { new MyClass(); i++; } sp.Stop(); Console.WriteLine("Created {0} in {1}", i, sp.Elapsed); } } public class MyClass { public string A; public int B; public DateTime C; }
The result?
Created 7,715,305 in 00:00:01
Comments
Pretty big number you got there ;)
But the if of the while is also eating time, so the number could be greater ...
On what hardware, version of the framework did you run this test?
What happens once the GC kicks in?
@Bogdan Marian
I was getting low values not higher than 80.000 objects/s so I gave it a try to remove the checking. Now I could create 120.000.000 objects/s! on a simple laptop. Whoa!
Here's the code:
using System;
using System.Threading;
class Program
{
static void Main(string[] args)
{
}
}
// Ryan
With instances of MyClass it is around 70.000.000 objects/s
// Ryan
Can't believe I've beaten you: Created 9.117.795 in 00:00:01
Created 17,176,490 in 00:00:01
Are you sure that the object is even created and that the compiler does not optimize the loop? Check it with IL Dasm ;-)
On my notebook.
using .NET: Created 732239 in 00:00:01.0000016.
using Java: Created 26248216 in 1000ms.
I am blown away by the difference.
Joao,
There is a lot that can cause that change.
Created 11 526 398 in 00:00:01.0000041
Press any key to continue . . .
Yet more evidence that my employer buys crap computers for programmers:
4975448
There are a few things wrong with this test. I know this was a "back of the napkin in the pub" kind of converstation and the main point is "you can create a crapload of CLR objects per second". Since some people however, seem to be running with this, I figured I'd point a few of my observations out.
First, some background on how I ran these tests so that others may compare more easily.
Each test run was 1000 milliseconds.
I ran each test run 60 times and averaged the results.
I built the project in the Release x86 configuration
I'm running Windows 7 64-bit SP1 and .NET v4.0.30319
I used a console application and targeted the .NET Framework 4 Client Profile
I ran my tests detached built from the debugger (Ctrl-F5) which increased the number of objects created per second by 170%
My machine has 12GB of DDR3 and an Intel Xeon E5620 @ 2.4GHz
Observations:
The CLR is optimizing out the creation of each additional object via the call to new in the loop. Your code is actually only testing how many times your can increment a counter in one second - minus the time to create a single MyClass object
Actually storing the object references in an array that doesn't lose scope lowers the "score" by approximately 50%
Average Number of MyClass objects (int/string/Datetime) created per second that aren't stored: 14 053 546
Average Number of MyClass objects (int/string/Datetime) created per second stored in an array: 6 943 109
Moving the new MyClass call outside of the while loop so that all the loop does is increment the counter only increases "score" by 6%
Only executing the while loop and incrementer and not creating a MyClass object at all only increases the "score" by 7%
Your "MyClass" contains an integral value type (int) and two reference types (string and Datetime). Datetime is itself an object reference type which may contain other value and reference types.
Changing the definition of MyClass substantially changes the numbers (number of objs created as percent of original)
string/int/Datetime 100%
string/int 113%
string 120%
int 122%
The first run of each series was slower than the average by about 25% but was not the slowest run in the series
Here is the code I used in case anyone cares:
using System;
using System.Diagnostics;
namespace CLRObjectCreationTest
{
internal class Program
{
}
public class MyClass
{
// public DateTime C;
}
}
First run was only 25,009,970, but after that I averaged about 25.56-25.57 million per run. Not too shabby.
Hardware: i7-2600k OC to 4.6ghz, 16gb ddr3-1600, cheapie SSD, win7x64.
I made some changes to source code to use multiple cores (using TPL) as well as some other optimizations.
You can read more if it sounds interesting at blog.evereq.com/.../net-performance-and-your-ha... :)
Maybe I made a mistake but by pub yo mean draft beer, old painting on the wall....
It's really fun to read a useless challenge becoming a serious discussion (with code, blog post, technical spec and all). You guys must be the best booze pal ever.
In how many milliseconds do you think you can go from int.MinValue to int.MaxValue ? or get a StackoverfloxException ?
Whoa, performance myths on top of performance myths!
You're only measuring the stopwatch performance of here, object creation is way faster. Which leads people to all kinds of wrong conclusions, of course (java is faster, new is optimized away...)
Created 14631788 in 00:00:01
becomes
Created 202107618,669004 in 1 sec
@zach: the performance impact you get is from GC. without storing the references, its all done in generation 0 which is fast. but the objects are created either way.
@Stefan
You should at least bump reps up to 210.000.000 because now your program is lying to you :P
// Ryan
@Stefan - You are correct that the objects are created either way. I should have gone to the MSIL earlier. You can see below that it IS making the newobj call. I haven't dug enough beyond this to confirm if it is GC0 vs GC1 or the stopwatch, or the conversions, or some of the other overhead involved that is causing the slow down.
Using your code, I can confirm that I too get much higher numbers (now ~207M obj/s vs 15M obj/s) using a single thread (and around 475-500M obj/s using 8 threads).
Created 206799368.856598 in 1 sec
Here is the (relevant) C# code:
...
while (sp.ElapsedMilliseconds < lengthOfRunInMS)
{
}
sp.Stop();
.....
public class MyClass
{
}
And the resulting IL:
.entrypoint
// Code size 146 (0x92)
.maxstack 4
.locals init (int64 V_0,
// Cut a bunch of code here
// Create, store, and start stop watch
IL_000b: stloc.3
IL_000c: call class [System]System.Diagnostics.Stopwatch [System]System.Diagnostics.Stopwatch::StartNew()
IL_0011: stloc.s V_4
// Branch to while loop conditional
IL_0013: br.s IL_001f
// new MyClass
IL_0015: newobj instance void CLRObjectCreationTest.MyClass::.ctor()
IL_001a: pop
// newObj++ (Load, increment by 1, and store )
IL_001b: ldloc.3
IL_001c: ldc.i4.1
IL_001d: add
IL_001e: stloc.3
// sp.ElapsedTime
IL_001f: ldloc.s V_4
IL_0021: callvirt instance int64 [System]System.Diagnostics.Stopwatch::get_ElapsedMilliseconds()
// Load while loop guard (1000)
IL_0026: ldc.i4 0x3e8
// Convert the loop guard to an Int64
IL_002b: conv.i8
// Loop back through our while loop if the elapsed time is less than the loop guard
IL_002c: blt.s IL_0015
@remi
from int.MinValue to int.MaxValue 508ms
var sp = new Stopwatch();
sp.Start();
for(var i = int.MinValue; i < int.MaxValue; i++);
sp.Stop();
Console.WriteLine("{0} ms", sp.ElapsedMilliseconds);
Console.ReadLine();
Surprisingly StackOverflowException will take ages, I didn't want to wait for it ...
void NeverReturn()
{
NeverReturn();
}
var sp = new Stopwatch();
sp.Start();
try
{
NeverReturn();
}
catch(StackOverflowException)
{
Console.WriteLine("{0} ms", sp.ElapsedMilliseconds);
Console.ReadLine();
}
// Ryan
Interessing.
When I change NeverReturn into
static void NeverReturn1()
The program crashes immediately as expected.
Why doesn't the simple version crash as well?
I looked up the IL but I see no difference ...
// Ryan
Ryan,
This is probably because the call was optimized to be a non recursive call
@stefan Oh and I forgot to add, this still suffers from the compound object creation problem associated with the MyClass object containing an integral value type and two reference types, one which is a compound type itself. Thus we are actually counting the number of MyClass objects that can be created per second not the number of CLR objects that can be created per second.
Changing the code to this:
int reps = 1000000000;
The count moves up by about 16% to 240M obj/s for a single thread and almost a Billion obj/s (Created 924214000 in 1 sec) for 8 threads on a 2.4GHz Xeon E5620.
Kevdog,
Thats nothing... the best I got with my work computer was 479816. Yes, that's approximately half a million. I tried everthing to get a higher number but just can't.
Needless to say that my computer is a piece of s#$t. Probably second only to the guy behind me with a P4.
Luckily this is my second to last day at this job.
@zach: .net zeroes out every field on construction, so more fields = more time. additional reference fields also mean more tracking work for the GC to do, so if you want to measure only construction time (not GC time), you should avoid them. or somehow measure the main thread's time only.
but all those tests just show that it would have to be orders of magnitude slower to justify wasting a single thought on it.
except if charles is your customer, that is ;-)
@stefan: agreed. I was simply pointing out that for the pure academic version of the question "How many CLR objects can you create in one second?" we should just use new object(); rather than MyClass at all.
Here is the real test:
I figured "sp.ElapsedMilliseconds" had quite high overhead, so I set a 'granularity', only check the elapsed time every 1,000 iterations, then subtract 1,000 from the final count to say 'at least' that many were created...
My count jumped from 5.2 million, to over 42 million.
@Zach
"Datetime is itself an object reference type which may contain other value and reference types."
No, DateTime is a struct, a value type. And it has only one instance field, an ulong.
@Patrick
Correct you are, sir!
Created 11711225 in 00:00:01:00000090
Press any key to continue . . .
I thought I had a good computer.
3.579.220 objects in one sec.
Asus Core i7 with 4 gig of Ram
My average result was 14millions
Ok, I'll try to challenge :)
original code: 16M
code below: 400M
Both tested in release mode. MyClass is the original one.
Here is my code ... hopefully not forgetting anything :)
[ThreadStaticAttribute]
static int threadCounter = 0;
// 22threads, release: Created 389.068.025 in 00:00:00.9997334
// Created 408.390.521 in 00:00:00.9996904
// Created 417.618.531 in 00:00:01.0003762
static void Main(string[] args)
{
}
static void DoLoop(out int counter, out TimeSpan elapsed)
{
<thread pool = new List <thread();
}
Raff is the king! :-)
.m
Ok, there is another, slighly improved version that avoid static class and ThreadStatic.
With this one I reach 450M objects.
BTW I am using my desktop PC with a i7CPU (4x2 core) with many apps opened (5 instances of vs.net, services, etc.).
Running on a real server architecture with Xeon CPUs should give a huge boost due to larger L2 cache (CLR trys to put Gen0 in L2).
Cheers
// 22threads, release: Created 450.691.605 in 00:00:00.9997830
// Created 452.639.194 in 00:00:01.0001203
// Created 452.955.475 in 00:00:00.9995694
static void Main(string[] args)
{
}
public class Version2
{
<thread();
}
Comment preview