Ayende @ Rahien

Refunds available at head office

Optimizing “expensive” calls

Take a look at the following code:

image

It is obvious where we need to optimize, right?

Except… each call here takes about 0.8 millisecond. Yes, we could probably optimize this further, but the question is, would it be worth it?

Given a sub millisecond performance, and given that trying to implement a different serialization format would be expensive operation, I think that there just isn’t enough justification to do so.

Comments

Marc Gravell
04/30/2010 10:14 AM by
Marc Gravell

How complex is the data? If you do care about this, I'm pretty confident I could get you some noticeable performance increase without any change to your object model, using the (not fully released, but pretty stable) protobuf-net "v2" build.

In particular, this:

  • has a metadata abstraction layer, so you don't need to decorate your model with attributes if you want to keep it unaware of serialization

  • has full ILGenerator pre-compilation (even to a dll if you really want) for performance

(and of course the usual protobuf advantages of not needing to do as much UTF encoding or string matching)

I've got some examples where I have to do large loops, because the measurement is in the low micro-seconds.

Ayende Rahien
04/30/2010 10:20 AM by
Ayende Rahien

Marc :-)

The data is any arbitrary JSON document.

Marc Gravell
04/30/2010 10:24 AM by
Marc Gravell

Fair enough. That won't help at all, then ;-p (feel free to delete the comment if you want to avoid any unhelpful distraction)

I do wonder, though, if I should try a second project to take my existing IL-generator and change the leaf-nodes to handle xml / json / something else. Could be interesting ;-p

Demis Bellot
04/30/2010 11:50 AM by
Demis Bellot

I'm a strong believer in 'selectively' optimizing where it makes sense. I actually think it does make sense to aggressively optimize serialization as it is ultimately used on every request so every optimization you can make improves the entire performance of your application as a whole. Personally I think application performance / perceived performance is the primary goal of all successful internet software companies. For a good example of an optimized HTML5 site view the source and traffic of yahoo's new search page (see: inline images :):

http://search.yahoo.com/

Speed is the main reason why I'm not using JSON serialization for serializing POCO's. I don't think there is anything inherently wrong with the format, I just haven't found a JSON library with good performance. 0.8ms seconds doesn't seem like here much so maybe its not worth it for this case but bigger datasets mean longer times and I'm used to having entire web service requests finish in > 1ms.

Using the Northwind dataset, I've benchmarked all leading serialization routines I could find in .NET here:

www.servicestack.net/.../...-times.2010-02-06.html

@Marc Gravell's protobuf-net binary serialization is the clear leader here (makes every other binary serializer redundant) while I do ok with the leading text-serializer.

I do wonder, though, if I should try a second project to take my existing IL-generator and change the leaf-nodes to handle xml / json / something else. Could be interesting ;-p

Yes please do! I don't think you should worry about XML as MS's implementation is actually pretty good, and well if you use XML you're going to be more concerned with interoperability than performance. But there is a big potential for perf gains in the JSON-space - all .NET ajax apps thank you in advance for your efforts :)

Ayende Rahien
04/30/2010 11:54 AM by
Ayende Rahien

Marc,

That might be nice, but the reason issue here is that we have a Json document being saved into a file, nothing else.

There isn't really a place for IL gen here. It isn't an object to json

Marc Gravell
04/30/2010 01:50 PM by
Marc Gravell

I mistakenly assumed that this was doing a serialization step.

I'll get my coat.

Jeremy Gray
04/30/2010 01:55 PM by
Jeremy Gray

Whenever looking at the results of a profiler run, I remind myself that unnecessary calls are as important as slow calls, and often moreso once the obvious slow calls have been addressed. With that in mind, does AddDocument need to call JToken.ToString twice?

Ayende Rahien
04/30/2010 01:57 PM by
Ayende Rahien

Jeremy,

Yes, it calls that on two different objects

Jeremy Gray
04/30/2010 02:03 PM by
Jeremy Gray

I was sure you had that covered but just had to check. ;)

James Newton-King
04/30/2010 03:58 PM by
James Newton-King

There isn't any reflection involved here, just simple iteration over collections of known objects and then writing values as JSON text.

I think it would be possible to improve performance here a little: under the covers the JSON is being written by a class called JsonTextWriter. Like any good public API JsonTextWriter can write to any TextWriter, checks for errors and updates some public properties with state. Since the contents of JObject et al are already guaranteed to be valid and will always output a string it could have its own internal writer which optimizes to be written to a string and fore-gos error checking and state.

If performance is important that is an idea of something you could look at. I haven't done it myself because I think sub 1ms is plenty fast enough for average use already and in my opinion more important features in a serializer, but you obviously have different needs. Happy to include the source in future releases if you do write something! :)

~J

Andrew
05/01/2010 02:09 AM by
Andrew

Can you serialize many objects at once with threads (keep some 'serialize' threads running looking for work).

May be overkill :)

Fabio Maulo
05/01/2010 04:51 AM by
Fabio Maulo

Very interesting NewtonSoft.Json is faster than BinaryFormatter.

Rafal
05/01/2010 11:51 AM by
Rafal

Probably you need to process the json only because you have to make sure ID and Version attributes are set. If so, you could just copy the doc from json reader to a json writer, adding id and version if necessary.

Thomas
05/03/2010 06:17 AM by
Thomas

What tool do you use, for creating the "report" with the expensive calls?

Ayende Rahien
05/03/2010 06:19 AM by
Ayende Rahien

Thomas,

This is dotTrace, a wonderful tool from JetBrains

Thomas
05/03/2010 08:20 AM by
Thomas

Thank you

Ayende Rahien
05/05/2010 03:57 PM by
Ayende Rahien

BSON,

No, I didn't know about that, thanks, I'll try it.

Comments have been closed on this topic.