Surprising behavior when roundtripping JSON documents
We run into a really interesting bug. For some reason, the system was behaving in a totally unexpected manner for some parts of the data. For pretty much the same input, we would get the wrong result, and we couldn’t figure out why.
Here is our source data:
This is some metrics data about servers, and you’ll note that we report the CPU load for each core on the instance and that the results are sorted based on the actual load. Here is what this looks like, the image on the left is wrong while the image on the right is right (pun intended).
Why do we have this behavior?
Well, let’s look at the actual data, shall we?
They are… the same. Exactly the same, in fact. We can throw that into diff engine and they will tell me that they are identical (except for the document id).
What is going on here?
Well, here is the issue, what you see is not what you get. Look at the JSON text that I have above, and compare that to the documents we see in the images. RavenDB shows the documents in a nicely formatted manner, and along the way, it messed up something pretty important.
In our case, we used an object to hold the various details about the instances. And we relied that the insertion sort order for the properties would stay the same when reading the document. That is actually the case, and RavenDB goes to great lengths to ensure that this is the case. However…
In order to prettify the document, we call to JSON.parse() and JSON.stringify() (on the client side), which give us nicely formatted output. Along the way, however, we run into JavaScript and its “ideas” about how things should work. In particular, JavaScript threats properties whose key is a number in a different way than other values. All the numeric properties will be sorted according to their integer value, while non numeric values will be sorted using insertion order.
That only applies to documents that were modified in the studio, however. The RavenDB server and client API are keeping the properties in insertion order. Only if you modified the document using the Studio will you get this. But because we always show the documents in the same manner, it was invisible to us for a long while.
For that matter, it took an embarrassingly long time of debugging this problem, because (naturally) whenever we viewed the data, we did that with formatting, which meant that we never actually saw the differences between the raw versions of the documents.
Comments
Relying on property order in JSON objects is setting yourself up for interesting surprises at some point. Databases do have to worry about property order and the even more fun idea of duplicate entries with the same key because databases should have well-defined behaviors even for weird scenarios to be able to deliver a good user experience. Being able to roundtrip data opens up easier diffing and hashing, for example, so I understand why that's something the database should do if possible. But for someone to actually depend on the order of entries in a JSON object is inadvisable, especially when anyone wanting the order to be preserved can just use an array.
Jesper,
Case in point, when you are using
JSON.Net
, it relies on the fact that the metadata properties ($type
,$id
,$ref
, etc) are first in the object. If that isn't the case, it will ignore them. That is how we actually discovered this issue. I agree that you shouldn't rely on those, but still...I know it sounds stupid but the worst kinds of errors to debug are those that you can’t see. I’m looking at you encoding issues etc.
Comment preview