JSON Packing, Text Based Formats and other stuff that come to mind at 5 AM
This post was written at 5:30AM, I run into this while doing research for another post, and I couldn’t really let it go.
XML as a text base format is really wasteful in space. But that wasn’t what really made it lose its shine. That was when it became so complex that it stopped being human readable. For example, I give you:
After XML was thrown out of the company of respectable folks, we had JSON show up and entertain us. It is smaller and more concise than XML, and so far has resisted the efforts to make it into some sort of a uber complex enterprisiey tool.
But today I run into quite a few effort to do strange things to JSON. I am talking about things like JSON DB (a compressed json format, not actual json database), JSONH, json.hpack, and friends. All of those attempt to reduce the size of JSON documents.
Let us take an example. the following is a JSON document representing one of RavenDB builds:
This document is 4.52KB in size. Running this through JSONH gives us the following:
It reduced the document size to 2.93KB! Awesome, nearly half of the size was gone. Except: This is actually generating utterly unreadable mess. I mean, can you look at this and figure out what the hell is going on.
I thought not. At this point, we might as well use a binary format. I happen to have a zip tool at my disposal, so I checked what would happen if I threw this through that. The end result was a file that was 1.42KB. And I had no more loss of readability than I have with the JSONH stuff.
To be frank, I just don’t get efforts like this. JSON is a text base human readable format. If you lose the human readable portion of the format, you might as well drop directly to binary. It is likely to be more efficient and you don’t lose anything by it.
And if you want to compress your data, it is probably better to use something like a compression tool. HTTP Compression, for example, is practically free, since all servers and clients should be able to consume it now. And any tool that you use should be able to inspect through it. And it is likely to generate much better results on your JSON documents than if you will try a clever format like this.