reThe Order of the JSON, AKA–irresponsible assumptions and blind spots

time to read 2 min | 346 words

I run into this post, in which the author describe how they got ERROR 1000294 from IBM DataPower Gateway as part of an integration effort. The underlying issue was that he sent JSON to the endpoint in an order that it wasn’t expected.

After asking the team at the other end to fix it, the author got back an estimation of effort for 9 people for 6 months (4.5 man years!). The author then went and figured out that the fix for the error was somewhere deep inside DataPower:

Validate order of JSON? [X]

The author then proceeded to question the competency  / moral integrity of the estimation.

I believe that the author was grossly unfair, at best, to the people doing the estimation. Mostly because he assumed that unchecking the box and running a single request is a sufficient level of testing for this kind of change. But also because it appears that the author never considered once what is the reason this setting may be in place.

  • The sort order of JSON has been responsible for Remote Code Execution vulnerabilities.
  • The code processing the JSON may not do that in a streaming fashion, and therefor expect the data in a particular order.
  • Worse, the code may just assume the order of the fields and access them by index. Change the order of the fields, and you may reverse the Creditor and Debtor fields.
  • The code may translate the JSON to another format and send it over to another system (likely, given the mentioned legacy system.

The setting is there to protect the system, and unchecking that value means that you have to check every single one of the integration points (which may be several layers deep) to ensure that there isn’t explicit or implied ordering to the JSON.

In short, given the scope and size of the change:  “Fundamentally alter how we accept data from the outside world”, I can absolutely see why they gave this number.

And yes, for 99% of the cases, there isn’t likely to be any different, but you need to validate for that nasty 1% scenario.

More posts in "re" series:

  1. (27 Oct 2020) Investigating query performance issue in RavenDB
  2. (27 Dec 2019) Writing a very fast cache service with millions of entries
  3. (26 Dec 2019) Why databases use ordered indexes but programming uses hash tables
  4. (12 Nov 2019) Document-Level Optimistic Concurrency in MongoDB
  5. (25 Oct 2019) RavenDB. Two years of pain and joy
  6. (19 Aug 2019) The Order of the JSON, AKA–irresponsible assumptions and blind spots
  7. (10 Oct 2017) Entity Framework Core performance tuning–Part III
  8. (09 Oct 2017) Different I/O Access Methods for Linux
  9. (06 Oct 2017) Entity Framework Core performance tuning–Part II
  10. (04 Oct 2017) Entity Framework Core performance tuning–part I
  11. (26 Apr 2017) Writing a Time Series Database from Scratch
  12. (28 Jul 2016) Why Uber Engineering Switched from Postgres to MySQL
  13. (15 Jun 2016) Why you can't be a good .NET developer
  14. (12 Nov 2013) Why You Should Never Use MongoDB
  15. (21 Aug 2013) How memory mapped files, filesystems and cloud storage works
  16. (15 Apr 2012) Kiip’s MongoDB’s experience
  17. (18 Oct 2010) Diverse.NET
  18. (10 Apr 2010) NoSQL, meh
  19. (30 Sep 2009) Are you smart enough to do without TDD
  20. (17 Aug 2008) MVC Storefront Part 19
  21. (24 Mar 2008) How to create fully encapsulated Domain Models
  22. (21 Feb 2008) Versioning Issues With Abstract Base Classes and Interfaces
  23. (18 Aug 2007) Saving to Blob
  24. (27 Jul 2007) SSIS - 15 Faults Rebuttal
  25. (29 May 2007) The OR/M Smackdown
  26. (06 Mar 2007) IoC and Average Programmers
  27. (19 Sep 2005) DLinq Mapping