Avoiding exposing identifier details to your users
A sadly common place “attack” on applications is called “Web Parameter Tampering”. This is the case where you have a URL such as this:
And your users “hack” you using:
And get access to another users records.
As an aside, that might actually be considered to be hacking, legally speaking. Which make me want to smash my head on the keyboard a few time.
Obviously, you need to run your security validation on parameters, but there are other reasons to want to avoid to expose the raw identifiers to the user. If you are using the a incrementing counter of some kind, creating two values might cause you to leak the rate in which your data change. For example, a competitor might want to create an order once a week and track the number of the order. That will give you a good indications of how many orders there have been in that time frame.
Finally, there are other data leakage issues that you want to might want to take into account. For example, “users/321” means that you are likely to be using RavenDB while “users/4383-B” means that you are using RavenDB 4.0 or higher and “607d1f85bcf86cd799439011” means that you are using MongoDB.
A common reaction to this is to switch your ids to use guids. I hate that option, it means that you are entering very unfriendly territory for the application. Guids convey no information to the developers working with the system and they are hard to work with, from a humane point of view. They are also less nice for the database systemto work with.
A better alternative is to simply mask the information when it leaves your system. Here is the code to do so:
You can see that I’m actually using AES encryption to hide the data, and then encoding it in the Bitcoin format.
That means that an identifier such as "users/1123" will result in output such as this:
bPSPEZii22y5JwUibkQgUuXR3VHBDCbUhC343HBTnd1XMDFZMuok
The length of the identifier is larger, but not overly so and the id is even URL safe . In addition to hiding the identifier itself, we also ensure that the users cannot muck about in the value. Any change to the value will result in an error to unmask it.
Comments
Hi Oren. Does using GUIDs considerably affect RavenDB performance and disk/memory space requirements, compared to autoincrement integers? (GUIDs place nicely with DDD). Let's consider a real-world scenario where we might have a few million records with GUID ids.
Nik,
It has an impact in terms of the structure of the B+Tree. It isn't going to be a major one before you get to 100 million records. A few millions won't really matter.
Got it. Thanks.
We've also used https://hashids.org/ in the past, (it's not hash as you can decode it with the secret), it allows for multiple longs and has a lot of configuration for generating longer ids.
We also catch any exceptions that could rise from revealing the ids, returning null of an empty string, resulting in usually in a 404. This hides any information the attacker could deduct from his attempts.
Steve,
Please note that this system provides no actual security. It would be trivial to figure out what the values are and what the encoding scheme is.
Yes, I should have mentioned that more but the website also says it, it's more for the accidental quickly change in the URL because they see a number < 1000, data security is still needed. The "hash" is by default salted which offers a bit more protection but it's absolutely brute forcible.
Nik - you can look into libraries implementing things similar to NEWSEQUENTIALID() from sql. As long as network card is attached, it'd generate a globally unique identifier.
Wow, this obfuscated ID problem is big. Your solution looks greats for general purposes ids. The Base58 trick looks marvelous.
For readers information: there was a great article @ instagram about shard-aware date-sortable incremented ids.
The IDs from your solution gets very long though. Would there exist ciphers that would generate shorter IDs? I see Skipjack uses 64 bit blocks instead of 128; might that work as well?
PS : please use a different captcha system to protect the comments section. How can I post if I do not accept google's terms of service?
SandRock,
A key issue here is that you need to keep enough state to validate that the id wasn't modified. For that, you need the MAC, which is 16 bytes in pretty much all standard encryption systems. And you need the nonce as well. That is what adds significantly to the cost here. You can use something else, of course, but that would have an effect on the security properties of your system.
Whatever that is acceptable depends on your mode of operation.
You may want to try to use
AesGcmSiv
instead. That allows you to reuse the nonce (but reveal identical strings). I would rather use the longer strings, to be honest, given that this will require a much bigger analysis on the security properties.Comment preview