Using data modeling to deal with shady clients

Oct 28 2019

Using data modeling to deal with shady clients

time to read 7 min | 1202 words

I run into the following Reddit’s question: My client's business sounds shady. Should I run away? And I thought it was very interesting. Here are the relevant details:

I have a client who wants me to finish developing a web application for him. Basically, his application has 2 types of users, buyers and sellers. The buyers can create an account and upload money to the website which goes into my clients bank account. The amount of money they have to spend is then logged in the database as their "available funds". The users can then purchase services from the sellers.
The sellers, when they sell a product, earn some of the funds from the buyers. So no money actually gets transferred. The database just updates to say that the buyer now has x$ less available, and the seller has an extra x$ available.
Eventually the seller can withdraw their money, at which point my client transfers it from his bank.

The core of the issue is in a comment:

Their funds that they have available is just represented as a number in a database, which himself and the developers have access to. If one wanted to, they could just log into the database, create their own seller account, load it up with funds, then withdraw from my clients bank account. Just one example.

There are a few interesting things here that I want to touch, but this question was posted to a legal advice community, so I should probably preface anything I say with the fact that I’m not a lawyer, merely doing professional software development for quite some time and have dealt numerous times with financial systems and projects.

Today, except for the money that you physically carry in your wallet or stuffed under a mattress, pretty much all money is represented as numbers in some database. And there have been cases where this have been used for theft. However, it is very rarely going to be something as obvious as merely changing numbers around in a database.

For example, let’s imagine that, as the admin of the website web application above, I follow the doomsday scenario and create my own seller account. At this point, I don’t need to move money around, I can just put 10,000,000 in the Amount field and ask for a withdrawal. It will likely work, after all. There is no need to balance the numbers. Of course, this assumes quite a few things:

There is no governance on the outgoing money.
I don’t care what would happen after.

The fun thing about money transfers in the real world is that for the most part, they are reversible. I once paid a supplier, but I hit an extra 0 and didn’t notice that until I hit the Submit key. I called the bank and cancelled the order. Another time, I switched numbers in a wire transfer and sent money to the wrote account. It took a couple of months before we discovered the issue, but we were able to reconcile everything. It was annoying, but not too hard.

The threat model here is also fairly strange. We have someone that is able to modify the data directly in the database, but can’t also call: SendMoney(me, all) ?

All of the reasons above are partly why no real financial system works in this manner. There is no single Amount field per customer and money being shuffled off between accounts using + or –. Instead, you always have a record of transactions.

You might have heard about the embezzlement by fraction, right? A programmer will direct the remainder of the result (typically partial cents) to their own account. Why do we need such a thing? After all, if the programmer is already able to write code that would move money around, why just deal with fractional cents?

The answer is that in every financial institution, you are going to have some sort of governance. You’ll run a query / report and the total amount of money in and the total amount of money out should match. If it doesn’t, you have a Big Problem and are going to have to do something about it. This is why all these systems work based on moving money around and not creating it out of thin air.

Money transfers are easily tracked, monitored and carefully audited. Doing the sort of thing that was suggested in the Reddit question falls under: Embezzlement, Theft, Money Laundering, Tax Evasion and probably a host of other laws. It isn’t some weird trick that you can get away with.

Regardless, unless the system you build is intentionally meant to evade the law, I doubt that the developers building the system would have any issues. The people operating it, of course, need to be sure that they are operating within the law, but that shouldn’t be an issue for the developers. After all, a photo sharing website is a perfectly innocent project, but can be used for several nefarious and illegal purposes. But if someone took Lychee to put up a site for a pig named Napoleon hosted on eu-west-3 (Paris), I doubt that would put the Lychee project in any legal trouble.

The title of this post promised that I would talk about data modeling, so let’s get to it, shall we? If you are still worried about the shadiness of your client, and assuming that you aren’t worried about the possibility of the shady client not paying you, what can you do?

The answer is simple, don’t have anything like an Amount field in the system. Don’t allow it to happen. Instead, build the system using the notion of monetary transactions. That is, every movement of money between accounts (deposit from outside, withdrawal or shifting balance between accounts) must be explicit recorded as such.

Here is a very simple example:

You can see the movement of the money, and the total amount that was deducted from the source account matches the deposit + commission.

With RavenDB, you can use a Map/Reduce index on top of this set of transactions to compute:

How much money does each account has.
That the total amount of money in the system balances out.

Note that in this case, the numbers that you are working on are computed, there is no way to just move things around. Instead, you have to do that using an actual transaction, which will create a traceable record.

To make things more robust, you can take things further. Make sure that on each transaction, you’ll email all the clients involved. That will ensure that there is an external record of the transactions in the system, and they can recover their own account state independently. This will protect the customer in the case of the shady client trying to do things behind their back. They have their own separate record of all the going on in their account, separate from what is store on the potentially vulnerable database.

These are fairly routine behaviors for most financial systems, and I can’t imagine a client not wanting them. They’ll also serve as a pretty good protection for the developers if there is shadiness involved.

Tweet Share Share 5 comments

Tags:

Comments

28 Oct 2019
15:23 PM

peter

This is just accounting really, though if they were a Financial Institution keeping client funds they would be required to segregate various accounts.

I work in the Financial Industry. On the macro side, we not only reconcile internally, we also reconcile with the Fed i.e. we have to match what the Fed tells us we should have in our account.

The traceability/reliability of the transaction records is the real promise of blockchain, the owner or dev would not be able to fudge the chain.

29 Oct 2019
06:05 AM

Felix Deutsch

@peter:

The traceability/reliability of the transaction records is the real promise of blockchain, the owner or dev would not be able to fudge the chain.

You don't need blockchain for that, with all the mining and whatnot, you only need a tamper proof logfile, which has been around for a long time: https://archive.nytimes.com/www.nytimes.com/library/tech/99/01/cyber/articles/10notary.html

29 Oct 2019
14:13 PM

peter

@felix

I seems the shortcoming of that solution is that you cannot isolate the exact entry that was fudged, you only can detect the whole file or document was altered.

29 Oct 2019
19:46 PM

Felix Deutsch

@peter:

I think you can walk the log from the start until you find a broken entry, no? Anyway, it is possible without the blockchain baggage (like transaction throughput, etc.), but I'm far from an expert on this: https://en.wikipedia.org/wiki/Linked_timestamping

01 Nov 2019
11:19 AM

Oren Eini

Peter,

Yes, it is accounting, but it seems like this is something that many software developers aren't familiar with, even if they need to touch related fields. I really don't like the notion of a blockchain here, though. It is really not suitable for the purpose, and there is no need to go distributed in order to create an immutable log.The immutable log here, especially if you can verify it by a 3rd party, on the other hand, is important.

Comment preview

Comments have been closed on this topic.

Oren Eini

Oren Eini

CEO of RavenDB