Complex data analysis with RavenDB

time to read 2 min | 297 words

The task today is to do some fraud detection. We have a set of a few million records of transactions, each one of them looking like this:

image

We want to check a new transaction against the history of data for this particular customer. The problem is that we have a very limited time frame to approve / reject a transaction. Trying to crawl through a lot of data about the customer may very well lead to the transaction timing out. So we prepare in advance. Here is the index I created, which summarize information about a particular customer:

I’ll have to admit that I’m winging all the actual logic here. I’m not sure what you’ll need for approving a transaction, but I pulled up a few things. Here is the result of the index for a particular customer:

image

What is most important, however, is the time it takes to get this result:

image

The idea is that we are able to gather information that can quickly determine if the next transaction is valid. For example, we can check if the current merchant is in the list of good destinations. If so, that is a strong indication it is good.

We can also check if the amount of the transaction matches previous amounts for the transaction, as well as many other checks. The key here is that we can get RavenDB to do all the work in the background and immediately produce the right result for us when needed.