Ayende @ Rahien

It's a girl

What is wrong here?

This case, it isn’t code that I am going to show, rather, I am going to show the final UI and the database structure, and let you figure:

  • What is wrong.
  • How to fix this.

Here is the database schema:

image

And here is the problem:

image

Hints, this used to work for a long time, and suddenly it doesn’t, and the customer is pissed, annoyed and threatening to sue.

Tags:

Posted By: Ayende Rahien

Published at

Originally posted at

Comments

Wojciech
02/24/2012 10:20 AM by
Wojciech

In my humble opinion: 1. I don't understand why Purchase Orders table is a separate entity. It looks like is 1-1 to Orders table therefore should be merged. 2. Tracking Id should be placed in Orders table not the other way around (unless you can send the same order more than once - so called part delivery, but that requires different database structure). 3. Order details should be somehow separated in UI from tracking details - it's a bit confusing. 4. Last but not least - spelling mistake in OrderLines schema:)

Giedrius
02/24/2012 10:24 AM by
Giedrius

Well, items subtotal does not match sum, but that does not relate to database schema posted, so that is not an issue I guess :)

Eoin C
02/24/2012 10:27 AM by
Eoin C

Cartesian Product on the Join between entries in the Tracking Table & Entries in the Purchase Orders Table causing costs to multiply up ?

Not sure what the query behind the Tracking GridView is supposed to be, but I'm guessing if it's meant to be flat the the PO & Tracking tables could be consolidated.

Oded
02/24/2012 10:30 AM by
Oded

The Order Id field is an INT(4) - Order numbers have started overflowing, so no new orders can be entered in the system (based on your "this used to work for a long time" comment).

Johannes
02/24/2012 10:35 AM by
Johannes

Like already mentioned, a cartesian product between tracking and purchase orders. Probably because the developer tried to make one single query with multiple joins. It worked fine until one customer decided to split the order into 2 shipments.

Alex Humphrey
02/24/2012 10:36 AM by
Alex Humphrey

I think Eoin is correct.

It's probably the first time that there's been two Tracking rows and two PurchaseOrders rows for the same Order.

Michel
02/24/2012 10:39 AM by
Michel

The UI shows that one Tracking relates to one PurchaseOrder. The database don't have relation between the two entities.

Roland
02/24/2012 10:44 AM by
Roland

missing ZipCode in shipping, but looking at the UI you probably just missed that

Jerrie Pelser
02/24/2012 10:50 AM by
Jerrie Pelser

Was just going to make the same comment as Michel. There is not relation between Tracking and Purchase Order in the database, but in the UI they are related to each other. Does not make sense?

The problem probably never showed up in the UI as there was always just one purchase order related to a Order, but when you created 2 purchase orders for the order it shows up as a cartesian product in the UI.

Frank Quednau
02/24/2012 10:59 AM by
Frank Quednau

You people are all wrong!

In order lines orderId is misspelled, which was revealed during the last Audit. Now the auditor is threatening to close down the system, Considering the blatant lack of quality.

Mark
02/24/2012 11:05 AM by
Mark

Hopefully you don't rename existing products (or worse still, recycle a product id for a completely new product), but if you did, the order history would show items that are different from the description at the time of order. Better to store a copy of the product name in the OrderLines table.

Phillip
02/24/2012 11:05 AM by
Phillip

@Roland, that's most likely a mock up error for the blog post and unrelated to the actual problem. Since billing is missing an 'n' in the order.

I believe Eoin is correct.

Wigy
02/24/2012 11:19 AM by
Wigy

Orders are not related to any authentication data, therefore any customer can just change the order number in the URL and peek into orders of other customers.

Ben Joyce
02/24/2012 11:26 AM by
Ben Joyce

Typo in OrderLines anyone? :) OrederId

tobi
02/24/2012 11:50 AM by
tobi

A duplicate row in PurchaseOrders is causing the join to duplicate data. The schema is crap.

Steve
02/24/2012 11:54 AM by
Steve

@Wigy I think urls just made up for the example, and even if its not, authentication could still exist checking if the order belongs to the the customer who is currently logged in.

I agree with Eoin, there is no link between the tracking and the purchase orders.

Carlos
02/24/2012 11:56 AM by
Carlos

I agree with Eoin and Alex. This is probably the first time and order is split among PurchaseOrders and this is breaking the query for the Tracking grid. The relationship should be Orders => PurchaseOrder => Tracking or consolidate PurchaseOrder with Tracking.

Daniel Lang
02/24/2012 12:22 PM by
Daniel Lang

The customer got to know another much nicer looking web application (probably Rails I guess) and is now really pissed because he must work with a screen that looks like a poorly designed mockup.

Agree with Eoin, Alex.

Michael H
02/24/2012 12:33 PM by
Michael H

+1 for Eoin

Since it appears to be normalised I don't know why the name field duplicates. It doesn't seem an issue in the example (since we cant tell what the product ID does relate to) but it could be during product updates.

Tim Skauge
02/24/2012 12:44 PM by
Tim Skauge

My guess: The order lines has a FK to products. If a product changes name in the shop the product name on the order would change too.

The order should be an immutable object (document if you will) but it is not with the current schema.

Simon
02/24/2012 01:12 PM by
Simon

Tim is right. The order is linking the the product but the product can change after the order has taken place.

Making a copy of the order at order time would fix it.

Tom Dietrich
02/24/2012 01:28 PM by
Tom Dietrich

The problem is being displayed on the mockup. Look at the Tracking section. Look at the tables these data are coming from.

Lee Atkinson
02/24/2012 01:29 PM by
Lee Atkinson

1 - Typo BillgState in Orders 2 - Missing BillingZip 3 - Perhaps there should be a ShippingCountry and BillingCountry? (Though personally, I think the address structure is too 'defined' - one only needs a three fields - address (which can take on the many forms of addresses around the world) , postal code, country code 4 - It has multiple tracking IDs - presumably so that parts of the order can be shipped independently - but there is no way to know what ordered products are sent with which shipment - and indeed whether an order has been completed - e.g. that all ordered products are shipped. 5 - It would seem odd to have multiple purchase orders per order. I guess it can happen that a customer needs to get multiple purchase orders for the products they are ordering, yet qualify for a discount on a single large order. In that case, however, I would like to know which products come under which Purchase Order. 6 - Typo OrederId in OrderLines 7 - In the UI, the order item Ids are presumably the id within the database - for the end-user this is a bit odd. Perhaps it should be a composite key of order id and a simple integer so that one ends up with 1, 2, 3, ... etc for each order? 8 - The subtotal for the order items in 25 cents too low. Perhaps it is rounding off to the cents, which could cause the company to lose quite a lot of income! 10 - In the UI, there seems to be a one-two-one relationship within between tracking and purchase order, though there is nothing to enforce that with the database schema.

As to why it worked for a long time, but not now, I'm not sure - something within the business has presumably changed - e.g. they are taking lots of little-values orders and the rounding down is losing a lot more income?

John Conan
02/24/2012 02:15 PM by
John Conan

I am sorry for my tone... but you can't possibly think that spotting typos or rounding errors is the purpose of this post. It's obviously the cartesian product between Tracking and PurchaseOrders, which related in charging the customer with $100.24 instead of $50.12, as others has already stated... Please stop pointing out "errors" from a simple mockup screen...

Stephane
02/24/2012 02:25 PM by
Stephane

Well, one thing is if you remove a product from the Product database, you won't be able to tell what the user ordered. the GUI shows the Items with their name, I'm guessing they join on the product table, which would just fail if the product has been deleted...

Matthew Shapiro
02/24/2012 02:44 PM by
Matthew Shapiro

What do you guys mean by cartesian products between the two entities?

Khalid Abuhakmeh
02/24/2012 03:02 PM by
Khalid Abuhakmeh

It's so obvious, it isn't using RavenDB.

Buildstarted
02/24/2012 03:11 PM by
Buildstarted

Tracking a separate entity from purchase orders? There's no direct link between the two so I don't see how the two can be matched up at all.

hangy
02/24/2012 03:13 PM by
hangy

Stephane, The product would never be deleted. There is no good reason to delete a product from the database unless it was entered in error (and hopefully never ordered). A correctly set-up foreign-key contstraint on OrderLines.ProductId -> Product.Id should prevent any delete statement on the Product table for any used (ie. ordered) product.

Matthew, Sounds like a math nerd term. ;) By selecting Tracking and PurchaseOrder in one statement and joining over the OrderId, the system produces "duplicate" rows - each tracking row is selected with one purchaseorder row, which leads to too much money being charged. (see the tracking grid in the mockup)

The correct way to model this could be to relate the Tracking table to the PurchaseOrder table (since that seems to be what is being tracked) instead of with the Order table.

John Sonmez
02/24/2012 03:15 PM by
John Sonmez

The problem is that the product name changed or that product was deleted.

The order should be a historical snapshot. Directly referencing the product table means that history will effectively change when products change.

Eoin C
02/24/2012 03:24 PM by
Eoin C

@Matthew/@Hangy

Well it's kinda a math nerd term ;-) But it's also a database term. It means when you perform a join with filtering criteria that results in every record on the left being paired with every record on the right

http://en.wikipedia.org/wiki/Relationaldatabase#Relationaloperations

But your right, it's basically duplication... and quite a common bug to come across debugging queries containing joins.

Because the only way to join these 2 tables to populate the tracking gridview would be to do so on Tracking.OrderId (2 occurences) JOIN PurchaseOrders.OrderId (2 occurences) resulting in 4 results.

Lee Atkinson
02/24/2012 03:31 PM by
Lee Atkinson

@John Conan - that is the largest error! I guess what is highlighted that there are so many errors, it's hard to list them all (even perhaps the largest - however as the system customer I'd be more concerned of earning less money than I expected that over charging my customers ;-)

There are lots of errors in this system, it's really scary that it made into production.

Calil
02/24/2012 03:59 PM by
Calil

Well, a little late, but my 2 cents. Shipping and Billing are entities, they should be separated from the Orders table. Also, an Order can have the same shipping and billing address (as the example shows), but with the current schema, you'll have duplicated data.

If Shipping and Billing have their own table, there should be one for Address as well. Shipping, Billing and others entities (Person, Company...) would reference this table.

Karep
02/24/2012 07:38 PM by
Karep

@tobi: What? Bad data gives bad results. How do you want to create database that when rows are duplicated results are still good!?

Daniel Schilling
02/24/2012 10:19 PM by
Daniel Schilling

So everyone agrees (or should agree) that the biggest problem is the cartesian product between Tracking and PurchaseOrders, causing the order total to be inflated.

If the database schema is correct, then the only reason I can imagine for joining these tables is for display in the view. Since the order total was obviously calculated using this joined data, that means that there's a good chance that the view is calculating the total. I don't think the view should be calculating an important piece of business data like "order total".

As far as fixing it goes, again assuming that the database schema is correct (this should be verified, though), the calculation of the order total should be fixed by moving it to a different piece of code that properly understands the relationship between Tracking and Order. Also, the Tracking grid in the UI should be split into two grids, one for tracking numbers and another for purchase orders, in order to more accurately describe the order to the user.

Alwin
02/24/2012 10:28 PM by
Alwin

What happens if you delete a product, and then want to view an old order with that product in it?

Or does the FK between OrderLine and Order prevent that from ever happening?

Theo Andersen
02/25/2012 08:33 AM by
Theo Andersen

It's because there was some problem with the order, so that there has been added a second Tracking row with 0 cost + a second PurchasedOrders row. (Perhaps the shop had an error with the order, and had to send it again)

So we now have 1 Order, with 2 Tracking rows, and 2 PurchasedOrders. When we join those to on Order, we get 4 rows.

Suppose you need the Tracking table to depend on a purchased order, and not the order? You can't track an Order that wasn't purchased anyway can you?

Kim
02/27/2012 05:06 AM by
Kim

@Alwin im not sure you should be deleting products in an e-commerce system but rather flag them as discontinued or something.

Comments have been closed on this topic.