I just got a really interesting customer inquiry, and I got their approval to share it. The basic problem is booking flights, and how to handle that.
The customer suggested something like the following:
{ //customers/12345
"Name" : "John Doe",
"Bookings" : [{
"FlightId": "flights/1234",
"BookingId": "1asifyupi",
"Flight": "EA-4814",
"From": "Iceland",
"To" : "Japan",
"DateBooked" : "2012/1/1"
}]
}
}
{ // flight/1234
"PlaneId": "planes/1234"// centralized miles flown, service history
"Seats":
{
{
"Seat": "F16"
"BookedBy": "1asifyupi"
}
}
But that is probably a… suboptimal way to handle this. Let us go over the type of entities that we have here:
- Customers / Passengers
- Flights
- Planes
- Booking
The key point in here is that each of those is pretty independent. Note that for simplicity’s sake, I’m assuming that the customer is also the passenger (not true in many cases, a company may pay for your flight, so you the company in the customer and you the passenger).
The actual problem the customer is dealing with is that they have thousands of flights, tens or hundreds of thousands of seats and millions of customers competing for those seats.
Let us see if we can breaking it down to a model that can work for this scenario. Customers deserve its own document, but I wouldn’t store the bookings directly in the customer document. There are many customers that fly a lot, and they are going to have a lot of booking there. At the same time, there are many bookings that are made for a lot of people at the same time (an entire family flying).
That leaves the Customer’s document with data about the customer (name, email, phone, passport #, etc) as well as details such as # of miles traveled, the frequent flyer status, etc.
Now, we have the notion of flights and bookings. A flight is a (from, to, time, plane), which contains the available seats number. Note that we need to explicitly allow for over booking, since that is a common practice for airlines.
There are several places were we have contention here:
- When ordering, we want to over book up to a certain limit.
- When seating (usually 24 – 48 hours before the flight) we want to reserve seats.
The good thing about it is that we actually have a relatively small contention on a particular flight. And the way the airline industry works, we don’t actually need a transaction between creating the booking and taking a seat on the flight.
The usual workflows goes something like this:
- A “reservation” is made for a particular itinerary.
- That itinerary is held for 24 – 48 hours.
- That itinerary is sent to the customer for approval.
- Customer approve and a booking is made, flight reservations are turned into actual booked seats.
The good thing about this is that because a flight can have up to ~600 seats in it, we don’t really have to worry about contention on a single flight. We can just use normal optimistic concurrency and avoid more complex models. That means that we can just retry on concurrency errors and see where that leads us. The breaking of the actual order into reservation and booking also helps, since we don’t have to coordinate between the actual charge and the reservation on the flight.
Overbooking is handled by setting a limit of how much we allow overbooking, and managing the number of booked seats vs. reserved seats. When we look at the customer data, we show the customer document, along with the most recent orders and the stats. When we look at a particular flight, we can get pretty much all of its state from the flight document itself.
And the plane’s stats are usually just handled via a map/reduce index on all the flights for that plane.
Now, in the real world, the situation is a bit more complex. We might give out 10 economy seats and 3 business seats to Expedia for a 2 months period, so they manage that, and we have partnership agreements with other airlines, and… but I think that this is a good foundation to start building this on.