Implementing "Suggested Destinations" in a few lines of code
Today I got in my car to drive to work and realized that Waze suggested “Work” as the primary destination to select. I had noticed that before, and it is a really nice feature. Today, I got to thinking about how I would implement something like that.
That was a nice drive since I kept thinking about algorithms and data flow. When I got to the office, I decided to write about how we can implement something like that. Based on historical information, let’s suggest the likely destinations.
Here is the information we have:
The Lat & Lng coordinates represent the start location, the time is the start time for the trip, and the destination is obvious. In the data set above, we have trips to and from work, to the gym once a week, and to our parents over the weekends.
Based on this data, I would like to build recommendations for destinations. I could try to analyze the data and figure out all sorts of details. The prediction that I want to make is, given a location & time, to find where my likely destination is going to be.
I could try to analyze the data on a deep level, drawings on patterns, etc. Or I can take a very different approach and just throw some computing power at the problem.
Let’s talk in code since this is easier. I have a list of trips that look like this:
public record Trip(double Lat, double Lng, string Destination, DateTime Time);
Trip[] trips = RecentTrips(TimeSpan.FromDays(90));
Given that, I want to be able to write this function:
string[] SuggestDestination((double Lat, double Lng) location, DateTime now)
I’m going to start by processing the trips data, to extract the relevant information:
var historyByDest = new Dictionary<string, List<double[]>>();
foreach (var trip in trips)
{
if (historyByDest.TryGetValue(trip.Destination, out var list) is false)
{
historyByDest[trip.Destination] = list = new();
}
list.Add([
trip.Lat,
trip.Lng,
trip.Time.Hour * 100 + trip.Time.Minute, // minutes after midnight
trip.Time.DayOfYear,
(int)trip.Time.DayOfWeek
]);
}
What this code does is extract details (location, day of the week, time of day, etc.) from the trip information and store them in an array. For each trip, we basically break apart the trip across multiple dimensions.
The next step is to make the actual prediction we want, which will begin by extracting the same dimensions from the inputs we get, like so:
double[] compare = [
location.Lat,
location.Lng,
now.Hour * 100 + now.Minute,
now.DayOfYear,
(int)now.DayOfWeek
];
Now we basically have an array of values from which we want to predict, and for each destination, an array that represents the same dimensions of historical trips. Here is the actual computation:
List<(string Dest, double Score)> scores = new();
foreach (var (dest, items) in historyByDest)
{
double score = 0;
foreach (var cur in items)
{
for (var i = 0; i < cur.Length; i++)
{
score += Math.Abs(cur[i] - compare[i]);
}
}
score /= items.Count;
scores.Add((dest, score));
}
scores.Sort((x, y) => x.Score.CompareTo(y.Score));
What we do here is compute the difference between the two arrays: the current start location & time compared to the start location & time of historical trips. We do that not only on the raw data but also extract additional features from the information.
For example, one dimension is the day of the week, and the other is the time of day. It is not sufficient to compare just the date itself.
The end result is the distance between the current trip start and previous trips for each of the destinations I have. Then I can return the destinations that most closely match my current location & time.
Running this over a few tests shows that this is remarkably effective. For example, if I’m at home on a Saturday, I’m very likely to visit either set of grandparents. On Sunday morning, I head to the Gym or Work, but on Monday morning, it is more likely to be Work.
All of those were mostly fixed, with the day of the week and the time being different. But If I’m at my parents’ house on a weekday (which is unusual), the location would have a far greater weight on the decision, etc. Note that the code is really trivial (I spent more time generating the actual data), but we can extract some nice information from this.
The entire code is here, admittedly it’s pretty dirty code since I wanted to test how this would actually work. At this point, I’m going to update my Curriculum Vitae and call myself a senior AI developer.
Joking aside, this approach provides a good (although highly simplified) overview of how modern AI systems work. Given a data item (image, text, etc.), you run that through the engine that outputs the embedding (those arrays we saw earlier, with values for each dimension) and then try to find its nearest neighbors across multiple dimensions.
In the example above, I explicitly defined the dimensions to use, whereas LLMs would have their“secret sauce” for this. The concept, at a sufficiently high level, is the same.
Comments
Can you explain the * 100 to get minutes after midnight, as opposed to * 60 ?
Oren, are you sure it's wise to post exact GPS coordinates of the location of your home, your inlaw's home, and so on?
Jorge, search some of the coords, they show that Oren is a farmer, perhaps he travels by tractor.
Peter, that's where the term RavenDB came from. Oren got tired of shooting the Ravens that tried to steal the fresh rops 🤣
Peter, that's where the term RavenDB came from. Oren got tired of shooting the Ravens that tried to steal the fresh crops 🤣
Comment preview