Rating Has Never Been So Good

Comparison of current rating methods popular on the web alongside a technical breakdown of a new novel rating algorithm.

Stylized depiction of a binary search. A bookshelf made in excalidraw.
Its supposed to be a book shelf with a new insertion being determined by a binary search

I got an ad on Instagram for an app called Beli. Upon first glance, it's a social media app for restaurant patrons aimed at a similar niche as Untapped but with an emphasis on the restaurant instead of beers. Untapped is a social media where you mark places that you've had a beer, rate the beer, give it some tags, maybe include a friend or make a post to the apps social feed. It is also great for researching and tracking of beers and its cool when friends use it so you can spy on their drinking habits. I used to use Untapped a lot so I seemed like a natural fit for Beli. I'm still new to it, but so far Beli is a very well made app that I'm really excited to watch grow.

The reason for this post is that Beli does something really cool that I haven't seen before. It has a novel rating method that honestly was so fun my partner and I ended up rating every restaurant I've been to in Denver over the past year and we had a blast doing it. Most rating systems have you provide a value of some sort to use for ranking. Google Maps has a 5-star rating, Apple Maps has thumbs up/down for a few categories, and Untapped has a 0:0.1:10 system. These systems are all flawed.

I subscribe to the same method as Dave Portnoy when it comes to giving a rating. High ratings should be very rare, and a perfect rating should only happen once in a lifetime. Restaurants, or whatever you happen to be rating, usually don't agree with this system and they work hard to chase those perfect ratings. I've been active on Google Maps for over 10 years now and its so frustrating when a perfectly find restaurant reaches out to me because I gave them a 4/5 star rating. When I drove for Uber there was a similar problem where if your driver rating was below a 4.7 you were put on probation until you got your numbers back up. 5 options for rating just isn't enough. On the other end of the spectrum you have Untapped that essentially gives 100 different rating options. The issue is that people rarely give ratings from 3 to 7 and they still end up essentially using a 5 star system where things they like do abnormally well and things they don't like score horribly. Apple is probably my favorite system we've covered so far. They give you a few categories; overall, food & drink, customer service, and atmosphere. All are optional to rate too. This is great because it pretty much breaks down how the 5 star system should work and removes the stress of some factors being abnormally good or bad changing the overall rating. Steam uses a similar positive/negative system that I trust more than any other reviews.

All of the previously mentioned systems are flawed, but Beli has innovated greatly in this space because their rating system is actually a ranking system. When you add a new place to your list it doesn't ask you for a number or a star count or which way a thumb should go, it compares it to the other places in your list. You rank this new place against all of the places you have already visited using a binary insertion sort. So, after a few comparisons to places you've already ranked, you end up with a ranked list, which then a linear interpolation is ran on to give everything a rating 0.0-10.0. I'll break down this algorithm in full detail below.

This system is a lot of fun because pitting two of your favorite places against each other really makes you think about each option, and it kind of diminishes the numerical value of the actual rating, which is a good thing. I think this system would also make buying fake reviews a lot more difficult and obvious to spot.

Right now The Wolf's Tailor is my top ranked restaurant which gives it a 10/10. Like I said earlier, a 10/10 should be reserved for that once in a lifetime meal, which to its credit The Wolfs Tailor is pretty close, but since what I'm really rating is its performance compared to its peers the rating is fine with me. It was even fun debating how this $330 meal stacked up compared to my favorite fast food fried chicken place.

I think this is also a better ranking system for the restaurants too. If my place got a 5/10 rating from someone then I would be worried about how their visit went, but if I can see that it's simply because my neighbors are beating me in the head to head ranking then I know that I need to strive to be better instead of writing off a few bad reviews.

Lets Write Some Code

If you don't want to see python and charts, turn around now

Lets see if we can recreate this system in Python. First we need a list of things to sort:

books = [
    "The Next 100 Years",
    "Extreme Ownership",
    "Project Hail Mary",
    "The Space Barons",
    "Safe Is Not an Option",
    "Skunk Works",
    "Artemis",
    "What If?",
    "The Hitchhiker's Guide to the Galaxy",
    "Ready Player One",
]

First 10 books that I could think of

Then we pretty much just write a simple binary search algorithm that uses user input to decide on equality of each item. Wikipedia is an awesome resource for getting pseudocode of an algorithm like this.

import random

random.shuffle(books)  # Just to make things more fun
sorted_books = []

for book in books:
    low = 0
    high = len(sorted_books)

    while low < high:
        mid = (low + high) // 2
        answer = input(f"Was '{book}' better than '{sorted_books[mid]}'? (y/n) ")

        if answer.lower() == "y":
            low = mid + 1
        elif answer.lower() == "n":
            high = mid
        else:
            raise ValueError("Invalid input. Please enter 'y' or 'n'.")

    sorted_books.insert(low, book)

23 lines of code not too bad 😎

Since this algorithm uses binary search, the worst case for adding a new entry to the list is only \(\lfloor \log_2 (n) + 1 \rfloor\) operations which for this list of 10 books means a worst case of 4 user inputs to sort the new book. My ranking of the 10 books looks like this:

Was 'The Hitchhiker's Guide to the Galaxy' better than 'Project Hail Mary'? (y/n) n
Was 'Safe Is Not an Option' better than 'Project Hail Mary'? (y/n) n
Was 'Safe Is Not an Option' better than 'The Hitchhiker's Guide to the Galaxy'? (y/n) y
Was 'Ready Player One' better than 'Safe Is Not an Option'? (y/n) n
Was 'Ready Player One' better than 'The Hitchhiker's Guide to the Galaxy'? (y/n) n
Was 'Skunk Works' better than 'Safe Is Not an Option'? (y/n) n
Was 'Skunk Works' better than 'The Hitchhiker's Guide to the Galaxy'? (y/n) n
Was 'Skunk Works' better than 'Ready Player One'? (y/n) y
Was 'What If?' better than 'The Hitchhiker's Guide to the Galaxy'? (y/n) y
Was 'What If?' better than 'Project Hail Mary'? (y/n) n
Was 'What If?' better than 'Safe Is Not an Option'? (y/n) n
Was 'The Space Barons' better than 'What If?'? (y/n) n
Was 'The Space Barons' better than 'Skunk Works'? (y/n) n
Was 'The Space Barons' better than 'Ready Player One'? (y/n) y
Was 'Extreme Ownership' better than 'The Hitchhiker's Guide to the Galaxy'? (y/n) n
Was 'Extreme Ownership' better than 'The Space Barons'? (y/n) y
Was 'Extreme Ownership' better than 'Skunk Works'? (y/n) y
Was 'The Next 100 Years' better than 'The Hitchhiker's Guide to the Galaxy'? (y/n) n
Was 'The Next 100 Years' better than 'Skunk Works'? (y/n) n
Was 'The Next 100 Years' better than 'The Space Barons'? (y/n) n
Was 'The Next 100 Years' better than 'Ready Player One'? (y/n) n
Was 'Artemis' better than 'Extreme Ownership'? (y/n) n
Was 'Artemis' better than 'The Space Barons'? (y/n) n
Was 'Artemis' better than 'Ready Player One'? (y/n) n
Was 'Artemis' better than 'The Next 100 Years'? (y/n) n

idk how to make a pretty UI on the fly for this sorry

It took me 25 inputs to sort this list which is quite literally the worst case scenario. The random library really did a number on me tonight. That may seem like a lot of inputs for only 10 books but if you factor that the max any book took was only 4 inputs then it isn't too bad, and since math equations aren't the easiest to visualize we can plot plot the length of a list vs. inputs required to get an idea of how the inputs really tapers off as the list grows exponentially:

A line graph titled 'Number of User Inputs Required for Each Insertion in Binary Insertion Sort' with the x-axis labeled 'Length of List at Insertion' ranging from 0 to 100 and the y-axis labeled 'User Inputs for Each Insertion' ranging from 0 to 6. The plot shows a curve starting at the origin and increasing at a decreasing rate as the length of the list increases, depicting the number of inputs required for each insertion.
Not the prettiest chart

I would argue that over a few hundred items in a list it becomes hard for a human to manage so I doubt users would ever have to do more than 8 inputs even for a really large library of items.

Cool Additional Things the Beli Implementation Does

Beli does two things that are really important for making this rating method work for the general population. They first have you sort you new entry into three options: bad, okay, great. I think this helps keep the comparisons grounded since it starts the comparisons off in a region of the list that is closer where it ends up. The other thing is they have a button for if comparing two restaurants is too hard. I don't know how this actually decides a ranking, but I would imagine this is necessary for making the ranking easier.

To wrap things up, here is my final ranking of the 10 books. Feel free to let me know how wrong I am.

Rating Book
10 Project Hail Mary
9 Safe Is Not an Option
8 What If?
7 The Hitchhiker's Guide to the Galaxy
6 Extreme Ownership
5 Skunk Works
4 The Space Barons
3 Ready Player One
2 The Next 100 Years
1 Artemis

Its crazy that ghost still doesn't have native tables.