Recommender Systems Explained: The AI Responsible for Your Latest Netflix Binge

This article is part of a series that sets out to demystify artificial intelligence and machine learning techniques that are common in everyday life, and in the news. At Pandata, we firmly believe that these high-level understandings of AI—why (or why not), when, and how—can and should be accessible to all. Take a look at the other posts in this series:

Seemingly customized recommendations are everywhere these days—Netflix is making suggestions of what you should watch next, Amazon recommends product pairings, LinkedIn highlights potential contacts to connect with, and Spotify streams music you will probably enjoy.

While these recommendations can sometimes seem off base, in general, they are fairly accurate at reflecting our interests (and sometimes even present a welcome surprise).

This begs the question: What is the science behind these recommendations?

The answer is a state-of-the-art data science technique, recommender systems, which companies are using to leverage large datasets to efficiently guide the customer experience.

How Does a Recommender System Work?

While there are many variations on a recommender system, on a general level, they work by using existing information about behavior to predict the preferences of end users. Its most common business use case is to suggest relevant items to people.

What Is Content-Based Filtering?

One type of recommender system is content-based filtering. Using this approach, an algorithm identifies characteristics of a product that a customer engaged with and identifies similar products to them. For example, if you recently purchased warm winter boots, the shopping site may recommend a similar winter item such as wool socks.

For this prediction to work, each item in a company’s inventory must be characterized by a text description, a series of features, or other such descriptors. Then, one (or multiple) algorithms can be used to identify which items are most similar and recommend those products following a customer’s purchase.

Are There Limitations to Recommender Systems?

One potential issue of utilizing a recommender system is if it recommends something too similar to the original purchase. Using our example above, if a user just bought new winter boots, they likely don’t need another pair suggested to them.

Recommending similar items based on item characteristics can be useful in some circumstances. However, by just identifying lookalike items, one is missing out on a voluminous and rich data source: human behavior.

What Is the Collaborative Filtering Technique?

Another recommender system technique is called collaborative filtering, which simultaneously uses similarities between items and people to predict what any given person may prefer. As a very simplified example:

Person A bought winter boots and cold medicine.
Person B bought winter boots, cold medicine, and diapers.
Based on patterns in behavior from Person A and B, when Person C adds winter boots to their cart, the site may recommend that “people who bought winter boots also bought cold medicine.”

How Does Collaborative Filtering Use Explicit Feedback?

Collaborative filtering can be used with explicit feedback (user-inputted data such as a person’s 4-star rating of a movie on Netflix), implicit feedback (inferred data that a user was interested in a TV series if they watched one episode after another), or a combination of the two.

One example of how explicit feedback can be used in collaborative filtering is with movie ratings (1-star, 2-star, etc.). An algorithm takes all the ratings across all customers to make a big table or matrix. Then, the algorithm uses matrix factorization as a mathematical way to represent information about the users and items.

These representations can then be used to compute theoretical ratings for movies that an individual may not have seen before. In the end, this manifests as “people who purchased winter boots also purchased cold medicine.”

This approach differs from content-based filtering mentioned above in that the recommended item is not suggested because it is necessarily similar to the original item, but because patterns of human behavior suggest commonality.

Can Collaborative Filtering Be Used With Implicit Feedback?

Continuing the rating example from above, how do users that don’t bother to rate movies or products still receive valuable recommendations? While explicit ratings are great, they are also rare. Therefore, companies may use the collaborative filtering technique with implicit ratings and feedback, as well.

Implicit feedback refers to the idea that you can infer what a person thinks about a product based on their behavior. Engagements like clicking on an item, time spent on a product page, purchasing an item, watching five minutes of an episode, or binging the entire season form the basis of implicit ratings.

Mathematically transforming information about how a person interacts with a product can serve as a stand-in for ratings, although with some assumptions baked in. For example, if someone bought an item, the assumption is that they must like the item. Or if someone watches three seasons of a show, the assumption is not that they fell asleep with auto-play on. While these assumptions may not always be accurate, with the large volumes of data common in streaming media or e-commerce, clear trends still emerge.

What Are the Risks or Biases Associated With Using Recommender Systems?

While recommender systems can be very powerful, they are not without potential pitfalls.

One big risk is that by using like to recommend like, suggestions can fall into a silo. At best, this causes recommendations to be boring. At worst, recommendations can reflect social bias or discrimination present in the underlying dataset.

Siloing can be due to limitations in content-based filtering or due to the predictability and stereotypy of human behavior. People who watch one horror movie probably watch multiple horror movies. Pretty soon, the algorithms are only recommending horror movies.

Human bias can also be reflected if the system’s AI strategy is not carefully considered. For example, using a recommender system without additional modification to suggest college classes may recommend engineering classes to male students and early childhood education classes to female students—based on gender-biased enrollment patterns.

That said, there are statistical and mathematical steps one can build into the AI design to avoid pigeonholing. A truly effective and responsible recommender system involves a component to identify and address bias and siloing.

Ultimately, the next time you are wondering how Amazon knows what shoes you like, or Netflix plans the perfect Friday evening, you have a recommender system to thank.

What Other AI Questions Would You Like Answered?

If there is an AI concept that you would like to see explained, submit your questions here and we’d be happy to write about them.

We also send a once-per-month email that includes answers to readers’ most asked questions and helpful resources around AI strategy, ethical and responsible AI, and valuable insights from AI experts themselves. Subscribe to our Voices of Trusted AI monthly digest for reputable information and actionable tips about designing AI that you, and your stakeholders, can trust.

Editor’s note: This post was originally published in 2020 and has been updated and republished.

Pandata Blog

AI design and development for high risk industries