← All simulations · Pillar 3: Finding patterns

Similarity recommender

What it is

A recommender answers “you liked this — what else might you like?” The simplest kind is content-based: describe each item with a few numbers, then suggest the items whose numbers sit closest to the thing you already liked. No crowd, no ratings — just similarity.

Go deeper: every item becomes a point in a feature space (here, sweetness and crunch). “Similar” means a short straight-line distance between two points — the very same distance idea behind k-nearest-neighbors. The recommender ranks everything by distance to your pick and hands back the nearest few. Content-based recommending needs no other users at all, which is why it works even on day one (the famous “cold-start” problem that ratings-based systems struggle with).

Why care

Recommenders quietly run much of the internet — the next video, the next song, the “customers also bought” row. Understanding the content-based version shows the honest core of the idea: it is not magic or mind-reading, just a tidy measurement of how alike two things are.

The idea, intuitively

Six fruits are plotted by how sweet and how crunchy they are. Tell it which one you like and it draws a line to every other fruit, then ranks them: the nearest is the strongest suggestion. Like an apple and it points you to a pear; like a lime and it sends you to a lemon, not a sweet banana. Closeness is the recommendation.

Peek at the data first

Just a tiny table of items and their features — no ratings column at all. The whole engine is the distance between these profiles, the same summary describe_data would show.

Try it

Choose a fruit in You like (or click a dot on the chart). The green lines and the ranked list show your top suggestions, each with a match % that turns distance into a friendly score. Slide How many to suggest to widen or narrow the picks.

Where it shows up

“More like this.” Similar songs, articles, or products from item features alone.
Cold start. Content-based picks work for brand-new items with zero ratings yet.
Hybrid systems. Real sites blend this with what other people liked (collaborative filtering).

Where it came from

Automated recommending grew up in the 1990s — the GroupLens project (1994) and Stanford’s Tapestry (1992) pioneered filtering by similarity, and content-based methods drew directly on decades of information-retrieval work on measuring how alike two documents are. The 2006 Netflix Prize later turned recommenders into headline news.

Try it in code

In the Studio, a recommender learns each item’s profile, then recommend ranks the rest by closeness to one you name — the same move you just made by hand:

data = load "fruits"
model = make_model "recommender"
train_model model, on: data, predict: "type", using: ["sweetness", "size"]
recommend model, like: "apple", count: 3

Open it in the Studio ▶

Check your understanding

How does the recommender decide which items are “similar”?
Why can a content-based recommender suggest a brand-new item that nobody has rated yet?
What does the “match %” really measure, under the friendly number?