← All simulations · Pillar 8: Brains made of math

Embeddings: a map of meaning

What it is

An embedding turns each word into a point on a map — really, into a list of numbers. The trick is where the points land: words that mean similar things sit close together, and words that mean different things sit far apart. So “king” and “queen” are neighbours, while “banana” is off in another part of the map entirely.

Go deeper: here the map is 2-D so we can see it, but real embeddings use hundreds of numbers per word. The magic is that directions carry meaning. The arrow from “man” to “king” means roughly “make it royal,” and that same arrow added to “woman” lands on “queen.” That’s why king − man + woman ≈ queen works.

Why care

Computers can’t do math on the letters “c-a-t.” Embeddings give every word a numeric home so a model can measure how related two words are, group them, and reason about words it has never seen paired before. They power search (“find pages that mean this”, not just match the words), recommendations, translation, and the language models behind chat.

The idea, intuitively

Imagine seating everyone at a party so friends end up near friends. Without anyone wearing a label, you’d see clusters form — the soccer crowd here, the musicians there. A model does the same with words, just by reading which words tend to show up in the same sentences. Words used in similar ways drift together; the map of meaning falls out on its own.

Peek at the data first

Each word is stored as a few numbers — its coordinates on the map. We never tell the model the “group” column; it’s only here so you can check that similar words really did land together.

Try it

Choose a word in Show words near (or click any dot) and slide How many neighbours to light up its closest words. Then tick Try word math and build an analogy like king − man + woman — watch the gold dot land on the answer, with arrows showing the meaning directions being added.

Where it shows up

Search & recommendations. Find items that mean the same thing, even when they share no exact words.
Language models. The very first thing a chat model does with your words is turn them into embeddings.
Beyond words. The same idea maps images, songs, and products into spaces where “near” means “similar.”

Where it came from

The idea that “you shall know a word by the company it keeps” comes from linguist J. R. Firth (1957). It became practical when Tomáš Mikolov and colleagues at Google released word2vec (2013), which learned word maps from raw text and showed off the king–man+woman≈queen trick. Stanford’s GloVe (2014) followed, and today embeddings are the foundation layer of nearly every language model.

Try it in code

Embeddings are just vectors, and you already met vectors and distance. In the Studio you can place points and measure how close they are — the same “near means similar” idea:

data = load "fruits"
describe_data data

model = make_model "recommender"
train_model model, on: data, predict: "type", using: ["sweetness", "size"]
recommend model, like: "apple", count: 3

Open it in the Studio ▶

Check your understanding

What does it mean for two words to be “close” on an embedding map?
Why does king − man + woman land near queen?
How can a model build this map without ever being told which words are related?