Tries (Prefix Trees)

⏱️ ~3-minute bite · solve the sandbox to master

0%lesson

🧒

5-Year-Old Metaphor

— The physical, real-world picture. No jargon.

🗂️ A branching filing cabinet where every shared prefix shares the same drawer.

The core idea

Store the words ["car", "cart", "cat", "bat", "ball"]in a trie. "cat" and "car" share the "ca" drawer. You only need one "c" node, one "a" node — then they branch. A node with a star (★) means a complete word ends there.

Trie for ["car","cart","cat","bat","ball"]

root

├── c

│ └── a

│ ├── r ★ (car)

│ │ └── t ★ (cart)

│ └── t ★ (cat)

└── b

└── a

├── t ★ (bat)

└── l

└── l ★ (ball)

Why search is O(L) regardless of dictionary size

To look up "cat" you traverse exactly 3 nodes: c → a → t. It doesn't matter if the trie has 100 words or 10 million — you only ever walk L steps where L is the length of the query word. A hash map gives O(1) average, but can't do prefix matching. A sorted array gives O(log n) per lookup, but prefix search requires extra work. A trie gives O(L) for both.

When to reach for a trie

Problem involves prefix matching or autocomplete
You need to find all words starting with a given prefix
You need to check if any word in a set starts with a given string
IP routing — longest prefix match in routing tables

🎛️

Interactive Sandbox

— Move something, see it react instantly.

How to use: pick an operation below, then step through it. The violet path shows your traversal. Nodes with ★ are word-ends. Green = newly created. Red = pruned. Explore all 5 operations to master.

insert 'care'Step 1 / 6

root

r★

t★

l★

[TRAVERSE]Start at root. We insert 'care' char by char.

🎯

Challenge

Explore all 5 trie operations: Insert, Search, Prefix, Delete, Autocomplete.

Try it

🎯

Why Should I Care?

— The exact interview question + the bug it kills.

Interview question: Trie vs HashMap for autocomplete?

A HashMap<String, Boolean> can answer "does this exact word exist?" in O(1), but to find all words with prefix "ca" you must scan every key — O(n) where n is dictionary size. A trie traverses only the relevant subtree, making prefix enumeration O(P + k) where P = prefix length and k = number of matches.

1	class TrieNode {
2	children = new Map<string, TrieNode>();
3	isEnd = false;
4	}
5	class Trie {
6	root = new TrieNode();
7
8	insert(word: string) {
9	let node = this.root;
10	for (const c of word) {
11	if (!node.children.has(c))
12	node.children.set(c, new TrieNode());
13	node = node.children.get(c)!;
14	}
15	node.isEnd = true;
16	}
17
18	search(word: string): boolean {
19	const node = this.#traverse(word);
20	return node?.isEnd ?? false; // must check isEnd!
21	}
22
23	startsWith(prefix: string): boolean {
24	return this.#traverse(prefix) !== null;
25	}
26
27	#traverse(s: string): TrieNode \| null {
28	let node: TrieNode = this.root;
29	for (const c of s) {
30	if (!node.children.has(c)) return null;
31	node = node.children.get(c)!;
32	}
33	return node;
34	}
35	}
36	// Time: O(L) per op · Space: O(alphabet × total_chars)

Interview question: Space complexity — when is a trie wasteful?

Each node holds a Map (or array of 26) of children. Worst case: n words with no shared prefixes (e.g. random UUIDs). You get O(n × L) nodes — similar to storing the strings themselves but with per-node overhead. A trie shines when the vocabulary has lots of shared prefixes. Radix trees (compressed tries) merge single-child chains to cut space dramatically.

Interview question: Count words with a given prefix?

Store a count field on each node incremented during insert. A prefix query becomes a single traversal returning the count at the prefix node — O(L). No DFS needed. This is a common follow-up after the basic trie implementation.

Classic bug: forgetting the isEnd check

If you return true whenever traverse(word)returns a non-null node, you get false positives. Searching for "ca" would return true even if only "car" and "cat" are in the trie, because the "a" node exists. Always check node.isEnd for exact search.

🔬

The Deep Dive

— Spec refs, engine internals, the minutiae.

Compressed trie (Radix tree / Patricia trie)

A plain trie wastes nodes on single-child chains. The word "ball" creates b → a → l → l — four nodes, three of which have only one child. A radix tree merges single-child chains into edge labels: b → a → ll★. Lookup is the same O(L) but with far fewer node allocations. This is how most production implementations work.

Radix tree for ["car","cart","cat"]

root

└── ca

├── r ★ (car)

│ └── t ★ (cart)

└── t ★ (cat)

Aho-Corasick: multi-pattern string search

A trie matches one pattern at a time. Aho-Corasick adds "failure links" — when a match fails at a node, jump to the longest proper suffix that is a valid prefix in the trie. This lets you search a text for all k patterns simultaneously in O(n + m + z) where n = text length, m = total pattern length, z = number of matches. Used in antivirus engines and network intrusion detection.

Suffix trie vs Trie

A trie indexes a set of words. A suffix trie indexes all suffixes of a single string — enabling O(L) substring search in a long document. Suffix tries are huge (O(n²) nodes for a string of length n), so in practice you use a suffix array + LCP array for the same queries with O(n log n) build time and O(n) space.

IP routing — longest prefix match

IPv4 routing tables store CIDR prefixes (e.g. 192.168.1.0/24). To forward a packet, you need the most specific (longest) matching prefix — not just any match. A binary trie over the 32-bit address gives O(32) = O(1) lookup with exact longest-prefix semantics. Hardware routers implement this in silicon as TCAM (ternary content-addressable memory), but the conceptual model is a binary trie.

Real-world appearances

Google / browser autocomplete — trie over typed prefix; top-k results ranked by weight
Spell checkers — traverse with bounded edit distance (BK-tree or trie + DFS with budget)
DNS resolution — hierarchical label lookup is conceptually a trie (. → com → example)
Git object addressing — object IDs indexed by hex prefix in .git/objects/

🎤

Interview Questions

— Real questions from real interviews — with answers.

All three are O(m) where m is the length of the word — independent of how many words are stored.

Traverse to the prefix node, then DFS/BFS to collect all words in that subtree.

Trie uses more memory but enables O(m) prefix queries; hashset is compact but can't answer 'does any word start with X?'

BFS/DFS with an edit-distance budget — explore neighbors within bounded Levenshtein distance.

Radix tree collapses single-child chains into one edge — reduces node count from O(total chars) to O(n * key branches).

🎮

Memory Game

— Quick quiz — lock the concept in long-term memory.

1/4

What is a suffix trie used for, and why isn't it used in production?