C++ In Practice - Trie Implementation
Trie (Prefix Tree) in C++11
A Trie is a tree optimized for string retrieval. Unlike a Binary Search Tree (O(L \log N)), a Trie processes a string in O(L) time, independent of the number of stored words (N).
In C++, Tries require manual memory management. Because Tries allocate thousands of small nodes, they are prone to memory fragmentation and cache misses. In high-performance scenarios, a "Static Trie" (allocated in a single std::vector) is preferred, but for didactic purposes, the dynamic pointer-based approach is standard.
Data Structure Design
The Node:
- Array vs. Map: For standard problems (lowercase English), use
TrieNode* children[26]. It guarantees O(1) transitions.std::unordered_mapadds hashing overhead. - Destruction: C++ requires a destructor to recursively free memory, or the use of
std::unique_ptr.
#include <vector>
#include <string>
struct TrieNode {
// Array for 'a'-'z'.
// Use raw pointers for teaching; use smart pointers for production.
TrieNode* children[26];
bool isEnd;
TrieNode() {
isEnd = false;
// Efficiently zero out the array
for (int i = 0; i < 26; ++i) children[i] = nullptr;
}
~TrieNode() {
for (int i = 0; i < 26; ++i) {
if (children[i]) delete children[i];
}
}
};
1. Basic Implementation (Insert, Search, StartsWith)
Complexity: Time O(L) | Space O(N \cdot L).
class Trie {
private:
TrieNode* root;
public:
Trie() {
root = new TrieNode();
}
// RAII: Clean up memory when Trie goes out of scope
~Trie() {
delete root;
}
void insert(std::string word) {
TrieNode* node = root;
for (char c : word) {
int index = c - 'a';
if (!node->children[index]) {
node->children[index] = new TrieNode();
}
node = node->children[index];
}
node->isEnd = true;
}
bool search(std::string word) {
TrieNode* node = traverse(word);
return node != nullptr && node->isEnd;
}
bool startsWith(std::string prefix) {
return traverse(prefix) != nullptr;
}
private:
TrieNode* traverse(const std::string& s) {
TrieNode* node = root;
for (char c : s) {
int index = c - 'a';
if (!node->children[index]) {
return nullptr;
}
node = node->children[index];
}
return node;
}
};
2. Advanced: Word Search II (Backtracking + Trie)
Problem: Find all words from a dictionary in a 2D board.
Optimization:
- Embed the Word: Store the full string at the leaf node of the Trie. This removes the need to pass a
string pathduring recursion, saving massive copying overhead. - Pruning: Once a word is found, remove the "End of Word" marker (or the node itself) to prevent finding duplicate instances of the same word.
#include <vector>
#include <string>
class Solution {
struct TrieNode {
TrieNode* children[26] = {nullptr};
std::string* word = nullptr; // Pointer to the actual word string
~TrieNode() {
for(auto c : children) if(c) delete c;
}
};
void insert(TrieNode* root, std::string& s) {
TrieNode* curr = root;
for (char c : s) {
int idx = c - 'a';
if (!curr->children[idx]) curr->children[idx] = new TrieNode();
curr = curr->children[idx];
}
curr->word = &s; // Store pointer to string
}
public:
std::vector<std::string> findWords(std::vector<std::vector<char>>& board,
std::vector<std::string>& words) {
TrieNode root; // Stack allocated root, auto-cleans up children
for (auto& w : words) insert(&root, w);
std::vector<std::string> result;
int m = board.size();
int n = board[0].size();
for (int i = 0; i < m; ++i) {
for (int j = 0; j < n; ++j) {
dfs(board, i, j, &root, result);
}
}
return result;
}
private:
void dfs(std::vector<std::vector<char>>& board, int r, int c,
TrieNode* node, std::vector<std::string>& result) {
char ch = board[r][c];
if (ch == '#' || !node->children[ch - 'a']) return;
TrieNode* nextNode = node->children[ch - 'a'];
// Found a word?
if (nextNode->word) {
result.push_back(*nextNode->word);
nextNode->word = nullptr; // Avoid duplicates
}
// Backtracking
board[r][c] = '#'; // Mark visited
int dirs[4][2] = {{0,1}, {0,-1}, {1,0}, {-1,0}};
for (auto& d : dirs) {
int nr = r + d[0];
int nc = c + d[1];
if (nr >= 0 && nr < board.size() && nc >= 0 && nc < board[0].size()) {
dfs(board, nr, nc, nextNode, result);
}
}
board[r][c] = ch; // Restore
}
};
3. Autocomplete (Design Problem)
Problem: Input characters one by one, return top 3 hot sentences.
Architecture:
- Trie: Stores sentences. Each node needs to know the "Top 3" sentences passing through it (Cache) or we traverse down to find them.
- HashMap: To update frequencies of full sentences quickly.
- Optimization: Storing the top 3 results in every TrieNode trades space for O(1) query time.
#include <unordered_map>
#include <vector>
#include <string>
#include <algorithm>
class AutocompleteSystem {
struct TrieNode {
std::unordered_map<char, TrieNode*> children;
std::unordered_map<std::string, int> counts; // Sentence -> Freq
};
TrieNode* root;
TrieNode* curr;
std::string inputStr;
public:
AutocompleteSystem(std::vector<std::string>& sentences, std::vector<int>& times) {
root = new TrieNode();
curr = root;
inputStr = "";
for (size_t i = 0; i < sentences.size(); ++i) {
insert(sentences[i], times[i]);
}
}
void insert(const std::string& s, int time) {
TrieNode* node = root;
for (char c : s) {
if (!node->children.count(c)) node->children[c] = new TrieNode();
node = node->children[c];
node->counts[s] += time; // Update frequency at every step
}
}
std::vector<std::string> input(char c) {
if (c == '#') {
insert(inputStr, 1);
inputStr = "";
curr = root;
return {};
}
inputStr += c;
if (curr && curr->children.count(c)) {
curr = curr->children[c];
} else {
curr = nullptr;
return {};
}
// Extract candidates from the current node
// In a real system, you'd maintain a pre-sorted list in the node
// instead of sorting a map every time.
std::vector<std::pair<int, std::string>> candidates;
for (auto& p : curr->counts) {
candidates.push_back({p.second, p.first});
}
auto comp = [](const std::pair<int, std::string>& a, const std::pair<int, std::string>& b) {
return a.first > b.first || (a.first == b.first && a.second < b.second);
};
// Partial sort to get top 3
int k = std::min((int)candidates.size(), 3);
std::partial_sort(candidates.begin(), candidates.begin() + k, candidates.end(), comp);
std::vector<std::string> res;
for (int i = 0; i < k; ++i) res.push_back(candidates[i].second);
return res;
}
};
STL Alternative: std::set (Ordered)
While C++ has no Trie, std::set<string> keeps strings sorted lexicographically.
You can perform a pseudo-prefix search using lower_bound.
Example: set = {"apple", "app", "apricot", "banana"}
Query Prefix: "apr"
lower_bound("apr")to points to"apricot".- Check if iterator valid and string starts with
"apr".
Limitations:
- Complexity: O(L log N). (Trie is O(L)).
- If N (dictionary size) is huge, Trie wins. If N is small,
std::setis cleaner code.
Summary: Trie vs. Others
| Feature | Trie |
unordered_map<string, T> |
set<string> |
|---|---|---|---|
| Exact Search | O(L) | O(L) (Avg) | O(L \log N) |
| Prefix Search | O(L) | O(N \cdot L) (Scan all) | O(L \log N) |
| Space | High (Nodes) | Medium (Strings) | Medium (Strings) |
| Autocomplete | Excellent | Poor | Decent (via Iterators) |