C++ In Practice - Combined Data Structures
Advanced O(1) Data Structures (System Design Primitives)
These problems are not merely algorithmic; they are micro-architecture design challenges. They require composing STL containers to bypass their individual limitations (e.g., combining a Hash Map's lookup with a Linked List's ordering).
1. LFU Cache (Least Frequently Used)
Requirement: O(1) get and put. Evict least frequent; tie-break with least recent.
Architecture:
- Frequencies: A Hash Map mapping
Frequency -> DoublyLinkedList<Key>. - Positions: A Hash Map mapping
Key -> Iterator(pointing inside the lists above). - Values: A Hash Map mapping
Key -> {Value, Frequency}. - MinFreq: An integer tracking the minimum frequency in the system to allow O(1) eviction.
C++11 Implementation Strategy:
We use std::list for the doubly linked list. The critical trick is maintaining minFreq. When a key is accessed, we move it from freq list to freq + 1 list. If freq list becomes empty and freq == minFreq, we increment minFreq.
#include <unordered_map>
#include <list>
#include <iterator>
class LFUCache {
int capacity;
int minFreq;
// Key -> {Value, Freq}
std::unordered_map<int, std::pair<int, int>> keyValFreq;
// Freq -> List of Keys (Front is MRU, Back is LRU for that freq)
std::unordered_map<int, std::list<int>> freqList;
// Key -> Iterator to the node in freqList
std::unordered_map<int, std::list<int>::iterator> keyIter;
public:
LFUCache(int capacity) : capacity(capacity), minFreq(0) {}
int get(int key) {
if (keyValFreq.find(key) == keyValFreq.end()) return -1;
// Update frequency
updateFreq(key);
return keyValFreq[key].first;
}
void put(int key, int value) {
if (capacity == 0) return;
if (keyValFreq.find(key) != keyValFreq.end()) {
keyValFreq[key].first = value;
updateFreq(key);
return;
}
if (keyValFreq.size() >= capacity) {
// Evict LFU (and LRU if tie)
// The back of the list is the LRU for that frequency
int evictKey = freqList[minFreq].back();
freqList[minFreq].pop_back();
keyValFreq.erase(evictKey);
keyIter.erase(evictKey);
}
// Insert new key
minFreq = 1;
keyValFreq[key] = {value, 1};
freqList[1].push_front(key);
keyIter[key] = freqList[1].begin();
}
private:
void updateFreq(int key) {
int currFreq = keyValFreq[key].second;
// Remove from current freq list
freqList[currFreq].erase(keyIter[key]);
// Update minFreq if necessary
if (freqList[currFreq].empty()) {
freqList.erase(currFreq); // Cleanup empty list
if (minFreq == currFreq) minFreq++;
}
// Move to next freq list
int nextFreq = currFreq + 1;
keyValFreq[key].second = nextFreq;
freqList[nextFreq].push_front(key);
keyIter[key] = freqList[nextFreq].begin();
}
};
2. Insert Delete GetRandom O(1)
Requirement: Strict O(1) average time.
Challenge: Arrays allow O(1) random access but O(N) deletion (shifting). Hash Maps allow O(1) deletion but no random access.
Solution: The "Swap-and-Pop" Idiom.
- Store values in
std::vector(for random index access). - Store
Value -> Indexinstd::unordered_map. - Delete: Swap the target element with the last element of the vector. Update the map index for the moved element. Pop the back.
#include <vector>
#include <unordered_map>
#include <cstdlib> // for rand()
class RandomizedSet {
std::vector<int> nums;
std::unordered_map<int, int> valToIndex;
public:
RandomizedSet() {}
bool insert(int val) {
if (valToIndex.count(val)) return false;
nums.push_back(val);
valToIndex[val] = nums.size() - 1;
return true;
}
bool remove(int val) {
if (!valToIndex.count(val)) return false;
int indexToRemove = valToIndex[val];
int lastVal = nums.back();
// Swap Logic:
// 1. Move lastVal to the hole
nums[indexToRemove] = lastVal;
valToIndex[lastVal] = indexToRemove;
// 2. Remove the last element
nums.pop_back();
valToIndex.erase(val);
return true;
}
int getRandom() {
return nums[rand() % nums.size()];
}
};
3. All Oone Data Structure
Requirement: Increment/Decrement Key count, GetMax, GetMin in O(1).
Architecture:
- Buckets: A Doubly Linked List where each node represents a
Countand contains aSetof keys with that count. Sorted by count. - Mapping:
unordered_map<Key, Iterator_to_Bucket>.
Implementation Nuance:
When incrementing, move key from CurrentBucket to NextBucket. If NextBucket (count+1) doesn't exist, create it. std::list::insert inserts before, so to insert after, use std::next(it).
#include <unordered_map>
#include <unordered_set>
#include <list>
#include <string>
class AllOne {
struct Bucket {
int count;
std::unordered_set<std::string> keys;
};
std::list<Bucket> buckets;
std::unordered_map<std::string, std::list<Bucket>::iterator> bucketOfKey;
public:
void inc(std::string key) {
// If key doesn't exist, insert into count=1 bucket
if (!bucketOfKey.count(key)) {
if (buckets.empty() || buckets.front().count != 1) {
buckets.push_front({1, {}});
}
buckets.front().keys.insert(key);
bucketOfKey[key] = buckets.begin();
return;
}
// Key exists, move to next bucket
auto curr = bucketOfKey[key];
auto next = std::next(curr);
if (next == buckets.end() || next->count != curr->count + 1) {
next = buckets.insert(next, {curr->count + 1, {}});
}
next->keys.insert(key);
bucketOfKey[key] = next;
curr->keys.erase(key);
if (curr->keys.empty()) buckets.erase(curr);
}
void dec(std::string key) {
if (!bucketOfKey.count(key)) return;
auto curr = bucketOfKey[key];
auto prev = buckets.end();
if (curr != buckets.begin()) prev = std::prev(curr);
bucketOfKey.erase(key); // Assuming removal unless moved
if (curr->count > 1) {
if (curr == buckets.begin() || prev->count != curr->count - 1) {
prev = buckets.insert(curr, {curr->count - 1, {}});
}
prev->keys.insert(key);
bucketOfKey[key] = prev;
}
curr->keys.erase(key);
if (curr->keys.empty()) buckets.erase(curr);
}
std::string getMaxKey() {
return buckets.empty() ? "" : *buckets.back().keys.begin();
}
std::string getMinKey() {
return buckets.empty() ? "" : *buckets.front().keys.begin();
}
};
4. Time Based Key-Value Store
Requirement: set(key, val, time), get(key, time). Get returns value at time_prev <= time.
Logic: Since set calls are strictly chronological (usually), the storage for each key is sorted. Use upper_bound.
#include <unordered_map>
#include <vector>
#include <string>
#include <algorithm>
class TimeMap {
// Key -> Vector of {Timestamp, Value}
std::unordered_map<std::string, std::vector<std::pair<int, std::string>>> store;
public:
void set(std::string key, std::string value, int timestamp) {
store[key].push_back({timestamp, value});
}
std::string get(std::string key, int timestamp) {
if (!store.count(key)) return "";
const auto& vec = store[key];
// Binary Search: Find first element > timestamp
// Compare pair with int? Need custom comparator or dummy pair.
// Easiest: Use a dummy pair with MAX string so strictly >
auto it = std::upper_bound(vec.begin(), vec.end(),
std::make_pair(timestamp, std::string(1, (char)127)));
if (it == vec.begin()) return "";
// Move back one step to get <= timestamp
return std::prev(it)->second;
}
};
5. Snapshot Array
Requirement: set(index, val), snap(), get(index, snap_id).
Optimization: Storing a full copy on snap() is O(N) memory/time. Too slow. We store Deltas.
Structure: vector<vector<pair<snap_id, value>>>.
Logic: get performs binary search on the history of that specific index.
#include <vector>
#include <algorithm>
#include <map>
class SnapshotArray {
int snapId = 0;
// Index -> List of {SnapID, Value}
std::vector<std::vector<std::pair<int, int>>> history;
public:
SnapshotArray(int length) {
history.resize(length);
// Initialize all indices with 0 at snap -1 or handle in get
for(auto& h : history) {
h.push_back({-1, 0});
}
}
void set(int index, int val) {
// Optimization: If value at current snapId already exists, overwrite it.
// Otherwise, push new entry.
if (history[index].back().first == snapId) {
history[index].back().second = val;
} else {
history[index].push_back({snapId, val});
}
}
int snap() {
return snapId++;
}
int get(int index, int snap_id) {
const auto& vec = history[index];
// Find iterator to first element with ID > snap_id
auto it = std::upper_bound(vec.begin(), vec.end(), std::make_pair(snap_id, 2000000000));
// Return the element just before it (closest snap_id <= requested)
return std::prev(it)->second;
}
};
Summary of Advanced Structures
| Problem | STL Primitives | Key Technique | Complexity |
|---|---|---|---|
| LFU Cache | unordered_map + list |
Iterator Map. Direct access to list nodes. | O(1) |
| Random Set | unordered_map + vector |
Swap-and-Pop. | O(1) |
All Oone |
list<Bucket> + unordered_map |
Bucket List. Move keys between buckets. | O(1) |
| Time Map | unordered_map + vector |
Binary Search (upper_bound). |
O(\log N_{key}) |
| Snapshot | vector<vector<pair>> |
Delta Storage + Binary Search. | O(\log N_{snaps}) |
Here are 5 additional "Architectural Class" problems. These problems, like the LFU Cache or Snapshot Array, require composing standard STL containers to create a custom data structure with strict time complexity requirements.
1. Range Module (Dynamic Interval Management)
Mechanism: Track a set of half-open intervals [left, right). Support adding a range, removing a range, and querying if a range is fully tracked. All operations should be roughly O(\log N).
Real-world Analogy: Memory Management Unit (MMU) tracking allocated memory pages.
Implementation Strategy:
- Data Structure:
std::map<int, int>mappingStart -> End. - Invariant: No two intervals in the map overlap or touch. If we insert
[1, 5)and[5, 10), they merge into[1, 10). - Logic:
- Add: Use
upper_boundto find the first interval that starts after our new range starts. Move iterator backward to check for overlap. Merge overlapping intervals into one, deleting the subsumed ones. - Remove: Similar to Add, but "cuts" holes in existing intervals. This might result in one interval splitting into two.
- Query:
upper_boundto find the interval starting \le query.start. Check if it covers query.end.
- Add: Use
#include <map>
#include <algorithm>
class RangeModule {
// Map: Start -> End
std::map<int, int> intervals;
public:
void addRange(int left, int right) {
auto it = intervals.upper_bound(left);
// Check previous interval: does it overlap or touch?
if (it != intervals.begin()) {
auto prev = std::prev(it);
if (prev->second >= left) {
left = prev->first;
right = std::max(right, prev->second);
intervals.erase(prev);
}
}
// Check next intervals: do they overlap?
while (it != intervals.end() && it->first <= right) {
right = std::max(right, it->second);
it = intervals.erase(it);
}
intervals[left] = right;
}
bool queryRange(int left, int right) {
auto it = intervals.upper_bound(left);
if (it == intervals.begin()) return false;
// Check the interval strictly to the left of our query start
return std::prev(it)->second >= right;
}
void removeRange(int left, int right) {
auto it = intervals.upper_bound(left);
if (it != intervals.begin()) {
auto prev = std::prev(it);
if (prev->second > left) {
// Determine if we need to split the previous interval or just trim right
int originalEnd = prev->second;
prev->second = left; // Trim
if (originalEnd > right) {
intervals[right] = originalEnd; // Split part
return;
}
}
}
while (it != intervals.end() && it->first < right) {
if (it->second <= right) {
// Fully covered, remove
it = intervals.erase(it);
} else {
// Partially covered, trim left
int end = it->second;
intervals.erase(it);
intervals[right] = end;
break;
}
}
}
};
2. Design In-Memory File System
Mechanism: Support ls, mkdir, addContent, readContent.
Architecture: A Trie-like tree structure, but nodes are Directories or Files.
Key Challenge: Clean separation of File vs Directory logic.
Implementation Strategy:
- Struct:
Nodecontainsbool isFile,string content, andmap<string, Node*> children. - Map: Using
std::mapautomatically keeps thelsoutput sorted lexicographically (a common requirement). - Traversal: A helper function
findNode(path)splits the string by/and walks the tree.
#include <vector>
#include <string>
#include <map>
#include <sstream>
#include <algorithm>
class FileSystem {
struct Node {
bool isFile = false;
std::string content = "";
std::map<std::string, Node*> children;
};
Node* root;
std::vector<std::string> split(const std::string& path) {
std::vector<std::string> parts;
std::stringstream ss(path);
std::string item;
while (std::getline(ss, item, '/')) {
if (!item.empty()) parts.push_back(item);
}
return parts;
}
public:
FileSystem() {
root = new Node();
}
std::vector<std::string> ls(std::string path) {
auto parts = split(path);
Node* curr = root;
for (const auto& part : parts) curr = curr->children[part];
if (curr->isFile) return {parts.back()};
std::vector<std::string> res;
for (const auto& pair : curr->children) res.push_back(pair.first);
return res;
}
void mkdir(std::string path) {
auto parts = split(path);
Node* curr = root;
for (const auto& part : parts) {
if (!curr->children.count(part)) curr->children[part] = new Node();
curr = curr->children[part];
}
}
void addContentToFile(std::string filePath, std::string content) {
auto parts = split(filePath);
Node* curr = root;
for (const auto& part : parts) {
if (!curr->children.count(part)) curr->children[part] = new Node();
curr = curr->children[part];
}
curr->isFile = true;
curr->content += content;
}
};
3. Dinner Plate Stacks (Infinite Stacks with PopAtStack)
Mechanism: You have an infinite row of stacks. push goes to the leftmost non-full stack. pop comes from the rightmost non-empty stack. popAtStack(index) pops from a specific stack.
Real-world Analogy: Warehouse shelving.
Implementation Strategy:
- Container:
vector<stack<int>>stores the actual data. - Availability Tracker:
set<int> available(Min-Heap behavior) stores indices of stacks that have space. - Efficiency:
push:*available.begin()gives the leftmost index. O(\log N).pop:vector.back(). O(1).popAtStack: Direct access. Add index toavailableset since it now has space. O(\log N).
#include <vector>
#include <stack>
#include <set>
class DinnerPlates {
int capacity;
std::vector<std::stack<int>> stacks;
std::set<int> available; // Indices of non-full stacks
public:
DinnerPlates(int capacity) : capacity(capacity) {}
void push(int val) {
if (available.empty()) {
// No holes, create new stack
available.insert(stacks.size());
stacks.push_back({});
}
int idx = *available.begin(); // Leftmost available
stacks[idx].push(val);
if (stacks[idx].size() == capacity) {
available.erase(idx); // Full now
}
}
int pop() {
// Remove empty stacks from the back
while (!stacks.empty() && stacks.back().empty()) {
// Also remove from available set if it was there
// (Wait, if it's empty, it MUST be in available, but we are deleting the stack entirely)
available.erase(stacks.size() - 1);
stacks.pop_back();
}
if (stacks.empty()) return -1;
return popAtStack(stacks.size() - 1);
}
int popAtStack(int index) {
if (index >= stacks.size() || stacks[index].empty()) return -1;
int val = stacks[index].top();
stacks[index].pop();
// Since we popped, this stack definitely has space now
available.insert(index);
return val;
}
};
4. Design Text Editor (Cursor Management)
Mechanism: addText(text), deleteText(k), cursorLeft(k), cursorRight(k).
Architecture: The "Two-Stack" approach (often called a Gap Buffer concept).
- Left Stack: Stores characters to the left of the cursor.
- Right Stack: Stores characters to the right of the cursor.
- Movement: Moving left means popping from Left Stack and pushing to Right Stack.
C++11 Optimization:
Instead of std::stack (which is hard to print/iterate for result), use std::string or std::deque acting as stacks. string is contiguous and better for the "return last 10 chars" requirement common in this problem.
#include <string>
#include <algorithm>
#include <deque>
class TextEditor {
std::deque<char> left; // Chars to left of cursor
std::deque<char> right; // Chars to right of cursor (reversed logic or simply push_front)
public:
void addText(std::string text) {
for (char c : text) left.push_back(c);
}
int deleteText(int k) {
int count = 0;
while (k > 0 && !left.empty()) {
left.pop_back();
k--;
count++;
}
return count;
}
std::string cursorLeft(int k) {
while (k > 0 && !left.empty()) {
right.push_front(left.back());
left.pop_back();
k--;
}
return getPreview();
}
std::string cursorRight(int k) {
while (k > 0 && !right.empty()) {
left.push_back(right.front());
right.pop_front();
k--;
}
return getPreview();
}
std::string getPreview() {
// Return last 10 chars from left
std::string res = "";
int count = 10;
auto it = left.rbegin();
while (count > 0 && it != left.rend()) {
res += *it;
it++;
count--;
}
std::reverse(res.begin(), res.end());
return res;
}
};
5. Leaderboard / Rank System (Order Statistic Tree)
Mechanism: addScore(player, score), getRank(score), getScoreAtIndex(k).
Challenge: STL std::set / std::map does not support "get index of element" or "get element at index" in O(\log N). std::distance is O(N).
Solution:
- Ideal (Production): Policy Based Data Structure (PBDS) in GCC, or Augmenting a BST (store subtree size).
- Didactic (Standard STL): We cannot achieve pure O(\log N) for both
addandgetRankusing onlystd::set.- Alternative: Use
std::vector+std::sort(O(N \log N)) orFenwick Tree(if scores are integers in reasonable range). - Wait, we can approximate: If specific rank isn't required, but "Top K" is, we use
min-heap. - Design Choice: If the range of scores is small (e.g., 0-100), use
count[score]. - Approximation for this exercise: Let's implement a system that maintains a Gap Ranking using a Hash Map +
setof scores.
- Alternative: Use
Scenario: Design a system to get the K-th largest score in O(\log N) and update scores in O(\log N). This is technically impossible with standard C++ STL. However, we can design a Tweet Counts or Rate Limiter style system instead.
Let's Pivot to: Design a Rate Limiter (Sliding Window Log)
Mechanism: Allow limit requests in the last window seconds.
Architecture: unordered_map<user, deque<timestamp>>.
Logic:
- Get user's deque.
- Pop front while
timestamp < current - window. - Return
deque.size() < limit.
#include <unordered_map>
#include <deque>
class RateLimiter {
std::unordered_map<int, std::deque<int>> userRequests;
int timeWindow;
int maxRequests;
public:
RateLimiter(int window, int maxReq) : timeWindow(window), maxRequests(maxReq) {}
bool shouldAllow(int userId, int timestamp) {
std::deque<int>& q = userRequests[userId];
// 1. Clean up old requests
while (!q.empty() && q.front() <= timestamp - timeWindow) {
q.pop_front();
}
// 2. Check capacity
if (q.size() < maxRequests) {
q.push_back(timestamp);
return true;
}
return false;
}
};
Summary of New Problems
| Problem | Key Components | Complexity Constraint |
|---|---|---|
| Range Module | std::map (Interval Merging) |
O(\log N) ops. |
| File System | Trie (map children) |
O(L) where L is path length. |
| Dinner Plates | vector<stack> + set |
Push O(\log N), Pop O(1). |
| Text Editor | deque (Two Stacks) |
Cursor Move O(K), Insert O(1). |
| Rate Limiter | map + deque |
Amortized O(1) (Clean up). |