C++ In Practice - Hash Maps

HashMaps, Two Pointers, and Array Manipulation (C++11 Deep Dive)

We are using rigorous C++11 implementations, focusing on memory management, STL specifics (std::unordered_map vs std::map), and low-level efficiency.

1. Hash Table Fundamentals (`std::unordered_map`)

In C++11, std::unordered_map is the direct equivalent of a generic HashMap.

Under the Hood:

Data Structure: An array of "buckets". Each bucket is typically a linked list (Chaining) to handle collisions.
Hash Function: Uses std::hash<Key> to map keys to size_t hash codes.
Index Calculation: bucket_index = hash_code % bucket_count.
Rehashing: When load_factor() (elements / buckets) exceeds max_load_factor() (default 1.0), the container allocates a larger bucket array (usually approx 2x), re-calculates indices for all elements, and moves them. This is an O(N) operation.

Comparison:

std::map: Red-Black Tree. Sorted keys. O(log N) operations.
std::unordered_map: Hash Table. Unsorted. O(1) average, O(N) worst-case (all keys collide).

ASCII Memory Layout (Chaining):

[ std::unordered_map Memory Layout ]
+-------------------+
| Bucket 0 (nullptr)|
+-------------------+
| Bucket 1          |---> [ Pair<K,V> ] -> [ Pair<K,V> ] -> nullptr
+-------------------+
| Bucket 2 (nullptr)|
+-------------------+
| ...               |
+-------------------+

2. Two Sum (Unsorted)

Problem: Find two numbers that add up to target. Pattern: HashMap Lookup. Goal: O(N) Time, O(N) Space.

C++11 Implementation Details:

Use std::unordered_map<int, int> to store {value, index}.
Use find() to check existence. find() returns end() if not found.
Check complement existence before inserting current element to avoid using the same element twice (e.g., target 6, current 3).

#include <vector>
#include <unordered_map>
#include <stdexcept>

std::vector<int> twoSum(const std::vector<int>& nums, int target) {
    // Map: Key = number value, Value = index in vector
    std::unordered_map<int, int> seen;
    // Reserve to prevent rehashing overhead if N is large
    seen.reserve(nums.size()); 

    for (int i = 0; i < nums.size(); ++i) {
        int complement = target - nums[i];
        
        // Check if complement exists
        auto it = seen.find(complement);
        if (it != seen.end()) {
            // Found: return indices {complement_index, current_index}
            return {it->second, i};
        }
        
        // Insert current number and index
        seen[nums[i]] = i;
    }
    
    return {}; // Or throw exception if solution guaranteed
}

3. Two Sum II (Sorted Input)

Pattern: Converging Pointers. Goal: O(N) Time, O(1) Space.

C++11 Implementation Details:

Input is const std::vector<int>&.
Pointers are integer indices left and right.
No memory allocation required.

#include <vector>

std::vector<int> twoSumSorted(const std::vector<int>& numbers, int target) {
    int left = 0;
    int right = static_cast<int>(numbers.size()) - 1;

    while (left < right) {
        // Compute sum. Caution: potential integer overflow if numbers are large.
        // For standard interview questions, int is usually safe.
        long long sum = static_cast<long long>(numbers[left]) + numbers[right];

        if (sum == target) {
            // Problem often asks for 1-based indices
            return {left + 1, right + 1};
        }
        
        if (sum < target) {
            left++; // Need larger sum
        } else {
            right--; // Need smaller sum
        }
    }
    return {};
}

4. 3Sum

Pattern: Sorting + Two Pointers. Goal: O(N²) Time, O(1) Extra Space (excluding output).

Critical Logic:

Sort: Brings duplicates together and allows Two Pointer logic.
Dedup I: Outer loop skips identical starting numbers.
Dedup II: Inner while-loop skips identical left/right numbers after a match is found.

#include <vector>
#include <algorithm>

std::vector<std::vector<int>> threeSum(std::vector<int>& nums) {
    std::vector<std::vector<int>> result;
    if (nums.size() < 3) return result;

    // 1. Sort O(N log N)
    std::sort(nums.begin(), nums.end());

    // 2. Iterate
    for (int i = 0; i < nums.size() - 2; ++i) {
        // Skip duplicates for the first element
        if (i > 0 && nums[i] == nums[i - 1]) continue;

        int left = i + 1;
        int right = nums.size() - 1;
        int target = -nums[i];

        while (left < right) {
            int sum = nums[left] + nums[right];

            if (sum == target) {
                result.push_back({nums[i], nums[left], nums[right]});
                
                // Skip duplicates for the second element
                while (left < right && nums[left] == nums[left + 1]) left++;
                // Skip duplicates for the third element
                while (left < right && nums[right] == nums[right - 1]) right--;

                left++;
                right--;
            } else if (sum < target) {
                left++;
            } else {
                right--;
            }
        }
    }
    return result;
}

5. Group Anagrams

Pattern: HashMap with Sorted Key. Goal: O(N * K log K) Time, O(N * K) Space (N=num strings, K=max string length).

Under the Hood:

Key: The sorted version of the string (canonical form).
Value: std::vector<std::string> storing all anagrams.
C++11 Move Semantics: Crucial here to avoid copying strings repeatedly when pushing into the vector.

#include <vector>
#include <string>
#include <unordered_map>
#include <algorithm>

std::vector<std::vector<std::string>> groupAnagrams(std::vector<std::string>& strs) {
    std::unordered_map<std::string, std::vector<std::string>> map;

    for (const auto& s : strs) { // O(N)
        std::string key = s;
        std::sort(key.begin(), key.end()); // O(K log K)
        map[key].push_back(s); // Copy string into vector
    }

    std::vector<std::vector<std::string>> result;
    result.reserve(map.size());
    
    for (auto& pair : map) {
        // Move the vector from the map to result to avoid deep copy
        result.push_back(std::move(pair.second));
    }

    return result;
}

6. Product of Array Except Self

Pattern: Prefix and Suffix Arrays (Accumulation). Goal: O(N) Time, O(1) Extra Space (output array doesn't count).

Logic:

output[i] = (product of all left of i) * (product of all right of i).
Pass 1: Compute prefix products directly into output.
Pass 2: Compute suffix products on the fly and multiply into output.

#include <vector>

std::vector<int> productExceptSelf(const std::vector<int>& nums) {
    int n = nums.size();
    std::vector<int> output(n);

    // Pass 1: Prefix Products
    // output[i] contains product of nums[0]...nums[i-1]
    output[0] = 1;
    for (int i = 1; i < n; ++i) {
        output[i] = output[i - 1] * nums[i - 1];
    }

    // Pass 2: Suffix Products integrated
    // R tracks the product of elements to the right
    int R = 1;
    for (int i = n - 1; i >= 0; --i) {
        output[i] = output[i] * R;
        R *= nums[i];
    }

    return output;
}

7. Contains Duplicate

Pattern: HashSet. Goal: O(N) Time, O(N) Space.

Alternative: std::sort + std::unique (O(N log N) Time, O(1) Space). Here is the O(N) Time approach using std::unordered_set.

#include <vector>
#include <unordered_set>

bool containsDuplicate(const std::vector<int>& nums) {
    std::unordered_set<int> seen;
    seen.reserve(nums.size()); // Optimization to avoid rehashes

    for (int num : nums) {
        // insert returns a pair<iterator, bool>. 
        // second is false if element already existed.
        if (!seen.insert(num).second) {
            return true;
        }
    }
    return false;
}

8. Top K Frequent Elements

Pattern: HashMap + Min-Heap (Priority Queue). Goal: O(N log K) Time, O(N) Space.