W Code - Pattern-Based DSA Learning Platform

Bhanu Bisht

Back to DSA Theory

Hashing (Hash Tables)

Master the art of O(1) lookups - Hash functions, collisions, and interview problems.

Topics

Minutes

C++

Language

Hash Table Complexity at a Glance

Operation	Average	Worst Case	When Best?
Search	O(1)	O(n)	Low load factor
Insert	O(1)	O(n)	Good hash function
Delete	O(1)	O(n)	Few collisions
Space	O(n)

What is Hashing?

Average: O(1) for all operations

Hashing is a technique for mapping keys (like names or unique IDs) to indices in a table using a hash function.

Real-life example (Indian context): Think of your Aadhar number – a unique 12-digit ID. When you go to a government office, they don't search through millions of records one by one. Instead, they use your Aadhar to directly compute where your record is stored. That's hashing!

Key Points:
- Hash table stores (key, value) pairs
- Hash function converts key → array index
- Enables O(1) average time for insert, search, delete
- Much faster than linear search O(n) or binary search O(log n)

When to use Hashing:
- ✅ Need fast lookups by key
- ✅ Counting frequency of elements
- ✅ Checking if something exists
- ✅ Two Sum type problems

C++ Code

#include <iostream>
#include <unordered_map>
using namespace std;

int main() {
    // Creating a hash table (unordered_map in C++)
    unordered_map<string, int> studentMarks;
    
    // Insert: O(1) average
    studentMarks["Rahul"] = 85;
    studentMarks["Priya"] = 92;
    studentMarks["Amit"] = 78;
    
    // Lookup: O(1) average
    cout << "Priya's marks: " << studentMarks["Priya"] << endl;  // 92
    
    // Check if key exists
    if (studentMarks.count("Rahul")) {
        cout << "Rahul found!" << endl;
    }
    
    // Delete: O(1) average
    studentMarks.erase("Amit");
    
    return 0;
}

Hash Functions (Division Method)

Hash computation: O(1) for integers

Interactive

Interactive Hash Table (Division Method)

Hash Function: h(key) = key % 7

idx: 0—

idx: 1—

idx: 2—

idx: 3—

idx: 4—

idx: 5—

idx: 6—

A hash function takes a key and returns an index in the table. The most common method is:

Division Method:
h(key) = key % m
where m is the table size (preferably a prime number).

Example:
- Table size m = 7
- Key = 25
- Hash = 25 % 7 = 4 (store at index 4)

Properties of a Good Hash Function:
- Fast - Should compute quickly
- Deterministic - Same key always gives same hash
- Uniform - Distributes keys evenly across table
- Minimizes collisions - Different keys should give different hashes

C++ Code

#include <iostream>
using namespace std;

class SimpleHashTable {
private:
    int tableSize;
    
public:
    SimpleHashTable(int size) : tableSize(size) {}
    
    // Division method hash function
    int hash(int key) {
        return key % tableSize;
    }
    
    // For string keys - sum of ASCII values
    int hashString(string key) {
        int sum = 0;
        for (char c : key) {
            sum += c;  // Add ASCII value
        }
        return sum % tableSize;
    }
};

Collision Handling: Chaining

Average: O(1 + α) where α = load factor

Interactive

Interactive Chaining (Linked List)

idx0

NULL

idx1

NULL

idx2

NULL

idx3

NULL

idx4

NULL

A collision occurs when two different keys hash to the same index. It's unavoidable!

Method 1: Separate Chaining
Each table slot holds a Linked List. Colliding keys are simply added to the list at that slot.

How it works:
1. Insert: Hash the key. Add it to the list at that index.
2. Search: Hash the key. Traverse the linked list at that index to find it.
3. Delete: Hash the key. Find and remove it from the linked list.

Pros & Cons:
- ✅ Simple to implement
- ✅ Table never "fills up"
- ❌ Extra memory for pointers
- ❌ Worst case O(n) if all keys hash to same index (long chain)

C++ Code

#include <iostream>
#include <vector>
#include <list>
using namespace std;

class HashTableChaining {
private:
    int size;
    vector<list<int>> table;
    
    int hash(int key) { return key % size; }
    
public:
    HashTableChaining(int s) : size(s), table(s) {}
    
    void insert(int key) {
        int idx = hash(key);
        table[idx].push_back(key);
    }
};

Collision Handling: Open Addressing

Average: O(1/(1-α)) | Worst: O(n) when nearly full

Method 2: Open Addressing
All entries are stored in the table array itself. On collision, we probe for the next empty slot.

Types of Probing:

1. Linear Probing:
If slot h(k) is full, try h(k)+1, h(k)+2, h(k)+3...

2. Quadratic Probing:
Try h(k)+1², h(k)+2², h(k)+3²...

3. Double Hashing:
Use second hash: h(k) + i * h₂(k)

Visual (Linear Probing):
Insert 10: hash=3 → slot 3 ✓
Insert 17: hash=3 → slot 3 occupied → try 4 ✓
Insert 24: hash=3 → slot 3,4 occupied → try 5 ✓

Pros:
- ✅ Better cache performance (contiguous memory)
- ✅ No extra pointers needed

Cons:
- ❌ Table can fill up completely
- ❌ Clustering - keys bunch together, slowing search
- ❌ Deletion is tricky (need "deleted" markers)

C++ Code

#include <iostream>
#include <vector>
using namespace std;

class HashTableOpenAddressing {
private:
    int size;
    vector<int> table;
    vector<bool> occupied;
    
    int hash(int key) {
        return key % size;
    }
    
public:
    HashTableOpenAddressing(int s) : size(s), table(s, -1), occupied(s, false) {}
    
    // Insert with linear probing
    void insert(int key) {
        int index = hash(key);
        int originalIndex = index;
        
        // Linear probing
        while (occupied[index]) {
            cout << "Collision at " << index << ", probing..." << endl;
            index = (index + 1) % size;  // Move to next slot
            
            if (index == originalIndex) {
                cout << "Table is full!" << endl;
                return;
            }
        }
        
        table[index] = key;
        occupied[index] = true;
        cout << "Inserted " << key << " at index " << index << endl;
    }
    
    // Search with linear probing
    int search(int key) {
        int index = hash(key);
        int originalIndex = index;
        
        while (occupied[index]) {
            if (table[index] == key) {
                return index;  // Found!
            }
            index = (index + 1) % size;
            
            if (index == originalIndex) break;
        }
        return -1;  // Not found
    }
    
    void display() {
        cout << "Hash Table: ";
        for (int i = 0; i < size; i++) {
            if (occupied[i]) {
                cout << "[" << i << "]:" << table[i] << " ";
            } else {
                cout << "[" << i << "]:_ ";
            }
        }
        cout << endl;
    }
};

int main() {
    HashTableOpenAddressing ht(7);
    
    ht.insert(10);  // 10 % 7 = 3
    ht.insert(17);  // 17 % 7 = 3 → collision → goes to 4
    ht.insert(24);  // 24 % 7 = 3 → collision → goes to 5
    ht.insert(31);  // 31 % 7 = 3 → collision → goes to 6
    
    cout << "\n";
    ht.display();
    
    cout << "\nSearch 17: index " << ht.search(17) << endl;  // 4
    
    return 0;
}

Load Factor & Rehashing

Rehash: O(n) | Amortized insert: O(1)

Load Factor (α) measures how full the hash table is:

``
α = (number of elements) / (table size)
`

Why it matters:
- Low α (< 0.5): Lots of empty space, very fast O(1)
- High α (> 0.7): More collisions, slower performance
- α = 1: Table is completely full!

Threshold Rule:
Keep α ≤ 0.7 (or 0.75) for good performance.

Rehashing:
When α exceeds threshold:
1. Create a new table (usually 2x the size)
2. Re-hash ALL existing elements into new table
3. Delete old table

Example:
`
Table size: 7, Elements: 5
α = 5/7 = 0.71 > 0.7 → TRIGGER REHASH!
New size: 14 (or next prime like 17)
Re-insert all 5 elements with new hash
``

Cost of Rehashing:
- Single rehash is O(n)
- But done rarely, so amortized O(1) per insert

C++ Code

#include <iostream>
#include <vector>
#include <list>
using namespace std;

class HashTableWithRehash {
private:
    int size;
    int count;
    double maxLoadFactor;
    vector<list<int>> table;
    
    int hash(int key) {
        return key % size;
    }
    
    // Rehash to a bigger table
    void rehash() {
        cout << "\n⚡ REHASHING: Load factor exceeded!" << endl;
        cout << "Old size: " << size << ", Elements: " << count << endl;
        
        int oldSize = size;
        vector<list<int>> oldTable = table;
        
        // Double the size (or use next prime)
        size = size * 2;
        table = vector<list<int>>(size);
        count = 0;
        
        // Re-insert all elements
        for (int i = 0; i < oldSize; i++) {
            for (int key : oldTable[i]) {
                insert(key);  // Re-hash into new table
            }
        }
        
        cout << "New size: " << size << endl << endl;
    }
    
public:
    HashTableWithRehash(int s = 7, double mlf = 0.7) 
        : size(s), count(0), maxLoadFactor(mlf), table(s) {}
    
    void insert(int key) {
        int index = hash(key);
        table[index].push_back(key);
        count++;
        
        // Check load factor
        double loadFactor = (double)count / size;
        cout << "Inserted " << key << " | Load Factor: " << loadFactor << endl;
        
        if (loadFactor > maxLoadFactor) {
            rehash();
        }
    }
    
    double getLoadFactor() {
        return (double)count / size;
    }
};

int main() {
    HashTableWithRehash ht(7, 0.7);  // Size 7, max load 0.7
    
    // Insert elements until rehash triggers
    ht.insert(10);
    ht.insert(20);
    ht.insert(30);
    ht.insert(40);
    ht.insert(50);  // This might trigger rehash!
    ht.insert(60);
    
    return 0;
}

HashMap vs HashSet (C++ STL)

All operations: Average O(1), Worst O(n)

C++ provides two main hash-based containers:

unordered_map - HashMap
- Stores key → value pairs
- Each key is unique
- Use when you need to associate data with keys

unordered_set - HashSet
- Stores only unique keys
- No associated value
- Use when you just need to check existence

When to use which?
- Counting frequency → unordered_map
- Two Sum → unordered_map (need index)
- Check if seen before → unordered_set
- Remove duplicates → unordered_set

Feature	unordered_map	unordered_set
Stores	Key-Value pairs	Only Keys
Duplicates	No duplicate keys	No duplicates
Access	map[key]	—
Insert	map[k] = v	set.insert(k)
Check	map.count(k)	set.count(k)
Use Case	Frequency, Lookup	Existence Check

C++ Code

#include <iostream>
#include <unordered_map>
#include <unordered_set>
#include <vector>
using namespace std;

int main() {
    // ========== UNORDERED_MAP ==========
    cout << "=== unordered_map (HashMap) ===" << endl;
    
    unordered_map<string, int> freq;
    
    // Insert / Update
    freq["apple"] = 5;
    freq["banana"] = 3;
    freq["apple"]++;  // Now apple = 6
    
    // Access
    cout << "apple count: " << freq["apple"] << endl;  // 6
    
    // Check existence (SAFE way)
    if (freq.count("mango") == 0) {
        cout << "mango not found" << endl;
    }
    
    // Iterate
    cout << "All items: ";
    for (auto& p : freq) {
        cout << p.first << ":" << p.second << " ";
    }
    cout << endl << endl;
    
    // ========== UNORDERED_SET ==========
    cout << "=== unordered_set (HashSet) ===" << endl;
    
    unordered_set<int> seen;
    
    // Insert
    seen.insert(10);
    seen.insert(20);
    seen.insert(10);  // Duplicate - ignored!
    
    cout << "Size: " << seen.size() << endl;  // 2 (not 3!)
    
    // Check existence
    if (seen.count(20)) {
        cout << "20 exists!" << endl;
    }
    
    // Remove duplicates from array using set
    vector<int> arr = {1, 2, 2, 3, 3, 3, 4};
    unordered_set<int> unique(arr.begin(), arr.end());
    cout << "Unique elements: ";
    for (int x : unique) cout << x << " ";
    cout << endl;
    
    return 0;
}

Problem: Two Sum

Time: O(n) | Space: O(n)

Problem: Given an array and a target sum, find if any two numbers add up to target. Return their indices.

Example:
``
Input: arr = [2, 7, 11, 15], target = 9
Output: [0, 1] (because arr[0] + arr[1] = 2 + 7 = 9)
``

Brute Force: Check all pairs → O(n²) ❌

Hashing Approach: For each number x, check if (target - x) exists in map → O(n) ✅

Algorithm:
1. Create empty hash map
2. For each element x at index i:
- Calculate complement = target - x
- If complement exists in map → Found!
- Else, add x to map with its index
3. Return indices or -1 if not found

C++ Code

#include <iostream>
#include <vector>
#include <unordered_map>
using namespace std;

// Returns indices of two numbers that add up to target
vector<int> twoSum(vector<int>& nums, int target) {
    unordered_map<int, int> seen;  // value -> index
    
    for (int i = 0; i < nums.size(); i++) {
        int complement = target - nums[i];
        
        // Check if complement was seen before
        if (seen.count(complement)) {
            return {seen[complement], i};  // Found!
        }
        
        // Store current number with its index
        seen[nums[i]] = i;
    }
    
    return {};  // No solution found
}

// Variation: Just check if pair exists (no indices needed)
bool hasTwoSum(vector<int>& nums, int target) {
    unordered_set<int> seen;
    
    for (int x : nums) {
        if (seen.count(target - x)) {
            return true;
        }
        seen.insert(x);
    }
    return false;
}

int main() {
    vector<int> arr = {2, 7, 11, 15};
    int target = 9;
    
    vector<int> result = twoSum(arr, target);
    
    if (!result.empty()) {
        cout << "Indices: [" << result[0] << ", " << result[1] << "]" << endl;
        cout << "Values: " << arr[result[0]] << " + " << arr[result[1]] 
             << " = " << target << endl;
    } else {
        cout << "No solution found" << endl;
    }
    
    return 0;
}

/*
Output:
Indices: [0, 1]
Values: 2 + 7 = 9
*/

Problem: First Non-Repeating Character

Time: O(n) | Space: O(k) where k = unique characters

Problem: Given a string, find the first character that doesn't repeat.

Example:
``
Input: "aabbccd"
Output: 'd' (first char with frequency 1)

Input: "aabbcc"
Output: '$' or -1 (no non-repeating char)
``

Approach (Two Pass):
1. First pass: Count frequency of each character using hash map
2. Second pass: Find first character with count == 1

C++ Code

#include <iostream>
#include <unordered_map>
#include <string>
using namespace std;

char firstNonRepeating(const string& s) {
    // Step 1: Count frequency of each character
    unordered_map<char, int> freq;
    for (char c : s) {
        freq[c]++;
    }
    
    // Step 2: Find first character with frequency 1
    for (char c : s) {
        if (freq[c] == 1) {
            return c;
        }
    }
    
    return '$';  // No non-repeating character
}

// Optimized: Using array instead of map (for lowercase only)
char firstNonRepeatingOptimized(const string& s) {
    int freq[26] = {0};  // For 'a' to 'z'
    
    // Count frequencies
    for (char c : s) {
        freq[c - 'a']++;
    }
    
    // Find first with count 1
    for (char c : s) {
        if (freq[c - 'a'] == 1) {
            return c;
        }
    }
    
    return '$';
}

int main() {
    cout << firstNonRepeating("aabbccd") << endl;      // d
    cout << firstNonRepeating("leetcode") << endl;     // l
    cout << firstNonRepeating("aabbcc") << endl;       // $
    cout << firstNonRepeating("z") << endl;            // z
    
    return 0;
}

Problem: Subarray Sum Equals K

Time: O(n) | Space: O(n)

Problem: Given an array of integers (can be negative) and target K, count subarrays that sum to K.

Example:
``
Input: arr = [1, 2, 3, -3, 1, 1, 1, 4, 2, -3], K = 3
Output: 7 subarrays sum to 3
``

Key Insight: Use Prefix Sum + HashMap

If prefixSum[i] - prefixSum[j] = K, then sum of elements from j+1 to i equals K!

Algorithm:
1. Keep running prefix sum
2. Store count of each prefix sum in map
3. At each index, check if (currentSum - K) was seen before
4. Initialize map with {0: 1} to handle subarrays starting at index 0

C++ Code

#include <iostream>
#include <vector>
#include <unordered_map>
using namespace std;

int subarraySum(vector<int>& nums, int k) {
    unordered_map<int, int> prefixCount;
    prefixCount[0] = 1;  // Important! Empty prefix has sum 0
    
    int currentSum = 0;
    int count = 0;
    
    for (int num : nums) {
        currentSum += num;  // Running prefix sum
        
        // Check if (currentSum - k) was seen before
        // If yes, those many subarrays end here with sum k
        int need = currentSum - k;
        if (prefixCount.count(need)) {
            count += prefixCount[need];
        }
        
        // Add current prefix sum to map
        prefixCount[currentSum]++;
    }
    
    return count;
}

// Returns true if ANY subarray sums to k (simpler version)
bool hasSubarraySum(vector<int>& nums, int k) {
    unordered_set<int> prefixSums;
    prefixSums.insert(0);  // Empty prefix
    
    int currentSum = 0;
    
    for (int num : nums) {
        currentSum += num;
        
        if (prefixSums.count(currentSum - k)) {
            return true;
        }
        
        prefixSums.insert(currentSum);
    }
    
    return false;
}

int main() {
    vector<int> arr1 = {1, 1, 1};
    cout << "Subarrays summing to 2: " << subarraySum(arr1, 2) << endl;  // 2
    
    vector<int> arr2 = {1, 2, 3, -3, 1, 1, 1, 4, 2, -3};
    cout << "Subarrays summing to 3: " << subarraySum(arr2, 3) << endl;  // 7
    
    return 0;
}

Problem: Group Anagrams

Sorting: O(n * k log k) | Count key: O(n * k) where k = avg string length

Problem: Given an array of strings, group all anagrams together.

Example:
``
Input: ["eat", "tea", "tan", "ate", "nat", "bat"]
Output: [["eat","tea","ate"], ["tan","nat"], ["bat"]]
``

Key Insight: All anagrams have the same sorted string!
- "eat" → "aet"
- "tea" → "aet"
- "ate" → "aet"

Approach:
1. For each string, sort it to create a "key"
2. Use this key in a hash map
3. All strings with same key are anagrams

C++ Code

#include <iostream>
#include <vector>
#include <string>
#include <unordered_map>
#include <algorithm>
using namespace std;

vector<vector<string>> groupAnagrams(vector<string>& strs) {
    unordered_map<string, vector<string>> groups;
    
    for (string& s : strs) {
        // Create key by sorting the string
        string key = s;
        sort(key.begin(), key.end());
        
        // Add to the group with this key
        groups[key].push_back(s);
    }
    
    // Collect all groups
    vector<vector<string>> result;
    for (auto& p : groups) {
        result.push_back(p.second);
    }
    
    return result;
}

// Faster approach: Use character count as key (O(n*k) vs O(n*k*log(k)))
vector<vector<string>> groupAnagramsFast(vector<string>& strs) {
    unordered_map<string, vector<string>> groups;
    
    for (string& s : strs) {
        // Create key from character counts
        int count[26] = {0};
        for (char c : s) {
            count[c - 'a']++;
        }
        
        // Convert counts to string key
        string key;
        for (int i = 0; i < 26; i++) {
            key += "#" + to_string(count[i]);
        }
        
        groups[key].push_back(s);
    }
    
    vector<vector<string>> result;
    for (auto& p : groups) {
        result.push_back(p.second);
    }
    return result;
}

int main() {
    vector<string> words = {"eat", "tea", "tan", "ate", "nat", "bat"};
    
    auto result = groupAnagrams(words);
    
    cout << "Grouped Anagrams:" << endl;
    for (auto& group : result) {
        cout << "[ ";
        for (string& s : group) {
            cout << s << " ";
        }
        cout << "]" << endl;
    }
    
    return 0;
}

/*
Output:
[ bat ]
[ tan nat ]
[ eat tea ate ]
*/

Common Mistakes & Interview Tips

Know your complexities!

Common Mistakes Beginners Make:

1. Poor Hash Selection
- Using non-prime table size
- Example: m = 10 with keys ending in 0 → all collide!

2. Ignoring Load Factor
- Not resizing leads to slow performance
- Keep α ≤ 0.7

3. Wrong Modulo in C++
``cpp
// WRONG: -3 % 7 = -3 in C++!
int bad = key % m;

// CORRECT: Always positive
int good = ((key % m) + m) % m;
`

4. Mixing map and unordered_map
- map is BST (ordered, O(log n))
- unordered_map is hash (unordered, O(1))

5. Not Handling Key Absence
`cpp
// DANGEROUS: Creates entry if key doesn't exist!
int val = myMap[key];

// SAFE: Check first
if (myMap.count(key)) {
int val = myMap[key];
}
``

6. Two Sum Mistake
- Checking complement AFTER inserting current element
- Can match an element with itself!

7. Assuming Order
- Elements in unordered containers have NO guaranteed order
- Don't rely on iteration order!

C++ Code

#include <iostream>
#include <unordered_map>
using namespace std;

int main() {
    // Mistake 1: Using [] on non-existent key creates it!
    unordered_map<string, int> mp;
    
    // BAD: Creates "ghost" entry with value 0
    if (mp["ghost"] == 0) {
        cout << "This creates the key!" << endl;
    }
    cout << "Size after bad check: " << mp.size() << endl;  // 1!
    
    mp.clear();
    
    // GOOD: Use count() or find()
    if (mp.count("ghost") == 0) {
        cout << "Key doesn't exist (no side effect)" << endl;
    }
    cout << "Size after good check: " << mp.size() << endl;  // 0
    
    // Mistake 2: Negative modulo
    int key = -15;
    int m = 7;
    cout << "Bad mod: " << key % m << endl;           // -1 (wrong!)
    cout << "Good mod: " << ((key % m) + m) % m << endl;  // 6 (correct!)
    
    // Interview Tip: Always clarify
    // - Can there be duplicates?
    // - What to return if not found?
    // - Range of input values?
    
    return 0;
}

Quick Revision Cheatsheet

Hash Table: Maps keys to values via hash function → O(1) average

Hash Function: h(k) = k % m (m should be prime)

Collision: Resolve with chaining or open addressing

Load Factor: α = n/m, keep ≤ 0.7, rehash when exceeded

unordered_map: Key→Value pairs, use for frequency counting

unordered_set: Only keys, use for existence checking

Two Sum Pattern: Check if (target - x) seen before

Subarray Sum: Prefix sum + hash map

Operation

Average

Worst Case

When Best?

O(1)

O(n)

Low load factor

Insert

O(1)

O(n)

Good hash function

Delete

O(1)

O(n)

Few collisions

Space

O(n)

#include <iostream> #include <unordered_map> using namespace std; int main() { // Creating a hash table (unordered_map in C++) unordered_map<string, int> studentMarks; // Insert: O(1) average studentMarks["Rahul"] = 85; studentMarks["Priya"] = 92; studentMarks["Amit"] = 78; // Lookup: O(1) average cout << "Priya's marks: " << studentMarks["Priya"] << endl; // 92 // Check if key exists if (studentMarks.count("Rahul")) { cout << "Rahul found!" << endl; } // Delete: O(1) average studentMarks.erase("Amit"); return 0; }

#include <iostream> using namespace std; class SimpleHashTable { private: int tableSize; public: SimpleHashTable(int size) : tableSize(size) {} // Division method hash function int hash(int key) { return key % tableSize; } // For string keys - sum of ASCII values int hashString(string key) { int sum = 0; for (char c : key) { sum += c; // Add ASCII value } return sum % tableSize; } };

#include <iostream> #include <vector> #include <list> using namespace std; class HashTableChaining { private: int size; vector<list<int>> table; int hash(int key) { return key % size; } public: HashTableChaining(int s) : size(s), table(s) {} void insert(int key) { int idx = hash(key); table[idx].push_back(key); } };

#include <iostream> #include <vector> using namespace std; class HashTableOpenAddressing { private: int size; vector<int> table; vector<bool> occupied; int hash(int key) { return key % size; } public: HashTableOpenAddressing(int s) : size(s), table(s, -1), occupied(s, false) {} // Insert with linear probing void insert(int key) { int index = hash(key); int originalIndex = index; // Linear probing while (occupied[index]) { cout << "Collision at " << index << ", probing..." << endl; index = (index + 1) % size; // Move to next slot if (index == originalIndex) { cout << "Table is full!" << endl; return; } } table[index] = key; occupied[index] = true; cout << "Inserted " << key << " at index " << index << endl; } // Search with linear probing int search(int key) { int index = hash(key); int originalIndex = index; while (occupied[index]) { if (table[index] == key) { return index; // Found! } index = (index + 1) % size; if (index == originalIndex) break; } return -1; // Not found } void display() { cout << "Hash Table: "; for (int i = 0; i < size; i++) { if (occupied[i]) { cout << "[" << i << "]:" << table[i] << " "; } else { cout << "[" << i << "]:_ "; } } cout << endl; } }; int main() { HashTableOpenAddressing ht(7); ht.insert(10); // 10 % 7 = 3 ht.insert(17); // 17 % 7 = 3 → collision → goes to 4 ht.insert(24); // 24 % 7 = 3 → collision → goes to 5 ht.insert(31); // 31 % 7 = 3 → collision → goes to 6 cout << "\n"; ht.display(); cout << "\nSearch 17: index " << ht.search(17) << endl; // 4 return 0; }

#include <iostream> #include <vector> #include <list> using namespace std; class HashTableWithRehash { private: int size; int count; double maxLoadFactor; vector<list<int>> table; int hash(int key) { return key % size; } // Rehash to a bigger table void rehash() { cout << "\n⚡ REHASHING: Load factor exceeded!" << endl; cout << "Old size: " << size << ", Elements: " << count << endl; int oldSize = size; vector<list<int>> oldTable = table; // Double the size (or use next prime) size = size * 2; table = vector<list<int>>(size); count = 0; // Re-insert all elements for (int i = 0; i < oldSize; i++) { for (int key : oldTable[i]) { insert(key); // Re-hash into new table } } cout << "New size: " << size << endl << endl; } public: HashTableWithRehash(int s = 7, double mlf = 0.7) : size(s), count(0), maxLoadFactor(mlf), table(s) {} void insert(int key) { int index = hash(key); table[index].push_back(key); count++; // Check load factor double loadFactor = (double)count / size; cout << "Inserted " << key << " | Load Factor: " << loadFactor << endl; if (loadFactor > maxLoadFactor) { rehash(); } } double getLoadFactor() { return (double)count / size; } }; int main() { HashTableWithRehash ht(7, 0.7); // Size 7, max load 0.7 // Insert elements until rehash triggers ht.insert(10); ht.insert(20); ht.insert(30); ht.insert(40); ht.insert(50); // This might trigger rehash! ht.insert(60); return 0; }

Feature

unordered_map

unordered_set

Stores

Key-Value pairs

Only Keys

Duplicates

No duplicate keys

No duplicates

Access

map[key]

—

Insert

map[k] = v

set.insert(k)

Check

map.count(k)

set.count(k)

Use Case

Frequency, Lookup

Existence Check

#include <iostream> #include <unordered_map> #include <unordered_set> #include <vector> using namespace std; int main() { // ========== UNORDERED_MAP ========== cout << "=== unordered_map (HashMap) ===" << endl; unordered_map<string, int> freq; // Insert / Update freq["apple"] = 5; freq["banana"] = 3; freq["apple"]++; // Now apple = 6 // Access cout << "apple count: " << freq["apple"] << endl; // 6 // Check existence (SAFE way) if (freq.count("mango") == 0) { cout << "mango not found" << endl; } // Iterate cout << "All items: "; for (auto& p : freq) { cout << p.first << ":" << p.second << " "; } cout << endl << endl; // ========== UNORDERED_SET ========== cout << "=== unordered_set (HashSet) ===" << endl; unordered_set<int> seen; // Insert seen.insert(10); seen.insert(20); seen.insert(10); // Duplicate - ignored! cout << "Size: " << seen.size() << endl; // 2 (not 3!) // Check existence if (seen.count(20)) { cout << "20 exists!" << endl; } // Remove duplicates from array using set vector<int> arr = {1, 2, 2, 3, 3, 3, 4}; unordered_set<int> unique(arr.begin(), arr.end()); cout << "Unique elements: "; for (int x : unique) cout << x << " "; cout << endl; return 0; }

#include <iostream> #include <vector> #include <unordered_map> using namespace std; // Returns indices of two numbers that add up to target vector<int> twoSum(vector<int>& nums, int target) { unordered_map<int, int> seen; // value -> index for (int i = 0; i < nums.size(); i++) { int complement = target - nums[i]; // Check if complement was seen before if (seen.count(complement)) { return {seen[complement], i}; // Found! } // Store current number with its index seen[nums[i]] = i; } return {}; // No solution found } // Variation: Just check if pair exists (no indices needed) bool hasTwoSum(vector<int>& nums, int target) { unordered_set<int> seen; for (int x : nums) { if (seen.count(target - x)) { return true; } seen.insert(x); } return false; } int main() { vector<int> arr = {2, 7, 11, 15}; int target = 9; vector<int> result = twoSum(arr, target); if (!result.empty()) { cout << "Indices: [" << result[0] << ", " << result[1] << "]" << endl; cout << "Values: " << arr[result[0]] << " + " << arr[result[1]] << " = " << target << endl; } else { cout << "No solution found" << endl; } return 0; } /* Output: Indices: [0, 1] Values: 2 + 7 = 9 */

#include <iostream> #include <unordered_map> #include <string> using namespace std; char firstNonRepeating(const string& s) { // Step 1: Count frequency of each character unordered_map<char, int> freq; for (char c : s) { freq[c]++; } // Step 2: Find first character with frequency 1 for (char c : s) { if (freq[c] == 1) { return c; } } return '$'; // No non-repeating character } // Optimized: Using array instead of map (for lowercase only) char firstNonRepeatingOptimized(const string& s) { int freq[26] = {0}; // For 'a' to 'z' // Count frequencies for (char c : s) { freq[c - 'a']++; } // Find first with count 1 for (char c : s) { if (freq[c - 'a'] == 1) { return c; } } return '$'; } int main() { cout << firstNonRepeating("aabbccd") << endl; // d cout << firstNonRepeating("leetcode") << endl; // l cout << firstNonRepeating("aabbcc") << endl; // $ cout << firstNonRepeating("z") << endl; // z return 0; }

#include <iostream> #include <vector> #include <unordered_map> using namespace std; int subarraySum(vector<int>& nums, int k) { unordered_map<int, int> prefixCount; prefixCount[0] = 1; // Important! Empty prefix has sum 0 int currentSum = 0; int count = 0; for (int num : nums) { currentSum += num; // Running prefix sum // Check if (currentSum - k) was seen before // If yes, those many subarrays end here with sum k int need = currentSum - k; if (prefixCount.count(need)) { count += prefixCount[need]; } // Add current prefix sum to map prefixCount[currentSum]++; } return count; } // Returns true if ANY subarray sums to k (simpler version) bool hasSubarraySum(vector<int>& nums, int k) { unordered_set<int> prefixSums; prefixSums.insert(0); // Empty prefix int currentSum = 0; for (int num : nums) { currentSum += num; if (prefixSums.count(currentSum - k)) { return true; } prefixSums.insert(currentSum); } return false; } int main() { vector<int> arr1 = {1, 1, 1}; cout << "Subarrays summing to 2: " << subarraySum(arr1, 2) << endl; // 2 vector<int> arr2 = {1, 2, 3, -3, 1, 1, 1, 4, 2, -3}; cout << "Subarrays summing to 3: " << subarraySum(arr2, 3) << endl; // 7 return 0; }

#include <iostream> #include <vector> #include <string> #include <unordered_map> #include <algorithm> using namespace std; vector<vector<string>> groupAnagrams(vector<string>& strs) { unordered_map<string, vector<string>> groups; for (string& s : strs) { // Create key by sorting the string string key = s; sort(key.begin(), key.end()); // Add to the group with this key groups[key].push_back(s); } // Collect all groups vector<vector<string>> result; for (auto& p : groups) { result.push_back(p.second); } return result; } // Faster approach: Use character count as key (O(n*k) vs O(n*k*log(k))) vector<vector<string>> groupAnagramsFast(vector<string>& strs) { unordered_map<string, vector<string>> groups; for (string& s : strs) { // Create key from character counts int count[26] = {0}; for (char c : s) { count[c - 'a']++; } // Convert counts to string key string key; for (int i = 0; i < 26; i++) { key += "#" + to_string(count[i]); } groups[key].push_back(s); } vector<vector<string>> result; for (auto& p : groups) { result.push_back(p.second); } return result; } int main() { vector<string> words = {"eat", "tea", "tan", "ate", "nat", "bat"}; auto result = groupAnagrams(words); cout << "Grouped Anagrams:" << endl; for (auto& group : result) { cout << "[ "; for (string& s : group) { cout << s << " "; } cout << "]" << endl; } return 0; } /* Output: [ bat ] [ tan nat ] [ eat tea ate ] */

#include <iostream> #include <unordered_map> using namespace std; int main() { // Mistake 1: Using [] on non-existent key creates it! unordered_map<string, int> mp; // BAD: Creates "ghost" entry with value 0 if (mp["ghost"] == 0) { cout << "This creates the key!" << endl; } cout << "Size after bad check: " << mp.size() << endl; // 1! mp.clear(); // GOOD: Use count() or find() if (mp.count("ghost") == 0) { cout << "Key doesn't exist (no side effect)" << endl; } cout << "Size after good check: " << mp.size() << endl; // 0 // Mistake 2: Negative modulo int key = -15; int m = 7; cout << "Bad mod: " << key % m << endl; // -1 (wrong!) cout << "Good mod: " << ((key % m) + m) % m << endl; // 6 (correct!) // Interview Tip: Always clarify // - Can there be duplicates? // - What to return if not found? // - Range of input values? return 0; }

Quick Revision Cheatsheet

Hash Table: Maps keys to values via hash function → O(1) average

Hash Function: h(k) = k % m (m should be prime)

Collision: Resolve with chaining or open addressing

Load Factor: α = n/m, keep ≤ 0.7, rehash when exceeded

unordered_map: Key→Value pairs, use for frequency counting

unordered_set: Only keys, use for existence checking

Two Sum Pattern: Check if (target - x) seen before

Subarray Sum: Prefix sum + hash map