Skip to main content

Command Palette

Search for a command to run...

Hash Tables Didn't Click — And I'm Okay With That

Updated
7 min read
Hash Tables Didn't Click — And I'm Okay With That
A
Backend-focused Python developer documenting technical learnings, projects, and software engineering fundamentals with an emphasis on clean code and practical problem-solving. GitHub: https://github.com/AvinashNegi1999

Day 16 into my DSA journey, and today was the first day I sat with Hash Tables.

I didn't finish it feeling confident.

Every topic before this — linked lists, stacks, queues, BST — had one thing in common. Nodes. I could draw them. I could visualize the chain. I could mentally walk through the structure.

Hash Tables have no nodes. No chain. No pointers to follow.

A key goes in. Math happens. An index comes out.

That broke my mental model completely.


What Is a Hash Table?

A Hash Table stores data as key-value pairs.

# Python's dictionary IS a hash table
person = {
    "name": "Avinash",
    "age": 26
}

print(person["name"])  # → Avinash

You give it a key, it gives you back a value. Instantly.

That "instantly" is the whole point. No searching. No looping. Direct access. That's why hash tables appear in almost every DSA interview.


How Does It Find Things So Fast?

It uses a hash function — something that converts your key into an index number.

"name"  →  hash function  →  4

Your value gets stored at index 4 in an array. When you search, it hashes the key again, jumps directly to index 4, done.

No loop. No chain to follow. Direct jump. That's O(1).


Two Properties Every Hash Function Has

1. One-Way

Key → index. Never backwards.

"name"  →  hash  →  4   ✅
4       →  hash  →  ???  ❌ impossible

This is also why passwords are stored as hashes — even the database owner can't reverse it back to your original password.

2. Deterministic

Same input always gives same output. Every time.

"name"  →  hash  →  4
"name"  →  hash  →  4
"name"  →  hash  →  4  ✅ always 4

If it gave random indexes, you'd never find your data again.


The Part That Broke My Brain: How Does a Key Become an Index?

With BST, I could trace every step visually:

Looking for 40?
→ 40 < 50, go left
→ 40 > 30, go right
→ found 40

Physical. Logical. Clear.

Hash tables don't work like that. I kept asking — but how does "name" actually become 4?

Here's the honest answer, step by step.

Say we're hashing "ab" and our array has 7 slots (index 0–6):

# letter "a"
ord("a")        →  97         # convert letter to number
97 * 232231       # multiply by prime number
(0 + 2231) % 75          # keep within 0-6
my_hash = 5

# letter "b"
ord("b")        →  98
98 * 232254
(5 + 2254) % 73
my_hash = 3

# final answer
"ab"  →  index 3

Every time you hash "ab" — you always land on 3. That's determinism.

The mental model that helped me stop overthinking it:

Forget the formula. Think of it as a machine with one job — take a word, spit out a number between 0 and 6.

"name"  →  🔧 machine  →  4
"age"   →  🔧 machine  →  1
"city"  →  🔧 machine  →  4

You don't need to memorize the formula. Just know: letters become numbers, numbers get crunched, % keeps the result inside array bounds.


What Is a Collision?

Sometimes two different keys hash to the same index.

"name"  →  hash  →  4
"city"  →  hash  →  4  ⚠️ both want index 4

That's a collision. Two ways to handle it:

Separate Chaining — store a list at each index.

index 4  →  [ ["name", "Avinash"], ["city", "Kotdwara"] ]

Linear Probing — if slot is taken, move to next empty slot.

"city"  →  index 4 taken ❌  →  index 5 empty ✅

Building It in Python

Constructor:

class HashTable:
    def __init__(self, size=7):
        self.data_map = [None] * size

7 empty slots. Prime number to reduce collisions.

Hash Function:

def __hash(self, key):
    my_hash = 0
    for letter in key:
        my_hash = (my_hash + ord(letter) * 23) % len(self.data_map)
    return my_hash

One thing that confused me here — why __hash with double underscore?

The double underscore makes it private. It's a Python convention that says: this method is internal, only used inside this class. No one outside should call it directly.

ht = HashTable()
ht.__hash("name")   # ❌ not meant to be called like this

It's not magic. It's just a naming rule that says hands off, this is internal.

Set (Insert):

def set(self, key, value):
    index = self.__hash(key)
    if self.data_map[index] is None:
        self.data_map[index] = []
    self.data_map[index].append([key, value])

Hash the key → get index → create list if empty → append pair.

Get (Search):

def get(self, key):
    index = self.__hash(key)
    if self.data_map[index] is not None:
        for pair in self.data_map[index]:
            if pair[0] == key:
                return pair[1]
    return None

Hash the key → jump to index → scan list for matching key → return value.

Keys:

def keys(self):
    all_keys = []
    for bucket in self.data_map:
        if bucket is not None:
            for pair in bucket:
                all_keys.append(pair[0])
    return all_keys

Visit every slot → collect every key → return list.


Big O

MethodTime
setO(1) average
getO(1) average
deleteO(1) average
keysO(n)

set, get, delete are O(1) — hashing jumps directly to the index, no searching.

keys is O(n) — no shortcut, must visit every slot.

Hash Table vs Array vs Linked List:

OperationArrayLinked ListHash Table
SearchO(n)O(n)O(1)
InsertO(n)O(1)O(1)
DeleteO(n)O(1)O(1)

First Interview Problem: Two Sum

Given a list of numbers and a target, return indices of two numbers that add up to the target.

nums = [2, 7, 11, 15],  target = 9
output → [0, 1]

Brute force — O(n²):

def two_sum(nums, target):
    for i in range(len(nums)):
        for j in range(i + 1, len(nums)):
            if nums[i] + nums[j] == target:
                return [i, j]

Hash table solution — O(n):

def two_sum(nums, target):
    seen = {}

    for i, num in enumerate(nums):
        complement = target - num

        if complement in seen:
            return [seen[complement], i]

        seen[num] = i

The insight: if I need 9 and I'm at 2, I need 9 - 2 = 7. Have I seen 7 before?

i=0, num=2 → need 7 → not seen → store {2: 0}
i=1, num=7 → need 2 → found! → return [0, 1]

One loop. O(1) lookup. O(n) total.


Where I'm At Honestly

This is Day 1 with hash tables. Day 16 of my overall DSA journey.

I built the code. I understand what it does at a surface level. But if you asked me to close my laptop and explain the whole thing from memory right now?

Not there yet.

Hash tables feel fundamentally different from everything I've learned so far. Linked lists, stacks, queues, BST — they all had nodes I could draw and visualize. Hash tables have no nodes. No pointers. Just a key, some math, and an index.

That mental shift is harder than I expected.


My next step is watching some Hindi YouTube explanations to see if a different perspective makes it click. Sometimes reading English technical content isn't enough — hearing it explained in your first language hits differently.

After that — Graphs.


Current Stats

Day in journey: 16 Topics completed: Linked Lists, Stacks, Queues, BST, Hash Tables (Day 1) Daily study time: 4 hours . Confidence with Hash Tables: Built it. Used it. Still processing. Next topic: Graphs


Learning DSA too? Especially if you're self-taught or came from a non-CS background — drop a comment. Would love to know what clicked for you and what didn't.

— Avinash Negi


Resources:

More from this blog

D

Data Structures and Algorithms (Python)

11 posts