Data Structures Explained in Under 10 Minutes
Data structures are a fundamental concept in computer science, crucial for anyone preparing for technical coding interviews. This article breaks down several of the most popular data structures, explaining what they look like, how they operate, their real-world examples, and their time complexities.
A Quick Guide to Big O Notation
Before diving into any particular data structure, it's essential to understand a core concept: Big O notation. This is our way of measuring the speed and efficiency of algorithms, showing how well data structures perform in different scenarios.
Think of it like transportation. To get from New York to California, you could walk (taking a month), bike (a couple of weeks), drive (a few days), or fly (a few hours). Different vehicles are optimized for different scenarios, just as various data structures are. For vehicles, we measure speed in miles per hour; for data structures, we use time complexity, or Big O notation.
Here are a few key types of Big O notation to know:
1. O(1) — Constant Time
This is the fastest an algorithm can operate—think rocket ship speed. An O(1)
operation takes the same amount of time regardless of the input data size. Imagine you have a grocery list and you only need to grab the first item. Whether the list has five items or 5,000, the effort is identical. You just take the first one instantly.
2. O(n) — Linear Time
This is still a very fast operation, like a supercar. With O(n)
, as the amount of data (n) increases, the time it takes to complete the operation grows at the same rate. For example, if you need to find a specific name in a list, the worst-case scenario involves checking every single name. A list of 10 names might require 10 checks, while a list of 1,000 names could require 1,000 checks. The increase is predictable.
3. O(n²) — Quadratic Time
This is significantly slower, like biking from New York to California. Quadratic time occurs when every item in a dataset needs to interact with every other item. Imagine a classroom where each student must shake hands with every other student. With 10 students, that’s 100 handshakes. With 100 students, it’s 10,000 handshakes. An algorithm with O(n²)
complexity is generally considered inefficient, and a better approach often exists.
4. O(log n) — Logarithmic Time
This complexity is faster than O(n)
but slower than O(1)
—think speed of sound. A classic example is looking up a word in a dictionary. You don't scan every page (O(n)
). Instead, you open to the middle, decide if your word comes before or after, and then jump forward or backward, halving the search area each time. The search space shrinks rapidly. With logarithmic growth, as the input size increases, the output grows very slowly.
Now that we understand how performance is measured, let's explore our first data structure.
The Array
Think of an array like a row of middle school lockers. Each locker has a fixed position (an index), and you can go directly to any of them just by knowing its number. Arrays store data in contiguous memory locations, meaning all elements are placed side-by-side.
Because each element has a specific index, retrieving a value is extremely fast. However, if the array is full, adding a new element is tricky. You can't just squeeze another item in. You must create a new, larger array and copy all existing elements over before adding the new one.
Time Complexity
- Accessing (by index):
O(1)
— Super efficient. - Inserting (at a position):
O(1)
if you are replacing an existing item. However, inserting a value beyond the array's size requires creating a new array, which is anO(n)
operation. - Deleting:
O(n)
because when an element is removed, all subsequent elements must shift down to maintain the contiguous structure. Deleting the last element isO(1)
since no shifting is needed.
The Linked List
Imagine a linked list as a train where each car is connected to the next. To reach a specific car, you can't teleport; you must start from the engine and move through each car until you find the one you need.
In a linked list, each element is a separate "node" containing both a value and a pointer to the next node. This structure makes linked lists more flexible than arrays for adding or removing elements, as there's no need to shift everything around. The trade-off is that accessing elements is slower since you must traverse the list from the beginning.
Time Complexity
- Accessing (by index):
O(n)
because you have to traverse the list. - Inserting:
O(1)
if you already have a reference to the node you're inserting after. If you need to find that node first, it becomes anO(n)
operation. - Deleting:
O(1)
if you have a reference to the node. Otherwise, it'sO(n)
to search for it. A key advantage is that no resizing or element shifting is required.
The Stack
A stack is a simple structure following the Last-In, First-Out (LIFO) principle. The last element added is the first one to be removed. Think of a stack of poker chips; you must take the top one before you can get to the next. You can only access or modify elements from the top, which makes insertion and retrieval highly efficient.
Time Complexity
- Accessing (by index): Not a standard operation. Searching for an element requires traversing the stack, making it
O(n)
. - Inserting (Push):
O(1)
because you simply add the element to the top. - Deleting (Pop):
O(1)
because you only remove the top element.
The Queue
Unlike a stack, a queue follows the First-In, First-Out (FIFO) principle. The first element added is the first to be removed. Think of standing in line at a store; the person at the front of the line is served first. New people join at the back and wait their turn. Elements are added to the back (enqueued) and removed from the front (dequeued).
Time Complexity
- Accessing (by index): Similar to a stack, this is not a typical operation and would be
O(n)
. - Inserting (Enqueue):
O(1)
. - Deleting (Dequeue):
O(1)
.
The Heap (Priority Queue)
Think of a heap as a pyramid of stacked boxes where the most important box (either smallest or largest) is always at the top. You always take from the top, and if any box is removed, the pyramid readjusts to ensure a new box takes its place at the top, maintaining the structure.
There are two types of heaps: * Min Heap: The parent node is always smaller than its children, so the smallest element is at the top (the root). * Max Heap: The parent node is always larger than its children, so the largest element is at the top.
Time Complexity
- Accessing:
O(1)
to access the top element (the root). Accessing any other element is anO(n)
operation. - Inserting:
O(log n)
. The element is added, and then it "bubbles up" to its correct position to maintain the heap property. - Removing:
O(log n)
. The element is removed, and the heap "bubbles down" to restore its structure.
The Hash Map
A hash map is like a mailroom where every employee has a dedicated mailbox. Instead of sorting through all the mail, you just find the mailbox number for "John" or "Sally." A hash map stores data in key-value pairs. Each key is processed by a hash function, which generates an index where the value is stored.
For example, a hash function could be the length of a string. "John" (4 letters) goes in index 4, and "Sally" (5 letters) goes in index 5. This makes lookups extremely efficient. The main challenge is a hash collision, which occurs when two keys hash to the same index (e.g., "Andy," also 4 letters). This can be resolved with techniques like chaining (creating a linked list at that index), but it can slow down performance.
Time Complexity
- Accessing (by key):
O(1)
on average. In the worst-case scenario (many collisions), it can becomeO(n)
. - Inserting:
O(1)
on average,O(n)
in the worst case. - Deleting:
O(1)
on average,O(n)
in the worst case.
The Binary Search Tree (BST)
A binary search tree looks similar to a family tree, but each node follows a specific ordering rule: the left child's value must be smaller than the parent's, and the right child's value must be greater. This structure makes searching, insertion, and deletion very efficient.
With each comparison, you can eliminate half of the remaining tree, similar to searching in a dictionary. However, this efficiency depends on the tree being balanced. If the tree is unbalanced (e.g., all new nodes are greater than the last), it can degenerate into a structure resembling a linked list, and performance suffers.
Time Complexity
- Accessing, Inserting, Deleting:
O(log n)
on average for a balanced tree. In the worst case (an unbalanced tree), these operations becomeO(n)
.
The Set
A set is an unordered collection of unique elements. Think of it like a collection of powerful, unique gems; you can't add the same gem twice. The order of the elements doesn't matter. Sets are often implemented using a hash table, giving them the same efficiency as a hash map for checking the existence of an element.
Sets are very useful for tasks like removing duplicates from a list or keeping track of visited nodes when traversing a graph.
Time Complexity (for Hash Set)
- Inserting:
O(1)
on average,O(n)
in the worst case due to hash collisions. - Deleting:
O(1)
on average,O(n)
in the worst case.
Join the 10xdev Community
Subscribe and get 8+ free PDFs that contain detailed roadmaps with recommended learning periods for each programming language or field, along with links to free resources such as books, YouTube tutorials, and courses with certificates.