# CSE 1305 Algorithms and Data Structures — Complete Summary

---

## Lecture 1: Algorithm Analysis and Operation Counting

### Key Concepts
- **Algorithm**: A finite, well-defined sequence of steps to solve a problem. Properties: input, output, definiteness, finiteness, effectiveness.
- **Empirical analysis**: Run the algorithm on different inputs and measure actual runtime. Problems: depends on hardware, programming language, implementation quality, input data.
- **Theoretical analysis**: Count the number of **basic operations** executed. Independent of hardware/language.
- **Basic operations**: arithmetic operations, comparisons, assignments, array accesses, method calls (for non-recursive calls).
- **Cost model**: Assign a cost to each operation type, sum them up.
- **Operation counting**: Identify the **dominant operation** (usually comparisons or assignments) and count how many times it executes as a function of input size $n$.

### Method: Analyze an Algorithm by Operation Counting
1. **Identify the input size** $n$ (e.g., array length, number of elements).
2. **Identify the dominant operation** (the operation executed most frequently).
3. **Determine best case, worst case, and average case** scenarios.
4. **Set up a summation** for the number of executions of the dominant operation.
5. **Evaluate the summation** to get a closed-form expression in terms of $n$.
6. **State the complexity** using asymptotic notation.

### WARNING
- Do not count every operation; focus on the **dominant operation**.
- For nested loops, the innermost operation is usually dominant.
- The cost of a single operation is assumed to be $O(1)$ (constant time).

---

## Lecture 2: Big-O Notation

### Key Concepts
- **Big-O (upper bound)**: $f(n)$ is $O(g(n))$ if we can put a factor before $g(n)$ such that the result is always greater than or equal to $f(n)$ for values of $n$ above some value.
- **Predicate logic definition**: $f(n)$ is $O(g(n))$ if and only if:
  $$\exists c, n_0 \ (0 < c \land 0 < n_0 \land (\forall n \ (n \geq n_0 \Rightarrow f(n) \leq c \cdot g(n))))$$
- **Big-Omega (lower bound)**: $f(n)$ is $\Omega(g(n))$ if there exists $c > 0, n_0 > 0$ such that for all $n \geq n_0$, $f(n) \geq c \cdot g(n)$.
- **Big-Theta (tight bound)**: $f(n)$ is $\Theta(g(n))$ if $f(n)$ is both $O(g(n))$ and $\Omega(g(n))$, i.e., $f(n)$ is bounded both above and below by $c \cdot g(n)$.
- **Common complexity classes** (from fastest to slowest):
  $O(1)$, $O(\log \log n)$, $O(\log n)$, $O(\sqrt{n})$, $O(n)$, $O(n \log n)$, $O(n^2)$, $O(n^3)$, $O(2^n)$, $O(n!)$

### Method: Prove $f(n)$ is $O(g(n))$
1. **Choose** a specific $c > 0$ and $n_0 > 0$.
2. **Take** an arbitrary $n \geq n_0$.
3. **Show** that $f(n) \leq c \cdot g(n)$ holds.
4. Verify that $c$ and $n_0$ are constants independent of $n$.

### Method: Prove $f(n)$ is NOT $O(g(n))$
1. Show that for any $c > 0, n_0 > 0$, there exists an $n \geq n_0$ such that $f(n) > c \cdot g(n)$.
2. Alternatively, show that $\lim_{n \to \infty} \frac{f(n)}{g(n)} = \infty$.

### Key Formulas/Properties
- **Dropping lower-order terms**: $3n^2 + 5n + 49$ is $O(n^2)$.
- **Dropping constants**: $5n \log n$ is $O(n \log n)$.
- **Polynomial of degree $k$**: $a_k n^k + \ldots + a_1 n + a_0$ is $O(n^k)$.
- **Logarithms**: $\log_a n = \frac{\log_b n}{\log_b a}$, so all logarithms differ by a constant factor.
- **Hierarchy**: $\log n \ll n \ll n \log n \ll n^2 \ll n^3 \ll 2^n \ll n!$

### WARNING
- $f(n)$ is $O(g(n))$ does **not** mean $g(n)$ is the tightest bound; it is an upper bound. For the tightest bound, use $\Theta$.
- In proofs, you **choose** $c$ and $n_0$; you do **not** show the inequality holds for arbitrary $c, n_0$.

---

## Lecture 3: Recursion and Recurrence Equations

### Key Concepts
- **Recursion**: A method that calls itself with a smaller input.
- **Types of recursion**: Linear (one recursive call), Binary (two recursive calls), Multiple (more than two), Tail recursion (recursive call is the last action).
- **Recurrence equation**: Describes the runtime of a recursive algorithm in terms of the input size.
- **Base case**: The smallest input for which the answer is known directly.
- **Recursive step**: The relation that reduces the problem to a smaller instance.

### Method: Unfold a Recurrence Equation
1. Write the recurrence: $T(n) = \text{non-recursive cost} + T(\text{smaller input})$.
2. Substitute $T(\text{smaller input})$ using the recurrence again.
3. Repeat the substitution pattern to find the $k$-th iteration.
4. Determine $k$ from the base case condition.
5. Substitute $k$ back to get a closed-form expression.

### Method: Solve $T(n) = c + T(n/2)$ (Binary Search Pattern)
1. $T(n) = c + T(n/2) = c + (c + T(n/4)) = 2c + T(n/4)$.
2. After $k$ substitutions: $T(n) = k \cdot c + T(n/2^k)$.
3. Set $n/2^k = 1$, so $k = \log_2 n$.
4. $T(n) = c \log_2 n + T(1) = c \log_2 n + c_2$.
5. Result: $O(\log n)$.

### Method: Solve $T(n) = c + 2T(n-1)$ (Linear Recursion)
1. After $k$ substitutions: $T(n) = (2^k - 1)c + 2^k T(n-k)$.
2. Set $k = n$ (base case at $T(0)$).
3. $T(n) = (2^n - 1)c + 2^n T(0) = (2^n - 1)c + 2^n c_0$.
4. Result: $O(2^n)$.

### Method: Master Theorem (for $T(n) = aT(n/b) + cn^k$)
1. Identify $a$ (number of subproblems), $b$ (factor by which input shrinks), $k$ (polynomial degree of combining cost).
2. Compare $n^{\log_b a}$ with $n^k$:
   - If $a < b^k$ (i.e., $\log_b a < k$): $T(n) = O(n^k)$ — work increases toward leaves.
   - If $a = b^k$ (i.e., $\log_b a = k$): $T(n) = O(n^k \log n)$ — work is same at each level.
   - If $a > b^k$ (i.e., $\log_b a > k$): $T(n) = O(n^{\log_b a})$ — work increases toward root.

### WARNING
- The Master Theorem applies only to recurrences of the form $T(n) = aT(n/b) + \Theta(n^k)$.
- For recurrences like $T(n) = T(n-1) + T(n-2)$ (Fibonacci), use unfolding or other techniques.

---

## Lecture 4: Space Complexity

### Key Concepts
- **Java memory model**: Two parts — the **call stack** (method arguments, local variables, return addresses) and the **heap** (arrays, objects).
- **Space complexity** is analyzed similarly to time complexity: count the maximum amount of memory used.
- **Stack frames**: Each method call creates a stack frame containing its local variables and parameters.
- **Recursion and space**: Deep recursion creates many stack frames, increasing space complexity to $O(\text{depth})$.
- **Tail recursion**: The recursive call is the last action; can be optimized to use $O(1)$ stack space.

### Method: Analyze Space Complexity
1. Identify all variables and data structures created during execution.
2. For recursion, determine the maximum depth of the call stack.
3. Each stack frame contributes $O(1)$ space (for primitive variables and references).
4. Account for any additional data structures allocated on the heap.
5. Sum up: total space = stack space + heap space.

### Key Formulas/Properties
- **Iterative algorithm** with fixed variables: $O(1)$ space (ignoring input storage).
- **Linear recursion** (e.g., factorial): $O(n)$ space for the call stack.
- **Binary recursion** (e.g., Fibonacci without memoization): $O(n)$ space for the call stack depth.
- **Divide and conquer** (e.g., merge sort): $O(\log n)$ stack depth + $O(n)$ auxiliary space for merging.

### WARNING
- Shallow copy vs. deep copy matters: a shallow copy of a linked list shares the same nodes; changes to nodes in the copy affect the original.
- Java passes references by value: modifying the object pointed to by a reference affects the original object.

---

## Lecture 5: Arrays and Dynamic Arrays

### Key Concepts
- **Array**: Fixed-size contiguous memory allocation. Access by index in $O(1)$.
- **Dynamic array**: Resizable array that automatically grows when full (e.g., `ArrayList` in Java, typically grows by a factor of 3/2 or 2).
- **Amortized analysis**: Analyze the average cost per operation over a sequence of operations, even if individual operations can be expensive.

### Method: Amortized Analysis (Aggregate Method)
1. Identify the expensive operation (e.g., array resize).
2. Determine when it occurs (every $k$ insertions, when array fills up).
3. Sum the total cost of all $n$ insertions: $n$ (for normal insertions) + cost of all resizes.
4. The resize cost: doubling from 1 to $n$ requires $\sum_{i=0}^{\log_2 n} 2^i = 2^{\log_2 n + 1} - 1 = 2n - 1$ element copies.
5. Total cost: $n + (2n - 1) = 3n - 1$.
6. Amortized cost per operation: $\frac{3n - 1}{n} \approx O(1)$.

### Key Formulas/Properties
- **Array access/set**: $O(1)$.
- **Array search**: $O(n)$ (linear scan).
- **Dynamic array insert at end**: amortized $O(1)$.
- **Dynamic array insert at middle/beginning**: $O(n)$ (need to shift elements).
- **Dynamic array delete at end**: amortized $O(1)$.
- **Dynamic array delete at middle/beginning**: $O(n)$ (need to shift elements).
- **Growing factor must be $> 1$**: Using arithmetic progression (fixed increment) gives $O(n)$ per insertion amortized.

### WARNING
- When removing from a dynamic array, shrinking is typically done when the array is less than 1/4 full (to avoid thrashing).
- Arrays have fixed capacity; attempting to access index $\geq$ size throws an exception.

---

## Lecture 6: Linked Lists

### Key Concepts
- **Singly Linked List (SLL)**: Each node has an element and a reference to the next node.
- **Doubly Linked List (DLL)**: Each node has an element, a reference to the next node, and a reference to the previous node.
- **Circularly Linked List**: The last node's next pointer points back to the first node (or a header).
- **Sentinel nodes**: Dummy header/trailer nodes that simplify edge cases (empty list, first/last insertion).

### Method: Insert Into a Doubly Linked List
1. Create a new node.
2. Set `newNode.prev = current.prev`.
3. Set `newNode.next = current`.
4. Set `current.prev.next = newNode`.
5. Set `current.prev = newNode`.
6. Requires 4 pointer changes (with sentinels).

### Method: Delete from a Doubly Linked List
1. Set `node.prev.next = node.next`.
2. Set `node.next.prev = node.prev`.
3. Requires 2 pointer changes.

### Complexity Comparison for Linear Data Structures

| Operation | Array | SLL (with head ref) | DLL (with head + tail) | Circular SLL (with tail ref) |
|---|---|---|---|---|
| Access element $i$ | $O(1)$ | $O(n)$ | $O(n)$ | $O(n)$ |
| Set element $i$ | $O(1)$ | $O(n)$ | $O(n)$ | $O(n)$ |
| Add to head | $O(n)$ | $O(1)$ | $O(1)$ | $O(1)$ |
| Add to tail | $O(n)$ (amortized for dynamic) | $O(n)$ | $O(1)$ | $O(1)$ |
| Add at index $i$ | $O(n)$ | $O(n)$ | $O(n)$ | $O(n)$ |
| Remove from head | $O(n)$ | $O(1)$ | $O(1)$ | $O(1)$ |
| Remove from tail | $O(n)$ | $O(n)$ | $O(1)$ | $O(n)$ |
| Remove at index $i$ | $O(n)$ | $O(n)$ | $O(n)$ | $O(n)$ |
| Concatenate 2 lists | $O(n)$ | $O(1)$ with tail ref | $O(1)$ with tail ref | $O(1)$ with tail ref |
| Clone | $O(n)$ | $O(n)$ | $O(n)$ | $O(n)$ |

### WARNING
- In a SLL, adding to the tail requires $O(n)$ traversal unless you maintain a tail reference.
- In a DLL without sentinels, inserting at the head/tail requires checking if the list is empty.
- A SLL clone must create **new nodes** (deep copy), not just copy references (shallow copy).

---

## Lecture 7: Stacks and Queues

### Key Concepts
- **Stack**: LIFO (Last In, First Out) data structure. Operations: push, pop, peek, isEmpty, size.
- **Queue**: FIFO (First In, First Out) data structure. Operations: enqueue, dequeue, front, isEmpty, size.
- **Deque** (Double-ended queue): Supports insertion and deletion at both ends. Operations: addFirst, addLast, removeFirst, removeLast.

### Method: Implement a Stack Using a Dynamic Array
1. Use an array to store elements.
2. Maintain an index `top` pointing to the top element.
3. **push**: increment top, store element at `top`.
4. **pop**: return element at `top`, decrement top.
5. **peek**: return element at `top` without modifying.
6. Handle resizing when full/empty.

### Method: Implement a Queue Using a Circular Array
1. Use an array of size $N$ and two indices: `head` and `tail`.
2. **enqueue**: store at `tail`, advance `tail = (tail + 1) % N`.
3. **dequeue**: return element at `head`, advance `head = (head + 1) % N`.
4. Track `size` to distinguish full from empty.
5. Resize the array when full (double) or too empty (halve).

### Key Formulas/Properties
- **Stack implementations**:
  - Array-based: push/pop are $O(1)$ (amortized for dynamic array).
  - Linked list-based: push/pop are $O(1)$ at the head.
- **Queue implementations**:
  - Circular array: enqueue/dequeue are $O(1)$ (amortized).
  - Doubly linked list: enqueue/dequeue are $O(1)$.
  - Singly linked list with tail ref: enqueue is $O(1)$, dequeue is $O(1)$.

### WARNING
- A circular array queue with $N$ slots can store at most $N-1$ elements when using the empty/full convention. Better to track size explicitly.
- Stack overflow occurs when the stack exceeds allocated space; common with deep recursion.

---

## Lecture 8: Positional Lists and Iterators

### Key Concepts
- **Positional List**: A sequence abstraction where each element has a position (index). Supports operations like first(), last(), before(p), after(p), insertBefore(p, e), insertAfter(p, e), replace(p, e), remove(p).
- **Iterator**: Provides sequential access to elements of a collection without exposing the underlying structure.
- **Lazy iterator**: The iterator is created at a point in time and reflects subsequent changes to the collection.
- **Snapshot iterator**: Captures a copy of the collection's elements at the time of creation; subsequent changes to the collection do not affect the iterator.

### Method: Iterator on a Positional List
1. **Lazy**: Store a reference to the current position. On `next()`, move to the next position. Check for `ConcurrentModificationException` using a modification counter.
2. **Snapshot**: Store an array or copy of all elements. On `next()`, advance through the copy.

### WARNING
- Modifying a collection while iterating with a lazy iterator can cause `ConcurrentModificationException` or unexpected behavior.
- Snapshot iterators use $O(n)$ extra space but are safe against concurrent modification.

---

## Lecture 9: Trees and Tree Traversals

### Key Concepts
- **Tree**: A hierarchical data structure consisting of nodes connected by edges. A tree with $n$ nodes has $n-1$ edges.
- **Binary tree**: Each node has at most 2 children (left and right).
- **Properties**:
  - Maximum number of nodes at depth $d$: $2^d$.
  - Height of a binary tree with $n$ nodes: minimum $\lfloor \log_2 n \rfloor$, maximum $n-1$.
  - A **proper binary tree** (each node has 0 or 2 children): if internal nodes $I$, then total nodes $n = 2I + 1$.
  - A **complete binary tree** (all levels full except possibly the last, filled left to right): height is $O(\log n)$.

### Tree Traversals
- **Preorder** (root, left, right): Visit root first, then recursively traverse left and right subtrees.
- **Inorder** (left, root, right): Only meaningful for binary search trees; visits nodes in sorted order.
- **Postorder** (left, right, root): Visit children first, then the root.
- **BFS (Breadth-First / Level-Order)**: Visit all nodes at depth 0, then depth 1, then depth 2, etc. Uses a queue.

### Method: DFS Traversal (Recursive)
1. **Preorder**: Visit root, then preorder(left), then preorder(right).
2. **Inorder**: Inorder(left), visit root, then inorder(right).
3. **Postorder**: Postorder(left), postorder(right), visit root.

### Method: BFS Traversal
1. Create a queue and enqueue the root.
2. While the queue is not empty:
   - Dequeue the current node and visit it.
   - Enqueue all children of the current node (left to right).

### Key Formulas/Properties
- **Tree representations**:
  - Array-based: parent $i$ has children at $2i+1$ and $2i+2$, parent of $i$ is $\lfloor(i-1)/2\rfloor$. Efficient only for complete binary trees.
  - Linked: Node stores element, parent reference, and list of children.
  - Binary tree linked: Node stores element, parent, left child, right child.

### WARNING
- Height of a null reference is defined as $-1$ or $0$ depending on convention (course typically uses null = leaf with height 0).
- Array representation is inefficient when the tree is sparse (many empty positions in the array).

---

## Lecture 10: Binary Search Trees (BST)

### Key Concepts
- **BST property**: For every node $p$, all keys in the left subtree are $\leq p.\text{key}$, and all keys in the right subtree are $> p.\text{key}$.
- **Search**: Start at the root. If target < current, go left; if target > current, go right. Repeat until found or null.
- **Insert**: Search for the target; when a null position is found, insert the new node there.
- **Delete**: Three cases:
  1. **Leaf node**: Simply remove it.
  2. **One child**: Replace the node with its child.
  3. **Two children**: Replace with the **in-order successor** (minimum of right subtree) or **in-order predecessor** (maximum of left subtree), then delete the successor/predecessor.

### Method: Search in a BST
1. Start at the root.
2. If tree is empty, return null.
3. Compare target with current node's key.
4. If equal, return the node.
5. If target < current key, recurse on left subtree.
6. If target > current key, recurse on right subtree.

### Method: Insert in a BST
1. Search for the key.
2. If found, update (or do nothing).
3. If not found, insert a new node at the null position where the search terminated.

### Method: Delete in a BST
1. Find the node to delete.
2. Count its children (0, 1, or 2).
3. **0 children**: remove the node directly.
4. **1 child**: replace the node with its child.
5. **2 children**: find in-order successor (leftmost node in right subtree), copy its key to the node to delete, then delete the successor (which has at most 1 child).

### Key Formulas/Properties
- **Best case** height: $O(\log n)$ — balanced BST.
- **Worst case** height: $O(n)$ — degenerate tree (essentially a linked list).
- **Search/insert/delete complexity**: $O(h)$ where $h$ is the height of the tree.
- **Inorder traversal of a BST** yields keys in ascending sorted order.

### WARNING
- A BST can degrade to $O(n)$ height if keys are inserted in sorted order.
- When deleting a node with two children, the in-order successor always has at most one child (the right child), making the subsequent deletion simple.

---

## Lecture 11: Priority Queues and Heaps

### Key Concepts
- **Priority Queue (PQ)**: A collection of entries, each with a key. Supports insert(min, key), findMin(), and deleteMin().
- **List-based PQ**:
  - **Unsorted list**: insert $O(1)$, findMin/deleteMin $O(n)$.
  - **Sorted list**: insert $O(n)$, findMin/deleteMin $O(1)$.
- **Heap**: A complete binary tree that satisfies the heap-order property: the key at each node is $\leq$ (min-heap) or $\geq$ (max-heap) the keys of its children.
- **Array representation of a heap**: Node at index $i$ has children at $2i+1$ and $2i+2$, parent at $\lfloor(i-1)/2\rfloor$.

### Method: Heap Insert
1. Add the new element at the next available position (end of the array, preserving completeness).
2. **Percolate up** (bubble up): Compare with parent. If the heap order is violated, swap with parent. Repeat until the heap order is satisfied or the root is reached.
3. Time complexity: $O(\log n)$ (height of the heap).

### Method: Heap DeleteMin (Min-Heap)
1. Replace the root with the last element in the array.
2. Remove the last element.
3. **Percolate down** (sift down): Compare the current node with its children. Swap with the smaller child if the heap order is violated. Repeat until the heap order is satisfied or the node is a leaf.
4. Time complexity: $O(\log n)$ (height of the heap).

### Method: Build a Heap from an Array
1. Start from the last non-leaf node (index $\lfloor n/2 \rfloor - 1$) and go backwards to the root.
2. Apply **percolate down** at each position.
3. Time complexity: $O(n)$ (not $O(n \log n)$, because most percolate-down operations are on shallow nodes).

### Key Formulas/Properties
- **Heap operations**: insert $O(\log n)$, deleteMin $O(\log n)$, findMin $O(1)$.
- **Heap height**: $O(\log n)$ because it is a complete binary tree.
- **Heap sort**: Build heap $O(n)$, then repeatedly deleteMin $O(n \log n)$. Total: $O(n \log n)$.

### WARNING
- The heap must remain a **complete binary tree** (all levels filled left to right).
- When percolating down, always swap with the **smaller** child (min-heap) or **larger** child (max-heap).

---

## Lecture 12: Sorting Algorithms — Comparison-Based

### Key Concepts
- **Lower bound for comparison-based sorting**: Any comparison-based sorting algorithm requires at least $\Omega(n \log n)$ comparisons in the worst case. This follows from the decision tree model: there are $n!$ possible permutations, and a binary tree of depth $d$ has at most $2^d$ leaves, so $2^d \geq n! \Rightarrow d \geq \log_2(n!) \approx n \log n$.

### Insertion Sort
- **Method**: Build the sorted portion one element at a time. For each element, find its correct position in the sorted portion and insert it by shifting larger elements.
- **Time**: Best $O(n)$ (already sorted), Worst $O(n^2)$, Average $O(n^2)$.
- **Space**: $O(1)$ (in-place).
- **Stable**: Yes.
- **Good for**: Small datasets or nearly sorted data.

### Method: Insertion Sort
1. Start from index 1.
2. Compare current element with elements to its left.
3. Shift elements greater than current one position to the right.
4. Insert current element in its correct position.
5. Repeat for all elements.

### Selection Sort
- **Method**: Repeatedly find the minimum element from the unsorted portion and swap it with the first unsorted element.
- **Time**: $O(n^2)$ in all cases.
- **Space**: $O(1)$ (in-place).
- **Stable**: No (swapping can change relative order of equal elements).

### Method: Selection Sort
1. Find the minimum element in the unsorted portion.
2. Swap it with the first element of the unsorted portion.
3. Move the boundary between sorted and unsorted portions one position to the right.
4. Repeat until the entire array is sorted.

### Merge Sort
- **Method**: Divide the array into two halves, recursively sort each half, then merge the two sorted halves.
- **Time**: $O(n \log n)$ in all cases.
- **Space**: $O(n)$ (requires auxiliary array for merging).
- **Stable**: Yes.
- **Recurrence**: $T(n) = 2T(n/2) + O(n)$, solution: $O(n \log n)$.

### Method: Merge Sort
1. If the array has 0 or 1 elements, return (base case).
2. Divide the array into two halves.
3. Recursively sort each half.
4. Merge the two sorted halves:
   - Compare the front elements of each half and copy the smaller one to the result.
   - Repeat until all elements are merged.

### Quick Sort
- **Method**: Choose a pivot, partition the array into elements less than, equal to, and greater than the pivot, then recursively sort the left and right partitions.
- **Time**: Best/Average $O(n \log n)$, Worst $O(n^2)$ (when pivot is always the smallest or largest element).
- **Space**: $O(\log n)$ (recursion stack).
- **Stable**: No.
- **Pivot selection**: Random pivot or median-of-three avoids worst case on sorted data.

### Method: Quick Sort
1. Choose a pivot element.
2. Partition: rearrange elements so that elements $<$ pivot come first, then elements $==$ pivot, then elements $>$ pivot.
3. Recursively sort the left and right partitions.
4. Concatenate the results.

### Heap Sort
- **Method**: Build a max-heap, then repeatedly swap the root (max) with the last element, reduce heap size, and percolate down.
- **Time**: $O(n \log n)$ in all cases.
- **Space**: $O(1)$ (in-place).
- **Stable**: No.

### QuickSelect (Selection Algorithm)
- **Method**: Similar to quicksort, but only recurse into the partition that contains the desired $k$-th element.
- **Average time**: $O(n)$, Worst case: $O(n^2)$.
- **Use case**: Finding the $k$-th smallest element, median, or other order statistics.

### WARNING
- **Stable sort**: Insertion sort, merge sort are stable. Selection sort, heap sort, quicksort are not stable.
- Merge sort uses $O(n)$ extra space; quicksort uses $O(\log n)$ stack space.
- For small arrays ($n < 10$), insertion sort is often faster than merge/quick sort due to lower overhead.

---

## Lecture 13: Non-Comparison Sorting and QuickSelect

### Key Concepts
- **Non-comparison sorting** can break the $O(n \log n)$ lower bound because they don't rely solely on comparisons. They exploit properties of the input data.

### Counting Sort
- **Method**: Count the occurrences of each distinct key value, then reconstruct the sorted array.
- **Requirements**: Keys must be integers in a known range $[0, k]$.
- **Time**: $O(n + k)$ where $k$ is the range of key values.
- **Space**: $O(n + k)$.
- **Stable**: Yes (with proper implementation).

### Method: Counting Sort
1. Count the occurrences of each key value in a count array of size $k+1$.
2. Compute cumulative counts to determine positions.
3. Place each element in its correct sorted position.
4. Time: $O(n + k)$.

### Bucket Sort
- **Method**: Distribute elements into a number of buckets based on their key values, sort each bucket individually, then concatenate.
- **Requirements**: Keys should be uniformly distributed over a range.
- **Time**: Average $O(n + k)$, Worst $O(n^2)$ (all elements in one bucket).
- **Space**: $O(n + k)$.
- **Stable**: Depends on the sorting algorithm used for each bucket.

### Method: Bucket Sort
1. Create $k$ empty buckets.
2. Distribute elements into buckets based on their key values.
3. Sort each bucket (using insertion sort or another algorithm).
4. Concatenate all buckets in order.

### Radix Sort
- **Method**: Sort elements digit by digit, starting from the least significant digit (LSD) or most significant digit (MSD).
- **LSD Radix Sort**: Requires stable sort for each digit.
- **Time**: $O(d \cdot (n + k))$ where $d$ is the number of digits and $k$ is the base (e.g., 10 for decimal).
- **Space**: $O(n + k)$.
- **Stable**: Yes (with LSD and a stable per-digit sort).

### Method: LSD Radix Sort (for 3-digit numbers)
1. **Pass 1**: Sort by the last digit (using counting sort — stable).
2. **Pass 2**: Sort by the middle digit (using counting sort — stable).
3. **Pass 3**: Sort by the first digit (using counting sort — stable).
4. After all passes, the array is sorted.

### QuickSelect
- **Method**: Partition around a pivot, then recurse only into the relevant partition.
- **Average time**: $O(n)$ — each step eliminates a constant fraction of elements.
- **Worst case**: $O(n^2)$ — pivot is always the smallest or largest.
- **Median of medians**: Guarantees $O(n)$ worst-case by choosing a good pivot.

### WARNING
- Radix sort and counting sort require that the key values are integers in a manageable range.
- LSD radix sort requires a stable sorting algorithm for each digit pass.
- Bucket sort performance depends on the distribution of input data; uniform distribution gives best results.

---

## Lecture 14: Maps and Hash Tables

### Key Concepts
- **Map**: A collection of key-value pairs where each key appears at most once. Supports insert(key, value), remove(key), get(key), contains(key).
- **Sorted Map**: Keys are kept in sorted order. Supports range search.
- **Hash Table**: Uses a hash function to map keys to array indices. Provides average $O(1)$ operations.

### Hash Functions
- **Identity function**: $h(k) = k$ (only works if keys are small integers).
- **Division method**: $h(k) = k \bmod p$ where $p$ is a prime number (not a power of 2, not close to a power of 2).
- **Universal hashing**: Use $h(k) = ((ak + b) \bmod p) \bmod m$ for random $a, b$.
- **For strings**: Horner's method — $h(s) = (s[0] \cdot b^{n-1} + s[1] \cdot b^{n-2} + \ldots + s[n-1]) \bmod p$.

### Collision Resolution

| Method | Description | Time (average) | Time (worst) |
|---|---|---|---|
| **Separate chaining** | Each bucket is a linked list of entries with the same hash | $O(1 + \alpha)$ | $O(n)$ |
| **Linear probing** | If bucket is occupied, try the next bucket (wrap around) | $O(\frac{1}{1-\alpha})$ | $O(n)$ |
| **Quadratic probing** | If bucket is occupied, try at intervals $1^2, 2^2, 3^2, \ldots$ | $O(\frac{1}{1-\alpha})$ | $O(n)$ |
| **Double hashing** | Use a second hash function to determine the probe interval | $O(\frac{1}{1-\alpha})$ | $O(n)$ |

- **Load factor** $\alpha = \frac{n}{N}$ where $n$ is the number of entries and $N$ is the number of buckets.
- For chaining: keep $\alpha \leq 1$. For open addressing (linear probing): keep $\alpha \leq 0.5$.

### Method: Hash Table with Separate Chaining
1. Compute hash value: $h = \text{hash}(key) \bmod \text{tableSize}$.
2. **Insert**: Add a new entry at the front of the chain at index $h$.
3. **Search**: Traverse the chain at index $h$ looking for the key.
4. **Delete**: Find the entry in the chain and remove it.
5. **Resize**: When $\alpha$ exceeds a threshold, double the table size and rehash all entries.

### Method: Hash Table with Linear Probing
1. Compute hash value: $h = \text{hash}(key) \bmod \text{tableSize}$.
2. **Insert**: If bucket $h$ is occupied, try $h+1, h+2, \ldots$ (wrapping around) until an empty slot is found.
3. **Search**: Start at $h$ and probe sequentially until the key is found or an empty slot is encountered.
4. **Delete**: Mark the slot as "deleted" (tombstone) so searches can continue past it.

### WARNING
- For linear probing, **primary clustering** occurs: consecutive occupied buckets form clusters, increasing search times.
- When using linear probing, the table size should be a prime number to ensure all buckets are probed.
- In separate chaining, the order of entries in the chain does not matter for correctness but affects cache performance (front insertion is faster).

---

## Lecture 15: Balanced Search Trees — AVL Trees

### Key Concepts
- **AVL tree**: A BST where the heights of the left and right subtrees of every node differ by at most 1.
- **Balance factor** of a node = height(left child) - height(right child). Must be $-1$, $0$, or $1$.
- **Rebalancing** is done using **rotations** (tri-node restructuring).

### Rotations
- **Single rotation** (LL or RR case): One child is the heavy side, and the new node was inserted into that child's same-side subtree.
  - **LL rotation** (right-heavy child): Rotate right at the unbalanced node.
  - **RR rotation** (left-heavy child): Rotate left at the unbalanced node.
- **Double rotation** (LR or RL case): The new node was inserted into the opposite side of the heavy child.
  - **LR rotation**: First rotate left at the child, then rotate right at the unbalanced node.
  - **RL rotation**: First rotate right at the child, then rotate left at the unbalanced node.

### Method: AVL Insert
1. Perform a standard BST insert.
2. Walk back up the path to the root.
3. At each ancestor, check the balance factor.
4. If unbalanced (balance factor is $-2$ or $2$), determine the case (LL, LR, RL, RR).
5. Perform the appropriate rotation.
6. All rotations preserve the BST property.

### Key Formulas/Properties
- **AVL tree height**: $O(\log n)$ — guaranteed.
- **AVL tree operations**: insert, delete, search all $O(\log n)$.
- **Minimum nodes in an AVL tree of height $h$**: $N(h) = N(h-1) + N(h-2) + 1$ (similar to Fibonacci).
- **Rotation cost**: $O(1)$ — at most one or two rotations per insertion/deletion along the path.

### WARNING
- Only the nodes on the path from the inserted node to the root need to be checked for balance.
- After a rotation, the height of the subtree may decrease by 1, which could cause the parent to become unbalanced. Continue checking up the path.
- Rotations must maintain the BST ordering property.

---

## Lecture 16: Balanced Search Trees — (2,4) Trees and Red-Black Trees

### Key Concepts
- **(2,4) tree** (also called B-tree of order 4): Every internal node has 2, 3, or 4 children. All leaves are at the same depth. Keys are stored in nodes: a node with $k$ children has $k-1$ keys.
  - **2-node**: 1 key, 2 children.
  - **3-node**: 2 keys, 3 children.
  - **4-node**: 3 keys, 4 children.
- **Red-Black tree**: A BST where each node has an additional color field (red or black). The colors enforce approximate balance.

### (2,4) Tree Operations
- **Search**: Same as BST search.
- **Insert**:
  1. Search for the appropriate leaf position.
  2. If the leaf has space (2-node or 3-node), insert the key.
  3. If the leaf is full (4-node), split it and push the middle key up.
  4. Propagate splits upward if necessary (root may split, increasing tree height by 1).
- **Delete**: More complex; involves borrowing from siblings or merging nodes.

### Red-Black Tree Properties
1. Every node is either red or black.
2. The root is black.
3. All leaves (null references) are black.
4. If a node is red, both its children are black (no two consecutive red nodes).
5. Every path from a node to its descendant leaves has the same number of black nodes (**black-height**).

### Relationship: (2,4) Trees $\leftrightarrow$ Red-Black Trees
- A 2-node in a (2,4) tree corresponds to a single black node in a red-black tree.
- A 3-node corresponds to a black node with one red child.
- A 4-node corresponds to a black node with two red children (or a black-red-red chain).
- There are multiple red-black tree configurations that correspond to the same (2,4) tree.

### Method: Red-Black Insert
1. Insert as in a BST, colored **red**.
2. Fix up violations (at most 2 rotations needed):
   - **Case 1**: Parent is black — no violation. Done.
   - **Case 2**: Parent is red, uncle is red — recolor parent and uncle to black, grandparent to red. Propagate fix-up up.
   - **Case 3**: Parent is red, uncle is black, and the new node is on the "outside" (LL or RR) — rotate once and recolor.
   - **Case 4**: Parent is red, uncle is black, and the new node is on the "inside" (LR or RL) — rotate twice and recolor.
3. Make the root black.

### WARNING
- Red-black trees guarantee $O(\log n)$ height but are less strictly balanced than AVL trees (roughly 2:1 ratio of max to min height).
- Red-black trees typically have fewer rotations than AVL trees on average.
- In a (2,4) tree, all splits propagate upward but never require moving data down.

---

## Lecture 17: Skip Lists

### Key Concepts
- **Skip list**: A probabilistic data structure that allows $O(\log n)$ search, insert, and delete on average.
- Consists of multiple layers of linked lists. The bottom layer contains all elements; higher layers contain sparser subsets.
- Each node has a random **level** (height), determined by a probabilistic process (e.g., coin flip).
- **Search**: Start from the top layer and move right. When the next node exceeds the target, drop down one level.

### Key Formulas/Properties
- **Expected height of a node**: $O(\log n)$.
- **Search time**: $O(\log n)$ expected, $O(n)$ worst case.
- **Insert/delete time**: $O(\log n)$ expected.
- **Space**: $O(n)$ on average.
- **Simpler to implement** than AVL or red-black trees (no complex rebalancing logic).

### WARNING
- Skip lists use more memory per node (multiple pointers per node).
- The probabilistic nature means worst-case performance is theoretically possible but extremely unlikely.

---

## Lecture 18: Graphs — Representations

### Key Concepts
- **Graph**: $G = (V, E)$ where $V$ is a set of vertices and $E$ is a set of edges.
- **Directed graph**: Edges have direction $(u, v)$.
- **Undirected graph**: Edges have no direction $\{u, v\}$.
- **Weighted graph**: Each edge has an associated weight/cost.
- **Degree**: Number of edges incident to a vertex. In directed graphs: in-degree and out-degree.
- **Path**: A sequence of vertices connected by edges.
- **Cycle**: A path that starts and ends at the same vertex.
- **DAG** (Directed Acyclic Graph): A directed graph with no cycles.

### Graph Representations

| Representation | Space | Add Edge | Check Edge | Adjacent Vertices |
|---|---|---|---|---|
| **Adjacency matrix** | $O(|V|^2)$ | $O(1)$ | $O(1)$ | $O(|V|)$ |
| **Adjacency list** | $O(|V| + |E|)$ | $O(1)$ | $O(\text{degree})$ | $O(\text{degree})$ |
| **Adjacency map** | $O(|V| + |E|)$ | $O(1)$ | $O(\text{degree})$ | $O(\text{degree})$ |

- **Adjacency matrix**: A $|V| \times |V|$ matrix where `matrix[i][j] = 1` (or weight) if edge $(i, j)$ exists.
- **Adjacency list**: An array of lists, where each list contains the adjacent vertices for a given vertex.
- **Adjacency map**: Similar to adjacency list but uses maps instead of lists (useful for sparse graphs with non-integer vertex labels).

### WARNING
- Adjacency matrix is efficient for dense graphs ($|E| \approx |V|^2$) and for edge existence checks.
- Adjacency list is more space-efficient for sparse graphs ($|E| \ll |V|^2$).
- For undirected graphs, the adjacency matrix is symmetric, and the adjacency list for edge $(u,v)$ appears in both $u$'s and $v$'s lists.

---

## Lecture 19: Graph Traversals — BFS and DFS

### Key Concepts
- **BFS (Breadth-First Search)**: Explores all vertices at the current depth before moving to the next level. Uses a **queue**.
- **DFS (Depth-First Search)**: Explores as far as possible along each branch before backtracking. Uses a **stack** (or recursion).
- Both BFS and DFS visit each vertex and edge once: time complexity $O(|V| + |E|)$.
- **Connected component**: A maximal set of vertices where each pair is connected by a path.
- **Spanning tree**: A subgraph that includes all vertices and is a tree.

### Method: BFS
1. Create a queue and enqueue the starting vertex.
2. Mark the starting vertex as visited.
3. While the queue is not empty:
   - Dequeue vertex $v$.
   - For each unvisited adjacent vertex $w$:
     - Mark $w$ as visited.
     - Enqueue $w$.

### Method: DFS (Recursive)
1. Mark the current vertex as visited.
2. For each unvisited adjacent vertex $w$:
   - Recursively call DFS on $w$.

### Method: DFS (Iterative using Stack)
1. Push the starting vertex onto a stack.
2. While the stack is not empty:
   - Pop vertex $v$.
   - If $v$ is unvisited, mark it and push all unvisited adjacent vertices.

### BFS vs DFS Properties
- **BFS** finds the **shortest path** (in unweighted graphs) from the source to all reachable vertices.
- **DFS** produces a **DFS forest** and is useful for topological sorting, cycle detection, and finding connected components.
- BFS uses $O(|V|)$ space for the queue; DFS uses $O(|V|)$ space for the recursion stack.

### WARNING
- BFS requires keeping track of visited vertices to avoid infinite loops in cyclic graphs.
- DFS recursion depth can be $O(|V|)$ in the worst case (e.g., a line graph).
- In BFS, the order in which adjacent vertices are added to the queue determines the traversal order.

---

## Lecture 20: Topological Sorting

### Key Concepts
- **Topological sort**: A linear ordering of vertices in a DAG such that for every directed edge $(u, v)$, vertex $u$ comes before $v$ in the ordering.
- **Only exists for DAGs** (Directed Acyclic Graphs).
- **Applications**: Course prerequisites, build ordering, task scheduling.

### Method: Topological Sort using DFS
1. Perform DFS on the graph.
2. When a vertex is finished (postorder), add it to the front of a linked list.
3. After DFS completes, the linked list is the topological ordering.

### Method: Topological Sort using In-Degree (Kahn's Algorithm)
1. Compute the in-degree of each vertex.
2. Add all vertices with in-degree 0 to a queue.
3. While the queue is not empty:
   - Dequeue vertex $v$ and add it to the result.
   - For each adjacent vertex $w$ of $v$: decrement its in-degree.
   - If $w$'s in-degree becomes 0, enqueue $w$.
4. If the result contains all vertices, it is a valid topological sort. Otherwise, the graph has a cycle.

### WARNING
- A DAG can have multiple valid topological orderings.
- If the graph is not a DAG (contains a cycle), topological sorting is impossible.

---

## Lecture 21: Shortest Path — Dijkstra's Algorithm

### Key Concepts
- **Dijkstra's algorithm**: Finds the shortest path from a source vertex to all other vertices in a **weighted graph with non-negative edge weights**.
- Uses a **priority queue** to efficiently find the next closest unvisited vertex.
- Maintains a distance estimate $d[v]$ for each vertex $v$.

### Method: Dijkstra's Algorithm
1. Initialize $d[s] = 0$ for the source $s$, and $d[v] = \infty$ for all other vertices $v$.
2. Create a priority queue with all vertices, keyed by their distance $d[v]$.
3. While the priority queue is not empty:
   - Remove the vertex $u$ with the smallest $d[u]$ from the queue.
   - For each adjacent vertex $v$ of $u$:
     - Calculate $newDist = d[u] + weight(u, v)$.
     - If $newDist < d[v]$: update $d[v] = newDist$ and update $v$'s priority in the queue.

### Key Formulas/Properties
- **Correctness**: Dijkstra's works because edge weights are non-negative. When a vertex is removed from the PQ, its shortest distance is final.
- **Time complexity**:
  - With binary heap PQ: $O((|V| + |E|) \log |V|)$.
  - With Fibonacci heap PQ: $O(|E| + |V| \log |V|)$.
  - With adjacency matrix and no PQ: $O(|V|^2)$.
- **Cannot handle negative edge weights** — use Bellman-Ford instead.

### WARNING
- Dijkstra's algorithm fails with negative edge weights (can produce incorrect results or infinite loops).
- The algorithm computes shortest paths from one source to all destinations (single-source shortest path).
- For all-pairs shortest paths, run Dijkstra from each vertex: $O(|V| \cdot (|E| + |V| \log |V|))$.

---

## Lecture 22: Minimum Spanning Trees — Prim-Jarnik and Kruskal's

### Key Concepts
- **Spanning tree**: A subgraph that includes all vertices of $G$ and is a tree (connected, acyclic, $|V|-1$ edges).
- **Minimum Spanning Tree (MST)**: A spanning tree with the minimum possible total edge weight.
- MST exists if and only if the graph is connected (undirected).
- **Cut property**: For any cut (partition of vertices into two sets), the minimum-weight edge crossing the cut belongs to the MST.
- **Cycle property**: For any cycle in the graph, the maximum-weight edge on the cycle does not belong to the MST.

### Prim-Jarnik Algorithm
- **Method**: Start from an arbitrary vertex and grow the MST one edge at a time by adding the minimum-weight edge that connects a vertex in the MST to a vertex outside.
- Uses a priority queue to efficiently find the minimum-weight crossing edge.

### Method: Prim-Jarnik Algorithm
1. Start from an arbitrary vertex $s$, add it to the MST set.
2. Maintain a priority queue of edges connecting MST vertices to non-MST vertices, keyed by weight.
3. While there are non-MST vertices:
   - Extract the minimum-weight edge $(u, v)$ from the PQ where $u$ is in MST and $v$ is not.
   - Add $v$ to the MST set and add edge $(u, v)$ to the MST.
   - Add all edges from $v$ to non-MST vertices to the PQ.

### Kruskal's Algorithm
- **Method**: Sort all edges by weight. Add edges one by one if they don't create a cycle.
- Uses **Union-Find** (Disjoint Set Union) to efficiently check for cycles.

### Method: Kruskal's Algorithm
1. Sort all edges in non-decreasing order of weight.
2. Initialize Union-Find with each vertex in its own set.
3. For each edge $(u, v)$ in sorted order:
   - If $u$ and $v$ are in different sets (i.e., find(u) $\neq$ find(v)):
     - Add the edge to the MST.
     - Union the sets containing $u$ and $v$.
4. Stop when $|V|-1$ edges have been added.

### Union-Find
- **Operations**: makeSet(x), find(x) (returns the representative of the set containing $x$), union(x, y) (merges the sets containing $x$ and $y$).
- **Union by rank**: Attach the shorter tree to the taller tree.
- **Path compression**: During find, make every node on the path point directly to the root.
- **Time**: Nearly $O(1)$ amortized per operation with both optimizations. Total for Kruskal's: $O(|E| \log |E|)$ for sorting + $O(|E| \alpha(|V|))$ for Union-Find.

### Key Formulas/Properties
| Algorithm | Time Complexity | Space | Approach |
|---|---|---|---|
| Prim-Jarnik (binary heap) | $O((|V| + |E|) \log |V|)$ | $O(|V| + |E|)$ | Greedy, vertex-based |
| Prim-Jarnik (Fibonacci heap) | $O(|E| + |V| \log |V|)$ | $O(|V| + |E|)$ | Greedy, vertex-based |
| Kruskal's | $O(|E| \log |E|)$ | $O(|V| + |E|)$ | Greedy, edge-based |

### WARNING
- MST is only defined for **undirected** graphs.
- If the graph is not connected, the result is a Minimum Spanning Forest.
- Kruskal's and Prim-Jarnik can produce different MSTs if there are multiple edges with the same weight.
- Dijkstra's algorithm finds shortest paths; MST finds minimum total weight — they are different problems.

---

## Lecture 23: Algorithm Design Techniques — Divide and Conquer

### Key Concepts
- **Divide and Conquer**: Break a problem into smaller subproblems, solve each subproblem recursively, and combine the results.
- **Pattern**: Divide $\to$ Conquer (recurse) $\to$ Combine.
- **Examples**: Merge sort, quicksort, binary search, closest pair of points.

### Method: Apply Divide and Conquer
1. **Divide**: Split the problem into $a$ subproblems, each of size $n/b$.
2. **Conquer**: Recursively solve each subproblem.
3. **Combine**: Merge the solutions into a solution for the original problem.
4. Analyze the recurrence: $T(n) = aT(n/b) + f(n)$ where $f(n)$ is the combine cost.

### Common Recurrences
| Algorithm | Recurrence | Solution |
|---|---|---|
| Merge sort | $T(n) = 2T(n/2) + O(n)$ | $O(n \log n)$ |
| Quick sort | $T(n) = 2T(n/2) + O(n)$ (avg) | $O(n \log n)$ |
| Binary search | $T(n) = T(n/2) + O(1)$ | $O(\log n)$ |
| Closest pair | $T(n) = 2T(n/2) + O(n)$ | $O(n \log n)$ |

### WARNING
- The Master Theorem applies when the subproblems are of equal size $n/b$.
- In quicksort, the worst case recurrence is $T(n) = T(n-1) + O(n)$, giving $O(n^2)$.

---

## Lecture 24: Algorithm Design Techniques — Dynamic Programming

### Key Concepts
- **Dynamic Programming (DP)**: Solve problems by breaking them into overlapping subproblems, solving each subproblem once, and storing the results.
- **Key properties**:
  - **Optimal substructure**: An optimal solution contains optimal solutions to subproblems.
  - **Overlapping subproblems**: The same subproblems are solved multiple times in a recursive approach.
- **Two approaches**:
  - **Top-down (memoization)**: Recursive with a cache to store already-solved subproblems.
  - **Bottom-up (tabulation)**: Iteratively solve subproblems from smallest to largest, filling a table.

### Method: Apply Dynamic Programming
1. **Define the subproblems**: Identify what parameters define a subproblem.
2. **State the recurrence relation**: Express the solution to a subproblem in terms of smaller subproblems.
3. **Base cases**: Identify the smallest subproblems with known solutions.
4. **Choose approach**: Top-down (memoization) or bottom-up (tabulation).
5. **Determine computation order**: Ensure subproblems are solved before they are needed.
6. **Extract the answer**: From the final subproblem(s).

### Common DP Problems
- **Fibonacci**: $F(n) = F(n-1) + F(n-2)$, $F(0) = 0$, $F(1) = 1$. Without DP: $O(2^n)$. With DP: $O(n)$.
- **Longest Common Subsequence (LCS)**: Given two sequences, find the longest subsequence common to both. $O(m \cdot n)$ time and space.
- **Edit Distance (Levenshtein distance)**: Minimum number of insertions, deletions, and substitutions to transform one string into another. $O(m \cdot n)$.
- **Knapsack Problem**: Given items with weights and values, find the maximum value within a weight limit. $O(n \cdot W)$ where $W$ is the capacity.

### WARNING
- DP is not applicable when subproblems are independent (use divide and conquer instead).
- Space optimization: For some DP problems, only the previous row/column is needed, reducing space from $O(mn)$ to $O(n)$.
- Bottom-up avoids recursion overhead and stack overflow risk.

---

## Lecture 25: Algorithm Design Techniques — Greedy Algorithms

### Key Concepts
- **Greedy algorithm**: Makes the locally optimal choice at each step, hoping to find a global optimum.
- **Key property**: **Greedy choice property** — a globally optimal solution can be arrived at by making locally optimal choices.
- **Proof techniques**:
  - **Exchange argument**: Show that any non-greedy solution can be transformed into a greedy solution without worsening the result.
  - **Staying ahead**: Show that at each step, the greedy solution is at least as good as any other solution.

### Greedy Algorithm Examples
| Problem | Greedy Choice | Correct? |
|---|---|---|
| Activity Selection | Pick earliest finishing activity | Yes |
| Huffman Coding | Merge two lowest-frequency nodes | Yes |
| Fractional Knapsack | Pick highest value/weight ratio | Yes |
| MST (Prim/Kruskal) | Pick minimum-weight crossing edge | Yes |
| Dijkstra's Shortest Path | Pick closest unvisited vertex | Yes (non-negative weights) |
| Coin Change (US coins) | Pick largest denomination $\leq$ remaining | Yes |
| Interval Scheduling | Pick earliest finishing interval | Yes |
| 0/1 Knapsack | Pick highest value/weight ratio | **No** |

### Method: Prove Greedy Correctness
1. **Define** the greedy choice and the optimal solution.
2. **Show** that the greedy choice is safe (a global optimum exists that includes it).
3. **Show** that after making the greedy choice, the remaining subproblem still has an optimal solution.
4. Use **exchange argument** or **staying ahead** to complete the proof.

### WARNING
- Greedy does **not** always produce optimal results (e.g., 0/1 knapsack, general shortest path with negative weights).
- Always verify that the greedy choice property and optimal substructure hold before using a greedy approach.
- Counterexamples for greedy on 0/1 knapsack: picking highest value/weight ratio first can leave insufficient capacity for the optimal combination.

---

## Quick Reference: When to Use What

| Problem Type | Algorithm/Structure | Complexity |
|---|---|---|
| Search in sorted data | Binary search | $O(\log n)$ |
| Search with fast average lookup | Hash table (chaining) | $O(1)$ average |
| Search with range queries | Sorted map / BST | $O(\log n)$ |
| Maintain smallest element | Min-heap / Min-PQ | $O(\log n)$ insert, $O(1)$ findMin |
| LIFO access pattern | Stack | $O(1)$ push/pop |
| FIFO access pattern | Queue | $O(1)$ enqueue/dequeue |
| Ordered traversal of data | BST (inorder) | $O(\log n)$ avg search |
| Guaranteed balanced search | AVL tree | $O(\log n)$ worst case |
| Balanced search with fewer rotations | Red-black tree | $O(\log n)$ worst case |
| Priority-based graph traversal | Dijkstra's (non-negative weights) | $O((V+E)\log V)$ |
| Shortest path with negative weights | Bellman-Ford | $O(V \cdot E)$ |
| Unweighted shortest path | BFS | $O(V + E)$ |
| Minimum spanning tree (edge-based) | Kruskal's + Union-Find | $O(E \log E)$ |
| Minimum spanning tree (vertex-based) | Prim-Jarnik | $O((V+E)\log V)$ |
| Sort with comparison guarantee | Merge sort / Heap sort | $O(n \log n)$ |
| Sort with small data / nearly sorted | Insertion sort | $O(n^2)$ worst, $O(n)$ best |
| Sort with integer keys in small range | Counting sort | $O(n+k)$ |
| Sort with multi-digit integer keys | Radix sort (LSD) | $O(d(n+k))$ |
| $k$-th smallest element | QuickSelect | $O(n)$ average |
| Overlapping subproblems | Dynamic programming | Varies (often $O(n^2)$) |
| Locally optimal $\to$ globally optimal | Greedy | Varies (often $O(n \log n)$) |
| DAG ordering / dependencies | Topological sort | $O(V + E)$ |
| Connected components | BFS / DFS | $O(V + E)$ |
| Find all paths at each depth | BFS | $O(V + E)$ |
| Explore deep paths / backtracking | DFS | $O(V + E)$ |
