Mastering Recurrence Relations: The Substitution Method

Verified Sources

May 26, 2026

In computer science, analyzing the execution time of recursive algorithms requires solving a recurrence relation . A classic example is the recurrence:

$T(n) = 2T(n/2) + n$

This equation frequently arises in divide-and-conquer algorithms, such as Mergesort, where a problem of size $n$ is split into two equal subproblems of size $n/2$ , alongside a linear cost $n$ for splitting and recombining the subproblems .

To establish an asymptotic upper bound for $T(n)$ , we can use the substitution method . This method leverages mathematical induction to prove that a guessed bound holds true for all inputs larger than a baseline value $n_0$ .

Visualizing the Recurrence Tree

To formulate a strong initial guess, we can visualize how the recurrence splits. The total work is the sum of the work done at each level of the recursion tree:

Summing the costs across any level $i$ yields a constant cost of $n$ per level. Since the tree has a depth of $\log_2 n$ levels, the total complexity is visually estimated to be $O(n \log n)$ .

Introduction to Algorithms (CLRS) - Chapter 4: Divide-and-Conquer and Recurrences. ↩ ↩² ↩³
Mathematical Induction for Recurrences - Stanford CS161 Lecture Notes on solving recurrences using the substitution method. ↩ ↩²

Formulating a Good Guess

Before starting the [substitution method]{def='A method for solving recurrences by guessing the form of the solution and then using mathematical induction to prove it.'}, always sketch a quick recursion tree. If the tree reveals that every level does $n$ work and there are $\log_2 n$ levels, your guess should be $O(n \log n)$ . This ensures you do not waste algebraic effort proving a bound that is too tight or too loose.

Mathematical Induction Walkthrough

1
Step 1
Based on our visual analysis, we guess that $T(n) \le c n \log_2 n$ for some constant $c > 0$ and all $n \ge n_0$ . This formulation represents our inductive hypothesis.
2
Step 2
We assume the inductive hypothesis holds true for all positive numbers smaller than $n$ , which specifically includes $n/2$ . Thus, we substitute the hypothesis into the recurrence relation: $T(n/2) \le c (n/2) \log_2(n/2)$ . Inserting this into the original equation yields: $T(n) \le 2\left(c \frac{n}{2} \log_2\left(\frac{n}{2}\right)\right) + n$ .
3
Step 3
Simplify the substituted inequality using the laws of logarithms. We know that $\log_2(n/2) = \log_2 n - \log_2 2 = \log_2 n - 1$ . Expanding the expression gives: $T(n) \le c n (\log_2 n - 1) + n = c n \log_2 n - c n + n$ .
4
Step 4
To complete the induction, we must show that our simplified inequality is less than or equal to our target bound: $c n \log_2 n - c n + n \le c n \log_2 n$ . This inequality holds true if and only if $-c n + n \le 0$ , which simplifies directly to $c \ge 1$ .
5
Step 5
Show that the boundary condition holds. If we assume a base case $T(1) = 1$ , our inductive hypothesis yields $T(1) \le c \cdot 1 \log_2 1 = 0$ , which contradicts the base case. Because asymptotic bounds only require the inequality to hold for $n \ge n_0$ , we can choose $n_0 = 2$ and $n_0 = 3$ as our base cases. With $T(2) = 2T(1) + 2 = 4$ , we choose $c$ large enough (e.g., $c \ge 2$ ) to satisfy the base cases.

The Base Case Trap

A common pitfall in [mathematical induction]{def='A technique for proving a statement, theorem, or formula is true for every non-negative integer.'} is neglecting the base case. If your base case is $n = 1$ , $T(1) \le c \cdot 1 \log_2 1$ evaluates to $T(1) \le 0$ , which is impossible for positive running times. Always remember that asymptotic notation only requires the bound to hold for $n \ge n_0$ . You can freely choose $n_0 = 2$ or $n_0 = 3$ as your base cases for induction [(https://web.stanford.edu/class/archive/cs/cs161/cs161.1168/)].

Mathematical Induction for Recurrences - Stanford CS161 Lecture Notes on solving recurrences using the substitution method. ↩

A rigorous mathematical technique where we guess the form of the solution and prove it using mathematical induction. Highly precise but requires a good initial guess.

$\text{Key Advantage: Perfect mathematical rigor.}$ $\text{Key Disadvantage: Hard to formulate a guess for complex equations.}$

Advanced Substitution Strategies & Edge Cases

Knowledge Check

Question 1 of 3

Q1Single choice

Which of the following is the correct inductive hypothesis to prove that $T(n) = 2T(n/2) + n$ is $O(n \log n)$ ?

$T(n) \ge c n \log_2 n$ for all $n \ge n_0$

$T(n) \le c n \log_2 n$ for all $n \ge n_0$ and some constant $c > 0$

$T(n) = c n^2$ for all $n$

$T(n) < n \log_2 n$ for all $n$

Explore Related Topics

Short Notes on Cook's Theorem, Randomized Algorithms, and Bin Packing

The notes cover Cook’s theorem establishing SAT as NP‑complete, the design and analysis of randomized (Las Vegas and Monte Carlo) algorithms, and the NP‑hard bin‑packing problem with its common heuristics and approximation guarantees.

Cook’s theorem shows every language $L\in\mathrm{NP}$ reduces to SAT via a polynomial‑time function $f$ such that $x\in L\iff f(x)\in\mathrm{SAT}$ , making SAT the first NP‑complete problem.
Randomized algorithms: Las Vegas algorithms are always correct with expected runtime (e.g., $\mathbb{E}[T(n)]=O(n\log n)$ for randomized quicksort); Monte Carlo algorithms run in fixed time with error ≤½, which can be reduced by amplification to $(\tfrac12)^k$ after $k$ repetitions.
Bin packing: the decision version is NP‑complete and the optimization version NP‑hard; heuristics like First Fit Decreasing guarantee $\mathrm{FFD}(I)\le\frac{11}{9}\mathrm{OPT}(I)+\frac{6}{9}$ .
Together they illustrate three core CS themes: proving hardness via reductions, leveraging randomness for efficient algorithm design, and using heuristics/approximation to tackle intractable optimization problems.

Solving the Recurrence $T(n)=T(n-1)+n$ by Substitution Method

The course shows how to solve the decrease‑by‑one recurrence (T(n)=T(n-1)+n) (with (T(1)=1)) using the substitution method.

Repeatedly substitute (T(n-i)=T(n-i-1)+(n-i)) until reaching the base case, yielding (T(n)=T(1)+2+3+\dots+n).
The resulting sum is the triangular number (\frac{n(n+1)}{2}).
The dominant term (\frac{1}{2}n^{2}) gives a tight asymptotic bound (\Theta(n^{2})).
For recurrences of the form (T(n)=T(n-1)+f(n)), expanding to a summation quickly reveals the closed form.

Research more with Coursify