General CS Notes

  1. Algorithms
  2. Data Structures
  3. Machine Learning
  4. Problems
  5. Python Data Structures
  6. Python Memory Management
  7. The Hardware-Software Interface

General CS Notes

Problems

Subset Sum

Given a multiset of numbers $S$ and a desired total $t$, can we find a subset of $S$ that sums to $t$?

The partition problem is a special case of subset sum where $t = \frac{1}{2} sum(S)$.

Dynamic Programming

A dynamic programming approach to subset sum builds a table of $|S|$ rows and $t$ columns.01 We use a fixed ordering over $S$. We iterate over $i \in |S|$, allowing ourselves to use the first $i$ elements in $S$ to construct sums. Our inner loop $j$ checks all possible sums in $0$ through $t$.

The key insight is that we can use the table to check whether previous sums were possible. To fill in a table entry at $(i,j)$ we’re asking whether the first $i$ elements of $S$ can make the sum $j$. We’ve already computed this for the first $i-1$ elements, so when we consider the $i$th element, there are two options:

  1. We can total $j$ without using $S[i]$. Whether we can do this is recorded in the table entry at $(i-1, j)$.
  2. We can total $j$ by including $S[i]$. To do this, we’d need to be able to make $j - S[i]$ using the first $i-1$ elements. Whether we can do this is recorded in the table entry at $(i-1, j - S[i])$

In summary, our entry at $(i,j)$ is the or of two previous cells:02

$$(i, j) = (i-1, j) \lor (i-1, j - S[i])$$

At the end, we read off the table value for $(|S|, t)$.

This algorithm takes $O(|S|t)$ time and space.03 It’s called a pseudo-polynomial time algorithm because $t$ grows exponentially in the number of bits required to specify it. (For more on this, see problem specification bits below.)

Reductions

Interestingly, subset sum can also be reduced into a partition problem, too. Let $s = sum(S)$. Create a new multiset $S’$ that’s $S$ plus a new element $b = 2t - s$. This means $sum(S’) = s + b = s + 2t - s = 2t$. If we can solve the partition problem on $S’$, that means we’ll have split $S’$ into two subsets that each sum to $t$. Whichever set lacks $b$ is the solution to the original subset sum problem of $S$ and $t$.

Partition Problem

Given a multiset of numbers $S$, can we split $S$ into two subsets of equal sum?

Observations:

The partition problem is a special case of subset sum. Trivial reduction: find a subset $S$ that sums to $t = \frac{1}{2}sum(S)$. (The other subset will also sum to $t$.)

Problem Specification Bits

The number 1024 is 0b10000000000, which takes 11 bits to represent.04 $2^{10} = 1024$ and $\log_2{1024} = 10$. In other words, it takes $O(\log n)$ bits to represent a number $n$.

So, if a problem’s runtime is based on the value of an input number ($n$), we need much fewer bits $O(\log n)$ to specify that number.

Put another way, assume we only have $b$ bits to specify an input number to a problem. The value we can construct using $b$ bits is $2^b$. So we can make the runtime take exponentially longer ($2^{b+1}$) by only increasing our problem specification linearly ($b+1$).

This analysis matters because of the way we measure algorithmic complexity. We care about the amount of “work” (bits) it takes to specify a problem’s input, and how much “work” (time, space) it correspondingly takes to solve it. Our analysis measures this ratio.

Footnotes


  1. If $S$ can include negative elements, instead of building columns for $0$ through $t$, we need columns from largest negative sum of $S$ through largest positive sum of $S$. We can do this by using an offset in the table, but I haven’t thought through the details. A negative $t$ shouldn’t be a problem then because we still just lookup its value in the table. ↩︎

  2. We start by knowing the table borders: we can always make $0$ by including no elements, (i.e., $(i,0)$ is always true) and if we have no elements, we can only make $0$ (i.e., $(0,j\neq0)$ is always false). Attempts to index off the table are false. ↩︎

  3. I believe you can save fewer rows of the table to use less space, you can explore top-down to only visit necessary sums to potentially cut time, and overall you may only need to explore possible sums rather than all numbers $0$ through $t$. ↩︎

  4. The off-by-one is because it takes 10 bits to represent 1024 distinct values. But if we include 0, these are the numbers 0 – 1023. ↩︎

General CS Notes series


← previous

3. Machine Learning

post info


Published Nov 25, 2025
Disclaimer This is an entry in the garage. It may change or disappear at any point.
Tags garage
Inbound