Reductions.

Introduction

What are Reductions? Of course the word Reduction can have different interpretations depending on the context in which it appears. The Reductions I am talking about have their roots in Theoretical Computer Science, specifically in Complexity Theory.

Reductions are actually a quiet advanced topic - but don't worry we will get there. The following chapter is intended to be an introduction to the idea behind Reductions, accessible to people without a STEM background. If you are just interested in the proof feel free to skip.

Throughout the post I will use a couple of symbols:

$\forall$ := "for all", $\in$ := "is an element of", $\mathbb{N}$ := "Natural numbers including $0$ "

when ever I write $\{ \dots \}$ , I am writing about a set so you might need to understand what a set is as well. Don't worry it's super easy. Here is the explanation from Wikipedia (2026-03-05 21:33:00 CET).

"In mathematics, a set is a collection of different things; the things are called elements or members of the set and are typically mathematical objects: numbers, symbols, points in space, lines, geometric shapes, variables, or other sets. A set may be finite or infinite. There is a unique set with no elements, called the empty set; a set with a single element is a singleton."

The goal of this blog post:

You will be introduced to various concepts of Theoretical Computer Science and Mathematics. In addition to that you will understand an interesting (mathematical) proof (if this is the first proof you ever read, please let me know). I believe that understanding and writing proofs sharpens your reasoning in ways that extend well beyond mathematics. Perhaps this will (even) spark some interest.

Why Reductions?

One big branch of Computer Science is concerned with estimating how hard it is to compute solutions to problems. Problems in this context are defined by an input and require an output that fulfils certain properties, e.g. "on this map (input) I want to find the shortest path from a position $A$ to a position $B$ (output)". This is not the formal definition but it will be enough for us.

It makes sense (even in practice) to think about complexity, because before we try to come up with a program that solves a problem, it would be good to know, if it is even possible to compute a solution (in a sane amount of time).

Hamiltonian Cycles

Graphs

One of those problems I would like to introduce has to do with Graphs. A Graph consists of a set $V$ of so called vertices and another set $E$ with so called edges. An edge is always connected to two different vertices (for our purposes). Thus, a Graph $G$ is a pair $G=(V,E)$ . Both $V$ and $E$ are just sets. The vertices can be elements, e.g. airports thereby $\forall v \in V$ , $v$ is an airport. Furthermore we say $\forall e \in E$ each $e$ is present if there is at least one two-sided flight connection. This would be enough to define a graph.

Hamiltonian Cycle Problem

One very famous Graphproblem is concerned with finding a so called Hamiltonian Cycle. A cycle is a path in the Graph where we move from one vertex in the Graph to another, using an edge $e \in E$ (or from one airport to another using a flight). But for that path to be a cycle, both the start and the end of our route have to start and end, in the identical vertex $v \in V$ respectively (or airport - You get the idea).

A cycle does not necessarily involve all vertices of a Graph, but a Hamiltonian Cycle does - there can't be a single vertex that is not part of the cycle.

The problem to check if there exists a Hamiltonian Cycle in a Graph is very hard. Its one of the so called $NP$ -Complete problems. Problems that fall into this category are $NP$ -Hard, which means (roughly), there is no Algorithm, that can solve the problem in polynomial time. Polynomial time ? This means we only need an amount of operations in our computer that grows with $n^{m}, m \in \mathbb{N}$ , where $n$ is the size of the Problem. How big a Problem is, depends on the Problem because the definition of size might vary. In the airport example it would be valid to say $n = |V| + |E|$ , meaning the size equals the amount of vertices and edges of our Graph.

$NP$ -Complete Problems have two additional properties: They can be transformed into other $NP$ -Complete Problems (in fact one such a transformation is what this whole blog post is about) and while it's not possible to compute a solution in polynomial time, we can at least verify that a solution is correct in polynomial time once we found one.

Nice, but what does not in polynomal time computable actually mean in practice ? Lets look at an example.

Such problems are very often solved by trying out each possible solution that exist. While this might work well for small instances of our graph of airports, quickly, for bigger graphs we have to check a lot of possible solutions.

In Germany, my home country, there are, according to this website (2026-03-07 10:57:32 CET), $840$ airports. Lets suppose we pick a small subset of them, say $15$ and we want to find out if there is a Hamiltonian Cycle in the graph to visit all of the $15$ Airports in one tour. This would mean, our Graph $G$ consists of

$V:=\{15$ Different airports in Germany $\}$

and

$E := \{$ All connections between the Airports $\}$

Just to be clear: Not all of the airports are connected to all other airports. Otherwise our problem would be trivial. Now how many possibilities do we have to check ? And how long would that take ? Luckily, citing Wikipedia (2026-03-08 09:30:40 CET), there is a formula for Hamiltonian Cycles in a complete graph, given as $\frac{(n-1)!}{2}$ . This leaves us with:

$\begin{aligned} \frac{(n-1)!}{2} &= \frac{1}{2} \cdot ((n-1) \cdot (n-2) \cdot \ldots \cdot 2 \cdot 1) \\ &= \frac{1}{2} \cdot (15 \cdot 14 \cdot \ldots \cdot 2 \cdot 1) \\ &= 43.589.145.600 \end{aligned}$

permutations to check. Of course we might find a solution on the first try. But generally in Computer Science we look at the worst case, meaning we would have to check all possible permutations. Now if we can check one solution in $0.1\mu s$ ( $0.1$ microseconds, which are $0.0000001$ seconds) we would still need about $72$ minutes or $1.2$ hours.

Reduction

To proof that problems like the Hamiltonian Cycle are indeed $NP$ -Complete is not an easy problem to solve (pun intended). Here come Reductions into play.

A Reduction is a special proof technique, which was described on Wikipedia (2026-03-08 12:18:50) as a technique that we can use to show, that a problem $B$ is at least as hard as the problem $A$ we reduce from. This is what I meant earlier with transforming problems into other problems.

We do this by finding a way to transform the input of problem $A$ which is the problem we know is e.g. $NP$ -Complete, into the problem $B$ , whose hardness we try to proof. So again, we try to find a way to transform input of problem $A$ into the input of problem $B$ - but we must make sure that whenever an input would be valid for a problem $A$ , it must also be valid for the other problem $B$ . Essentially we reduce the problem $A$ to the problem $B$ , thus the name. Of course this reduction must be possible in polynomial time (otherwise this would destroy any estimation about the hardness of the problem $B$ )

As we know that problem $A$ is e.g. $NP$ -Complete we know problem $B$ must be at least as hard as problem $A$ and furthermore we know we can reduce every other $NP$ -Complete Problem to it because we have found a way to reduce one of them to it (this might be hard to grasp at first - think of this fact: We know all problems that are $NP$ -Complete can be reduced to each other, thus we can reduce each problem to problem $A$ and then to problem $B$ therefore reducing problem $A$ to problem $B$ is enough).

As those problems often involve completely different inputs (and different mathematical branches), those reductions can become quiet mind bending. For instance not all problems have anything to do with graphs, but still we can to transform them into each other to show their complexity. One example of a Reduction where this happens is explained in the next chapter.

3-SAT to Vertex Cover

Okay at this stage you should be equipped with enough to get the idea of the Reduction from the problem called 3-Satisfiability (short: 3-SAT) to the Vertex Cover Problem.

3-SAT

Input is a logical formula made from so called clauses in the form $(\bar{A} \vee \bar{B} \vee C) \wedge (A \vee B \vee C)$ where $\wedge = \text{AND}$ , $\vee = \text{OR}$ . Each so called literal $A,B,C$ can be negated by writing $\bar{A}$ . The Satisfiability Problem is concerned with finding out whether there is a configuration that satisfies the formula. in this case one would be $A=0, B=0,C=1$ because it would result in

$\begin{aligned} &(1 \vee 1 \vee 1) \wedge (0 \vee 0 \vee 1) = \\ &(1\text{OR}1\text{OR} 1)\text{AND} \\ &(0\text{OR}0\text{OR}1) \end{aligned}$

An example that is not satisfiable would be $A \wedge \bar{A}$ . Of course in those small instances of the problem we can do it by hand. This is however not the case anymore for bigger instances.

The number $3$ in 3-SAT just means that each clause has three literals in it at a maximum (this form has a special name: conjunctive normal form). For this problem, Stephen A. Cook proofed 1971 in a very long and tedious proof that deciding whether such a formula is satisfiable or not is $NP$ -Complete. So we can take $NP$ -Completeness as a given.

Vertex Cover Problem

In this problem we have a graph as an input and we want to find a set of vertices so that every edge is connected to at least one vertex in the set. The amount of elements in this set must be smaller or equal to a $k \in \mathbb{N}$ .

Proof

This is of course only going to be a rough explanation as I try to make this digestible for readers that hear about Reductions the first time. Nevertheless I think the idea and the validity will become clear.

Okay, first we remember that 3-SAT is $NP$ -Complete, since Cook proofed it in 1971. We also see that Vertex Cover can be checked in polynomial time as we would just have to iterate for each vertex over all edges, checking if there is a connection or not. This would need less than $n^{2}$ operations for $n = \max\{ |V|, |E| \}$ therefore we have $O(n^{2})$ (expressed in Big-O Notation), qualifying this algorithm to run in polynomial time. As we learned earlier this is everything we need, to say that Vertex Cover is in $NP$ . If we can reduce 3-SAT to it, we know that Vertex Cover is $NP$ -Complete as well.

So whats missing now is the Reduction. For that we aim to create a graph with different smaller components. For every literal we will connect a small graphs of the form

I will refer to them as literal graphs. For each clause of some formula we create an additional graph of the form

and I will refer to them as clause graphs. So this clause graph represents $(\bar{A} \vee \bar{B} \vee C)$ . In the next step we connect the clause graphs with the literal graphs by creating edges whenever two vertices have the same name. For our initial example this will result in something like this:

Constructing this for a given formula is possible in polynomial time. This means if we can show now that for every valid input of 3-SAT our construction will be a valid input for the Vertex Cover Problem we have reduced the problem successfully.

Okay so here is what we do: If we have a configuration for our formula, we will choose the the vertices of each literal graph that correspond with the literal of the configuration. Lets say we have, like in the previous example, the configuration $A=0, B=0, C=1$ . Then we choose the following vertices in this stage for our Vertex Cover:

Now is the right time to talk about the $k$ we need to specify the Vertex Cover Problem. Let $n$ be the amount of literals and $m$ the amount of clauses of our formula then we set $k=n + 2m$ . In our example we get $k = 3 + 4 = 7$ .

And here comes the magic: In addition to our the vertices we chose earlier from the literal graphs, we choose from the clauses graph, all vertices whose corresponding literals are not set to $1$ . Of course if too many of those literals are set to $1$ we will just pick two of the corresponding vertices. The left clause graphs in our example falls into such a case, as all literals in that clause are set to $1$ by our configuration:

Now we have always a Vertex Cover if the formula is satisfiable. What happens if there is no Vertex Cover ? Well this means one clause has only literals that are set to $0$ . Therefore we must choose all of the vertices if the respective clause graph. This means we have chosen at least one vertex too much and our set has more than $k$ elements. Therefore our Reduction produces an invalid Vertex Covers exactly when the configuration is invalid, thus if there is no valid configuration for a formula, there will not be a valid Vertex Cover.