Max Grossmann: Convolution

Convolution

Posted: 2020-07-12 · Last updated: 2023-12-01 · Permalink

Consider two independent random variables $X$ and $Y$ characterized by probability density functions $f_{X}$ and $f_{Y}$, respectively. If we add their values, what is the distribution of the resulting random variable $X+Y$?

This question is by no means trivial, and in general, there are no trivial answers to it. In this section, let us ponder about how we could derive the distribution of this new random variable, which we name and define as $Z = X+Y$.

At first, it is always valuable to think about what we are looking for. Here, we are looking for the distribution of $Z$, but what is a "distribution"? You can look up the definition of that word, but what we mean here is that we aim to find a probability distribution (or "density") function (p.d.f.) $f_{Z}(z)$—this function describes the distribution sufficiently. It is all we need.

Consider the following: Assume you fixed some value of $z$, say $5$ and you ask for the probability density at that point. How can we solve for that?

For the moment, assume you also fixed some value of $x$, say $3$. From that, it follows trivially that $y$ must be $2$! Think about it: Our relationship $Z = X+Y$ determines the third variable if we already know the value of two variables. Therefore, for any value of $x$ (given $z$), $y$ is immediately determined.

This is a major step. But notice that there is not only one combination of $x$ and $y$ (like $3$ and $2$) that produces $z = 5$. In fact, there are infinitely many pairs of $x$ and $y$ that produce $z = 5$, but $y$ is always determined through $y = z-x$.

We could now go through all infinite values of $x$, calculate the appropriate $y$ that is necessary to produce $z = 5$ and somehow add them up so that we get the probability density at $z = 5$, $f_{Z}(5)$. However, it is not too clear how we would do that. Let us therefore consider the discrete case first.

Before we go further, consider for the moment that we have discrete distributions. Let's assume that $X$ returns $1$ with probability $25\,\%$, $2$ with probability $25\,\%$ and $4$ with probability $50\,\%$. $Y$ returns $1$ with probability $50\,\%$ and $3$ with probability $50\,\%$. The probability densities are now real probabilities. Let's look at the possible values of $Z = X+Y$:

$Z$		$X$
$Z$		1 25%	2 25%	4 50%
$Y$	1 50%	2	3	5
$Y$	3 50%	4	5	7

Once again we are interested in the probability density (here: probability) of obtaining $Z = 5$. Let's do what was suggested above: Let's walk through all $X$ values, compute the necessary $Y$ values if we want $Z = 5$ and then compute the total probability.

If we have $X = 1$ (which occurs $25\,\%$ of the time), we would need $Y = 5-1 = 4$. However, $Y$ cannot take that value (it happens $0\,\%$ of the time). Therefore, it follows that $P(X = 1 \cap Y = 4) = 0$.
If we have $X = 2$ (which occurs $25\,\%$ of the time), we would need $Y = 5-2 = 3$. $Y = 3$ happens $50\,\%$ of the time. Since the random variables are independent, the probability that both happen simultaneously is the product of the probabilities. It follows that $P(X = 2 \cap Y = 3) = 0.25\cdot 0.5 = 0.125$.
If we have $X = 4$ (which occurs $50\,\%$ of the time), we would need $Y = 5-4 = 1$. $Y = 1$ happens $50\,\%$ of the time. Since the random variables are independent, the probability that both happen simultaneously is the product of the probabilities. It follows that $P(X = 4 \cap Y = 1) = 0.5\cdot 0.5 = 0.25$.

Certainly, you would agree that the total probability of obtaining $Z = 5$ is $P(Z = 5) = 0+0.125+0.25=0.375$, right?

If $X_{i}$ are the possible values of $X$ (1, 2, 4) and if $\sum_{X_{i}}$ is the sum over these values, this probability can also be written as

$$ P(Z = 5) = \sum_{X_{i}} P(X = X_{i} \cap Y = 5-X_{i}). $$

Since $X$ and $Y$ are independent, it follows that

$$ P(Z = 5) = \sum_{X_{i}} P(X = X_{i}) \cdot P(Y = 5-X_{i}). $$

And that's basically it for the discrete case. We now return to the continuous case in which there are infinitely many values of our random variables. To derive the formula for $f_{Z}(z)$, we just have to remember two things:

A probability $P()$ translates to the "probability density" $f()$, and
the integral is the "continuous version of the sum".

Using these two principles and our previously derived formula from the discrete case, we reach the following:

$$ f_{Z}(5) = \int_{-\infty}^{\infty} f_{X}(x) \cdot f_{Y}(5-x)\,dx. $$

And for values other than $Z = 5$, we arrive at the following p.d.f. of $Z$:

$$ f_{Z}(z) = \int_{-\infty}^{\infty} f_{X}(x) \cdot f_{Y}(z-x)\,dx. $$

Congratulations. You have just derived the distribution of $Z$ (and also the formula of the convolution).

By the way, there are some awesome results that can be obtained using this integral. Most notable, however, is the fact that the sum of two normally distributed random variables is once again normally distributed, with the new mean being the sum of the means and the new variance being the sums of the variances. That is a truly stunning result also known as the "reproduktionseigenschaft" of the normal distribution. A proof is available on many websites, and we will not repeat it here.

Instead, let us "convolute" two uniform distributions!

Let the random variables $X$ and $Y$ be described by the p.d.f. $f_{X}(x) = f_{Y}(y) = 1$ in the interval $[0,1]$. Beyond that interval, the p.d.f.s have the value $0$. Therefore, $X$ and $Y$ are constrained to lie in $[0,1]$. We have to modify the above formula for $f_{Z}(z)$ to reflect that:

$$ f_{Z}(z) = \int_{0}^{1} f_{X}(x) \cdot f_{Y}(z-x)\,dx. $$

Consequently, $Z \in [0,2]$. Firstly, we can note that outside these bounds, $f_{Z}(z) = 0$.

Suppose we wanted to reach $z = 1.2$. $x \in [0,1]$, no problem: $f_{X}(x) = 1$. But what will be the value of $f_{Y}(z-x)$? Remember the above constraint: $Y$ must be an element of $[0,1]$. $f_{Y}(z-x)$ will be $1$ most of the time (if $x \in [0.2, 1]$). But it will be $0$ some time (if $x \in [0,0.2]$). Think about it: If $x = 0.1$, there is no way we could reach $z = 1.2$, because $y$ would have to be $1.1$, which is outside its domain! There just is not a corresponding valid $y$ value for some $x$ values. $f_{Y}(z-x) = 1$ if $z-x < 1$, or, equivalently, if $x \geq z-1$.

Since $f_{X}(x) = 1$ and we know where $f_{Y}(z-x)$ will be $1$, we can simplify our integral drastically:

$$ f_{Z}(z) = \int_{z-1}^{1} \underbrace{f_{X}(x)}_{1}\cdot \underbrace{f_{Y}(z-x)}_{1} \, dx = 2-z. $$

But wait a second: This is only true if $z \geq 1$. What if $z = 0.6$, for example? Well, a similar case holds true. Note that $x$ must now be restricted to $[0, 0.6]$ so that we can find a suitable value for $z-x$ that lies in $y$'s domain! If $z-x > 0$ (i.e. $x < z$), $f_{Y}(z-x)$ will be $1$. It follows that, for these values of $z$, $f_{Z}(z) = \int_{0}^{z} 1 \, dx = z$.

Ultimately, we have

$$ f_{Z}(z) = \left\{ \begin{array}{ll} 0 & \mbox{if } x < 0, \\ z & \mbox{if } 0 \leq x \leq 1, \\ 2-z & \mbox{if } 1 < x \leq 2, \\ 0 & \mbox{if } x > 2. \end{array} \right. $$

This is the triangular distribution. In other words: The sum of two uniformly distributed random variables is triangularly distributed. Pretty awesome, isn't it?