The Harris–Kleitman Inequality

Section 1.4 The Harris–Kleitman Inequality

Our goal in this section is to prove a beautiful correlation inequality, which is often referred to as Harris’ Inequality or Kleitman’s Inequality (depending on who you ask), and is a special case of several more general results that you may encounter in the future, such as the FKG Inequality and the Four Functions Theorem. On the surface, this inequality has little connection to the “intersection theorems” in the rest of this chapter. However, its relationship to the Erdős–Ko–Rado Theorem will come up in several of the exercises at the end of this chapter.

“Correlation.”

A collection \(\mathcal{F}\subseteq 2^{[n]}\) is a downset (or is downward closed) if whenever \(A\in \mathcal{F}\) and \(B\subseteq A\text{,}\) we have \(B\in \mathcal{F}\text{.}\) Analogously, \(\mathcal{F}\) is an upset if whenever \(A\in \mathcal{F}\) and \(B\supseteq A\text{,}\) we have \(B\in \mathcal{F}\text{.}\) Given a pair \(\mathcal{A},\mathcal{B}\subseteq 2^{[n]}\) of downsets, how many elements must \(\mathcal{A}\) and \(\mathcal{B}\) have in common? Since both \(\mathcal{A}\) and \(\mathcal{B}\) are “biased” towards small subsets of \([n]\text{,}\) one may expect them to have many elements in common; the following result confirms this intuition.

Theorem 1.4.1. The Harris–Kleitman Inequality [145, 168].

If \(\mathcal{A}\) and \(\mathcal{B}\) are downsets in \(2^{[n]}\text{,}\) then

\begin{equation*} \frac{|\mathcal{A}\cap\mathcal{B}|}{2^n}\geq \frac{|\mathcal{A}|}{2^n}\cdot \frac{|\mathcal{B}|}{2^n}\text{.} \end{equation*}

Proof.

The inequality that we are trying to prove can be equivalently written as

\begin{equation*} 2^n|\mathcal{A}\cap \mathcal{B}|\leq |\mathcal{A}|\cdot |\mathcal{B}|\text{.} \end{equation*}

We prove this by induction on \(n\text{.}\) The case \(n=1\) is easy, so let’s consider \(n\geq2\text{.}\) Define

\begin{equation*} \mathcal{A}^+:= \{A\subseteq [n-1]: A\cup\{n\}\in\mathcal{A}\} \end{equation*}

and

\begin{equation*} \mathcal{A}^-:= \{A\subseteq [n-1]: A\in\mathcal{A}\}\text{.} \end{equation*}

Define \(\mathcal{B}^+\) and \(\mathcal{B}^-\) analogously. Note that \(\mathcal{A}^+\) and \(\mathcal{A}^-\) are both downsets in \(2^{[n-1]}\text{.}\) Moreover, since \(\mathcal{A}\) is a downset, we have that \(A\in\mathcal{A}\) whenever \(A\cup\{n\}\in \mathcal{A}\text{;}\) this implies that \(\mathcal{A}^+\subseteq \mathcal{A}^-\text{.}\) Analogous properties hold for \(\mathcal{B}^+\) and \(\mathcal{B}^-\text{.}\) Note that

\begin{equation*} 2^n|\mathcal{A}\cap \mathcal{B}|=2^n|\mathcal{A}^+\cap \mathcal{B}^+| + 2^n|\mathcal{A}^-\cap \mathcal{B}^-| \end{equation*}

By induction, this is at least

\begin{equation*} 2|\mathcal{A}^+|\cdot |\mathcal{B}^+| + 2|\mathcal{A}^-|\cdot |\mathcal{B}^-| \end{equation*}

\begin{equation*} = \left(|\mathcal{A}^+|+|\mathcal{A}^-|\right)\left(|\mathcal{B}^+|+|\mathcal{B}^-|\right) + \left(|\mathcal{A}^+|-|\mathcal{A}^-|\right)\left(|\mathcal{B}^+|-|\mathcal{B}^-|\right) \end{equation*}

which, since \(\mathcal{A}^+\subseteq\mathcal{A}^-\) and \(\mathcal{B}^+\subseteq\mathcal{B}^-\text{,}\) is at least

\begin{equation*} |\mathcal{A}|\cdot|\mathcal{B}|\text{.} \end{equation*}

This completes the proof.

Of course, the same is true for upsets.

Corollary 1.4.2.

If \(\mathcal{A}\) and \(\mathcal{B}\) are both upsets in \(2^{[n]}\text{,}\) then

\begin{equation*} \frac{|\mathcal{A}\cap\mathcal{B}|}{2^n}\geq \frac{|\mathcal{A}|}{2^n}\cdot \frac{|\mathcal{B}|}{2^n}\text{.} \end{equation*}

Proof.

If \(\mathcal{A}\) is an upset, then \(\overline{\mathcal{A}}:=\{[n]\setminus S: S\in \mathcal{A}\}\) is a downset. Also, \(|\overline{\mathcal{A}}|=|\mathcal{A}|\) and \(|\overline{\mathcal{A}}\cap \overline{\mathcal{B}}|=|\mathcal{A}\cap \mathcal{B}|\text{.}\) The result now follows by Theorem 1.4.1.

Next, let us show that the inequality in Theorem 1.4.1 is tight.

Example 1.4.3.

Perhaps the simplest tight example for the Harris–Kleitman Inequality is to take \(\mathcal{A}=\mathcal{B}=2^{[n]}\text{;}\) in this case, both sides of the inequality are equal to one.

For a more general family of examples, take disjoint subsets \(S\) and \(T\) of \([n]\) and let \(\mathcal{A}:=2^{[n]\setminus S}\) and \(\mathcal{B}:=2^{[n]\setminus T}\text{.}\) Then \(\mathcal{A}\cap \mathcal{B} = 2^{[n]\setminus (S\cup T)}\) and so, since \(S\) and \(T\) are disjoint, it has cardinality \(2^{n-|S|-|T|}\text{.}\) Then both sides of the inequality in Theorem 1.4.1 evaluate to \(2^{-|S|-|T|}\text{.}\)

Perhaps the best way to think of the Harris–Kleitman Inequality is as a “correlation” inequality for downsets. That is, if one selects a element of \(2^{[n]}\) uniformly at random, then the probability of it being in \(\mathcal{A}\) is \(|\mathcal{A}|/2^n\) and the probability of it being in \(\mathcal{B}\) is \(|\mathcal{B}|/2^n\text{.}\) If these two events were independent, then the probability that it is contained in both \(\mathcal{A}\) and \(\mathcal{B}\text{,}\) i.e. the probability that it is in \(\mathcal{A}\cap \mathcal{B}\text{,}\) would be the product of these two probabilities: \((|\mathcal{A}|/2^n)\cdot(|\mathcal{B}|/2^n)\text{.}\) Of course, the actual probability of this event is given by \(|\mathcal{A}\cap \mathcal{B}|/2^n\text{.}\) Therefore, one can interpret this theorem as saying that, in the case that both collections are downsets, the events of a random set landing in \(\mathcal{A}\) and landing in \(\mathcal{B}\) are “positively correlated.”

This allows us to derive inequalities relating the probability of the intersection of two events to the product of their individual probabilities. This is particularly useful when the probability of this intersection is difficult to compute exactly. For example, suppose that \(\mathcal{F}\) is a random subset of \(2^{[n]}\) obtained by including each \(S\subseteq [n]\) with probability \(1/2\text{,}\) independently of all other such sets. Let \(E_1\) be the event that \(\mathcal{F}\) is intersecting and \(E_2\) be the probability that \(\mathcal{F}\) has VC-dimension at most \(37\cdot n^{1/2077}\text{.}\) If the event \(E_1\) is satisfied by \(\mathcal{F}\text{,}\) then it is also satisfied by every subcollection of \(\mathcal{F}\text{;}\) the same is true for \(E_2\text{.}\) Thus, by the Harris–Kleitman Inequality,

\begin{equation*} \mathbb{P}(E_1\cap E_2)\geq \mathbb{P}(E_1)\mathbb{P}(E_2)\text{.} \end{equation*}

Of course, the probabilities of \(E_1\) and \(E_2\) may still be difficult to compute in this case, but they are probably easier to compute than the probability of \(E_1\cap E_2\text{.}\)

Here are a couple YouTube videos related to the Harris–Kleitman Inequality:

Prev Top Next