Skip to main content

Section 2.3 The Kruskal–Katona Theorem

As we saw in the previous section, the Local LYM Inequality provides a lower bound on the cardinality of the shadow of a family \(\mathcal{F}\subseteq\binom{[n]}{k}\) in terms of the cardinality of \(\mathcal{F}\text{.}\) However, this lower bound is rarely optimal. Therefore, it is natural to wonder the following: Given \(1\leq k\leq n\) and \(1\leq m\leq \binom{n}{k}\text{,}\) how small can the shadow of a collection \(\mathcal{F}\subseteq \binom{[n]}{k}\) with \(|\mathcal{F}|=m\) be? As it will turn out, the size of the “ground set” \([n]\) is not very important for this problem, and so we will allow the elements of \(\mathcal{F}\) to come from \(\binom{\mathbb{N}}{k}\text{.}\)
described in detail following the image
“Shadow.”

Example 2.3.1.

As a warm-up, consider the case \(k=2\text{;}\) that is, \(\mathcal{F}\subseteq \binom{\mathbb{N}}{2}\text{.}\) The shadow of \(\mathcal{F}\) is the collection of all sets of size one contained in an element of \(\mathcal{F}\text{.}\) Therefore, in this special case,
\begin{equation*} |\partial \mathcal{F}| = \left|\bigcup_{A\in \mathcal{F}} A\right|\text{.} \end{equation*}
So, the smallest shadow is obtained by any collection of sets of size two which covers as few points as possible. For \(m\geq1\text{,}\) if we let \(\ell\) be the unique integer so that
\begin{equation*} \binom{\ell-1}{2}\lt m\leq \binom{\ell}{2}\text{,} \end{equation*}
then, if you think about it, if \(|\mathcal{F}|=m\text{,}\) then the the union of the sets in \(\mathcal{F}\) must be at least \(\ell\text{,}\) and this can be achieved by taking \(\mathcal{F}\subseteq \binom{[\ell]}{2}\) with \(|\mathcal{F}|=m\text{.}\)
Example 2.3.1 suggests the following.

Intuition.

If you want \(\partial \mathcal{F}\) to be small, then it is wise to keep the union of sets in \(\mathcal{F}\) small. That is, one should avoid using many elements.
Let’s play with sets of cardinality three.

Example 2.3.2.

Let us think about how we would construct a family \(\mathcal{F}\subseteq \binom{\mathbb{N}}{3}\) with \(|\mathcal{F}|=m\) for small values of \(t\) which minimizes the cardinality of \(\partial \mathcal{F}\text{.}\) For \(m=1\text{,}\) the family \(\mathcal{F}\) consists of a single set of size three, and so the cardinality of the shadow is always equal to \(\binom{3}{2}\text{.}\)
Things start getting more interesting for \(m=2\text{.}\) Without loss of generality, assume that \(\mathcal{F}\) contains the set \(\{1,2,3\}\text{.}\) If the other element of \(\mathcal{F}\) is \(\{a,b,c\}\text{,}\) then
\begin{equation*} \partial \mathcal{F} = \{\{1,2\},\{1,3\},\{2,3\}\} \cup\{\{a,b\},\{a,c\},\{b,c\}\}\text{.} \end{equation*}
Therefore, if we are trying to make \(\partial\mathcal{F}\) as small as possible, we want the “overlap” between the collections \(\{\{1,2\},\{1,3\},\{2,3\}\}\) and \(\{\{a,b\},\{a,c\},\{b,c\}\}\) to be as large as possible. Clearly, this occurs when \(\{1,2,3\}\) and \(\{a,b,c\}\) have as many elements in common as possible and so it is optimal to take \(\{a,b,c\}=\{1,2,4\}\text{.}\) Note that, of course, this choice is not unique; e.g., \(\{a,b,c\}=\{2,3,7\}\) is also optimal. Note that this matches our earlier intuition that it is smart to avoid introducing new elements of \(\mathbb{N}\) into the union of sets in \(\mathcal{F}\) as much as possible.
Now, if we apply similar intuition to the case \(m=3\text{,}\) it seems to be a good idea to take three sets which have large pairwise intersections. For the sake of discussion, suppose that \(\mathcal{F}\) contains the sets \(\{1,2,3\},\{1,2,4\}\) and a third set \(\{a,b,c\}\text{.}\) Then
\begin{equation*} \partial \mathcal{F} = \{\{1,2\},\{1,3\},\{2,3\},\{1,4\},\{2,4\}\}\cup\{\{a,b\},\{a,c\},\{b,c\}\}\text{.} \end{equation*}
In particular, \(\partial \mathcal{F}\) already contains all elements of \(\binom{[4]}{2}\text{,}\) with the exception of \(\{3,4\}\text{.}\) So, if we take \(\{a,b,c\}\) to be a subset of \(\{1,2,3,4\}\text{,}\) then, at worst, the shadow will have cardinality \(6\text{.}\) On the other hand, if \(\{a,b,c\}\) contains an element of \(\{5,6,7\}\text{,}\) then at least two of the sets \(\{a,b\},\{a,c\}\) and \(\{b,c\}\) will not be contained in \(\binom{[4]}{2}\text{,}\) and so \(\partial \mathcal{F}\) will contain at least \(7\) elements. With a bit more thinking, these arguments lead us to the conclusion that an optimal choice for \(\mathcal{F}\) is \(\{\{1,2,3\},\{1,2,4\},\{1,3,4\}\}\text{.}\) But also, taking any three sets from \(\binom{[4]}{3}\) would be equally good.
Given the answer for \(m=3\text{,}\) the case \(m=4\) actually becomes trivial. If we let \(\mathcal{F}=\binom{[4]}{3}\text{,}\) then the shadow again has cardinality \(6\text{,}\) which must be optimal, as it cannot be better than the best example in the case \(m=3\text{.}\)
Now, when \(m=5\text{,}\) the best thing to do is to take \(\mathcal{F}=\binom{[4]}{3}\cup\{\{1,2,5\}\}\text{,}\) which has a shadow of cardinality \(8\text{.}\) For \(m=6\text{,}\) the best is \(\mathcal{F}=\binom{[4]}{3}\cup\{\{1,2,5\},\{1,3,5\}\}\) and for \(m=7\) it is \(\mathcal{F}=\binom{[4]}{3}\cup\{\{1,2,5\},\{1,3,5\},\{2,3,5\}\}\text{.}\)
But why, you may ask? It is a bit tedious to continue to rule out all other possibilities as we have been doing, but the vague intuition is as follows. When \(\binom{4}{3}\lt t\leq \binom{5}{3}\text{,}\) we have no choice but to introduce a fifth element of \([n]\text{,}\) say, the number \(5\text{,}\) into the union of sets in \(\mathcal{F}\text{.}\) Assuming that \(\mathcal{F}\) contains all of the sets in \(\binom{[4]}{3}\) (which we haven’t justified, but let’s go with it), then the “new” sets that we get in the shadow are of the form \(\{a,5\}\) where \(\mathcal{F}\) contains a set of the form \(\{a,b,5\}\text{.}\) So, if we let \(\mathcal{F}'=\{\{a,b\}: \{a,b,5\}\in\mathcal{F}\}\text{,}\) then it seems reasonable that, to make the shadow of \(\mathcal{F}\) small, we would want to minimize the shadow of \(\mathcal{F}'\text{;}\) however, we already learned how to do that in Example 2.3.1.
Example 2.3.2 now suggests:

Intuition.

When you are forced to introduce a “new” element into the union of the sets in \(\mathcal{F}\text{,}\) it is wise to keep the union of the sets of \(\mathcal{F}\) which contain that element small.
Still, it seems that it is not obvious what the best collection is for minimizing the shadow. One thing that complicates things is that, as we saw in the above examples, the optimal construction usually isn’t unique. However, the intuition that we should “avoid introducing new elements as long as possible” does point us towards a certain construction. Consider the following.

Definition 2.3.3.

Given distinct subsets \(A=\{a_1,\dots,a_k\}\) and \(B=\{b_1,\dots,b_k\}\) of \(\mathbb{N}\) with \(a_1\lt \cdots\lt a_k\) and \(b_1\lt \cdots\lt b_k\text{,}\) we write
\begin{equation*} A\colex B \end{equation*}
to mean that \(a_j\lt b_j\) where \(j=\max\{i: a_i\neq b_i\}\text{.}\) This ordering is called the colexicographic order (or colex order, for short) on \(\binom{\mathbb{N}}{k}\text{.}\)
In other words, \(A\colex B\) if and only if there exists \(m\) such that the \(m\)th largest element of \(A\) is less than the \(m\)th largest element of \(B\) and all of the larger elements of \(A\) and \(B\) coincide.
Figure 2.3.4. If \(A=\{2,3,4,6\}\) is the green set and \(B=\{1,2,5,6\}\) is the red set, then \(A\colex B\) because the largest elements of \(A\) and \(B\) are the same, but the second largest element of \(B\) (which is \(5\)) is larger than the second largest element of \(A\) (which is \(4\)). That is, \(a_j\lt b_j\) where \(j\) is the latest index on which \(A\) and \(B\) differ.

Example 2.3.5.

The first seven elements of \(\binom{\mathbb{N}}{2}\) under the colex order are
\begin{equation*} \{1,2\}, \{1,3\}, \{2,3\}, \{1,4\}, \{2,4\}, \{3,4\},\{1,5\}\text{.} \end{equation*}
The colex order seems to have the properties that we described above; that is, the colex order aims to avoid introducing large numbers as long as possible. It also has a nice “recursive” structure in the sense that, when a new number, say \(\ell\text{,}\) is introduced, if look at the sets of size \(k-1\) that you get from deleting \(\ell\) from all sets in which \(\ell\) is the maximum element, then they are ordered according to the colex order on \(\binom{\mathbb{N}}{k-1}\text{.}\) Just for completeness, let us also mention the more familiar lexicographic order. We won’t actually use it for anything, though.

Definition 2.3.6.

Given distinct subsets \(A=\{a_1,\dots,a_k\}\) and \(B=\{b_1,\dots,b_k\}\) of \(\mathbb{N}\) with \(a_1\lt \cdots\lt a_k\) and \(b_1\lt \cdots\lt b_k\text{,}\) we write
\begin{equation*} A\lex B \end{equation*}
to mean that \(a_j\lt b_j\) where \(j=\min\{i: a_i\neq b_i\}\text{.}\) This ordering is called the lexicographic order (or lex order, for short) on \(\binom{\mathbb{N}}{k}\text{.}\)

Example 2.3.7.

If \(A\) and \(B\) are the sets from Figure 2.3.4, then \(B\lex A\) because the smallest element of \(B\) is \(1\text{,}\) which is less than the smallest element of \(A\) which is \(2\text{.}\)

Example 2.3.8.

The first seven elements of \(\binom{\mathbb{N}}{2}\) under lex order are
\begin{equation*} \{1,2\},\{1,3\},\{1,4\},\{1,5\},\{1,6\}, \{1,7\},\{1,8\}\text{.} \end{equation*}
Our focus in this section is on proving the following theorem, which confirms our intuition that colex order is useful for minimizing the size of the shadow.
In other words, the Kruskal–Katona Theorem says that “initial segments” of the colex order have the smallest shadow. An important part of the proof is to understand how colex order works.

Example 2.3.10.

Given a set \(A\in\binom{\mathbb{N}}{k}\text{,}\) how do we determine the position of \(A\) in the colex order? For example, consider \(A=\{3,7,9\}\text{.}\) Then every set \(B\in\binom{[8]}{3}\) comes before \(A\text{.}\) Also, every set \(B\) of size \(3\) whose largest element is \(9\) and whose second largest element is at most \(6\) comes before \(A\text{.}\) Finally, sets whose largest and second largest elements are \(9\) and \(7\) whose smallest element is at most \(2\) come before \(A\text{.}\) These three disjoint classes actually describe all of the sets that come before \(A\text{.}\) So, the number of sets that come before \(A\) is
\begin{equation*} \binom{8}{3}+\binom{6}{2} + \binom{2}{1}\text{.} \end{equation*}
The argument in this example generalizes straightforwardly to the following.

Proof.

The sets which come before \(A\) in colex can be classified as follows. First, there are sets whose maximum element is less than \(a_k\text{.}\) The number of these sets is \(\binom{a_k-1}{k}\text{.}\) Then, you have the sets whose maximum element is equal to \(a_k\text{,}\) but whose second largest element is less than \(a_{k-1}\text{.}\) The number of these is \(\binom{a_{k-1}-1}{k-1}\text{.}\) Generally, for each \(i=1,\dots,k\text{,}\) we have the sets which whose largest \(k-i\) elements are equal to \(a_k,a_{k-1},\dots,a_{i+1}\) and whose next largest element is less than \(a_i\text{.}\) The number of these sets is \(\binom{a_i-1}{i}\text{.}\) This classifies all the sets that come strictly before \(A\) in colex order, and so \(A\) is the \(m\)th element where
\begin{equation*} m = 1+\sum_{i=1}^k \binom{a_i-1}{i}. \end{equation*}
Here is a simple, but surprisingly useful, corollary of Lemma 2.3.11.

Proof.

For existence, let \(A\) be the \(m\)th element of \(\binom{\mathbb{N}}{k}\) under colex. Let \(a_1,\dots,a_k\) be the elements of \(A\) in increasing order. Apply Lemma 2.3.11. Uniqueness follows from the fact that colex is a total order.
Before trying to prove the Kruskal–Katona Theorem, it is useful to get a better understanding of what our “target” lower bound is. That is, how large is the the shadow of the first \(m\) elements under colex?

Proof.

Let \(\mathcal{I}_m=\{X_1,\dots,X_m\}\text{.}\) By Lemma 2.3.11, we have
\begin{equation*} X_m=\{a_1,a_2,\dots,a_k\}\text{.} \end{equation*}
Therefore, every set of \(\binom{[a_{k}-1]}{k}\) is in \(\mathcal{I}_m\text{.}\) This implies that \(\partial \mathcal{I}_m\) contains \(\binom{[a_{k}-1]}{k-1}\text{.}\) There are \(\binom{a_{k}-1}{k-1}\) such sets. Also, any set of the form \(S\cup\{a_k\}\) where \(S\in\mathcal{[a_{k-1}-1]}{k-1}\) is in \(\mathcal{I}_m\text{.}\) This gives us that the shadow contains all sets of the form \(T\cup\{a_k\}\) where \(T\in\mathcal{[a_{k-1}-1]}{k-2}\text{,}\) of which there are \(\binom{a_{k-1}-1}{k-2}\text{.}\) Generally speaking, for each \(i=1,\dots,k-1\text{,}\) there are \(\binom{a_{i}-1}{i-1}\) sets in \(\partial\mathcal{I}_m\) whose \(k-i\) largest elements are \(a_k,a_{k-1},\dots,a_{i+1}\) and whose next largest elements is less than \(a_i\text{.}\) Summing this over all \(i\) gives the result.
This allows us to restate the Kruskal–Katona Theorem as follows: For \(k\geq1\) and \(m\geq1\text{,}\) let \(a_1,\dots,a_k\) be a sequence of positive integers such that
\begin{equation*} m=1+\sum_{i=1}^k\binom{a_i-1}{i}\text{.} \end{equation*}
Then every \(\mathcal{F}\subseteq\binom{\mathbb{N}}{k}\) with \(|\mathcal{F}|=m\) satisfies
\begin{equation*} |\partial\mathcal{F}|\geq \sum_{i=1}^k\binom{a_i-1}{i-1}\text{.} \end{equation*}
A natural strategy to prove the Kruskal–Katona Theorem is to start with a collection \(\mathcal{F}\subseteq\binom{\mathbb{N}}{k}\) and try to replace some of the sets in \(\mathcal{F}\) with sets that come earlier under the colex order. The natural way to modify a set \(A\) to make it come earlier under colex is to replace a large element \(j\in A\) with a small element \(i\notin A\text{.}\) The simplest operation for doing this is as follows:

Definition 2.3.14.

For distinct \(i,j\in\mathbb{N}\text{,}\) let \(C_{i,j}:2^{\mathbb{N}}\to 2^{\mathbb{N}}\) be defined by
\begin{equation*} C_{i,j}(A):=\begin{cases}\left(A\setminus \{j\}\right)\cup\{i\} \amp \text{ if } j\in A \text{ and } i\notin A,\\ A \amp \text{ otherwise } \end{cases} \end{equation*}
for each \(A\subseteq \mathbb{N}\text{.}\) We call \(C_{i,j}(A)\) the \((i,j)\)-compression of \(A\text{.}\)
Figure 2.3.15. The set \(A\) and the set \(C_{3,5}(A)\) for four different subsets of \([6]\text{.}\) Only the first one satisfies \(C_{3,5}(A)\neq A\) because \(5\in A\) and \(3\notin A\text{.}\)
Figure 2.3.16. An illustration of how the function \(C_{i,j}\) behaves on \(2^{\mathbb{N}}\text{.}\)
The function \(C_{i,j}\) is certainly not a bijection. If we let
\begin{equation*} \mathcal{S}_{i,j}:=\{A\subseteq\mathbb{N}: i\notin A\text{ and } j\in A\}\text{,} \end{equation*}
then \(C_{i,j}\) maps \(\mathcal{S}_{i,j}\) bijectively to \(\mathcal{S}_{j,i}\text{,}\) but it does not map any set into \(\mathcal{S}_{i,j}\text{.}\) See Figure 2.3.16. Since \(C_{i,j}\) is not a bijection, simply replacing each set \(A\) in a collection \(\mathcal{F}\) with \(C_{i,j}(A)\) can change the cardinality of \(\mathcal{F}\text{.}\) A smarter way to apply \(C_{i,j}\) to a collection of sets is as follows.

Definition 2.3.17.

For distinct \(i,j\in\mathbb{N}\) and \(\mathcal{F}\subseteq 2^{\mathbb{N}}\text{,}\) define
\begin{equation*} C_{i,j}(\mathcal{F}):=\{C_{i,j}(A): A\in \mathcal{F}\} \cup \{A\in \mathcal{F}: C_{i,j}(A)\in \mathcal{F}\}\text{.} \end{equation*}
Call \(C_{i,j}(\mathcal{F})\) the \((i,j)\)-compression of \(\mathcal{F}\).
Figure 2.3.18. The compression \(C_{1,2}\) applied to a family \(\mathcal{F}\subseteq \binom{\mathbb{N}}{3}\text{.}\)
The proof of the following lemma can be more or less understood by staring at Figure 2.3.16 long enough. Nevertheless, we will still give a proper proof.

Proof.

We describe an explicit bijection \(\varphi:\mathcal{F}\to C_{i,j}(\mathcal{F})\text{.}\) For \(A\in\mathcal{F}\text{,}\) let
\begin{equation*} \varphi(A)=\begin{cases}A \amp \text{ if } A\in C_{i,j}(\mathcal{F})\\ C_{i,j}(A) \amp \text{ otherwise } . \end{cases} \end{equation*}
Let’s argue that \(\varphi\) is surjective. Any set \(B\) in \(C_{i,j}(\mathcal{F})\) which is also in \(\mathcal{F}\) satisfies \(\varphi(B)=B\text{,}\) and so it is covered. If \(B\in C_{i,j}(\mathcal{F})\) and it is not in \(\mathcal{F}\text{,}\) then the set \(A=C_{j,i}(B)\) must be in \(\mathcal{F}\text{.}\) We claim that \(\varphi(A)=B\text{.}\) If not, then it must be the case that \(\varphi(A)=A\text{,}\) which only happens if \(A\in C_{i,j}(\mathcal{F})\text{.}\) But the only reason that \(A\) could be in \(C_{i,j}(\mathcal{F})\) is if \(B\in \mathcal{F}\text{,}\) which we already said is not the case; contradiction.
Now, we verify that it is injective. Suppose \(A_1,A_2\in \mathcal{F}\) and \(\varphi(A_1)=\varphi(A_2)\text{.}\) If neither of \(A_1\) nor \(A_2\) are in \(C_{i,j}(\mathcal{F})\text{,}\) then we get \(A_1=\varphi(A_1)=\varphi(A_2)=A_2\text{.}\) If \(A_1\in C_{i,j}(\mathcal{F})\) and \(A_2\notin C_{i,j}(\mathcal{F})\text{,}\) then we have \(A_1=\varphi(A_1)=\varphi(A_2)=C_{i,j}(A_2)\text{.}\) But then \(C_{i,j}(A_2)\in \mathcal{F}\) which, by definition of \(C_{i,j}(\mathcal{F})\text{,}\) implies that \(A_2\in \mathcal{F}\text{,}\) which we said was not the case; contradiction. If neither \(A_1\) nor \(A_2\) are in \(C_{i,j}(\mathcal{F})\text{,}\) then \(C_{i,j}(A_1)=\varphi(A_1)=\varphi(A_2)=C_{i,j}(A_2)\text{.}\) The only way for \(C_{i,j}(A_1)\) to equal \(C_{i,j}(A_2)\) while \(A_1,A_2\notin C_{i,j}(\mathcal{F})\) is if \(A_1=A_2\text{.}\) This completes the proof.
What is less clear, and more difficult to visualize, is that applying \(C_{i,j}\) to a collection \(\mathcal{F}\) does not increase the size of the shadow.

Proof.

What we will actually prove is that
\begin{equation} \partial C_{i,j}(\mathcal{F})\subseteq C_{i,j}(\partial\mathcal{F})\text{.}\tag{2.3.1} \end{equation}
This will be sufficient since, by Lemma 2.3.19, \(|C_{i,j}(\partial\mathcal{F})|=|\partial \mathcal{F}|\text{.}\)
So, we need to show that every set \(B\in \partial C_{i,j}(\mathcal{F})\) is in \(C_{i,j}(\partial\mathcal{F})\text{.}\) None of the individual steps in proving this are hard. The difficulty mainly comes from digesting what it “means” and then dealing the multitude of cases that arise!
Let \(B\in \partial C_{i,j}(\mathcal{F})\text{.}\) Then there must exist \(\ell\notin B\) such that \(B\cup\{\ell\}\in C_{i,j}(\mathcal{F})\text{.}\) At this point, let’s divide into cases.
Case 1: \(B\cup\{\ell\}\in \mathcal{F}\) and \(B\cap \{i,j\}\neq \{j\}\text{.}\)
In this case, we have that \(B\in\partial\mathcal{F}\) and that \(C_{i,j}(B)=B\text{.}\) Therefore, \(B\in C_{i,j}(\partial\mathcal{F})\) and so (2.3.1) holds in this case.
Case 2: \(B\cup\{\ell\}\in \mathcal{F}\text{,}\) \(B\cap \{i,j\}= \{j\}\) and \(\ell\neq i\text{.}\)
Since \(\ell\neq i\text{,}\) we have \((B\cup\{\ell\})\cap \{i,j\}= B\cap\{i,j\}=\{j\}\text{.}\) Recall that \(B\cup\{\ell\}\in C_{i,j}(\mathcal{F})\text{.}\) Since \((B\cup\{\ell\})\cap \{i,j\}= \{j\}\) we have that \(C_{i,j}(B\cup\{\ell\})\neq B\cup\{\ell\}\text{.}\) So, the only way for \(B\cup \{\ell\}\) to be in both \(\mathcal{F}\) and \(C_{i,j}(\mathcal{F})\) is for \(C_{i,j}(B\cup\{\ell\}) = (B\setminus\{j\})\cup\{i,\ell\}\) to be in \(\mathcal{F}\) as well. So, both of \(B\) and \((B\setminus\{j\})\cup\{i\} = C_{i,j}(B)\) are in \(\partial \mathcal{F}\text{.}\) This implies that \(B\) is in \(C_{i,j}(\partial \mathcal{F})\text{.}\) So, (2.3.1) holds in this case.
Case 3: \(B\cup\{\ell\}\in \mathcal{F}\text{,}\) \(B\cap \{i,j\}= \{j\}\) and \(\ell= i\text{.}\)
In this case, since \(B\cup\{i\}\in\mathcal{F}\) and \(j\in B\text{,}\) we have that both \(B\) and \((B\cup\{i\})\setminus \{j\} = C_{i,j}(B)\) are in \(\partial\mathcal{F}\text{.}\) So, \(C_{i,j}(\partial \mathcal{F})\) contains both \(C_{i,j}(B)\) and \(B\) itself. So, (2.3.1) holds in this case.
Case 4: \(B\cup\{\ell\}\in C_{i,j}(\mathcal{F})\setminus\mathcal{F}\) and \(\ell\neq i\text{.}\)
In this case, we must have that \(B\cup\{\ell\}\) contains \(i\) and not \(j\) and that \(((B\cup\{\ell\})\setminus\{i\})\cup\{j\}\in\mathcal{F}\text{.}\) Note that the fact that \(B\cup\{\ell\}\) contains \(i\) and not \(j\) implies that \(\ell\neq j\text{.}\) Now, since \(\ell\neq i\text{,}\) the set \((B\setminus\{i\})\cup\{j\}\) is in \(\partial \mathcal{F}\text{.}\) So, \(C_{i,j}((B\setminus\{i\})\cup\{j\}) = B\) is in \(C_{i,j}(\partial\mathcal{F})\) and so (2.3.1) holds in this case.
Case 5: \(B\cup\{\ell\}\in C_{i,j}(\mathcal{F})\setminus\mathcal{F}\) and \(\ell=i\text{.}\)
In this case, we again must have that \(B\cup\{i\}\) contains \(i\) and not \(j\) and that \(B\cup\{j\}\in\mathcal{F}\text{.}\) So, \(B\in\partial\mathcal{F}\text{.}\) Since \(B\) contains neither \(i\) nor \(j\text{,}\) \(C_{i,j}(B)=B\in C_{i,j}(\partial\mathcal{F})\) and so (2.3.1) holds in this case as well.
We have exhausted all of the cases (and ourselves) and so the proof is complete.
The notion of compression leads us to a natural strategy for trying to prove the Kruskal–Katona Theorem. That is, start with a finite collection \(\mathcal{F}\subseteq\binom{\mathbb{N}}{k}\text{.}\) While there exists \(i\lt j\) such that \(C_{i,j}(\mathcal{F})\neq \mathcal{F}\text{,}\) replace \(\mathcal{F}\) with \(C_{i,j}(\mathcal{F})\text{.}\) By Lemma 2.3.19, this does not change the size of the collection and, by Lemma 2.3.20, this does not increase the size of the shadow. Imagine that we keep repeating this until no such \(i\) and \(j\) exist. Two natural questions arise:
  1. Does this eventually terminate?
  2. Does it eventually reach an initial segment of colex?
The answer to the first question here is “yes.” One way to see this is by defining a weight function. For a finite set \(A\subseteq \mathbb{N}\text{,}\) define the weight of \(A\) to be
\begin{equation*} w(A):=\sum_{i\in A}2^i\text{.} \end{equation*}
Then define the weight of a finite collection \(\mathcal{F}\subseteq\binom{\mathbb{N}}{k}\) to be
\begin{equation*} w(\mathcal{F})=\sum_{A\in\mathcal{F}}w(A)\text{.} \end{equation*}
The key observation is that \(C_{i,j}\) does not increase the weight of a set and, if \(C_{i,j}(A)\neq A\text{,}\) then the weight of \(C_{i,j}(A)\) is strictly less than that of \(A\text{.}\) Since the weight of a collection is an integer, and applying \(C_{i,j}\) for \(i\lt j\) can only decrease the weight of a collection, we see that the process described above eventually terminates.
Thus, to prove the Kruskal–Katona Theorem, it suffices to consider a collection \(\mathcal{F}\) such that \(C_{i,j}(\mathcal{F})=\mathcal{F}\) for all \(i\lt j\text{.}\) The following definition describes this situation.

Definition 2.3.21.

For distinct \(i,j\geq1\text{,}\) say that \(\mathcal{F}\subseteq\binom{\mathbb{N}}{k}\) is \((i,j)\)-compressed if \(C_{i,j}(\mathcal{F})=\mathcal{F}\text{.}\) Say that \(\mathcal{F}\) is compressed if it is \((i,j)\)-compressed for all \(i\lt j\text{.}\)
Now, what about that second question? If \(\mathcal{F}\) is compressed, does it follow that \(\mathcal{F}\) is an initial segment of colex? Sadly, it does not, as the next example shows.

Example 2.3.22.

Consider
\begin{equation*} \mathcal{F}=\{\{1,2,3\},\{1,2,4\},\{1,2,5\},\{1,2,6\},\{1,2,7\}\}\text{.} \end{equation*}
Clearly, this is not an initial segment of colex. However, if you think about it, you can see that \(\mathcal{F}\) is compressed.
Indeed, if \(i\in\{1,2\}\text{,}\) then \(C_{i,j}(A)=A\) for all \(j>i\) and \(A\in \mathcal{F}\text{.}\) Also, if \(3\leq i\lt j\leq 7\) and \(A\in\mathcal{F}\text{,}\) then either \(C_{i,j}(A)=A\) or \(C_{i,j}(A)\in \mathcal{F}\text{.}\) Thus, \(C_{i,j}(\mathcal{F})=\mathcal{F}\text{.}\)
At this point, it may feel like our strategy of trading large elements for small ones has failed. However, the situation is not as bad as it seems. There are several directions that we can go. One way is to consider more powerful types of compressions in which, instead of trading one element for another, we trade a whole set of elements for another set that comes earlier in colex. If you are sufficiently careful in how you do it, then this can work; this is the strategy followed in [186], for example. We won’t do it this way, but it is worth being aware that it is possible. Instead, we will follow an inductive argument.

Proof of Theorem 2.3.9.

We proceed by induction on \(k+m\text{.}\) As a base case, suppose that \(k=1\text{.}\) This case is easy, as every collection \(\mathcal{F}\subseteq\binom{\mathbb{N}}{1}\) has the same shadow; namely, \(\{\emptyset\}\text{.}\) In what follows, we assume that \(k\geq2\) and let \(m\geq1\text{.}\) By Corollary 2.3.13, we can let \(a_1,\dots,a_k\) be a sequence of positive integers such that
\begin{equation} m=1+\sum_{i=1}^k\binom{a_i-1}{i}\text{.}\tag{2.3.2} \end{equation}
Let \(\mathcal{F}\subseteq\binom{\mathbb{N}}{k}\) such that \(|\mathcal{F}|=m\text{.}\) By Lemmas 2.3.19 and Lemma 2.3.20, we can assume that \(\mathcal{F}\) is compressed. Define
\begin{equation*} \mathcal{F}_1:=\{A\in\mathcal{F}: 1\in A\},\text{ and } \end{equation*}
\begin{equation*} \mathcal{F}_0:=\mathcal{F}\setminus\mathcal{F}_1\text{.} \end{equation*}
Here’s a cool thing about \(\mathcal{F}\) being compressed. Imagine that we start with a set \(A\in \mathcal{F}_0\text{.}\) Take an element \(\ell\) of \(A\) and delete it; this gives us an element of \(\partial\mathcal{F}_0\text{.}\) Now, add the element \(1\) to the resulting set. This process yields \(C_{1,\ell}(A)\text{,}\) which must be an element of \(\mathcal{F}_1\) because \(\mathcal{F}\) is compressed. The last step of this process (adding \(1\) to a set in \(\partial \mathcal{F}_0\)) is an injection from \(\partial\mathcal{F}_0\) to \(\mathcal{F}_1\text{.}\) Therefore, we have the following
\begin{equation} |\mathcal{F}_1|\geq |\partial\mathcal{F}_0|\text{.}\tag{2.3.3} \end{equation}
The above equation bounds the size of \(\mathcal{F}_1\) relative to the shadow of \(\mathcal{F}_0\text{.}\) Let’s turn this into an absolute bound.

Proof.

We suppose that it is false and derive a contradiction. By (2.3.2), we get
\begin{equation*} |\mathcal{F}_0|=|\mathcal{F}|-|\mathcal{F}_1|=1+\sum_{i=1}^k\binom{a_i-1}{i} - |\mathcal{F}_1|\geq 1+\sum_{i=1}^k\binom{a_i-1}{i}-\sum_{i=2}^k\binom{a_i-2}{i-1} \end{equation*}
\begin{equation*} =a_i + \sum_{i=2}^k\left[\binom{a_i-1}{i}-\binom{a_i-2}{i-1}\right] = 1 + \binom{a_1-1}{1} + \sum_{i=2}^k\binom{a_i-2}{i}\text{.} \end{equation*}
In conclusion,
\begin{equation} |\mathcal{F}_0|\geq 1 + \binom{a_1-1}{1} + \sum_{i=2}^k\binom{a_i-2}{i}\text{.}\tag{2.3.4} \end{equation}
At first, this may seem pretty irrelevant for proving the claim. But here’s the kicker. Knowing that \(|\mathcal{F}_0|\) is large implies that its shadow is large by the induction hypothesis, which, by (2.3.3), will implies that \(|\mathcal{F}_1|\) is large as well.
Here are the details. What we would like to do is simply say that (2.3.4), mixed with induction, immediately implies that
\begin{equation} |\partial\mathcal{F}_0|\geq \binom{a_1-1}{0}+\sum_{i=2}^k\binom{a_i-2}{i-1}\geq 1+\sum_{i=2}^k\binom{a_i-2}{i-1}\tag{2.3.5} \end{equation}
which proves the claim. This almost just works. However, we need to be a little bit more careful. We can’t quite apply induction immediately here because we don’t know that the sequence \(a_1,a_2-1,\dots,a_k-1\) is increasing. That is, we could have \(a_1=a_2-1\text{.}\) We can get around this issue with a little bit of tinkering. If \(a_1\geq2\text{,}\) then we can use (2.3.4) and the fact that that \(a_1-1,a_2-1,\dots,a_k-1\) is increasing to derive (2.3.5), since \(\binom{a_1-2}{0}=1\text{.}\) So, we only have a problem in the case \(a_1=1\) and \(a_2=2\text{.}\) Define \(j\) to be the largest index such that \(a_j=j\text{.}\) Then, for \(2\leq i\leq j\text{,}\) we have \(\binom{a_i-2}{i}=0=\binom{a_i-1}{i}\text{.}\) So, we can rewrite (2.3.4) as
\begin{equation*} |\mathcal{F}_0|\geq 1+\binom{a_1-1}{1} + \sum_{i=2}^j\binom{a_i-1}{i} + \sum_{i=j+1}^k\binom{a_i-2}{i}\text{.} \end{equation*}
The sequence \(a_1,a_2,\dots,a_j,a_{j+1}-1,\dots,a_k-1\) is genuinely increasing, so we can derive (2.3.5) from this inequality and induction.
Finally, (2.3.3) and (2.3.5) directly imply the claim.
We are now in position to finish the proof off. Define
\begin{equation*} \mathcal{G}:=\{A\setminus \{1\}: A\in\mathcal{F}_1\}\text{.} \end{equation*}
Note that \(\mathcal{G}\subseteq\partial\mathcal{F}\) and that \(|\mathcal{G}|=|\mathcal{F}_1|\text{.}\) Define a map \(\psi:\mathcal{G}\cup\partial\mathcal{G}\to\partial\mathcal{F}\) by defining, for each \(B\) in the domain,
\begin{equation*} \psi(B):=\begin{cases}B \amp \text{ if } B\in\mathcal{G},\\ B\cup\{1\} \amp \text{ if } B\in\partial\mathcal{G}. \end{cases} \end{equation*}
The function \(\psi\) is injective; therefore \(|\partial\mathcal{F}|\geq |\mathcal{G}| + |\partial\mathcal{G}|\text{.}\) By Claim 2.3.23, the facts that \(|\mathcal{G}|=|\mathcal{F}_1|\) and \(\mathcal{G}\subseteq \binom{\mathbb{N}}{k-1}\text{,}\) and induction, we have
\begin{equation*} |\partial\mathcal{F}|\geq |\mathcal{G}| + |\partial\mathcal{G}| \geq 1+\sum_{i=2}^k\binom{a_i-2}{i-1} + \sum_{i=2}^k\binom{a_i-2}{i-2} \end{equation*}
\begin{equation*} =1+\sum_{i=2}^k\binom{a_i-1}{i-1} = \sum_{i=1}^k\binom{a_i-1}{i-1}\text{.} \end{equation*}
WOW! Right?
The following is a weak form of Theorem 2.3.9 which has a particularly elegant formulation. Given a real number \(\ell\) and an integer \(k\text{,}\) define
\begin{equation*} \binom{\ell}{k}:=\frac{\ell(\ell-1)\cdots(\ell-k+1)}{k!}\text{.} \end{equation*}
Let us now show how the Kruskal–Katona Theorem (in the form of Theorem 2.3.24 with \(\ell\in\mathbb{N}\)) implies the Erdős–Ko–Rado Theorem (Theorem 1.1.9). Given a family \(\mathcal{F}\) and \(t\geq1\text{,}\) let \(\partial^{(t)}\mathcal{F}\) be the collection obtained by applying the shadow operation to \(\mathcal{F}\) exactly \(t\) times. That is, \(\partial^{(1)}\mathcal{F}=\partial \mathcal{F}\) and, for \(t\geq2\text{,}\)
\begin{equation*} \partial^{(t)}\mathcal{F}:=\partial\left(\partial^{(t-1)}\mathcal{F}\right)\text{.} \end{equation*}

Another Proof of Theorem 1.1.9.

Suppose that \(\mathcal{F}\subseteq \binom{[n]}{k}\) is intersecting. Define
\begin{equation*} \mathcal{G}:=\{[n]\setminus A: A\in\mathcal{F}\}\text{.} \end{equation*}
If \(A\in \mathcal{F}\) and \(B\in \mathcal{G}\text{,}\) then we cannot have \(A\subseteq B\text{.}\) This is because \(\mathcal{F}\) is intersecting and the set \(B\) is of the form \([n]\setminus A'\) for some \(A'\in\mathcal{F}\text{.}\) Therefore,
\begin{equation*} \mathcal{F}\cap \partial^{(n-2k)}\mathcal{G}=\emptyset\text{.} \end{equation*}
Suppose that \(|\mathcal{F}|\geq \binom{n-1}{k-1}\text{.}\) Then, since \(|\mathcal{G}|=|\mathcal{F}|\text{,}\) we also have \(|\mathcal{G}|\geq \binom{n-1}{k-1} = \binom{n-1}{n-k}\text{.}\) By Theorem 2.3.24 and the fact that \(\mathcal{G}\subseteq \binom{[n]}{n-k}\text{,}\) we have that
\begin{equation*} |\partial \mathcal{G}|\geq \binom{n-1}{n-k-1} \end{equation*}
and, by induction, for all \(1\leq t\leq n-k\text{,}\)
\begin{equation*} |\partial^{(t)}\mathcal{G}|\geq \binom{n-1}{n-k-t}\text{.} \end{equation*}
In particular, \(|\partial^{(n-2k)}\mathcal{G}|\geq \binom{n-1}{k}\text{.}\) However, since this set is disjoint from \(\mathcal{F}\) (we already proved this above), this implies that \(|\mathcal{F}|\leq\binom{n}{k}-\binom{n-1}{k}= \binom{n-1}{k-1}\text{.}\) This completes the proof.
Here are a few YouTube videos related to the Kruskal–Katona Theorem: