Skip to main content

Section 6.1 Independent Sets in Triangle-Free Graphs

The first question that we will focus on is: given a triangle-free graph \(G\) on \(n\) vertices, how large of an independent set must \(G\) contain? Let’s start by obtaining a relatively easy bound of roughly \(\sqrt{n}\text{.}\) Recall that \(\alpha(G)\) is the cardinality of the largest independent set in a graph \(G\text{.}\)

Proof.

Since \(G\) is triangle-free, for every vertex \(v\text{,}\) the neighbourhood \(N(v)\) is an independent set. Therefore, if \(G\) contains a vertex of degree at least \(\sqrt{n} - \frac{1}{2}\text{,}\) then we are done.
So, we assume that the maximum degree of \(G\) is less than \(\sqrt{n} - \frac{1}{2}\text{.}\) Let \(f:V(G)\to\mathbb{N}\) be a function obtained by ordering the vertices of \(G\) by \(v_1,\dots,v_n\) arbitrarily and letting \(f(v_i)\) be the smallest integer that is not contained in \(\{f(v_j): j\lt i\text{ and } v_j\in N(v_i)\}\text{.}\) Then, by construction, the set \(f^{-1}(c)\) is independent for every \(c\in\mathbb{N}\text{.}\) Also, the neighbours of any given vertex \(v\) are coloured with at most \(d(v)\) distinct colours, and so \(f(v)\leq d(v)+1\leq \sqrt{n}+\frac{1}{2}\text{.}\) for every vertex \(v\text{.}\) Therefore, by the Pigeonhole Principle, there exists \(1\leq c\leq \sqrt{n}+\frac{1}{2}\) such that
\begin{equation*} |f^{-1}(c)|\geq \frac{n}{\sqrt{n}+\frac{1}{2}} > \sqrt{n} - \frac{1}{2} \end{equation*}
which completes the proof.

Remark 6.1.2.

The bound in the above proof can be improved a wee bit if we use “Brooks’ Theorem” (which is not covered in this course) instead of the simple greedy colouring procedure. However, the gain would not be very large.
The argument in Proposition 6.1.1 was not terribly sophisticated, but it is somewhat instructive. In particular, it tells us that a simple way of finding a large independent set in a triangle-free graph is just by taking the neighbourhood of a large degree vertex. Next, we present an argument which is based on the same fundamental principle, but contains more clever analysis.

Proof.

Let \(f:[0,\infty)\to \mathbb{R}\) be defined by
\begin{equation*} f(x)= \begin{cases}1 \amp \text{ if } x=0,\\ 1/2 \amp \text{ if } x=1,\\ \frac{x\ln(x)-x+1}{(x-1)^2} \amp \text{ otherwise } . \end{cases} \end{equation*}
We start by making some observations about \(f\text{.}\) By taking a couple of derivatives, it is not hard to that \(f\) satisfies
\begin{equation} 1-f(x)(x+1)+f'(x)(x-x^2)=0\tag{6.1.1} \end{equation}
and that \(f\) is convex (see Appendix C). Therefore, it satisfies
\begin{equation} f(x)\geq f(y) + f'(y)(x-y)\tag{6.1.2} \end{equation}
for every \(x,y\in [0,1]\text{.}\)
We show that every triangle-free graph \(G\) on \(n\) vertices has an independent set of cardinality at least \(f(2|E(G)/n)n\text{.}\) We will do this by induction on \(n\text{,}\) where the case \(n=1\) is trivial.
Define
\begin{equation*} d=2|E(G)|/n = \sum_{v\in V(G)}d(v)/n\text{.} \end{equation*}
That is, \(d\) is the average degree of \(G\text{.}\) What we are trying to prove is that \(\alpha(G)\geq f(d)n\text{.}\) For each vertex \(v\text{,}\) let
\begin{equation*} d_2(v) := \sum_{u\in N(v)}d(u) \end{equation*}
and let \(G_v\) be the graph obtained from \(G\) by deleting \(v\) and all of its neighbours. Note that the number of vertices in \(G_v\) is \(n-d(v)-1\text{.}\) Define
\begin{equation*} d_v:=2|E(G_v)|/(n-d(v)-1)\text{.} \end{equation*}
That is, \(d_v\) is the average degree of \(G_v\text{.}\)
Since \(G\) is triangle-free, we have that, for every \(v\in V(G)\text{,}\)
\begin{equation} |E(G)| = |E(G_v)| + d_2(v)\tag{6.1.3} \end{equation}
Also, for any independent set \(S\) in \(G_v\text{,}\) the set \(S\cup\{v\}\) is independent in \(G\text{.}\) Therefore, \(\alpha(G) \geq \alpha(G_v)+1\text{.}\) Applying this bound for every \(v\in V(G)\text{,}\) induction on \(|V(G)|\) and (6.1.2), we get
\begin{equation*} n\cdot \alpha(G) \geq \sum_{v\in V(G)}\left(\alpha(G_v)+1\right) \geq n + \sum_{v\in V(G)}f(d_v)(n-d(v)-1) \end{equation*}
\begin{equation*} \geq n+\sum_{v\in V(G)}[f(d) + f'(d)(d_v-d)](n-d(v)-1)\text{.} \end{equation*}
Doing a bit of rearranging on the right side (and using the fact that \(\sum_{v\in V(G)}d(v) = dn\)) yields
\begin{equation*} n+\sum_{v\in V(G)}[f(d) + f'(d)(d_v-d)](n-d(v)-1) \end{equation*}
\begin{equation*} =n + \sum_{v\in V(G)}[f(d) - f'(d)d](n-d(v)-1) + \sum_{v\in V(G)}f'(d)d_v(n-d(v)-1) \end{equation*}
\begin{equation*} =n+(f(d)-f'(d)d)(n^2-dn-n) + f'(d)\sum_{v\in V(G)}d_v(n-d-1)\text{.} \end{equation*}
By definition, we have \(d_v(n-d-1)= 2|E(G_v)|\) for all \(v\in V(G)\text{.}\) Combining this with (6.1.3) and the fact that \(|E(G)|=nd\text{,}\) we get that the above expression is equal to
\begin{equation*} n+(f(d)-f'(d)d)(n^2-dn-n) + f'(d)\sum_{v\in V(G)}(nd - 2d_2(v)) \end{equation*}
\begin{equation*} = n+f(d)(n^2-dn-n) +f'(d)(d^2n+dn) - 2f'(d)\sum_{v\in V(G)}d_2(v)\text{.} \end{equation*}
Now, \(\sum_{v\in V(G)}d_2(v)\) precisely counts the number of ways to choose a vertex \(v\) of \(G\text{,}\) choose a neighbour \(u\) of \(v\) and then choose a neighbour \(w\) of \(u\text{.}\) This is the same as the number of ways to simply choose the vertex \(u\) first and choose two of its neighbours, \(v\) and \(w\text{.}\) Thus,
\begin{equation*} \sum_{v\in V(G)}d_2(v) = \sum_{u\in V(G)}d(u)^2 \geq \frac{1}{n}\left(\sum_{u\in V(G)}d(u)\right)^2 = nd^2 \end{equation*}
by Corollary C.0.6. So, putting everything together (and dividing by \(n\)) we get that
\begin{equation*} \alpha(G)\geq f(d)n +1- f(d)(d+1) + f'(d)(d-d^2) \end{equation*}
\begin{equation*} = f(d)n \end{equation*}
Recall that \(G\) has an independent set of cardinality \(d(v)\) for any \(v\in V(G)\text{.}\) So, it has an independent set of cardinality at least the average degree \(d\text{.}\) So, by the result above,
\begin{equation*} \alpha(G)\geq \max\{d,f(d)n\}\text{.} \end{equation*}
If \(d\geq \sqrt{\frac{n\ln(n)}{2}}\text{,}\) then we are done. Otherwise, \(d\lt \sqrt{\frac{n\ln(n)}{2}}\text{,}\) and plugging this into \(f(d)n\) yields a lower bound of roughly
\begin{equation*} \approx \left(\frac{\ln(d)}{d}\right) n\geq\left(\frac{\ln\left(\sqrt{\frac{n\ln(n)}{2}}\right)}{\sqrt{\frac{n\ln(n)}{2}}}\right)n \geq\sqrt{\frac{n\ln(n)}{2}} \end{equation*}
(since \(\ln\left(\sqrt{\frac{n\ln(n)}{2}}\right)\geq \frac{1}{2}\ln(n)\)).
For \(k,\ell\in\mathbb{N}\text{,}\) the Ramsey number \(R(k,\ell)\) is the minimum \(N\) such that, in any colouring of the edges of \(K_N\) with red and blue, there is either a red copy of \(K_k\) or a blue copy of \(K_\ell\text{.}\) In this language, Theorem 6.1.3 can be thought of as saying that
\begin{equation*} R(3,k)\leq (1+o(1))\frac{k^2}{\log(k)} \end{equation*}
as \(k\to\infty\text{.}\)
Here are a couple of YouTube videos related to Shearer’s Bound: