Imagine that I pick a random independent set \(I\) according to the hard core model with fugacity \(\lambda\) and pick a random vertex \(v\in V(G)\) but I don’t tell you which \(I\) and \(v\) I have picked (I am keeping it a secret, for now). Since \(\alpha_G(\lambda)\) is the expected fraction of vertices that are in \(I\text{,}\) we have that
\begin{equation*}
\alpha_G(\lambda)=\mathbb{P}(v\in I).
\end{equation*}
Now, imagine that I reveal the number \(Y=|N(v)\cap I|\) to you, but I don’t tell you anything else about the independent set. For each \(1\leq i\leq d\text{,}\) what is the value of
\begin{equation*}
\mathbb{P}(v\in I\mid Y=i)?
\end{equation*}
This one is a little trickier. What is
\begin{equation*}
\mathbb{P}(v\in I\mid Y=0)?
\end{equation*}
From the values that you have computed and standard rules of conditional probability, conclude that
\begin{equation*}
\mathbb{P}(v\in I)=\frac{\lambda}{1+\lambda}\mathbb{P}(Y=0).
\end{equation*}
The rest of the proof is based on a super nifty double-counting trick. For the second part of the double-counting, we need to compute \(\mathbb{P}(Y=0)\) in another way. Recall that we still have not been told the choice of the set \(I\) or the vertex \(v\text{.}\) Imagine also that \(Y\) has not been revealed either. Now, suppose that I reveal a random variable \(Z\) which is the number of neighbours \(u\) of \(v\) such that none of the neighbours of \(u\) are in \(I\text{.}\) That is, \(Z\) counts neighbours of \(v\) which could “potentially contribute” to the random variable \(Y\text{.}\) That is, any neighbour of \(v\) that is not counted by \(Z\) has no chance of being in the independent set. For each vertex \(u\) that is counted by \(Z\) we have
\begin{equation*}
\mathbb{P}(u\notin I)=\frac{1}{1+\lambda}.
\end{equation*}
Can you see why? We have that
\begin{equation*}
\mathbb{P}(Y=0)=\mathbb{P}(\text{every vertex }u \text{ counted by }Z\text{ satisfies }u\notin I).
\end{equation*}
Now, here is the magical thing about triangle-freeness here. None of the vertices \(u\) counted by \(Z\) are adjacent. So, the events that every such vertex satisfies \(u\notin I\) are independent of one another. So,
\begin{equation*}
\mathbb{P}(Y=0)=\prod_{\substack{u\notin N(v)\\u\text{ is counted by }Z}}\mathbb{P}(u\in I) = \left(\frac{1}{1+\lambda}\right)^Z.
\end{equation*}
Next, by convexity of the function \(\varphi(z)=(1+\lambda)^{-z}\) we can “linearly interpolate” to get \((1+\lambda)^{-z}\leq 1-\frac{z}{d}(1-(1+\lambda)^{-d})\) for any \(0\leq z\leq d\) (i.e. this is true for \(z=0\) and \(z=d\) and so it should be true for all \(z\) between these values). So, since \(0\leq Z\leq d\text{,}\)
\begin{equation*}
\mathbb{P}(Y=0)\leq 1-\frac{\mathbb{E}(Z)}{d}(1-(1+\lambda)^{-d}).
\end{equation*}
Now, here is something clever. How do we compute \(\mathbb{E}(Z)\text{?}\) Well, if we choose \(v\) uniformly at random and then choose a uniformly random neighbour of \(v\text{,}\) then, since \(G\) is \(d\)-regular, \(u\) is equally likely to be any vertex of \(G\text{.}\) Do you see why? Anyway, since a random neighbour of \(v\) is nothing more than a uniformly random vertex, i.e. it has the same distribution as \(v\text{,}\) the probability that \(u\) is counted by \(Z\) is the same as \(\mathbb{P}(Y=0)\text{.}\) So, the expected number of neighbours of \(v\) that are counted by \(Z\) is
\begin{equation*}
\mathbb{E}(Z) = d\cdot \mathbb{P}(Y=0).
\end{equation*}
So, plugging this into the bound on \(\mathbb{P}(Y=0)\) derived earlier and solving for \(\mathbb{P}(Y=0)\text{,}\) we get
\begin{equation*}
\mathbb{P}(Y=0) \leq \frac{1}{2 - (1 + \lambda)^d}.
\end{equation*}
Finally, putting everything together, we get
\begin{equation*}
\alpha_G(\lambda) = \mathbb{P}(v\in I)=\frac{\lambda}{1+\lambda}\mathbb{P}(Y=0) \leq \frac{\lambda}{1+\lambda}\cdot\frac{1}{2 - (1 + \lambda)^{-d}}=\frac{\lambda(1+\lambda)^{d-1}}{2(1+\lambda)^d-1} = \alpha_{K_{d,d}}(\lambda).
\end{equation*}
Magic, right?