Shortest Paths in Digraphs

Section 1.3 Shortest Paths in Digraphs

We are now ready to investigate an algorithmic problem coming from graph theory, known as the Shortest Path Problem. A walk in a digraph \(D\) is a sequence \(v_0,v_1,\dots,v_k\) of vertices of \(D\) such that \(v_iv_{i+1}\in A(D)\) for all \(0\leq i\leq k-1\text{.}\) It is called a path if no vertex appears more than once in the sequence. If \(\ell:A(D)\to \mathbb{R}\) is a length function, then the length of a walk \(W=v_0,v_1,\dots,v_k\) with respect to \(\ell\text{,}\) denoted \(\ell(W)\text{,}\) is the sum of the lengths of the arcs of the path; that is, it is \(\ell(W):=\sum_{i=0}^{k-1}\ell(v_iv_{i+1})\text{.}\) A natural practical problem is to find the shortest route from one vertex to another in a graph; for example, this is the problem that your GPS is trying to solve when you turn the navigation on. Of course, it would be foolish for your GPS to send you around in a loop back to an intersection that you crossed earlier in your journey; more generally, if every arc has non-negative length, then it is clear that the length of the shortest walk is equal to the length of the shortest path. Our focus in this section will be on finding shortest paths in digraphs.

Problem 1.3.1. The Shortest Path Problem.

Input: A digraph \(D\text{,}\) a length function \(\ell:A(D)\to \mathbb{R}\) and two vertices \(s,t\in V(D)\text{.}\)
Goal: Find a shortest (i.e. minimum length) path from \(s\) to \(t\) in \(D\) with respect to \(\ell\text{.}\)

If all of the lengths are equal to one, then one can easily solve this problem by generating a “breadth-first search tree” in the digraph. That is, starting from \(s\text{,}\) we first explore all of the out-neighbours of \(s\text{,}\) and then we explore all of the vertices at distance two from \(s\text{,}\) and so on, until we reach \(t\text{.}\) This strategy can also be adapted to the case where the lengths are positive integers. In this case, we could modify the digraph by replacing each arc \(uv\) with a path of length \(\ell(uv)\) in which each arc of this new path has length \(1\text{.}\) However, the running time of such an algorithm would be highly dependent on the specific choice of length function; having high lengths on the arcs would cause the size of the resulting digraph to be much greater than the size of the original input. Let us now present a famous algorithm of Dijkstra for solving this problem in a way that is less sensitive to the length function, under the (reasonable) assumption that all of the lengths are non-negative. Not only will this algorithm produce the shortest path from a vertex \(s\) to some vertex \(t\text{,}\) but it will produce the shortest path to every vertex in the digraph simultaneously.

Algorithm 1.3.2. Dijkstra’s Algorithm.

Input: A digraph \(D\text{,}\) a length function \(\ell:A(D)\to [0,\infty)\) and a vertex \(s\in V(D)\text{.}\)
Output: A path \(P_v\) from \(s\) to every vertex \(v\) in \(D\) for which such a path exists of minimum length with respect to \(\ell\) and the length \(d_s(v)\) of such a path.

Initialization: Let \(d_s:V(D)\to\mathbb{R}\cup\{\infty\}\) be defined by \(d_s(s):=0\) and \(d_s(v):=\infty\) for all \(v\in V(D)\setminus \{s\}\text{.}\) Set \(R:=\emptyset\text{.}\) Also, let \(p:V(D)\to V(D)\cup \{\text{null}\}\) be the function such that \(p(s)=s\) and \(p(v)=\text{null}\) for all \(v\in V(D)\setminus\{s\}\text{.}\) Throughout the algorithm, \(R\) will represent the set of vertices that we have “reached” so far and \(d_s(v)\) and \(p(v)\) will represent our current “best guesses” for the the length of a shortest path from \(s\) to \(v\) and the predecessor of \(v\) on such a path.

while there exists \(v\in V(D)\setminus R\) do

Choose \(v\in V(D)\setminus R\) so that \(d_s(v)\) is minimized among all such vertices.
Add \(v\) to \(R\text{.}\)
For each \(w\in V(D)\setminus R\) such that \(vw\in A(D)\text{,}\) if \(d_s(w)>d_s(v)+\ell(vw)\text{,}\) then update \(d_s(w)\) to be equal to \(d_s(v)+\ell(vw)\) and update \(p(w)\) to be equal to \(v\text{.}\)

After completing this loop, for each \(v\in R\text{,}\) let \(P_v\) be the sequence obtained by reversing \(v,p(v),p(p(v)),\dots,s\) and return all of the paths \(P_v\) and values \(d_s(v)\) for \(v\in R\text{.}\)

Theorem 1.3.3.

Dijkstra’s Algorithm is correct and has running time \(O(n^2)\) on digraphs with \(n\) vertices.

Proof.

We start by analyzing the running time. The number of iterations of the while loop is \(O(n)\text{.}\) In each iteration, it takes us \(O(n)\) steps to locate the vertex \(v\in V(D)\setminus R\) minimizing \(d_s(v)\text{.}\) After locating \(v\text{,}\) the number of comparisions and updates performed is a constant multiple of the number of outgoing arcs at \(v\text{,}\) which is also \(O(n)\text{.}\) Also, in producing the output, we generated \(n\) paths \(P_v\) for \(v\in V(D)\text{,}\) each of length \(O(n)\text{.}\) So, the total running time is \(O(n^2)\text{.}\)

Let us show that the algorithm is correct, meaning that it correctly identifies a shortest path from \(s\) to any vertex in the digraph and the length of such a path. Let \(\dist(s,v)\) be the length of the shortest path from \(s\) to a vertex \(v\) where, if no such path exists, then \(\dist(s,v):=\infty\text{.}\) In particular, we want to prove that \(d_s(v)=\dist(s,v)\) for all \(v\in V(G)\text{.}\) We claim that, at the start of every iteration of the algorithm, the following two statements hold

\(d_s(w)=\dist(s,w)\) for all \(w\in R\text{.}\)
for every \(w\in V(D)\setminus\{s\}\) such that \(p(w)\neq \text{null}\text{,}\) we have \(p(w)\in R\text{,}\) \(d_s(w)=d_s(p(w))+\ell(p(w)w)\) and the sequence \(p(w),p(p(w)),\dots\) contains \(s\text{.}\)

First, we show that these two properties hold at the beginning of the algorithm. Immediately after initialization, we have that \(R=\emptyset\) and \(p(w)=\text{null}\) for all \(v\in V(D)\setminus\{s\}\text{.}\) So, both conditions hold trivially at the beginning of the algorithm.

Now, suppose that we are at the beginning of a stage of the algorithm and that Statement 1 and Statement 2 are both currently satisfied. We need to show that these two properties will still hold after this stage of the algorithm is complete. Let \(v\) be the vertex of \(V(D)\setminus R\) that is being added to \(R\) in this stage.

We show that Statement 1 continues to hold after this stage. For any vertex \(w\in R\setminus\{v\}\text{,}\) we have that \(p(w)\) and \(d_s(w)\) are not changed during this stage and so Statement 1 continues to hold for \(w\text{.}\) So, all that is left is to show that Statement 1 holds for \(v\text{.}\) In the case that \(v=s\text{,}\) we have \(d_s(v)=0=\dist(s,v)\) and so it holds in this case. Now, suppose \(v\neq s\text{.}\) First, let us show that \(d_s(v)\geq \dist(s,v)\text{.}\) If \(d_s(v)=\infty\text{,}\) then this is certainly true and so assume \(d_s(v)< \infty\text{.}\) Since \(v\neq s\) and \(d_s(v)< \infty\text{,}\) the value of \(d_s(v)\) was updated during an earlier stage of the algorithm. So, \(p(v)\neq\text{null}\text{.}\) Thus, by applying Statement 2 to \(v\text{,}\) we have \(p(v)\in R\) and \(d_s(v)=d_s(p(v))+\ell(p(v)v)\text{.}\) Since \(p(v)\in R\) and Statement 1 holds for all previous stages, we have \(d_s(p(v))=\dist(s,p(v))\text{.}\) So,

\begin{equation*} d_s(v)=d_s(p(v))+\ell(p(v)v)=\dist(s,p(v))+\ell(p(v)v)\geq \dist(s,v). \end{equation*}

We claim that the reverse inequality holds as well. Take any path of length \(\dist(s,v)\) from \(s\) to \(v\text{.}\) Note that \(s\) is clearly the first vertex to be added to \(R\text{,}\) and so \(v\neq s\) implies that \(s\in R\setminus\{v\}\text{.}\) Since \(s\in R\setminus\{v\}\text{,}\) we can let \(x\) be the last vertex of \(R\setminus\{v\}\) on the path and \(y\notin R\setminus\{v\}\) be the vertex that comes after \(x\) on the path. Since the length function is non-negative and the shortest path from \(s\) to \(v\) contains a path from \(s\) to \(x\) and the edge \(xy\text{,}\) we have \(\dist(s,v)\geq \dist(s,x)+\ell(xy)\text{.}\) Since \(x\in R\text{,}\) we know that \(d_s(x)=\dist(s,x)\) by Statement 1 and that \(d_s(y)\leq d_s(x)+\ell(xy)=\dist(s,x)+\ell(xy)\) because of how the algorithm updated the function \(d_s\) on the out-neighbours of \(x\) on the stage in which \(x\) was added to \(R\text{.}\) Also, since \(v\) is being added to \(R\) in this stage, it was chosen to be a vertex outside of \(R\) such that \(d_s(v)\) is minimum, and so \(d_s(y)\geq d_s(v)\text{.}\) Putting this together, we get

\begin{equation*} \dist(s,v)\geq \dist(s,x)+\ell(xy)\geq d_s(y)\geq d_s(v) \end{equation*}

and so Statement 1 holds for \(w=v\text{.}\)

Now, let us show that Statement 2 continues to hold at the end of this stage. If \(vw\notin A(D)\) or the inequality \(d_s(w)\leq d_s(v)+\ell(vw)\) holds prior to adding \(v\) to \(R\text{,}\) then \(p(w)\) and \(d_s(w)\) are not changed during this stage. So, assume \(vw\in A(D)\) and \(d_s(w)> d_s(v)+\ell(vw)\text{.}\) So, right after \(v\) is added to \(R\text{,}\) the value of \(p(w)\) is updated to \(v\text{,}\) which is a (brand new) element of \(R\text{.}\) Also, \(d_s(w)\) is updated to \(d_s(v)+\ell(vw)=d_s(p(w))+\ell(p(w)w)\text{.}\) Now, let’s prove that the last property in Statement 2 holds for \(w\text{.}\) In the case that \(v=s\text{,}\) it holds trivially since \(p(w)=v=s\text{.}\) Otherwise, since \(d_s(w)> d_s(v)+\ell(vw)\) held prior to adding \(v\) to \(R\text{,}\) we have \(d_s(v)<\infty\text{.}\) This can only happen if \(p(v)\neq\text{null}\text{.}\) So, since Statement 2 held for all earlier stages of the algorithm and \(p(v)\) is not changed during the stage when \(v\) is being added to \(R\text{,}\) the sequence \(p(v),p(p(v)),\dots\) includes \(s\text{,}\) which implies the same thing for \(w\text{.}\)

By definition of the algorithm, every vertex \(v\) for which there exists a path from \(s\) to \(v\) is eventually added to \(R\text{.}\) By Statement 1 and Statement 2, we have \(d_s(v)=\dist(s,v)\) for every such vertex, and the path \(P_v\) from \(s\) to \(v\) obtained by following the sequence \(v,p(v),p(p(v)),\dots,s\) in reverse is a shortest path.

The analysis of Dijkstra’s Algorithm relied on the assumption that the lengths are non-negative. Specifically, we used that the length of any path is at least the length of any of its subpaths, which only holds when the lengths are non-negative. It is natural to wonder what happens if we allow negative lengths. It may come to some surprise that there is no known polynomial-time algorithm that can work for length functions with negative values in full generality, and there are good reasons to believe that no such algorithm exists (see Exercise 17). That being said, there do exist algorithms which can deal with negative lengths if we have some additional constraints. A cycle in a digraph \(D\) is a path \(v_0,\dots,v_k\) such that \(v_kv_0\in A(D)\text{.}\) Note that, in particular, if, \((u,v),(v,u)\in A(D)\text{,}\) then \(u,v\) is a \(2\)-vertex cycle. Given a length function \(\ell\text{,}\) the length of a cycle with respect to \(\ell\) is equal to the length of the path \(v_0,\dots,v_k\) plus the length of the arc \(v_kv_0\text{.}\) We denote the length of a cycle \(C\) with respect to \(\ell\) by \(\ell(C)\text{.}\) The key condition that we require on a length function to satisfy is described as follows.

Definition 1.3.4.

A length function \(\ell:A(D)\to\mathbb{R}\) on a digraph \(D\) is said to be conservative if no cycles of \(D\) have negative length.

Let us pause to give some intuition about why the conservative property is valuable. In the case where the weights are non-negative, we observed earlier that the length of a shortest walk from one vertex to another is equal to the length of a shortest path because one cannot make a path shorter by coming back to the same vertex multiple times. With a general length function, this may not be true because, given a path, we may be able to find a shorter walk by “inserting” a negative cycle somewhere along the route. However, for a conservative length function, there are no negative cycles, and so we preserve the property that a shortest walk is always a path. Our next aim is to present an algorithm for solving the shortest path problem for conservative length functions. The following lemma is useful for analyzing this algorithm.

Lemma 1.3.5.

Let \(D\) be a digraph and \(\ell:A(D)\to\mathbb{R}\) be a conservative length function for \(D\text{.}\) For distinct vertices \(s,w\in V(D)\) and \(k\geq1\text{,}\) suppose that \(v_0,\dots,v_r\) is a path from \(s\) to \(w\) with \(r\leq k\) arcs of minimum length among all paths from \(s\) to \(w\) with at most \(k\) arcs. Then, letting \(v=v_{r-1}\text{,}\) the path \(v_0,\dots,v_{r-1}\) from \(s\) to \(v\) minimizes the length among all paths from \(s\) to \(v\) with at most \(k-1\) arcs.

Proof.

Let \(P\) denote the path \(v_0,\dots,v_{r-1}\) and suppose, to the contrary, that there is a path \(Q=u_0,\dots,u_t\) from \(s\) to \(v\) with \(t\leq k-1\) of shorter length than \(P\text{.}\) In particular, this can only happen if \(r\geq2\) (if \(r=1\text{,}\) then \(v=v_{r-1}=v_0=s\) and there’s only one path from \(v\) to itself). If \(Q\) does not contain \(w\text{,}\) then the path obtained by appending \(w\) to the end of \(Q\) has length \(\ell(Q)+\ell(vw)<\ell(P)+\ell(vw)\) which contradicts the assumption of the lemma. So, \(w=u_i\) for some \(1\leq i\leq t-1\text{.}\) Let \(C\) be the cycle \(u_i,u_{i+1},\dots,u_t\text{.}\) We get that \(u_0,\dots,u_i\) is a path from \(s\) to \(w\) of length

\begin{equation*} \ell(Q) + \ell(vw) - \ell(C). \end{equation*}

Since \(w\) is conservative, we have \(\ell(C)\geq 0\) and so the length is at most \(\ell(Q)+\ell(vw)\) which is less than \(\ell(P)+\ell(vw)\) by our choice of \(Q\text{.}\) However, this contradicts the assumption in the lemma and completes the proof.

We now present an algorithm for solving the Shortest Path Problem in digraphs with conservative length functions. In fact, this algorithm can take any length function as an input and return either a shortest path from \(s\) to \(v\) for every vertex \(v\) or a negative length cycle (which certifies that the length function is not conservative).

Algorithm 1.3.6. The Moore-Bellman-Ford Algorithm.

Input: A digraph \(D\) with \(n\) vertices, a length function \(\ell:A(D)\to \mathbb{R}\) and a vertex \(s\in V(D)\text{.}\)
Output: A path \(P_v\) from \(s\) every vertex \(v\) in \(D\) for which such a path exists with minimum length with respect to \(\ell\) and the length \(d_s(v)\) of such a path, or a cycle \(C\) in \(D\) of negative length.

Initialization: Let \(d_s:V(D)\to\mathbb{R}\cup\{\infty\}\) be defined by \(d_s(s):=0\) and \(d_s(v):=\infty\) for all \(v\in V(D)\setminus \{s\}\text{.}\) Let \(p:V(D)\to V(D)\cup \{\text{null}\}\) be the function such that \(p(s)=s\) and \(p(v)=\text{null}\) for all \(v\in V(D)\setminus\{s\}\text{.}\) Throughout the algorithm, the quantity \(d_s(v)\) will represent the length of a shortest path from \(s\) to a \(v\) found so far and \(p(v)\) will represent the predecessor of \(v\) on the shortest path from \(s\) to \(v\) found so far.

for \(i=1,\dots,n-1\) do

for each arc \(vw\in A(D)\) do
- If \(d_s(w)>d_s(v)+\ell(vw)\text{,}\) then update \(d_s(w)\) to be equal to \(d_s(v)+\ell(vw)\) and \(p(w)\) to be equal to \(v\)

After the loops terminate, if there is an arc \(vw\) with \(d_s(w)>d_s(v)+\ell(vw)\text{,}\) then find any cycle \(C\) in the sequence obtained by reversing \(w,v,p(v),p(p(v)),\dots\) and output it. Otherwise, for each vertex \(v\) for which \(d_s(v)<\infty\text{,}\) let \(P_v\) be the sequence obtained by reversing \(v,p(v),p(p(v)),\dots,s\) and return all of the paths \(P_v\) and values \(d_s(v)\text{.}\)

Theorem 1.3.7.

The Moore-Bellman-Ford Algorithm is correct and has running time \(O(nm)\) on digraphs with \(n\) vertices and \(m\) arcs.

Proof.

The running time is \(O(nm)\) because there are \(O(n)\) iterations of the outer loop and, on each iteration, the inner loop iterates \(O(m)\) times. In producing the output, the step of checking the arcs takes time \(O(m)\text{,}\) finding the cycle takes time \(O(n)\text{.}\) The time it takes to produce the path corresponding to a given vertex \(v\) is \(O(n)\text{;}\) however, if \(v\) has no incoming arcs, then this running time decreases to \(O(1)\text{.}\) So, the time required to produce the paths is definitely \(O(mn)\text{.}\)

I haven’t gotten around to writing this proof. See page 161 of Combinatorial Optimization by Korte and Vygen for a proof.

To learn more about the Shortest Path Problem, see

Here is a YouTube video related to this section:

Prev Top Next