Breadth-first search is a great elementary algorithm for searching graphs. In lots of scenarios, BFS will be sufficient to visit all of the vertices in a graph. In others, it’s very important that you choose the right algorith. Today we’ll be exploring depth-first search (DFS), a very similar algorithm that also explores all vertices by walking down neighboring edges, the difference is in the order in which those vertices are explored.

First, let’s tackle the Daily Problem from the previous lecture:

This article is part of the Data Structures and Algorithms Series following the course The Analysis of Algorithms. If you missed the previous article, check that out first as well as a copy of The Algorithm Design Manual.

## Answers to the previous Daily Problem

Prove that in a breadth-first search on a undirected graph `G`, every edge is either a tree edge or a cross edge, where `x` is neither an ancestor nor descendant of `y`, in cross edge `(x,y)`.

It helps to provide definitions to the term tree edge and cross edge. If you are traveling from `(x,y)` and it’s your first time visiting `y`, we say that this is a tree edge. As the example mentions above, a cross edge is on a path between two points that is neither the ancestor nor the descendant of `x` and `y`. It also helps to define what a forward edge and a backward edge are to give a complete picture.

All edges adhere to one of these four types. Tree edges only apply to newly-visited vertices. The other three apply to previously-visited verticies. If a visit already happens for the first time, we run through this simple ruleset:

1. If the `y` in this `(x,y)` relationship is an ancestor, it is a backward edge.
2. If the `y` is a descendant, it is a forward edge.
3. If `y` is neither, that’s the cross edge.

So let’s go through all four and eliminate by contradiction.

Assume `G` has a backward edge. If a backward edge exists, there is a cycle. If we have a cycle, we have to terminate the search after an edge is processed at most once (otherwise we would loop infinitely on our cycle and we could bounce back and forth between the two vertices involved in the cycle). That means that in this case, the only other possible edges are tree edges.

Assume `G` has a forward edge. If a forward edge exists, then `(y,x)` would have already been visited during our search before we reach `x` and try to visit `y`. Since you cannot visit the descendant before you visit the ancestor, no forward edges can exist in an undirected graph.

Assume `G` has a cross edge. If a cross edge exists, we’re really saying that there are connections between siblings in the tree. BFS operates by going across each layer in the height of the tree. Even though those cross edges aren’t formally defined in terms of real connections, they manifest in how our output array of BFS exploration looks.

By the process of elimination, the only edge types available left are tree edges, which works just fine. Since BFS is a graph-walking algorithm, the whole point is to visit every node. By visiting a node, those edges are tree nodes their first time around. Since we don’t need to visit anything a second time via our queue, our proof is satisfied.

Now that we understand a bit more about ancestors and descendants, this should make our implementation of Depth-First Search a bit clearer (Hint: DFS only has tree edges and back edges).

## Depth-First Search: Like BFS but with a stack

As I mentioned earlier, the real difference between how BFS and DFS walk a graph is in the data structure they use to make that walk. BFS uses a queue and DFS uses a stack. This means that there is some backtracking involved; the algorithm will go as far down the children as it can before it turns back to dive down the next stack of successive child nodes. This tracking is what makes DFS so powerful.

When remembering depth-first search, just remember that we’re going to visit nodes in a top-to-bottom fashion. So a tree that looks like this:

``````      8
/ \
6  10
/ \   \
4   5  12
``````

Will read the tree with DFS in this order: `[8, 6, 4, 5, 10, 12]`. The generic algorithm for all graphs (not just trees) using DFS looks like this:

``````class Graph {
// ... see last article for our implementation
let count = 0;
const ENTRY_TIMES = new Array(MAX_VERTICES);
const EXIT_TIMES = new Array(MAX_VERTICES);

dfs(currentVertex) {
if (finished) return;
let nextVertex;

VISITED[currentVertex] = true;
count += 1;
ENTRY_TIMES[currentVertex] = count;
console.log("PRE-PROCESSED!");
tempVertex = this.connections[currentVertex];

while (tempVertex) {

if (!VISITED[nextVertex]) {
PARENTS[nextVertex] = currentVertex;
console.log(`PROCESSED EDGE \${currentVertex}=>\${nextVertex}`);
this.dfs(nextVertex);
} else if (
(!PROCESSED[nextVertex] && PARENTS[currentVertex] !== nextVertex) ||
this.isDirected
) {
console.log(`PROCESSED EDGE \${currentVertex}=>\${nextVertex}`);
if (finished) return;
tempVertex = tempVertex.nextVertex;
}
}

console.log("POST-PROCESSED");
count +=1;
EXIT_TIMES[currentVertex] = count;
PROCESSED[currentVertex] = true;
}
}

``````

So now we know the what and the how of DFS, but why should we care to use it? Here’s a couple reasons:

• Cycle detection. As we mentioned from the previous Daily Problem, cycles can occur with back edges. Back edges are very easy to detect in DFS because backtracking is built into the algorithm. How long will this take? Only `O(n)` because it will take `n` vertices to find a tree node that has to head back up to an ancestor (via a back edge). Since `n` vertices need to be explored, that’s only `n-1` edges to visit to find that back edge and thus the cycle. This reduces down to the worst-case `O(n)` to find the cycle in the graph.
• Dead ends and cutoffs. Any vertices where cutting them off disconnects the graph is called an articulation vertex. DFS can find these in linear time (because of the ability to look back on a parent node to see if connectivity still exists) while BFS can only do this in quadratic time.

### DFS for directed graphs: Topological sort

When graphs are directed, we now have the possibility of all for edge case types to consider. Each of these four cases helps learn more about what our graph may be doing. Recall that if no back edges exist, we have an acyclic graph. Also recall that directed acyclic graphs (DAGs) possess some interesting properties.

The most important of them is that for a certain configuration, you can represent a DAG as a list of nodes in a linear order (like you would a linked list). Such a configuration (of which more than one can exist for certain DAGs) is called a topological sort. Topological sorting is important because it proves that you can process any vertex before its successors. Put it another way, we can find efficient shortest paths because only relevant vertices are involved in the topological sorting.

``````class Graph {
topologicalSort() {
for (let i = 0; i < this.vertices; i++) {
if (!VISITED[i]) {
// the console logs from dfs() as we pop off stack
// will print a topological sort order
this.dfs(i);
}
}
}
}
``````

## Onto the next Daily Problem

Now that we’ve covered the basics of graph searching, be sure to study this and the previous article to compare and contrast why these implementations work and how they’re useful in different situations. Given your knowledge of both now, this next Daily Problem should give you a good idea of how to solve each part:

Your job is to arrange `n` ill-behaved children in a straight line, facing front. You are given a list of `m` statements of the form `i` hates `j`. If `i` hates `j`, then you do not want put `i` somewhere behind `j`, because then `i` is capable of throwing something at `j`.

(a) Give an algorithm that orders the line, (or says that it is not possible) in `O(m + n)` time.

(b) Suppose instead you want to arrange the children in rows such that if `i` hates `j`, then `i` must be in a lower numbered row than `j`. Give an efficient algorithm to find the minimum number of rows needed, if it is possible.

## More practice problems

To wrap up this chapter here are the other homework problems to go along with these articles:

1. Problem 5-12.
2. Problem 5-13.
3. Problem 5-14.
4. Problem 5-19.
5. Problem 5-25.

Think you’ve got the answers? Let’s see how you do in the next article!