Implementing BFS and DFS

About how to implement two popular graph traversal algorithms. Further, understand difference between stack data structure and program stack used with recursion.

14s8j7

Hola ! Did the post on graph search bring your here ? Or you are here to check what we will be decoding today. Well, as the title goes : this post is about how do we implement BFS and DFS algorithms and the data structures we need to support this.

Before we begin : just a heads-up there ain’t no code here ! This blog will be about understanding how we could implement the algorithms. Once the idea is clear I am sure you will write the code in a jiffy in the programming language of your choice! Say what ? 😉

The Data Structures

In the previous post on graph search , we looked at what these algorithms are. Here we will learn about how we could implement them. This would require understanding of two simple data structures – queues and stacks. Queues and Stacks can be visualized as containers that provide storage and retrieval of objects in particular order. This order will decide the usage scenario. The process of adding and retrieving are called “queue” and “dequeue” respectively for a queue. They are referred to as “push” and “pop” for a stack.

Have you been in a queue at D.M.V or at the grocery billing counter. You must have noticed the first one in the queue is the first one to get serviced and move out of queue. Essentially queuing allows you to insert and retrieve in FIFO [first in first out] manner. With “stacks” this would be LIFO [last in first out]. Imagine a stack of plates at a salad counter.  You pick the top most plate (which was placed last into stack) and start at your turn. [Obvio ! :P]. So now let’s move on to actual agenda here – implementing BFS/DFS. Shall we?


Queues and BFS

If we want to implement BFS, we will need to inspect all the directly connected neighbors and then second level neighbors and so on.

Algorithm for BFS implementation

BFS algorithm can be implemented using “Queue” data structure to hold the to-be-visited-nodes at each level as we explore nodes. There have been discussions about if you can implement without a queue – short answer -not really, this means you would need to implement a queue action using stack or some other data structure  (which means you are still using queue :P).

One thing to be noted is that the nodes are marked visited as they are explored, to avoid getting into infinite loop with cyclic graphs or also prevent visiting a given node multiple times when it is reachable through more than one path (which would mean – it’s a ‘tree’ – every node has a unique reachable path starting at some node in a connected graph).

Pseudocode

Begin : Set all nodes to "not visited";
q = new Queue();
q.enqueue(starting node); [from where BFS begins]
while ( q ≠ empty ) do
 {
    x = q.dequeue();
   if ( x has not been visited )
   {
      visited[x] = true; // [ Mark node x as visited ]
      for ( every edge (x, y) /* we are using all edges ! */ ) 
      if ( y has not been visited ) 
      q.enqueue(y); // Use the edge (x,y) !!!
   }
 }

For more clarity (visit visualization aid here), the operations on queue while performing BFS are depicted in Fig[1] below with an example graph. Yellow arrow  at (2) indicates the direction of insertion into queue. Blue arrow  at (5) shows the direction of de-queuing. Marked in yellow boxes are nodes to be explored/operated on next.

Starting at node-E , C and D are first level neighbors , while A and B are second level neighbors.

bfs-queue

Fig 1. BFS – Queue operations on a graph.


Stacks and DFS

If we want to implement DFS, we will need to choose one neighbor and it’s neighbor and so on till we have reached end of iterating through all the neighbor levels.Check this link out – for better visualization (dfs visual).

DFS can simply be implemented using a stack data structure to store the “to-be-visited-nodes” (instead of queue as in BFS).

DFS algorithm has an iterative as well as recursive approach available for it’s implementation – that are both actively used.

Pseudo code for iterative approach:

Begin : Set all nodes to "not visited";
s = new Stack();    //use stack structure for DFS
push(initial node);   // push() operations stores node to be visited 
   while ( s ≠ empty ) do
   {
      x = s.pop();     // pop() removes value from STACK to explore
      if ( x has not been visited )
      {
         visited[x] = true;        // same as BFS - mark the visited nodes
         for ( every edge (x, y)) //all edges need to be explored like BFS
            if ( y has not been visited )   
	       s.push(y);       //any new node needs to be pushed on STACK
      }
   }

Recursive Implementation

Oops, what is recursion? Go here .

For this implementation we can replace the use of  “stack” data structure with program stack, by use of recursive function call (call to self). The pseudo-code for it is as below. The step with  dfs(G,n) is the recursive call – you call it repeatedly until the last node is reached.

Pseudo code for recursive approach

dfs(Graph G, node x)
    Mark x "visited"
    for (every edge (x,y) in G): 
       if (y has not been visited):
           dfs(G,y) //recursive call   
End procedure

Stack data structure versus program stack

Now, coming to the meme in the blog- do you stick to recursion that uses stack memory or re-write an iterative DFS that will use stack data structure. I bring this up for discussion here, as I think this would help in understanding relation between memory and the program we write. (No clue about memory – Stack, Heap, Data segment , Text etc ? – Please go here right away ! )

With recursive function we will be utilizing the system stack  for it’s operations.

  1. It’s important to note that depending on the depth of the graph we can run out of stack space  – cough *stack overflow* cough (welcome to embedded systems)
  2. Additionally, we may also incur extra cost pushing and popping the program counter -system stack related overhead.
  3. There’s also notion of cache friendly coding – temporal and spatial locality (loops may be better idea).

But, recursive approach results in clean and simple code.

But, I think it really needs to be decided on a case by case basis – you need to judge the depth of recursion, is cache optimization the need of the hour – just have the answer for “what are the constraints you have to work with”. 

Essentially, Think before you ink implement !

*Note : The animation links in this blog post are not created by me. They are available online for reference.

Well, why don’t you try implementing it now ? k-bye.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: