A guide to Dijkstra's Algorithm

If you have ever wondered about how Dijkstra's algorithm works or what the intuition behind it is then you might have something to learn here, hopefully. I shall try to explain my understanding of the Dijkstra's algorithm. If some errors have crept in, let me know.

Dijkstra's algorithm is a Single-Source-Shortest-Path algorithm, which means that it calculates shortest distance from one vertex to all the other vertices.

The node from where we want to find the shortest distance is known as the source node. For the rest of the tutorial, I'll always label the source node as S. And the distance to source node is naturally going to be zero.

So, if we had an array of distances called dist[] and source node is 2 then dist[] would be 
[INF, INF, 0, INF, INF]	
  0    1   2   3    4
// Consider INF as infinity since we don't know what the actual distance is. But we are very much sure that the distance to the source node is 0.

Let's start.

Have a look at the image below. We shall work with it in this tutorial.
image

Remember what I said about the source node? I said that we were SURE that the shortest distance to source node is going to be zero. Do you agree with me on that point? Of course, yes, because there was no path to travel.

What does the distance array for our graph look like?

Node 	 |  S   A   B   C   D 
Distance |  0  INF INF INF INF

Let's break the problem. Let's rip apart this graph.

We are at S currently. Let's see where we can go from S.
image

You are at node S. There are 2 edges that come out of S. One edge goes to A and other goes to B. Edge SA distance is 15 and edge SB distance if 10. You can clearly verify this in the figure.

I now claim that I am SURE that the shortest distance to B is 10. Why though? How many other ways are possible, through which you can reach to node B? Maybe there will be some edge from A to B. But since distance to A is already 15, any edge from A to B will only increase that distance and will never be able to become less than 10. This is the most important step. Read it again if you did not get it.

Now, that I am SURE of the shortest distance to reach B, I discover more vertices from B.

Now, our exploration graph looks something like this.
image
and the distance array has been updated.

Node 	 |  S   A   B   C   D 
Distance |  0  INF  10 INF INF

Let's see what the current scenario looks like.

Nodes for which we know the shortest distance: S and B.

What are we looking at now?

At all the nodes we can reach from the S and B.

Again, the same question. Which node's shortest distance can we be SURE of ?

I claim it's node D.
Look at the figure again.
image
The distance to D is 10 + 1 = 11, and no other path can have length smaller than this since their own distances are greater than 11.

Hence, we now include D in our set of vertices for which shortest distances is known. Also, we explore more edges going from node D.

Note that we are always picking the green vertex with the minimum distance.

So, now our exploration graph looks something like this.

image
Distance has been updated.

Node 	 |  S   A   B   C   D 
Distance |  0  INF  10 INF  11

Now, same question. Which node's shortest distance can we be SURE of? More specifically. Which green node's shortest distance can we be SURE of? Yes, right. It is C because the path S-B-D-C gives distance of 10 + 1 + 3 = 14 which is the lowest.

Hence, we include C into the set of nodes for which final distance have been finalized and we also explore it's neighbours.

The exploration graph now looks like this.
image
Distance has been updated again.

Node 	 |  S   A   B   C   D 
Distance |  0  INF  10  14  11

Again, for which node are we SURE of? There is just one node remaining and so it is the shortest. So, we include it in the set of nodes whose final distances are known and explore it's neighbours.

Now, the exploration graph is as follows:
image

Final distance array now looks like this:

Node 	 |  S   A   B   C   D 
Distance |  0  15  10  14  11

But now, since we have no green nodes, we stop. We have visited all the nodes.

Couple of key points here. How were we SURE of the shortest distance of a green node? That node had the shortest distance among all the other green nodes.

Now, summarizing Djikstra's algorithm:

  1. We keep a set of vertices for which final shortest distance is already known to us. Initially only the source vertex S belongs to this set.
  2. We do several iterations, during which we pick the green node with the minimum distance and add it to the set of vertices whose distances are finalized and also all nodes reaching from this node and not VISITED (i.e. NOT MARKED YELLOW), are made green.

And that's it. That's the working of Djikstra's algorithm.

Regarding the implementation, since we need the node with minimum distance, a min-heap is prefered. Implementations with sets( the ones based on BSTs) are also popular. You can find the implementations online with both the data structures.

Now, for a moment, let's go back to where we started.

image
At this point, we chose B. So, what we have done is finalized the distance to B and it can never be changed in coming iterations. Now, let's say that there was an edge from A to B with cost as -10 (Negative ten). The distance then is 15 + (-10) = 5, which is lower than 10.

But, we had said that it can never be less than 10. Then, how come this?

Here is the truth. Djikstra's algorithm doesn't work for negative edge weights.

Let's think about the graphs involving negative weights cycles. If there is a negative weight cycle like the C-A-B-C below, we can keep moving in cycles and each iteration of the cycle will decrease the distance, so negative weight cycle is a big NO for Djikstra.
image
It's intuitive to understand that Djikstra doesn't work for negative cycles but not very intuitive to understand why it doesn't work for negative edges with no negative cycles. I have already given an example above for which it fails. Here is another very good example for which Djikstra's algorithm gives wrong answer. Take A as the source node.

image
Remember that the distance once finalized for a node, it must never be changed again. NEVER.

Time to part :)

I hope you enjoyed it.

Edit:
An excerpt from an interview of Edsger Dijkstra, in 2001:
" What is the shortest way to travel from Rotterdam to Groningen, in general: from given city to given city. It is the algorithm for the shortest path, which I designed in about twenty minutes. One morning I was shopping in Amsterdam with my young fiancée, and tired, we sat down on the café terrace to drink a cup of coffee and I was just thinking about whether I could do this, and I then designed the algorithm for the shortest path. As I said, it was a twenty-minute invention. In fact, it was published in '59, three years later. The publication is still readable, it is, in fact, quite nice. One of the reasons that it is so nice was that I designed it without pencil and paper. I learned later that one of the advantages of designing without pencil and paper is that you are almost forced to avoid all avoidable complexities. Eventually, that algorithm became to my great amazement, one of the cornerstones of my fame."

Comments (29)