Dijkstra's Shortest path algorithm

 Published on 2018-05-03

Many different problems can be solved with graph theory, and the various algorithms that are part of that field. One of the classic problems of graph theory is finding the shortest path in a graph (for example, a path between two cities, where the graph contains cities as vertices, and roads as edges).

In this post, we will look at Dijkstra - a famous algorithm for finding a shortest path in a graph, which can be applied when the weights of the edges (for example, the lengths of the roads) are non-negative numbers (numbers greater than or equal to 0), which is almost always the case. The algorithm, starting from a specific vertex/city (called a "source"), can find the distances from that vertex to all other vertices in the graph.

Shortest path in a graph

If you have heard about the breadth-first search algorithm (BFS), you probably know that it can identify the shortest path between two vertices, if all edges in the graph have the same weight (i.e. length). Specifically, when all edges have the same weight, then the shortest path depends on the number of edges that need to be travelled when going from one vertex to another. For example, if two vertices in a maze are located immediately next to each other, then the distance between them is 1, because only one edge needs to be travelled (the connection between those vertices), etc.

But, when considering other problems, we often come across situations where the edges in the graph have different weights. For example, if we look at the road network of a particular country, and the vertices indicate the cities in the country, then an edge that describes the road from Long Beach to Los Angeles, for example, can be marked with a weight of 40, indicating that the distance between those cities is 40 kilometers. For some other cities and edges between them, the weight will be different (for example, 27). Note that when we discuss edges in a graph, we use the term "weight", but that weight can represent any number that is of interest to us when analyzing the graph - for example, road length.

Dijkstra's algorithm is used to calculate the shortest path in a graph between an initial vertex (source), and all other vertices in the graph. For example, if we are talking about road infrastructure, we can indicate that the source city is Los Angeles, and Dijkstra's algorithm will calculate the shortest path (distance) from that city to all other cities in the network (vertices in the graph). The only limitation is that all edges must have a weight that is a non-negative number (for example, -1912 is wrong). This is satisfied in almost all problems in real life, so Dijkstra's algorithm is widely used in many areas.

Note that there are other algorithms that can be used to solve similar problems. One of them (Bellman-Ford) works in a similar way to Dijkstra's algorithm, but it also calculates the correct distance when some edges have negative weights (but, it's slower). Another algorithm, called Floyd-Warshall, is used when we want to calculate the distance from all vertices to all other vertices (when we do not have a source). Both algorithms are simple to implement. However, note that Dijkstra is (generally) used much more often, and thus it is extremely important.

Dijkstra can be implemented in a similar way to BFS. The main difference between the two is that Dijkstra's algorithm uses a priority queue (priority_queue) rather than a simple queue. This is important, because we want to look at the vertices according to their total distance from the starting vertex (the source); that is, according to the sum of weights of all edges that need to be travelled to get to a vertex. With BFS, we used a simple queue, because all edges have the same weight. After we understand this difference, the algorithm is very simple to implement.

Before we see the source code of the algorithm, let's mention an important note that will help us understand it better. Specifically, since we use a priority queue, it's important to visit the elements of the queue according to their distance from the source, in ascending order (i.e., we should start with those closest to the source), similar to the BFS algorithm. Because it's important to know both the distance (for comparison) and the vertex, we can use a pair<int, int> to store both values. When C++ looks at two pairs, they are initially compared by the first value in the pair, so we will store the distance there - like pq.push({ distance, city }). Now, when viewing an element of a priority queue, we will have the distance in e.first, and the vertex in e.second (in the case of our example, that means the city we are considering at the moment). However, what's very important is that a priority_queue will favour those elements where the distance is largest (that's how priority_queue works by default), which we obviously don't want. Therefore, when implementing Dijkstra's algorithm in C++, we will add the value pq.push({ -distance, city }). Note how we used (-distance) instead of (distance). In other words, instead of adding distances like 3, 9, 17, we add -3, -9, -17, so when we ask for elements from the priority queue, we will get them in the following order -3, -9, -17, from largest to smallest (because, as we said, that's how priority_queue works). However, now we can multiply by -1 again (after extracting the element) to override the previous multiplication, and get the correct distance (-(-3))=3. A bit complicated to understand at first, but after you understand that part, the algorithm is actually pretty simple to implement.

#include <iostream>
#include <vector>
#include <queue>
#include <cstring>
using namespace std;

int N, E;
vector<pair<int, int> > G[101];

vector<int> dijkstra(int start) 
{
     
     vector<int> distance;
     distance.resize(101);
     for (int i=0; i<N; i++)
     {
          distance[i] = 1000000000; //some large number
     }
     
     priority_queue<pair<int, int> > pq;
     distance[start] = 0;
     pq.push({0, start});
     
     int visited[101];
     memset(visited, 0, sizeof(visited));
     
     while (!pq.empty()) 
     {
          pair<int, int> state = pq.top();
          pq.pop();
          
          int city = state.second, distance_to_city = -state.first;
          
          if (visited[city] == 0) 
          {
               visited[city] = 1;
               
               for (auto edge : G[city]) 
               {
                    int next = edge.first, distance_between_cities = edge.second;
                    
                    if (distance[next] > distance[city] + distance_between_cities) 
                    {
                         distance[next] = distance[city] + distance_between_cities;
                         pq.push({ -distance[next], next });
                    }
               }
          }
     }
     
     return distance;
}




int main() 
{
     
     cout << "Number of cities: " << endl;
     cin >> N;
     
     cout << "Number of edges: " << endl;
     cin >> E;
     
     cout << "Enter the edges (from, to, weight)." << endl;
     for (int i=0; i<E; i++)
     {
          int from, to, weight;
          cin >> from >> to >> weight;
          
          G[from].push_back({to, weight});
     }
     
     int start;
     cout << "Start city: " << endl;
     cin >> start;
     
     vector<int> distance = dijkstra(start);
     
     for (int i=0; i<N; i++)
     {
          cout << "min distance to " << i << " is: " << distance[i] << endl;
     }
     
     return 0;
}

Note the use of the extra array called visited, as well as the condition (visited[city] == 0), because those are extremely important for achieving the optimal time complexity. Specifically, during the execution of the algorithm, there may be several ways to get from one city to another - like when there is a path between two cities where two edges can be travelled (by "older roads"), and there is a shorter path (as a total number of kilometers) where three or more "highways" are travelled. So, the shortest distance might be through more edges (in this case, 3) compared to another path - which is longer because the road is older, passes around a mountain instead of a tunnel, or something else. Therefore, it is important to check the condition (visited[city] == 0), and once we discover the shortest distance to a city, we don't look at other possible paths that lead to that same vertex because they can't improve the result - in other words, because we use a priority queue, when we find the shortest path to a city, there is no need to look at other paths with a greater distance. Clearly, since all edges have non-negative weights (which was a condition to use Dijkstra's algorithm), we know that such a thing simply can't happen. The time complexity of this algorithm is O(E * log(V)), where E is the number of edges, and V is the number of vertices in the graph.

Let us also mention several other examples. First, notice that in the algorithm given above we use integer numbers, but the weights of the edges can be floating-point numbers as well - as long as they are non-negative. The only change required in the algorithm is to modify the data types - vector<double> distance, priority_queue<pair<double, int> > pq, etc. Similarly, note that in our example we talked about discovering the shortest path (in this case, in kilometers) from one source (source) to all other vertices. In other problems, you can search for the shortest time (for example, in minutes) to travel from an initial vertex to other vertices. Dijkstra's algorithm can clearly be used because we are (practically) discussing the same problem, only now the weights of the edges are storing something else (it can be distance in kilometers, minutes required to travel from one city to another, price that needs to be paid to travel through a highway, etc.).

The algorithm can be modified to store the exact vertices or edges that need to be travelled in the shortest path between the source and some other vertex (if, in addition to the distance recorded in distance[], we also want to know which vertices we have to go through). Specifically, the only thing that we need to change is to create another array (for example, let's call it parent[]), and then when we insert a new element into the priority queue pq.push({ -distance[next], next }), we add the statement parent[next] = city. So, for example, if we want to discover the shortest path from city 0 (source) to city Z, we need to see what is stored in parent[Z] (assume it's the value Y, indicating the city we have to travel through before visiting Z), then we see what is stored in parent[Y] (that's the city we need to travel through before visiting Y), etc. - and we repeat this until we find the value 0 in parent[] (because that was the source, i.e. the city where we started). As we moved through the array parent[], we actually discovered all the vertices that need to be visited on the path from source to Z: 0 -> ... -> X -> Y -> Z.