Prims algorithm real application

The minimal spanning tree algorithm

What is the cheapest set of edges connecting all nodes?

For a given connected, undirected graph, a spanning tree is a circular subgraph that connects all nodes. Obviously, there can be many spanning trees in a graph. If you add weights to the edges, then the total weight of a subgraph is the sum of the individual edge weights. The spanning tree with the lowest weight is called the minimum spanning tree (MST). More generally, every undirected graph has a minimal spanning forest (MSF), that is, a set of minimal spanning trees (for each connected component of a).

Kruskal's algorithm is a "greedy" algorithm (a method that makes the greediest / best choice locally in each step in the hope of finding the global solution), which efficiently calculates the minimum spanning tree of a connected, weighted undirected graph . If the graph is not connected, the algorithm will calculate a minimal spanning forest without further modifications.

The algorithm is initialized with a forest consisting of trees with individual nodes. The edges are sorted in a queue according to their weight. In each iteration, an edge is removed from the queue. If the endpoints of the edge belong to different trees, they will be united over the edge. This is repeated until all nodes belong to the same tree or there are no more edges in the queue.

The most efficient implementation of the algorithm works with a "Union-Find" data structure. This offers efficient set operations such as the union of two sets or checking whether something is contained in a set. When an edge is removed from the queue, this data structure allows checking whether the two endpoints of the edge belong to the same tree. If this is the case, the trees (which are represented as sets) are simply united.

The algorithm is initialized first:

  • All edges are sorted according to their weight and placed in a queue.
  • Each node of the graph is added to the "Union-Find" data structure. The data structure assigns an ID to each node (shown here by the color). This ID represents the tree to which this node belongs. During the execution of the algorithm, these IDs are represented by the colors of the nodes.

The algorithm then processes the queue with the edges. In each iteration, the edge at the top of the queue is removed. Then the two end points of the edge are compared. If their IDs are not the same (that is, if they belong to different trees), then the trees belonging to the endpoints are merged. The "Union-Find" data structure unites the two trees by assigning a new ID to all nodes in both trees. In this case the edge is added to the spanning tree. Otherwise the endpoints are in the same tree; if the edge were added, the tree would then contain a circle.

The loop described is repeated until there are no more edges left in the queue or all nodes are in the same tree (that is, their IDs are the same).

If two endpoints of an edge are checked during the algorithm, this edge is added to the spanning forest if the edge connects two different trees. The union of all these edges results in the resulting minimal spanning forest.

Create a graph and run through the algorithm

Test your knowledge on the research task