What are the uses of cluster computing

Cluster

In the following, we refer to a cluster as a network of networked computers that appear to the outside world as a single computer. The computers in a cluster are also called nodes. An alternative name for the sum of these nodes is server farm. Such an interconnection of computers usually pursues one of the following three goals: high availability (HA), high performance computing (HPC) or load balancing (LB), the boundaries between the last two variants being rather fluid.

Important terms in connection with clusters are "shared-nothing" and "shared-all". In a shared-nothing architecture, each node is autonomous, i.e. it has all the hardware components required for operation and its own operating system. This applies particularly to storage and working memory; neither is shared with other nodes. This characteristic is found particularly often in the area of ​​HPC clusters. The high scalability is an advantage here. This means that the performance of the entire environment can be increased at will by adding new nodes.

This is not the case with a shared-all cluster. This architecture is particularly common in storage networks (SAN) and, through competing access to shared storage, guarantees that all nodes can access the entire database. In this context, cluster file systems also come into play: Since inconsistencies can occur when several sources access a common database at the same time. To prevent this, the metadata such as directories, attributes and storage space allocations must be stored in a coordinated manner. For this purpose, a metadata server is usually used within a Fiber Channel or iSCSI network.

High availability cluster
The aim of high availability clusters (HA clusters) is to provide increased reliability. If an error occurs on a cluster node or if it fails completely, the services running there are migrated to another node. Most HA clusters have two nodes. This is mostly an active / passive construction, that is, the replacement node is only in standby as long as the first node works without problems. Such a takeover of services is also known as failover - in the event of an error, the application is restarted on another computer in the cluster. Takeover, on the other hand, means that the services are active on two or more servers at the same time. In this case one speaks of an active / active cluster in which the nodes are always active simultaneously.

Services must be specially programmed for their use on a cluster. This is described by the term "cluster aware". For the operation of an HA cluster, it is extremely important that both the hardware and the software of the computer network are free of single point of failure. A node uses a special heartbeat connection to obtain information on the status of the other node (s). Last but not least, the cluster should continue to function in the event of a disaster. This can be achieved through spatial separation. In extreme cases, the cluster nodes are placed several kilometers apart in different data centers. In turn, it should be noted that fast WAN lines are required for a failover / takeover, which, in addition to the multiple costs for the individual nodes, further increases the investments required for a cluster.

HPC and load balancing clusters
Another type of cluster are high-performance computing clusters, which are often also referred to as supercomputers. The aim of such a construction is to break down extremely demanding computing tasks into individual components, distribute them to the nodes and have them carried out there simultaneously. An extremely fast network is required for this purpose, so InfiniBand is often used here. Most of the supercomputers based on an HPC cluster are currently running on Linux. Such high-performance, clustered infrastructures are used by large search engine and cloud providers such as Google or Amazon. At Google, for example, the individual nodes consist of commercially available PCs that ensure high data throughput with the help of the Google File System (GFS).

Load balancing clusters, like HPC clusters, aim to distribute the load over several machines. Here, too, possible areas of application are environments with very high demands on computer performance. Such clusters are often used in web servers when it comes to processing queries from other computers from the Internet simultaneously at high speed - a single host can only answer a limited number of HTTP queries at once and thus represents a bottleneck. Possible Procedures in LB clusters are the upstream connection of a system, such as a load balancer or front-end server, which divides the requests, or the use of DNS with the round-robin method.

11/21/2014 / ln