Redundancy is the key to high availability. This also applies to the extreme case, where an entire data center is protected against power failures and catastrophes. A metro cluster is basically a local cluster that is deployed on two or more sites, using a locally mirrored memory.
Technical background
A metrocluster can be designed in such a way that no single point of failure persists and that a single hardware failure does not require any switching between the sites. The biggest advantage of a MetroCluster is the automatic switchover without the need for an administrator.
Scenarios of system failure
When using asynchronous replication, a person must decide when and when to switch. This in turn requires a previously defined emergency plan. The automation of this process, on the other hand, can guarantee a continuous uptime for applications.
Failure without "Split Brain"
For each location, the concept consists of a storage layer, each of which is designed to be highly localized, a cluster with two nodes. This cluster provides the hard disk space for the service nodes. These reflect their data between the two sites, and all four nodes belong to a 4-node cluster. A full MetroCluster offers numerous advantages
Conclusion
To build a metro cluster, two conditions should be met:
In the US, where sites are often thousands of kilometers apart, there is in most cases no technical possibility to use a metro cluster and the concept is therefore hardly known there. The idea of a MetroCluster is based on European conditions, where companies often have several branches and data centers that are only a few kilometers apart. Many companies with relatively low investments can raise the availability of their clusters to a very high level.
A cluster has many weaknesses. The goal of a MetroCluster is to provide automatic recovery solutions for any weaknesses to avoid or at least severely limit the negative impact on applications.
In the following, seven scenario scenarios and their consequences are presented using the example of a MetroCluster based on the ZFS file system.
1. Failure of a hard disk: If a hard disk fails, this has practically no consequences for the operational operation. The administrator exchanges the disk during operation, the data on the defective disk is simply synchronized again.
2. Failure of important components in the disk shelves: The multi-pathing of the storage nodes ensures that all services remain online without interruption in the event of a SAS cable, SAS HBA or expander failing. The administrator replaces the parts during operation.
3. Failure of an entire disk shelve: The distribution of the RAIDZ-2 hard disk connections is distributed between the JBODs, so that a complete JBOD failure can be copied. If a JBOD is reconnected after a failure, only the previously changed data is synchronized. All services remain online without interruption, without any significant performance slump.
4. Failure of a storage node: If a complete server of the storage nodes fails, a second server in the same location can take over the tasks of the defective server within a few seconds. Although the I / O data stream is intermittent, these dropouts are not passed to the applications because the mirror is still present at the second location at any time.
5. Failure of a switch, cable or Fiber Channel HBAs between storage nodes and the upper service nodes: This scenario is also addressed by multi-pathing the service nodes. Failover to the other data center is not necessary and the performance of the applications is not significantly affected.
6. Failure of a service node: In the case of a complete failure of a service node, the use of ZFS only results in a short interruption of the I / O stream to the applications lasting several seconds. The switchover time depends on the number of services such as NFS shares, CIFS shares, iSCSI targets. It is, however, independent of the amount of data, since, unlike other systems, ZFS technology never needs to perform a complete file system check.
For the application servers, this switching is transparent; in the case of Fiber Channel, the application servers must bring an ALUA-capable multi-pathing driver from the operating system, which is often the standard today.
The cluster is configured in such a way that the services are always first moved to the locally neighboring node, in order to make site failover necessary only for the complete failure of a site.
7. Failure of a complete site: In the worst case scenario, a complete site is missing. It is only in this case that the Metrocluster uses the redundancy of the data center for a failover and the second location takes over all the services. All services are therefore available to the application servers, even if only half of the service nodes, ie with limited performance.
Since, in this case, the mirroring, reading and writing between the sites is also eliminated, the latency improves, which can lead to better performance even in the case of databases. If the failed site is back online, the entire database will never be lost, but only all the data changed so far.
In order to avoid undefined states, so-called "split brains," a ZFS metro cluster is typically implemented as follows:
Today high-availability data centers are the backbone of numerous companies that invest large sums of money into their business. For all companies that already have two sites within a 50-kilometer radius or are able to use resources in a data center operated by service providers, a MetroCluster is a suitable method to keep their systems accessible and active under all circumstances >
No comments:
Post a Comment