• subscribe
February 03, 2011 10:00 AM

The Price of High Availability

SQL Server Pro
InstantDoc ID #129441

Over the past couple of months, I’ve been reviewing high-availability servers in the SQL Server Magazine and Windows IT Pro labs. Two of the servers I’ve recently looked at are the NEC Express5800/R320 and the Stratus ftServer 4500. Both of these servers depart from standard servers in several ways. The most significant difference is that these servers are designed primarily for high availability, and their systems’ design definitely reflects this goal. The primary difference between these systems and standard servers is that the NEC Express5800/R320 and the Stratus ftServer 4500 both utilize a similar architecture in which all of the system components are duplexed. In other words, a single high-availability server is composed of two separate and distinct motherboards, and each motherboard has a set of CPUs, RAM, power supply, and storage. The two CPUs are kept in lockstep, and if there’s a hardware failure, the backup set of system components immediately takes over, and the server continues to provide uninterrupted service. Both of these systems can provide five nines of availability. These systems are more expensive than standard servers, but if you have the need for a fault-tolerant server, they’re worth the extra cost.

Other High-Availability Options

Of course, specialized hardware isn’t the only route to high availability. Windows Failover Clustering is Microsoft’s primary high-availability solution and is designed to protect against unplanned downtime caused by server failure. However, it can also provide increased availability for planned downtime, and it lets you perform rolling upgrades, where you can manually fail over to a backup node and upgrade the original server while the backup node handles the application workload. Although Windows Failover Clustering no longer requires specialized hardware, it does require multiple servers, which need to be configured with enough available capacity to handle the additional workload after a failover happens.

SQL Server provides database mirroring and log shipping as a way to increase the availability of your applications. Like Windows Failover Clustering, these technologies are primarily designed to provide protection from unplanned downtime. Database mirroring and log shipping both provide protection at the database level. Database mirroring allows for automatic failover, but it’s up to you to create the logins and other server properties that are required to handle a failover. Log shipping is primarily designed for site recovery and disaster recovery scenarios in which your data is transferred to a backup system at another site. Log shipping doesn’t have an automatic failover option. Although I’ve presented these three solutions as separate options, there’s nothing stopping you from combining them.

Weighing the Cost

When you’re determining which of these types of high-availability solutions best fits your company’s needs, you need to weigh the price of the availability solution against the cost of downtime. Some applications can experience significant amounts of downtime with no cost—other than some end-user inconvenience. For this type of application, regular backups might be a good enough precaution against unexpected failure.

However, mission-critical and ecommerce applications can have huge costs associated with downtime. For many companies, if their web application or its back-end database is down, a major portion of their income is shut down. For instance, a PayPal outage in 2009 was estimated to have resulted in the loss of $2,000 per second, or $7.2 million per hour. And this only counted the loss of direct revenue opportunities. For companies such as PayPal, the cost of downtime is extremely high and certainly worth the price of implementing a five nines level of availability solution.

Finding the Right Solution

Although not every company needs or can afford five nines of availability, high-availability solutions are within the reach of almost all organizations. Even small-to-midsized businesses (SMBs) can have the need for high availability. Although the direct costs of an outage might not be as high for a small business as it is for an enterprise company such as PayPal, the overall impact of a critical application outage to the business might be more significant. Hardware-based solutions, such as the NEC Express5800/R320 and the Stratus ftServer 4500, tend to run in the $30,000 to $40,000 range, but they can offer a load-and-go type of solution that has little complexity and can leverage the vendor’s support organization. Solutions such as Windows Failover Clustering and database mirroring can cost less, but the burden of bearing the extra complexity falls on the customer. In either case, although there’s a price for higher availability, that price can easily offset the cost of downtime.

 

 

The Price of High Availability
Over the past couple of months, I’ve been reviewing high-availability servers in the SQL Server Magazine and Windows IT Pro labs. Two of the servers I’ve recently looked at are the NEC Express5800/R320 and the Stratus ftServer 4500. Both of these servers depart from standard servers in several ways. The most significant difference is that these servers are designed primarily for high availability, and their systems’ design definitely reflects this goal. The primary difference between these systems and standard servers is that the NEC Express5800/R320 and the Stratus ftServer 4500 both utilize a similar architecture in which all of the system components are duplexed. In other words, a single high-availability server is composed of two separate and distinct motherboards, and each motherboard has a set of CPUs, RAM, power supply, and storage. The two CPUs are kept in lockstep, and if there’s a hardware failure, the backup set of system components immediately takes over, and the server continues to provide uninterrupted service. Both of these systems can provide five nines of availability. These systems are more expensive than standard servers, but if you have the need for a fault-tolerant server, they’re worth the extra cost.
Other High-Availability Options
Of course, specialized hardware isn’t the only route to high availability. Windows Failover Clustering is Microsoft’s primary high-availability solution and is designed to protect against unplanned downtime caused by server failure. However, it can also provide increased availability for planned downtime, and it lets you perform rolling upgrades, where you can manually fail over to a backup node and upgrade the original server while the backup node handles the application workload. Although Windows Failover Clustering no longer requires specialized hardware, it does require multiple servers, which need to be configured with enough available capacity to handle the additional workload after a failover happens.
SQL Server provides database mirroring and log shipping as a way to increase the availability of your applications. Like Windows Failover Clustering, these technologies are primarily designed to provide protection from unplanned downtime. Database mirroring and log shipping both provide protection at the database level. Database mirroring allows for automatic failover, but it’s up to you to create the logins and other server properties that are required to handle a failover. Log shipping is primarily designed for site recovery and disaster recovery scenarios in which your data is transferred to a backup system at another site. Log shipping doesn’t have an automatic failover option. Although I’ve presented these three solutions as separate options, there’s nothing stopping you from combining them.
Weighing the Cost
When you’re determining which of these types of high-availability solutions best fits your company’s needs, you need to weigh the price of the availability solution against the cost of downtime. Some applications can experience significant amounts of downtime with no cost—other than some end-user inconvenience. For this type of application, regular backups might be a good enough precaution against unexpected failure.
However, mission-critical and ecommerce applications can have huge costs associated with downtime. For many companies, if their web application or its back-end database is down, a major portion of their income is shut down. For instance, a PayPal outage in 2009 was estimated to have resulted in the loss of $2,000 per second, or $7.2 million per hour. And this only counted the loss of direct revenue opportunities. For companies such as PayPal, the cost of downtime is extremely high and certainly worth the price of implementing a five nines level of availability solution.
Finding the Right Solution
Although not every company needs or can afford five nines of availability, high-availability solutions are within the reach of almost all organizations. Even small-to-midsized businesses (SMBs) can have the need for high availability. Although the direct costs of an outage might not be as high for a small business as it is for an enterprise company such as PayPal, the overall impact of a critical application outage to the business might be more significant. Hardware-based solutions, such as the NEC Express5800/R320 and the Stratus ftServer 4500, tend to run in the $30,000 to $40,000 range, but they can offer a load-and-go type of solution that has little complexity and can leverage the vendor’s support organization. Solutions such as Windows Failover Clustering and database mirroring can cost less, but the burden of bearing the extra complexity falls on the customer. In either case, although there’s a price for higher availability, that price can easily offset the cost of downtime.
InstantDoc ID 129441



ARTICLE TOOLS

Comments
  • jasonaw
    1 year ago
    Mar 17, 2011

    Marathon Technologies www.marathontechnologies.com provides Stratus and NEC ftServer like availability through software lockstepping of CPUs, memory and disks. So you get true full fault-tolerance with no downtime as opposed to the stratus avance software and ftServer (proprietary hardware).

  • sqluptime
    1 year ago
    Feb 08, 2011

    This is right on, plus the higher license fees, witness servers and general "overprovisioning of servers" for performance add more costs to consider. For smaller companies looking at traditional failover clustering, lower price software HA products like Stratus Avance that run on industry standard HW are as big as it gets for an IT "no brainer" for SQL administrators.

  • wirama
    1 year ago
    Feb 07, 2011

    Judging from paypal case, I realize now that high availability becomes very important for mission critical systems.

    Thanks for raising me awareness how important to keep data available in possibility of disaster might happen at any point of time.

You must log on before posting a comment.

Are you a new visitor? Register Here