History of Exchange HA
Prior to Exchange 2007, high availability and disaster recovery features were fairly limited and even up until Exchange 2010 these features relied heavily upon expensive technologies that were complex to implement. Exchange 2007 introduced Local Continuous Replication (LCR), Cluster Continuous Replication (CCR) and Standby Continuous Replication (SCR) although the latter came within Exchange 2007 SP1. LCR works pretty much as the name describes, a copy of the storage group is created on a second set of disks that are locally connected to the mailbox server. Now as you can tell this created a single point of failure at the hardware level, which is why CCR was much more popular as it utilized Windows Failover Clustering technology to provide redundancy at both the hardware and storage level. SCR then utilized the same technology as LCR and CCR to provide site resilience as it made it possible to ship the log files to another Exchange 2007 mailbox server. Exchange 2010 then dropped LCR and combined CCR and SCR to create Database Availability Groups (DAG).
Database Availability Groups
At the heart of Microsoft Exchange Servers High Availability and site resilience framework is Database Availability Groups. Introduced in Exchange 2010, enhanced in 2013 and still utilized in Exchange 2016, DAG’s are simply a group of up to 16 Mailbox servers, with each server hosting a set of databases. Once there is a failure of a DAG member, any active mailbox databases fail-over to another DAG member. The introduction of DAG removed several single points of failure, as there is no longer the reliance upon a single instance of a database – this is due to the ability to have up to 16 globally distributed database copies. This not only provides further resiliency but also reduces the need for technologies such as RAID or other traditional backups, if a hard drive was to fail numerous other database copies would already be available to activate. The progression of DAG has also made it incredibly simple to deploy and with the removal of the requirement for expensive high-performing storage solutions High Availability is now an affordable option in most installations.
New Additions to Exchange HA
Most of the High Availability enhancements within Exchange have been centred around improving the capabilities of the DAG including the introduction of lagged database copies. A lagged database copy is a copy of the database that isn’t updated by replaying transactions as they become available – instead the transaction logs are held for the defined period and then replayed. The primary reason is to provide access to a database that is at a certain point in time where it was known to be in a good state – therefore acting like an insurance policy should there be any form of corruption. If you were to be in the unfortunate situation where the active database had become corrupt then this would enable you to utilize the lagged database copy and bring the database back to a point prior to the corruption.
Replay Lag Manager
Replay Lag Manager was introduced in Exchange 2013, refined in 2016 and will be enabled by default in 2016 CU1. It enables Exchange to change a lag copy into a highly available copy if needed. Once Replay Lag Manager is enabled, it allows log replay to play down the log files in the following scenarios;