Académique Documents
Professionnel Documents
Culture Documents
SwitchoversandFailovers:Exchange2010Help
Applies to: Exchange Server 2010 SP3, Exchange Server 2010 SP2
Topic Last Modified: 20131101
Switchovers and failovers are the two forms of outages in Microsoft Exchange Server 2010. A switchover is a scheduled outage of a database or server that's
explicitly initiated by an administrator, typically in preparation for performing a maintenance operation. Switchovers involve an administrator moving the active
mailbox database copy to another server in the database availability group DAG.
A failover refers to unexpected events that result in the unavailability of services, data, or both. A failover involves the system automatically recovering from the
failure by activating a passive mailbox database copy to make it the active mailbox database copy.
The high availability platform in Exchange 2010 is designed to handle both switchovers and failovers.
Looking for management tasks related to high availability and site resilience? See Managing High Availability and Site Resilience.
Switchovers
There are three types of switchovers in Exchange 2010:
Database switchovers
Server switchovers
Datacenter switchovers
Database Switchovers
A database switchover is the process by which an individual active database is switched over to another database copy a passive copy, and that database copy
is made the new active database copy. Database switchovers can happen both within and across datacenters. A database switchover can be performed by
using the Exchange Management Console EMC or the Exchange Management Shell. Regardless of which interface is used, the switchover process is the same:
1. The administrator initiates a database switchover to move the current active mailbox database copy to another server. The switchover can be initiated by
using the MoveActiveMailboxDatabase cmdlet or by using the Activate a Database Copy wizard.
2. The client used for the task makes an RPC call to the Microsoft Exchange Replication service on a DAG member.
3. If the DAG member doesn't hold the Primary Active Manager PAM role, the DAG member refers the task to the PAM.
4. The task makes an RPC call to the Microsoft Exchange Replication service on the PAM.
5. The PAM reads and updates the database location information that's stored in the cluster database for the DAG.
6. The PAM contacts the Microsoft Exchange Replication service on the DAG member whose passive copy is being activated as the new active mailbox
database copy.
7. The Microsoft Exchange Replication service on the target server queries the Microsoft Exchange Replication services on all other DAG members to
determine the best log source for the database copy.
8. The database is dismounted from the current server and the Microsoft Exchange Replication service on the target server copies the remaining logs to
the target server.
9. The Microsoft Exchange Replication service on the target server requests a database mount.
10. The Microsoft Exchange Information Store service on the target server replays the log files and mounts the database.
11. Any error codes are returned to the Microsoft Exchange Replication service on the target server.
12. The PAM updates the database copy state information in the cluster database for the DAG.
13. Any error codes are returned by the Microsoft Exchange Replication service on the target server to the Microsoft Exchange Replication service on the
PAM.
14. The Microsoft Exchange Replication service on the PAM returns any errors to the administrative interface where the task was called.
15. Remote PowerShell returns the results of the operation to the calling administrative interface.
For detailed steps about how to perform a database switchover, see Move the Active Mailbox Database.
Server Switchovers
https://technet.microsoft.com/enus/library/dd298067(d=printer,v=exchg.141).aspx
1/7
1/22/2015
SwitchoversandFailovers:Exchange2010Help
A server switchover is the process by which all active databases on a DAG member are activated on one or more other DAG members. Like database
switchovers, a server switchover can occur both within a datacenter and across datacenters, and it can be initiated by using both the EMC and the Shell.
Regardless of which interface is used, the switchover process is the same:
1. The administrator initiates a server switchover to move all current active mailbox database copies to one or more other servers. The switchover can be
initiated by using the MoveActiveMailboxDatabase cmdlet, or by using the Switchover Server UI.
2. The task performs the same steps described earlier in this topic for database switchovers Steps 2 through 4 for each of the active databases on the
current server.
3. The PAM reads and updates the database location information that's stored in the cluster database for the DAG.
4. The PAM contacts the Microsoft Exchange Replication service on each DAG member that has a passive copy being activated.
5. The Microsoft Exchange Replication service on the target servers query the Microsoft Exchange Replication services on all other DAG members to
determine the best log source for the database copy.
6. The database is dismounted from the current server and the Microsoft Exchange Replication service on each target server copies the remaining logs.
7. The Microsoft Exchange Replication service on each target server requests a database mount.
8. The Microsoft Exchange Information Store service on each target server replays the log files and mounts the database.
9. Any error codes are returned to the Microsoft Exchange Replication service on the target server.
10. The PAM updates the database copy state information in the cluster database for the DAG.
11. Any error codes are returned by the Microsoft Exchange Replication service on the target server to the Microsoft Exchange Replication service on the
PAM.
12. The Microsoft Exchange Replication service on the PAM returns any errors to the administrative interface where the task was called.
13. Remote PowerShell returns the results of the operation to the calling administrative interface.
For detailed steps about how to perform a server switchover, see Perform a Server Switchover.
Datacenter Switchovers
A datacenter or site failure is managed differently from the types of failures that can cause a server or database failover. In a high availability configuration,
automatic recovery is initiated by the system, and the failure typically leaves the messaging system in a fully functional state. By contrast, a datacenter failure is
considered to be a disaster recovery event, and as such, recovery must be manually performed and completed for the client service to be restored and for the
outage to end. The process you perform is called a datacenter switchover. As with many disaster recovery scenarios, prior planning and preparation for a
datacenter switchover can simplify your recovery process and reduce the duration of your outage.
For more information about datacenter switchovers, including detailed steps for performing a datacenter switchover, see Datacenter Switchovers.
For assistance with performing a datacenter switchover, see Guided Walkthrough: Exchange Server 2010 Datacenter Switchover for a Database Availability
Group.
Failovers
A failover is an automatic activation process that can occur at either the database or server level. Failovers occur in response to a failure that affects an individual
database for example, an isolated storage loss or an entire server for example, a motherboard failure or a loss of power.
DAGs and mailbox database copies provide full redundancy and therefore rapid recovery of both the data and the services that provide access to the data. The
following table lists the expected recovery actions for a variety of failures. Some failures require the administrator to initiate the recovery, and other failures are
automatically handled by the system.
Description
Automatic
activation
Automatic
repair action
Possible short
outage.
Automatic
patching of bad
page.
Possible
automatic
failover.
State
during
repair:
Active
State
during
repair:
Passive
Manual
switchover,
automatic
failover, or
online
repair.
Failed
https://technet.microsoft.com/enus/library/dd298067(d=printer,v=exchg.141).aspx
Repair actions
Comments
There may be
other soft
database failure
codes.
Doesn't include
NTFS file system
block failures.
2/7
1/22/2015
SwitchoversandFailovers:Exchange2010Help
If failover or
switchover is
performed, host
server is updated.
ESE "semisoft" database
failure: The drives storing the
database are returning errors
on some writes.
Short outage
during automatic
failover.
Short outage
during automatic
failover.
Automatic
volume/disk
rebuilt after
possible drive
replacement.
Dismounted
if cant be
recovered.
Automatic
volume/disk
rebuilt after
possible drive
replacement.
Dismounted
if cant be
recovered.
Failed
Failed
An ESE semisoft
write error means
some writes are
successful.
Doesn't include an
NTFS block failure.
An ESE semisoft
read/write error
means some
reads/writes are
successful.
If the database
fails, automated
recovery will occur
before log data
recovery
processing starts.
Short outage
during automatic
failover.
None.
Dismounted
if cant be
recovered.
Failed
Short outage
during automatic
failover.
Volume
completely
rebuilt after
possible drive
replacement.
Dismounted
if cant be
recovered.
Failed
Short outage
during automatic
failover.
Short outage
during automatic
failover.
Drive
reformatted or
replaced,
followed by
complete
volume rebuild.
Dismounted
if cant be
recovered.
Drive
reformatted or
replaced.
Dismounted
if cant be
recovered.
Failed
Not applicable.
Not applicable.
Automatic
failover if other
copy isn't in
similar state.
None.
Dismounted.
Failed
Not applicable.
If automatic
failover isn't
blocked by the
administrator,
there will be a
short outage.
None.
Dismounted.
Not
applicable
Not applicable.
If automatic
failover is
https://technet.microsoft.com/enus/library/dd298067(d=printer,v=exchg.141).aspx
3/7
1/22/2015
SwitchoversandFailovers:Exchange2010Help
prevented, there
will be an outage
until the database
is mounted.
Administrator suspends the
wrong database copy.
Depending on
configuration and
impacted copy,
auto recovery
may be
prevented.
None.
Not
applicable.
Suspended
Not applicable.
Administrator dismounts a
database for storage, NTFS,
or volume maintenance.
If automatic
failover isn't
blocked by the
administrator,
there will be a
short outage.
None.
Dismounted.
Not
applicable
Not applicable.
If automatic
failover is
blocked, there will
be an outage
until the
administrator
completes the
task.
Administrator suspends a
database copy for storage,
NTFS, or volume
maintenance.
Depending on
configuration and
impacted copy,
auto recovery
may be
prevented.
None.
Not
applicable.
Suspended
Not applicable.
Administrator dismounts a
database for offline database
maintenance.
Outage until
repaired.
None.
Dismounted.
Suspended
Short outage
during automatic
failover.
None.
Dismounted.
Any
Repair hardware.
A passive database
copy will be in the
state that existed
at the time when
the system failed.
Server hardware
maintenance.
Short outage
during automatic
failover unless
blocked by an
administrator.
None.
Dismounted.
Any
Complete actions.
A passive database
copy will be in the
state that existed
at the time when
the system was
shut down.
Short outage
during automatic
failover unless
blocked by an
administrator.
None.
Dismounted.
Any
Complete actions.
A passive database
copy will be in the
state that existed
at the time when
the system was
shut down.
Microsoft Exchange
Information Store service is
stopped or paused by an
administrator.
None.
None.
Dismounted.
Any
A passive database
copy will be in the
state that existed
at the time when
the service was
stopped.
Microsoft Exchange
Information Store service
fails; operating system is still
running.
Short outage
during automatic
failover.
Service Control
Manager
restarts the
Microsoft
Exchange
Information
Store service.
Dismounted.
Any
A passive database
copy will be in the
state that existed
when the
Microsoft
Exchange
Information Store
service failed.
https://technet.microsoft.com/enus/library/dd298067(d=printer,v=exchg.141).aspx
4/7
1/22/2015
SwitchoversandFailovers:Exchange2010Help
Partial Microsoft Exchange
Information Store service
failure; some part of the
Exchange store stops
functioning, but it's not
identified as completely
failed.
Possible short
outage during
automatic
failover.
None.
Mounted
and partially
functional.
Any, but
may be
only
partially
functional
Not applicable.
Short outage
during automatic
failover.
Restart
computer.
Dismounted.
Any
Not applicable.
Outage until
repaired.
None.
Dismounted.
Any
A passive database
copy will be in the
state that existed
at the time when
the system failed.
MAPI network
communication failure: The
server is no longer available
on the MAPI network.
Short outage
during automatic
failover; must be
lossless.
None.
Communication
continues to be
attempted.
Dismounted.
Any
Not applicable.
Replication network
communication failure: The
server cant receive
heartbeats, log copies, or
seed through the failed
replication network.
Possible short
copying or
seeding outage
while the
workload is
switched to other
network.
None.
Communication
continues to be
attempted.
None.
Any
Resiliency
impacted by
failure.
Multiple network
communication failure: The
server cant receive
heartbeats, log copies, or
seed through multiple
networks.
Short outage
during automatic
failover; must be
lossless.
None.
Communication
continues to be
attempted.
Dismounted.
Any
At least one
network is still
functional.
Failure not
detected; no
action.
None.
Mounted,
but possible
performance
issues.
Any
Network
experiences higher
than normal error
rates.
None.
None.
Any.
Any
Hang isn't
detected so no
action is taken.
Short outage
during automatic
failover.
None.
Dismounted.
Any
Not applicable.
Short outage
during automatic
failover.
None.
Dismounted.
Any
Not applicable.
Short outage
during automatic
failover.
None.
Dismounted.
Any
Not applicable.
Complete power
failure
Unrecovered failure of
the processor chip,
motherboard, or
backplane
Operating system stop
error
Operating system
stops responding
Complete
communication failure
Some functionality
may be
operational.
https://technet.microsoft.com/enus/library/dd298067(d=printer,v=exchg.141).aspx
5/7
1/22/2015
SwitchoversandFailovers:Exchange2010Help
Drive containing the
Exchange binaries is out of
space.
Short outage
during automatic
failover.
None.
Dismounted.
Any
Not applicable.
Short outage
during automatic
failover; assume
other copies
don't have the
same problem.
None.
Dismounted.
Failed
Continuous replication
detects invalid log: Replay
detects an inappropriate log
during copy or replay.
Not applicable.
Discard log.
Not
applicable.
Failed
Not applicable.
Database Failovers
A database failover occurs when a database copy that was active is no longer able to remain active. The following occurs as part of a database failover:
1. The database failure is detected by the Microsoft Exchange Information Store service.
2. The Microsoft Exchange Information Store service writes failure events to the crimson channel event log.
3. The Active Manager on the server that contains the failed database detects the failure events.
4. The Active Manager requests the database copy status from the other servers that hold a copy of the database.
5. The other servers return the requested database copy status to the requesting Active Manager.
6. The PAM initiates a move of the active database to another server in the DAG using a best copy selection algorithm.
7. The PAM updates the database mount location in the cluster database to refer to the selected server.
8. The PAM sends a request to the Active Manager on the selected server to become the database master.
9. The Active Manager on the selected server requests that the Microsoft Exchange Replication service attempt to copy the last logs from the previous
server and set the mountable flag for the database.
10. The Microsoft Exchange Replication service copies the logs from the server that previously had the active copy of the database.
11. The Active Manager reads the maximum log generation number from the cluster database.
12. The Microsoft Exchange Information Store service mounts the new active database copy.
Server Failovers
A server failover occurs when the DAG member is no longer able to service the MAPI network, or when the Cluster service on a DAG member is no longer able
to contact the remaining DAG members. The following occurs as part of a server failover:
1. The Cluster service on the PAM sends a notification to the PAM for one of two conditions:
a. Node Down The server is reachable but is unable to participate in DAG operations.
b. MAPI Network Down The server can't be contacted over the MAPI network and therefore can't participate in DAG operations.
2. If the server is reachable, the PAM contacts the Active Manager on the affected server and requests that all databases be immediately dismounted.
3. For each affected database copy:
a. The PAM requests the database copy status from all servers in the DAG.
b. The PAM receives a response from all reachable and active DAG members.
c. The PAM tries to determine the best log source among all responding servers by querying the most recent log generation number from each of
the responders.
d. Each of the servers responds with the log generation number.
4. The PAM retrieves the current search index catalog status from the cluster database.
5. Based on the log generation number and catalog health of each database copy, the PAM selects the best copies to activate.
6. The PAM updates the mounted location of the database in the cluster database.
7. The PAM initiates database failover by communicating with the Active Manager on one or more other servers.
8. The Active Manager on the selected servers requests that the Microsoft Exchange Replication service attempt to copy the last logs from the previous
https://technet.microsoft.com/enus/library/dd298067(d=printer,v=exchg.141).aspx
6/7