Académique Documents
Professionnel Documents
Culture Documents
-> How will you restart your SQL server on cluster without failover..?
A : Choose option ( Take offline and Bring online option by right clicking node)
-> What will you if want to add a disk to the SQL Group cluster ..?
A : Need to choose Add Dependency option after doing that in Cluster administrator tool (or) in Failover
Cluster admin tool from 2008 version
-> As a DBA how will you design active/active cluster requirement . (i.e), how will you manage resource if
failed over..?
Please read article from MSDN on this to have better understanding
-> Steps for failover..?
A : Please red MSDN on this with full view
-> Difference between SQLSERVER 2005 and SQLSERVER 2008 Cluster Installation ..?
A : In sql2005 we have the option of installing SQL in remaining nodes from the primary node ., But in
sql2008 we need to go separately (Login to the both nodes) for installing SQL cluster .
Q: If using virtual machines and clustering / failing over at that level (not sql server)
is there any reason that SQL Server Standard Edition wont work? Someone once
told us in a sql class that Enterprise Edition was necessary for this.
Answer from Brent: dont you just love those someone once told us things? Youll want to
get them to tell you why. Standard Edition works fine in virtual machines. It may not be costeffective once you start stacking multiple virtual machines on the same host, though,
because you have to pay for Standard Edition for every guest.
Q: Hi, with mirroring being deprecated and Always On AG only available with
Enterprise Edition what are our HA options going to be with Standard Edition in the
future? Any ideas if Always On synchronous will make it into Standard?
Answer from Jeremiah: You have a few HA choices with SQL Server 2012 Standard Edition
and beyond. Even though mirroring is deprecated, you could feasibly use mirroring in the
hope that something new will come out. Obviously, this isnt a viable option. The other HA
option is to use clustering. SQL Server Standard Edition supports 2 node clusters, so you
can always use it for HA.
Matt Velic has put together a guide on how to build a virtual lab with a Windows Server
2012 Cluster with Virtual Box and SQL Server 2012 Evaluation Edition complete with
eBook.
If folks want to test AlwaysOn, I did a video a while back on how to plan an
AvailabilityGroup lab, and our AlwaysOn page has a setup checklist PDF.
HOW TO MANAGE ALWAYSON AVAILABILITY GROUPS
Q: Did you experience or know split brain scenario in Always On Availability
Groups that when secondary node is up to take over primary role, the transaction
becomes inconsistent? And how to avoid it?
Answer from Brent: Ooo, theres several questions in here. First, theres the concept of split
brained clusters when two different database servers both believe theyre the master.
Windows Server Failover Clustering (WSFC) has a lot of plumbing built in to avoid that
scenario. When you design a cluster, you set up quorum voting so that the nodes work
together to elect a leader. In theory, you cant run into a split brain scenario automatically
but, you can most definitely run into it manually if you go behind the scenes and change
cluster settings. The simple answer here: education. Learn about how the quorum process
works, learn the right quorum settings for the number of servers you have, and prepare for
disaster ahead of time. Know how youll need to react when a server (or an entire data
center) goes down. Plan and script those tasks, and then you can better avoid split brain
scenarios.
Q: Can you recommend any custom policies for monitoring AlwaysOn? Or do the
system policies provide thorough coverage? Thank you!
Answer from Brent: I was a pretty hard-core early adopter of AlwaysOn Availability Groups
because I had some clients who needed it right away. In that situation, you have to go to
production with the monitoring you have, not the monitoring you want. The built-in stuff just
wasnt anywhere near enough, so most of my early adopters ended up rolling their own.
StackOverflows about to share some really fun stuff there, so Id keep an eye
on Blog.ServerFault.com. You should also evaluate SQL Sentry 7.5s new AlwaysOn
monitoring - its the only production monitoring Im aware of, although I know all the other
developers are coming out with updates to their tools for monitoring too.
Q: Is it wise to have primary availability groups in one server of the nodes and have
primary groups on another of the servers that form the cluster. Or is it better to have
all primary groups on server 1 and secondary on server 2?
Answer from Brent: If you split the primaries onto two different nodes, then you can do
some load balancing.
and into the VMware infrastructure. More than anything else, this is a business decision
just be sure youre happy with the decision of which team is managing your uptime.
Q: When using a virtualized active/passive 2008R2 cluster with underlying iSCSI
storage can the nodes by on different hosts or is FoE needed to have nodes on
different hosts?
Answer from Brent: Check out VMwares knowledge base article on Microsoft cluster
support. It lays out your options for iSCSI, FC, FCoE, and more, and separates them by
shared-disk clustering versus non-shared-disk (AlwaysOn Availability Groups).
Q: Any thoughts on implementing AlwaysOn in conjunction with a virtual SQL
environment using VMWare HA/ Site Recovery Manager (SRM)?
Answer from Kendra: With this level of complexity, when things get tricky its incredi-hard to
sort out. You gotta have a rockstar team with great processes and communication skills to
handle problems as they arise and you are going to hit problems.
Even if you have the rockstar team, you want to first ask if theres a simpler way to meet
your requirements with a less risky cocktail of technologies. If you rush into what you
describe, youll find that your high availability solution becomes your primary cause of
downtime.
volumed/ formatted/ and configured identically, and then you can move tempdb files over to
it. You will need to restart SQL Sever to make modified tempdb files recognize the new
paths.
>Types of Clusters ?
In Windows we can configure two types of clusters
1. NLB (network load balancing) cluster for balancing load between servers. This cluster will not provide
any high availability. Usually preferable at edge servers like web or proxy.
2. Server Cluster: This provides High availability by configuring active-active or active-passive cluster. In
2 node active-passive cluster one node will be active and one node will be stand by. When active server
fails the application will FAILOVER to stand by server automatically. When the original server backs we
need to FAILBACK the application
> What is Quorum ? A shared storage need to provide for all servers which keeps information about
clustered application and session state and is useful in FAILOVER situation. This is very important if
Quorum disk fails entire cluster will fails.
>Why Quorum is necessary ?
When network problems occur, they can interfere with communication between cluster nodes. A small set
of nodes might be able to communicate together across a functioning part of a network, but might not be
able to communicate with a different set of nodes in another part of the network. This can cause serious
issues. In this split situation, at least one of the sets of nodes must stop running as a cluster.
To prevent the issues that are caused by a split in the cluster, the cluster software requires that any set of
nodes running as a cluster must use a voting algorithm to determine whether, at a given time, that set has
quorum. Because a given cluster has a specific set of nodes and a specific quorum configuration, the
cluster will know how many votes constitutes a majority (that is, a quorum). If the number drops below
the majority, the cluster stops running. Nodes will still listen for the presence of other nodes, in case
another node appears again on the network, but the nodes will not begin to function as a cluster until the
quorum exists again.
For example, in a five node cluster that is using a node majority, consider what happens if nodes 1, 2,
and 3 can communicate with each other but not with nodes 4 and 5. Nodes 1, 2, and 3 constitute a
majority, and they continue running as a cluster. Nodes 4 and 5 are a minority and stop running as a
cluster, which prevents the problems of a split situation. If node 3 loses communication with other
nodes, all nodes stop running as a cluster. However, all functioning nodes will continue to listen for
communication, so that when the network begins working again, the cluster can form and begin to run.
> Different types of Quorum in Windows server 2008 ?
1.Node Majority - Used when Odd number of nodes are in cluster.
2.Node and Disk Majority - Even number of nodes(but not a multi-site cluster)
3.Node and File Share Majority - Even number of nodes, multi-site cluster
4.Node and File Share Majority - Even number of nodes, no shared storage
> Different types of Quorum in Windows server 2003 ?
Standard Quorum : As mentioned above, a quorum is simply a configuration database for MSCS, and is
stored in the quorum log file. A standard quorum uses a quorum log file that is located on a disk hosted
on a shared storage interconnect that is accessible by all members of the cluster.
Standard quorums are available in Windows NT 4.0 Enterprise Edition, Windows 2000 Advanced Server,
Windows 2000 Datacenter Server, Windows Server 2003 Enterprise Edition and Windows Server 2003
Datacenter Edition.
Majority Node Set Quorums : A majority node set (MNS) quorum is a single quorum resource from a
server cluster perspective. However, the data is actually stored by default on the system disk of each
member of the cluster. The MNS resource takes care to ensure that the cluster configuration data stored
on the MNS is kept consistent across the different disks.
Majority node set quorums are available in Windows Server 2003 Enterprise Edition, and Windows
Server 2003 Datacenter Edition.
>Explain about each Quorum type ?
Node Majority: Each node that is available and in communication can vote. The cluster functions only with
a majority of the votes, that is, more than half.
Node and Disk Majority: Each node plus a designated disk in the cluster storage (the disk witness) can
vote, whenever they are available and in communication. The cluster functions only with a majority of the
votes, that is, more than half.
Node and File Share Majority: Each node plus a designated file share created by the administrator (the
file share witness) can vote, whenever they are available and in communication. The cluster functions
only with a majority of the votes, that is, more than half.
No Majority: Disk Only: The cluster has quorum if one node is available and in communication with a
specific disk in the cluster storage.
> How is the quorum information located on the system disk of each node kept in synch?
The server cluster infrastructure ensures that all changes are replicated and updated on all members in a
cluster.
> Can this method be used to replicate application data as well?
No, that is not possible in this version of clustering. Only Quorum information is replicated and maintained
in a synchronized state by the clustering infrastructure.
> Can I convert a standard cluster to an MNS cluster?
Yes. You can use Cluster Administrator to create a new Majority Node Set resource and then, on the
cluster properties sheet Quorum tab, change the quorum to that Majority Node Set resource.
> What is the difference between a geographically dispersed cluster and an MNS cluster?
A geographic cluster refers to a cluster that has nodes in multiple locations, while an MNS-based cluster
refers to the type of quorum resources in use. A geographic cluster can use either a shared disk or MNS
quorum resource, while an MNS-based cluster can be located in a single site, or span multiple sites.
> What is the maximum number of nodes in an MNS cluster?
Windows Server 2003 supports 8-node clusters for both Enterprise Edition and Datacenter Edition.
> Do I need special hardware to use an MNS cluster?
There is nothing inherent in the MNS architecture that requires any special hardware, other than what is
required for a standard cluster (for example, there must be on the Microsoft Cluster HCL). However, some
situations that use an MNS cluster may have unique requirements (such as geographic clusters), where
data must be replicated in real time between sites.
> Does a cluster aware application need to be rewritten to support MNS?
No, using an MNS quorum requires no change to the application. However, some cluster aware
applications expect a shared disk (for example SQL Server 2000), so while you do not need shared disks
for the quorum, you do need shared disks for the application.
> Does MNS get rid of the need for shared disks?
It depends on the application. For example, clustered SQL Server 2000 requires shared disk for data.
Remember, MNS only removes the need for a shared disk quorum.
> What does a failover cluster do in Windows Server 2008 ?
A failover cluster is a group of independent computers that work together to increase the availability of
applications and services. The clustered servers (called nodes) are connected by physical cables and by
software. If one of the cluster nodes fails, another node begins to provide service (a process known as
failover). Users experience a minimum of disruptions in service.
> What new functionality does failover clustering provide in Windows Server 2008 ?
New validation feature. With this feature, you can check that your system, storage, and network
configuration is suitable for a cluster.
Support for GUID partition table (GPT) disks in cluster storage. GPT disks can have partitions larger than
two terabytes and have built-in redundancy in the way partition information is stored, unlike master boot
record (MBR) disks.
> What happens to a running Cluster if the quorum disk fails in Windows Server 2003 Cluster ?
In Windows Server 2003, the Quorum disk resource is required for the Cluster
to function. In your example, if the Quorum disk suddenly became unavailable
to the cluster then both nodes would immediately fail and not be able to
restart the clussvc.
In that light, the Quorum disk was a single point of failure in a Microsoft
Cluster implementation. However, it was usually a fairly quick workaround to
get the cluster back up and operational. There are generally two solutions
to that type of problem.
1. Detemrine why the Quorum disk failed and repair.
2. Reprovision a new LUN, present it to the cluster, assign it a drive
letter and format. Then start one node with the /FQ switch and through
cluadmin designate the new disk resource as the Quorum. Then stop and
restart the clussvc normally and then bring online the second node.
> What happens to a running Cluster if the quorum disk fails in Windows Server 2008 Cluster ?
Cluster continue to work but failover will not happen in case of any other failure in the active node.
resource.
When recreating or reconfiguring the mounted drive(s), follow these
guidelines:
Make sure that you create unique mounted drives so that they do
not conflict with existing local drives on any node in the cluster.
Do not create mounted drives between disks on the cluster storage
device (cluster disks) and local disks.
Do not create a mounted drive from a clustered disk to the cluster
disk that contains the quorum resource (the quorum disk). You can,
however, create a mounted drive from the quorum disk to a clustered
disk.
Mounted drives from one cluster disk to another must be in the same
cluster resource group, and must be dependent on the root disk.
Basic Troubleshooting Steps
When working with SQL Server failover clustering, remember that the
server cluster consists of a failover cluster instance that runs under
Microsoft Cluster Services (MSCS). The instance of SQL Server might
be hosted by Microsoft MSCS-based nodes that provide the Microsoft
Server Cluster.
If problems exist on the nodes that host the server cluster, those
problems may manifest themselves as issues with your failover cluster
instance. To investigate and resolve these issues, troubleshoot a SQL
Server failover cluster in the following order:
1. Hardware: Review Microsoft Windows system event logs.
2. Operating system: Review Windows system and application event
logs.
3. Network: Review Windows system and application event logs. Verify
the current configuration against the Knowledge Base article,
Recommended Private "Heartbeat" Configuration on a Cluster Server.
4. Security: Review Windows application and security event logs.
5. MSCS: Review Windows system, application event, and cluster logs.
6. SQL Server: Troubleshoot as normal after the hardware, operating
system, network, security, and MSCS foundations are verified to be
problem-free.
Recovering from Failover Cluster Failure
Usually, failover cluster failure is to the result of one of two causes:
Hardware failure in one node of a two-node cluster. This hardware
failure could be caused by a failure in the SCSI card or in the operating
system.
To recover from this failure, remove the failed node from the failover
cluster using the SQL Server Setup program, address the hardware
failure with the computer offline, bring the machine back up, and then
add the repaired node back to the failover cluster instance.
For more information, see How to: Create a New SQL Server Failover
Cluster (Setup) and How to: Recover from Failover Cluster Failure in
Scenario 1.
Operating system failure. In this case, the node is offline, but is not
irretrievably broken.
To recover from an operating system failure, recover the node and test
failover. If the SQL Server instance does not fail over properly, you
must use the SQL Server Setup program to remove SQL Server from
the failover cluster, make necessary repairs, bring the computer back
up, and then add the repaired node back to the failover cluster
instance.
Recovering from operating system failure this way can take time. If
the operating system failure can be recovered easily, avoid using this
technique.
For more information, see How to: Create a New SQL Server Failover
Cluster (Setup) and How to: Recover from Failover Cluster Failure in
Scenario 2.
Resolving Common Problems
Problem: The Network Name is offline and you cannot connect to SQL
Server using TCP/IP
Issue 1: DNS is failing with cluster resource set to require DNS.
Resolution 1: Correct the DNS problems.
Issue 2: A duplicate name is on the network.
Resolution 2: Use NBTSTAT to find the duplicate name and then
correct the issue.
Issue 3: SQL Server is not connecting using Named Pipes.
Resolution 3: To connect using Named Pipes, create an alias using the
SQL Server Configuration Manager to connect to the appropriate
computer. For example, if you have a cluster with two nodes (Node A
and Node B), and a failover cluster instance (Virtsql) with a default
instance, you can connect to the server that has the Network Name
resource offline using the following steps:
1. Determine on which node the group containing the instance of SQL
Server is running by using the Cluster Administrator. For this example,
it is Node A.
2. Start the SQL Server service on that computer using net start. For
more information about using net start, see Starting SQL Server
Manually.
3. Start the SQL Server SQL Server Configuration Manager on Node A.
5. In the Distributed Transaction Coordinator window, click the Logon tab, and set the logon account NT
AUTHORITY\NetworkService.
6. Click Apply and OK to close the Distributed Transaction Coordinator window. Close the Computer
Management window. Close the Administrative