Académique Documents
Professionnel Documents
Culture Documents
Solution:
1. Verify
a.
b.
c.
d.
e.
When the Windows Failover Cluster (WFC) is initially configured a Cluster Name object (CNO) will be created. The CNO is
visible as a computer object in your Activity Directory Users and Computer snap-in (dsa.msc). By default the CNO will be
created in the Computers container and granted specific permissions:
allation you will now see a Virtual Computer Object (VCO) for the SQL Server Network Name:
*Note: After the CNO is created any additional Network Name resource in the cluster is considered a Virtual Computer
Object. VCOs are simply Computer objects in which the CNO has permissions to change the properties or reset the
password.
Problem
But what if the CNO does not possess the required permissions to create computer objects in the Computers container?
It is in the above scenario where we commonly see the following errors during SQL Server FCI installation:
A user encountering the same issue while installing a pre-SQL Server 2012 version may see:
The cluster resource 'SQL Server (MSSQLSERVER)' could not be brought online. Error: The resource failed to come online
due to the failure of one or more provider resources. (Exception from HRESULT: 0x80071736)
System log:
Cluster network name resource 'SQL Network Name (VSQL2012)' failed to create its associated computer object in domain
'motox.com' during: Resource online.
The text for the associated error code is: A constraint violation occurred.
The common cause of the Network Name resource failure is insufficient permissions. More specifically, the permission
"Create Computer Objects" has not been granted to the Cluster Name Object(CNO).
http://technet.microsoft.com/en-us/library/cc731002(v=ws.10).aspx
when you create a failover cluster and configure clustered services or applications, the failover cluster wizards create
the necessary Active Directory computer accounts (also called computer objects) and give them specific permissions. The
wizards create a computer account for the cluster itself (this account is also called the cluster name object or CNO) and a
computer account for most types of clustered services and applications
When the SQL Server Network Name is first brought online during the FCI installation process, the CNO identity is used to
create the VCO(as long as the VCO doesnt already exist). If the required permissions are not granted to the CNO, the
creation of the VCO will fail and so will your SQL Server FCI installation.
*Note: The Create Computer objects right only applies to Domain Functional Levels above Windows Server 2003. For
Windows Server 2003 the required privilege is Add Workstations to the Domain.
Resolution(s)
Option #1
4. Open the properties of the container and click the "Security" tab. Click "Add" and add the CNO. Make sure to select
Computers option in the Object Types window:
6. Make sure "Read all properties" and "Create Computer objects" are checked. Click OK until you're back to the AD Users
and Computer window:
7. Retry your previously failed installation. Note that with SQL Server 2012 there will be a retry button.
Option # 2
We can also Pre-Stage the VCO, which is useful in situations where the Domain Administrator does not allow the CNO
Read All Properties and Create computer Objects permissions:
1. Ensure that you are logged in as a user that has permissions to create computer objects in the domain.
4. Right click the OU/Container you want the VCO to reside in and click New -> Computer
5. Provide a name for the object (This will be your SQL Server Network Name) and click OK:
6. Right click on the on the VCO you just created and select Properties. Click the security tab and then click Add:
7. Enter the CNO (Make sure to select Computers option in the Object Types window) and click OK.
8. Highlight the CNO, check the following permissions, and click OK.
Read
Allowed To Authenticate
Change Password
Receive As
Reset Password
Send As
Read MS-TS-GatewayAccess
*Note: You can replace step #8 by giving the CNO Full Control over the VCO
9. Install SQL Server and the Network Name resource should start without issue.
SYMPTOMS
When you try to bring a SQL Server cluster resource
online for a virtual instance of Microsoft SQL Server
2000, of SQL Server 2005, or of SQL Server 2008,
you may notice the following behavior:
Date: 08/05/2004
Time: 1:11:19 AM
Source: ClusSvc
Category: Failover Mgr
Type: Error
Event ID: 1069
User: N/A
Computer: <Computer Name>
Description:
Cluster resource 'SQL Server (<SQL Server instance name>)' in Resource Group '<Cluster group name>'
failed.
Error message 2
An error message that is similar to the following is in the Cluster log file:
Error message 3
Error messages that are similar to the following are in the SQL Server error log file:
2003-11-30
2003-11-30
2003-11-30
2003-11-30
2003-11-30
2003-11-30
2003-11-30
2003-11-30
2003-11-30
17:00:37.27
17:00:37.27
17:00:37.27
17:00:37.27
17:00:37.27
17:00:37.27
17:00:37.27
17:00:37.27
17:00:37.27
CAUSE
The resource-specific registry keys that correspond to the SQL Server cluster resource that you are trying to
bring online are missing. This problem also occurs if the values that correspond to the resource-specific
registry keys are not correct.
RESOLUTION
Important This section, method, or task contains steps that tell you how to modify the registry. However,
serious problems might occur if you modify the registry incorrectly. Therefore, make sure that you follow
these steps carefully. For added protection, back up the registry before you modify it. Then, you can restore
the registry if a problem occurs. For more information about how to back up and restore the registry, click
the following article number to view the article in the Microsoft Knowledge Base:
322756 How to back up and restore the registry in Windows
To resolve this problem, you must manually re-create the resource-specific registry keys that correspond to
the SQL Server cluster resource. To do this, follow these steps:
1.
Click Start, click Run, type Regedit, and then click OK.
2.
HKEY_LOCAL_MACHINE\Cluster\Resources\<GUID>\Parameters
3.
InstanceName
Value Name: InstanceName
Value Type: REG_SZ
Value Data: MSSQLSERVER
VirtualServerName
Value Name: VirtualServerName
Value Type: REG_SZ
Value Data: <Name of the virtual SQL server>
For a named instance of SQL Server:
InstanceName
VirtualServerName
Value Name: VirtualServerName
Value Type: REG_SZ
Value Data: <Name of the virtual SQL server>
4.
Failover Cluster
Troubleshooting
Updated: October 21, 2015
Applies To: SQL Server 2014, SQL Server 2016 Preview
This topic provides information about the following issues:
In the Failover Cluster snap-in, in the console tree, make sure Failover Cluster Management is selected and
then, under Management, click Validate a Configuration.
2.
Follow the instructions in the wizard to specify the servers and the tests, and run the tests. The Summary page
appears after the tests run.
3.
While still on the Summary page, click View Report to view the test results.
To view the results of the tests after you close the wizard, see %SystemRoot%\Cluster\Reports\Validation
Report date and time.html where %SystemRoot% is the folder in which the operating system is installed (for
example, C:\Windows).
4.
To view help topics that will help you interpret the results, click More about cluster validation tests.
To view help topics about cluster validation after you close the wizard, in the Failover Cluster snap-in, click Help, clickHelp
Topics, click the Contents tab, expand the contents for the failover cluster help, and click Validating a Failover Cluster
Configuration. After the validation wizard has completed, the Summary Report will display the results. All tests must
pass with either a green check mark or in some cases a yellow triangle (warning). When looking for problem areas (red Xs
or yellow question marks), in the part of the report that summarizes the test results, click an individual test to review the
details. Any red X issues will need to be resolved prior to troubleshooting SQL Server issues.
Install Updates
Installing updates is an important part of avoiding problems with your system. Useful links:
Recommended hotfixes and updates for Windows Server 2012 R2-based failover clusters
Recommended hotfixes and updates for Windows Server 2012-based failover clusters
Recommended hotfixes and updates for Windows Server 2008 R2-based failover clusters
Recommended hotfixes and updates for Windows Server 2008-based failover clusters
Hardware failure in one node of a two-node cluster. This hardware failure could be caused by a failure in the SCSI
card or in the operating system.
To recover from this failure, remove the failed node from the failover cluster using the SQL Server Setup program,
address the hardware failure with the computer offline, bring the machine back up, and then add the repaired
node back to the failover cluster instance.
For more information, see Create a New SQL Server Failover Cluster (Setup) and Recover from Failover Cluster
Instance Failure.
Operating system failure. In this case, the node is offline, but is not irretrievably broken.
To recover from an operating system failure, recover the node and test failover. If the SQL Server instance does
not fail over properly, you must use the SQL Server Setup program to remove SQL Server from the failover
cluster, make necessary repairs, bring the computer back up, and then add the repaired node back to the failover
cluster instance.
Recovering from operating system failure this way can take time. If the operating system failure can be recovered
easily, avoid using this technique.
For more information, see Create a New SQL Server Failover Cluster (Setup) and How to: Recover from Failover
Cluster Failure in Scenario 2.
Issue 1: It is difficult to diagnose Setup issues when using the /qn switch from the command prompt, as the /qnswitch
suppresses all Setup dialog boxes and error messages. If the /qn switch is specified, all Setup messages, including error
messages, are written to Setup log files. For more information about log files, see View and Read SQL Server Setup Log
Files.
Resolution 1: Use the /qb switch instead of the /qn switch. If you use the /qb switch, the basic UI in each step will be
displayed, including error messages.
Clear the Affect the Group check box on the Advanced tab of the Full Text Properties dialog box. However, if
SQL Server causes a failover, the full-text search service restarts.
Determine on which node the group containing the instance of SQL Server is running by using the Cluster
Administrator. For this example, it is Node A.
2.
Start the SQL Server service on that computer using net start. For more information about using net start,
see Starting SQL Server Manually.
3.
Start the SQL Server SQL Server Configuration Manager on Node A. View the pipe name on which the server is
listening. It should be similar to \\.\$$\VIRTSQL\pipe\sql\query.
4.
5.
Create an alias SQLTEST1 to connect through Named Pipes to this pipe name. To do this, enter Node A as the
server name and edit the pipe name to be \\.\pipe\$$\VIRTSQL\sql\query.
6.
Connect to this instance using the alias SQLTEST1 as the server name.
Problem: Cluster Setup Error: "The installer has insufficient privileges to access this directory:
<drive>\Microsoft SQL Server. The installation cannot continue. Log on as an administrator or contact
your system administrator"
Issue: This error is caused by a SCSI shared drive that is not partitioned properly.
Resolution: Re-create a single partition on the shared disk using the following steps:
1.
2.
3.
4.
Create one partition on the shared disk, format the disk, and assign a drive letter to the disk.
5.
6.
In Control Panel, open Administrative Tools, and then open Computer Management.
2.
In the left pane of Computer Management, expand Services and Applications, and then click Services.
3.
In the right pane of Computer Management, right-click Distributed Transaction Coordinator, and
selectProperties.
4.
In the Distributed Transaction Coordinator window, click the General tab, and then click Stop to stop the
service.
5.
In the Distributed Transaction Coordinator window, click the Logon tab, and set the logon account NT
AUTHORITY\NetworkService.
6.
Click Apply and OK to close the Distributed Transaction Coordinator window. Close the Computer
Management window. Close the Administrative Tools window.
Root Cause
This problem occurs because of a new security feature named Loopback check functionality. By default, loopback check functionality is turned
ON in Windows and the value of the DisableLoopbackCheck registry entry is set to 0 (zero).
http://support.microsoft.com/kb/957097/
With this feature being turned ON: windows do not allow NTLM authentication if we try to access server from Local server using a name which
is not its Net-Bios name (or) IPAddress.
When SQL Server Agent is started, SQL Agent resource access the SQL Server using SQL VirtualServer name and hence we do not allow
NTLM. So the SQL Server Agent would fail and the SQLServer Agent Resource creation would also fail.
SQL Server resource will fail to come Online because, IsAlive check will be done using NTLM Authentication i.e: Cluster service startup account
resolves as NT AUTHORITY\ANONYMOUS LOGON when connecting to SQL Server for IsAlive check and the connection fails.
We will not get in to this issue if startup account of SQL Server has permissions to read and write SPNs.
After the installation fails you will see the SQL Server resource is created but not the SQL Agent resource.
Option 1
1. After the failure, create the SPNs manually using SetSPN tool (or) Configure SQL Server service to create SPNs dynamically for the SQL
Server instances (Refer KB: 811889)
{
To add the sql server agent resource type execute the below command:
cluster restype SQL Server Agent /create /DLL:sqagtres.dll .Once done we got the
update that the Resource type SQL Server Agent created.
}
We need to make sure that the newly created SQL server Agent resource have the virtualservername and Instance name .
To add this property go to failover cluster management ==>SQL Server Agent Resource==>Properties==>properties
check for the two parameters (virtualservername and Instancename) and fill in the
details.
Option 2
1. Do a Complete uninstall of failed installation (or) Configure SQL Server service to create SPNs dynamically for the SQL Server instances
(Refer KB: 811889) and move to Step 3.
Example:
SETSPN -A MSSQLSVC/VSName.XX.XX.EDU:1433
SETSPN -A MSSQLSVC/VSName.XX.XX.EDU
Note:Beginning with SQL Server 2008, the SPN format is changed and new SPN format does not require a port number
Refer:http://msdn.microsoft.com/en-us/library/ms191153.aspx
Option 3 (Recommended)
a. Click Start, click Run, type regedit, and then click OK.
b. Locate the following registry path:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa
c. Right-click Lsa, select New, and then click DWORD Value.
d. Type DisableLoopbackCheck, and then press ENTER.
e. Right-click DisableLoopbackCheck, and then click Modify.
f. In the Value data box, type 1, and then click OK.
3. Do complete uninstall and re-run the setup(or) Follow the steps from step2 in option 1.
Note:
1. We will encounter above error if we are installing the named instance of SQL Server and SQL Server browser is in stopped state.
2. If you have installed SQLServer 2012 (Denali) and uninstalled it on same cluster. You might encounter above issue. Refer below link for
details.
http://mssqlwiki.com/2012/01/31/sql-server-resource-fails-to-come-online-is-alive-check-fails/
Installation of SQLServer2005/2008/2012
Fails on Windows2008 Cluster.
Installation of SQLServer2005/20082012 on Windows2008/2012 Cluster might fail with error mentioned below
Erorr1
The cluster resource SQL Server could not be brought online due to an error bringing the dependency resource SQL Network Name ( )
online. Refer to the Cluster Events in the Failover Cluster Manager for more information.
Click Retry to retry the failed action, or click Cancel to cancel this action and continue setup.
Erorr2:
The cluster identity cLUSTER$nAME can create computer objects. By default all computer objects are created in the Computers container;
consult the domain administrator if this location has been changed.
The quota for computer objects has not been reached.
If there is an existing computer object, verify the Cluster Identity SCLUSTERNAME$ has Full Control permission to that computer object
using the Active Directory Users and Computers tool.
ERR [RES] Network Name <SQL Network Name (VSServerName)>: Computer account VSServerName couldnt be re-enabled. status 5
INFO [NM] Received request from client address ASPMB9000D05N1.
INFO [NM] Received request from client address ASPMB9000D05N1.
ERR [RHS] Online for resource SQL Network Name (VSServerName) failed.
INFO [RCM] HandleMonitorReply: ONLINERESOURCE for SQL Network Name (VSServerName), gen(2) result 5018.
INFO [RCM] TransitionToState(SQL Network Name (VSServerName)) OnlinePending>ProcessingFailure.
ERR [RCM] rcm::RcmResource::HandleFailure: (SQL Network Name (VSServerName))
ERR [RES] Network Name <SQL Network Name (VSServerName)>: Unable to create computer account VSServerName on DC \\ay.xz.dc, in
default Computers container, status 5
Cause
When you create a new clustered network name, a computer object (computer account) for that clustered service or application must be
created in the Active Directory domain.
This computer object is created by the computer object of the cluster itself. This computer abject of the cluster is responsible for creating the
computer object for "SQL Virtual Server" in active directory . If the computer object of the cluster itself does not have the appropriate
permissions, it cannot create or update the computer object for "SQL Virtual Server" . So the installation of SQL Server would fail.
To resolve this issue Grant "Create Computer Objects" permission for the computer object created for the cluster (Computer Name
object(CNO)).
Reason:
Server is in script upgrade mode. Only administrator can connect at this time.
Login failed for user ccccc\xxxxx. Reason: Server is in script upgrade mode. Only administrator can connect at this time.
Issue:
Reason: Server is in script upgrade mode. Only administrator can connect at this time.
Login failed for user ccccc\xxxxx. Reason: Server is in script upgrade mode. Only administrator can connect at this time.
Script level upgrade for database master failed because upgrade step sqlagent100_msdb_upgrade.sql encountered error 598, state 1,
severity 25. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error
happened during upgrade of the master database, it will prevent the entire SQL Server instance from starting. Examine the previous Error
log entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.
Resolution:
Start the SQLServer from services console (or) Command prompt and wait till the master database is completely upgraded.
Once the script upgrade is complete start the SQL-Server normally from Cluster admin (or) Failover cluster manager.
Cause: During the Script upgrade mode only administrator can connect to SQL-Server, So when the SQL-Server resource is brought online
ISAlive check fails immediately before upgrade completes and SQL-Server resource goes down. When we start SQLServer from services or
command prompt ISAlive check doesnt happen so upgrade completes. Once the upgrade is completed we can start SQL-Server normally.
SQL Server 2012 : Script level upgrade for database master failed because upgrade step msdb110_upgrade.sql encountered error
SQL Server 2008/2012 instance fails to start or hangs after service pack or Cumulative update installation.
Error
Script level upgrade for database master failed because upgrade step sqlagent100_msdb_upgrade.sql encountered error 574, state 0,
severity 16. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error
happened during upgrade of the master database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog
entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.
Script level upgrade for database master failed because upgrade step sqlagent100_msdb_upgrade.sql encountered error 574, state 0,
severity 16. This is a serious error condition which might interfere with regular operation and the database will be taken offline. If the error
happened during upgrade of the master database, it will prevent the entire SQL Server instance from starting. Examine the previous errorlog
entries for errors, take the appropriate corrective actions and re-start the database so that the script upgrade steps run to completion.
Script level upgrade for database master failed because upgrade step msdb110_upgrade.sql encountered error 15173, state 1, severity 16
Start SQL Server from command prompt using trace flag T902 to disable script execution
GO
GO
Error:
Directory lookup for the file "P:\Data\temp_MS_AgentSigningCertificate_database.mdf" failed with the operating system error 2(The system
cannot find the file specified.).
CREATE DATABASE failed. Some file names listed could not be created. Check related errors.
spid7s Script level upgrade for database master failed because upgrade step sqlagent100_msdb_upgrade.sql encountered error 598, state
1, severity 25.
CREATE FILE encountered operating system error 3(The system cannot find the path specified.) while attempting to open or create the
physical file Q:\Data\temp_MS_AgentSigningCertificate_database_log.LDF.
This error is raised when the default database location is invalid. Edit below registry to have a valid directory for default database location.
3. Check if there are Orphan users in system databases and fix them.
Ex: Script level upgrade for database master failed because upgrade step msdb110_upgrade.sql encountered error 15173, state 1, severity
16
you can use the below script to identify the users who have permissions granted on ##MS_PolicyEventProcessingLogin##
Resolution
If none of the above resolves the issue then you can use Trace flag -T3601 which causes the first 512 characters of each batch
being executed to be printed to the error log. Identify the batch which is failing and troubleshoot the batch.
Error
2008-03-19 23:06:38 + [396] An idle CPU condition has not been defined OnIdle
job schedules will have no effect
2008-03-19 23:06:38 + [408] SQL Server MSSQLSERVER is clustered AutoRestart has
been disabled
2008-03-19 23:06:39 ! [298] SQLServer Error: 22022, CryptUnprotectData() returned
error -2146893813, Key not valid for use in specified state. [SQLSTATE 42000]
2008-03-19 23:06:39 ! [442] ConnConnectAndSetCryptoForXpstar failed (0).
2008-03-19 23:06:40 ? [098] SQLServerAgent terminated (normally)
Error2
Resolution
Ie:HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL
Server\MSSQL.X\SQLServerAgent
ServerHost
Value: np:Virtualservername.
This will force the SQLServer agent to connect with SQLserver using Named Pipes so
delegation is not used.
We have a HOTFIX available for this issue and it is included in the cumulative update pack9 for SQLServer service
pack2. http://support.microsoft.com/?id=956378
Note: Before applying the Hotfix. you have to follow the steps mentioned in Resolution else hotfix would fail. Revert the steps after applying
the fix.
A failure was detected for a previous installation, patch, or repair during configuration for features
[SQL_PowerShell_Engine_CNS,SQL_PowerShell_Tools_ANS,XXXX,XXXXX]. In order to apply this patch package (KB968369 XYZ), you must
resolve any issues with the previous operation that failed.
Look at the Sub-keys for below registry keys and check if Value 2 is set for any components.
Run the Repair for existing setup. Values for all the sub keys will change to 1 after the successful repair.
Resolution: