Vous êtes sur la page 1sur 37

http://sqldbatask.blogspot.com/2011/09/how-to-stop-log-shippingalert-which.html How to stop the log shipping alert which sending the error message 14421?

How to stop the log shipping alert which sending the below error message? Error Message: Over the previous interval, there were 5 occurrences of the following event. Logfile: Application Event Type: Error Event ID: 14421 Computer: XXXXXXXX Source: MSSQL$SQL2008 Text: The log shipping secondary database Servername \SQL2008.Databasename has restore threshold of 45 minutes and is out of sync. No restore was performed for 9426 minutes. Restored latency is 8 minutes. Check agent log and logshipping monitor information. Cause: Database which involved in source as well in destination server for log shipping setup is not available. Because database dropped by somebody. Resolution: The database name entry is found in the below logshipping tables that are referencing the alert jobs LSAlert_ XXXXXXXX \SQL2008 and LSAlert_YYYYYYY\SQL2008 in both primary and secondary server. Action Taken: By removing the record entry in the below log shipping tables corresponding to the log shipping database, Alerts stating above error message 14421 wont come anymore. Select * from msdb.dbo.log_shipping_monitor_primary select * from msdb.dbo.log_shipping_monitor_secondary Delete from msdb.dbo.log_shipping_monitor_primary where primary_database='logshippingdatabasename'

Delete from msdb.dbo.log_shipping_monitor_secondary where secondary_database='logshippingdatabasename' http://www.mssqltips.com/sqlservertip/2553/different-ways-to-monitor-log-shipping-for-sql-serverdatabases/

https://mssqlnuggets.wordpress.com/2014/01/27/mssql-error-32023-an-entry-for-primary-serverprimary-database-does-not-exist-on-this-secondary-register-the-primary-first/

Hope all is well. Today I would like post the resolution for an error that I came up with when I was modifying the logshipping configuration. Logshipping in my scenario is not being used for high availability but I am using the logshipped databases for the purpose of SSRS reports. We did a failover a couple of weeks back from Server A to Server B using a third party HA solution and after a couple of weeks I wanted to change the retention on the secondary after the files are copied. I tried to do this thorough GUI on primary server which is Server B and then when I make changes I got the below error: SQL Server Management Studio could not save the configuration of ReportServer as a Secondary.An entry for primary server , primary database does not exist on this secondary. Register the primary first. (Microsoft SQL Server, Error: 32023) I used the below script to fix the issue, but I still put the value for @primary_server=Server A and not server B. If I put server B I would still receive the same error:
exec master.dbo.sp_change_log_shipping_secondary_primary @primary_server = 'Server A', @primary_database = 'Db Name', @file_retention_period ='time in minutes'

Description of error message 14420 and error message 14421 that occur when you use log shipping in SQL Server http://support.microsoft.com/kb/329133

Errors in Logshipping
http://social.msdn.microsoft.com/Forums/sqlserver/en-US/97159ada-d7fe-4334-9b17184674a11535/errors-in-logshipping?forum=sqlgetstarted

Logshipping secondary server is out of sync and LSRestore job failing


http://blogs.technet.com/b/mdegre/archive/2009/08/11/logshipping-secondary-server-is-out-of-syncand-lsrestore-job-failing.aspx
Logshipping secondary server is out of sync and transaction log restore job failing.

Problem You can see that your logshipping is broken. In the SQL Error log, the message below is displayed :

Error: 14421, Severity: 16, State: 1. The log shipping secondary database myDB.logshippingPrimary has restore threshold of 45 minutes and is out of sync. No restore was performed for 6258 minutes. Description of error message 14420 and error message 14421 that occur when you use log shipping in SQL Server

http://support.microsoft.com/default.aspx?scid=329133

Cause Inside the LSRestore job history, you can find out two kind of messages : - Restore job skipping the logs on secondary server Skipped log backup file. Secondary DB: 'logshippingSecondary', File: '\\myDB\logshipping\logshippingPrimary_20090808173803.trn' - Backup log older is missing *** Error 4305: The file '\\myDB\logshipping\logshippingPrimary_20090808174201.trn' is too recent to apply to the secondary database 'logshippingSecondary'. **** Error : The log in this backup set begins at LSN 18000000005000001, which is too recent to apply to the database. An earlier log backup that includes LSN 18000000004900001 can be restored. Transaction Log backups can only be restored if they are in a sequence. If the LastLSN field and the FirstLSN field do not display the same number on consecutive transaction log backups, they are not restorable in that sequence. There may be several reasons for transaction log backups to be out of sequence. Some of the most common reasons are a redundant transaction log backup jobs on the primary server that are causing the sequence to be broken or the recovery model of the database was probably toggled between transaction log backups.

Resolution At this time, to check if there are a gaps in the Restore Process. You can run the query below to try to find out whether a redundant Backup Log was performed : SELECT s.database_name,s.backup_finish_date,y.physical_device_name FROM msdb..backupset AS s INNER JOIN msdb..backupfile AS f ON f.backup_set_id = s.backup_set_id INNER JOIN msdb..backupmediaset AS m ON s.media_set_id = m.media_set_id INNER JOIN msdb..backupmediafamily AS y ON m.media_set_id = y.media_set_id WHERE (s.database_name = 'databaseNamePrimaryServer') ORDER BY s.backup_finish_date DESC;

You can see that another Backup Log was running out of logshipping process. Now, you have just to restore this backup on the secondary and run the LSRestore Job.

http://support.unitrends.com/ikm/questions.php?questionid=423

http://cadarsh.blogspot.com/2011/03/violation-of-primary-key-constraint-in.html

Violation of PRIMARY KEY constraint in Log Shipping

I have SQL Server 2005 log shipping setup with primary/secondary configuration. I can confirm from the logs that log shipping is working without issue, however, reports generated from the monitor server show this message: Violation of PRIMARY KEY constraint 'PK__#log_shipping_mo__3ABBDC91'. Cannot insert duplicate key in object 'dbo.#log_shipping_monitor'. The statement has been terminated. There is nothing special about the configuration. Any ideas? I think the problem you are seeing is related to some old information being present in the tables used to store log shipping configuration. There are some scenarios where this can happen and it causes the problem you reported in your first post. We are working on correcting this in a future release. As you can tell from the error, the problem is caused by an insert to a temp table causing a PK constraint violation. The PK for the temp table is server name and database name. This error is normally caused by old configuration being present in the tables log_shipping_monitor_primary and/or log_shipping_monitor_secondary. You can view the contents of these tables directly (in msdb) or use some supplied help SP's (see BOL topic titled "Log Shipping Tables and Stored Procedures"). The workaround is to remove the old rows from log_shipping_monitor_primary and/or log_shipping_monitor_secondary tables. Can you determine that you do indeed have stale data in the log shipping tables. If this is the case I can work with you on how to remove the old rows. The old configuration data is probably causing the incorrect alerts you are seeing.
Problem Description =============== After role reversal in a log shipping setup, the stored procedure sp_help_log_shipping_monitor may return the following error OR we may find the error in the log shipping report in SQL Server Management Studio. Msg 2627, Level 14, State 1, Procedure sp_help_log_shipping_monitor, Line 148 Violation of PRIMARY KEY constraint 'PK__#log_shipping_mo__15502E78'. Cannot insert duplicate key in object 'dbo.#log_shipping_monitor'. The statement has been terminated.

The stored procedure sp_help_log_shipping_monitor fetches data from log_shipping_monitor_primary and log_shipping_monitor_secondary tables and puts them into a temp table called #log_shipping_monitor. This temp table has a primary key with the ServerName and the DatabaseName. log_shipping_monitor_primary - Stores one monitor record per primary database in each log shipping configuration log_shipping_monitor_secondary - Stores one monitor record per secondary database in a log shipping configuration

1.

The problem may happen under few scenarios such as: A manual failover was done for the log shipped database. Since, the roles are not switched on the monitor server, there would be two entries in the Log_shipping_monitor_secondary table. This would cause the stored procedure sp_help_log_shipping_monitor to fail. There could be incorrect metadata cleanup of a previous log shipping setup for the same log shipped database. Resolution ======== Note: Any incorrect update/deletion in the metadata table may lead to inconsistencies in the logshipping setup, Let us say : Logshipping Database name : TESTDB Primary Server name : Server\Primary Secondary Server name : Server\Secondary Check the entries under 1) select primary_server, primary_database from db.dbo.log_shipping_monitor_primary Output : primary_server primary_database ------------------ ---------------Server\Primary TestDB Primary server should have an entry corresponding to its primary database. It should show Primary_server as Server\Primary and Primary_database as TESTDB 2) select secondary_server,secondary_database,primary_server, primary_database from msdb.dbo.log_shipping_monitor_secondary Output: secondary_server secondary_database primary_server primary_database -------------------- ------------------- ---------------- ----------------Server\Secondary TestDB Server\Primary TestDB

2.

Secondary server should have a correct entry corresponding to its primary server and primary database. It should show Secondary_server as Server\Secondary, secondary_database as TESTDB, primary_server as Server\Primary, Primary_database as TESTDB Now if you find any mismatch in the above output, you need to update/delete the incorrect entries manually using Delete or Update command depending on the scenario. For example Scenario 1 : select secondary_server,secondary_database,primary_server, primary_database from msdb.dbo.log_shipping_monitor_secondary Output: secondary_server secondary_database primary_server primary_database -------------------- ------------------- ---------------- ----------------Server\Secondary TestDB Server\other_server TestDB The output here shows incorrect server under Primary_Server column, To resolve this update the msdb.dbo.log_shipping_monitor_secondary manually as Update msdb.dbo.log_shipping_monitor_secondary Set Primary_server = Server \Primary where Primary_server = Server\Other_server

Scenario 2 : select secondary_server,secondary_database,primary_server, primary_database from msdb.dbo.log_shipping_monitor_secondary Output: secondary_server secondary_database primary_server primary_database -------------------- ------------------- ---------------- ----------------Server\Secondary TestDB Server\Primary TestDB Server\Secondary TestDB Server\other_server TestDB In this scenario, It shows uncleaned metadata which might be from previous Log shipping setup so delete theentry from msdb.dbo.log_shipping_monitor_secondary as Delete from msdb.dbo.log_shipping_monitor_secondary where Primary_server = Server \Other_server Note: There was a design change that was done on SQL Server 2008 to prevent this issue from occurring.

http://blog.sungardas.com/2013/06/database-mirroring-in-sql-server-2005-consider-logshipping/#sthash.B5wziPjQ.dpbs

http://swamy-sqlserver.blogspot.com/2011/06/how-logshipping-works.html

http://swamy-sqlserver.blogspot.com/2011/06/maintenance-plan-issue.html

Maintenance plan issue


Even am unable to create new maintenance plan. Same error.. Actually earlier backups are happening to one share disk. Now they have added new disk in server. Manager asked me to re-schedule backups to new disk. Please any one help me in this regards.

TITLE: Microsoft SQL Server Management Studio -----------------------------Creating an instance of the COM component with CLSID {E80FE1DB-D1AA-4D6B-BA7E040D424A925C} from the IClassFactory failed due to the following error: c001f011. (Microsoft.SqlServer.ManagedDTS) Resolutions Step 1: Execute the below query to obtain the Maintenance plan name and Id SELECT NAME, ID FROM MSDB..SYSMAINTPLAN_PLANS Step 2: Replace the Id obtained from Step 1 into the below query and delete the entry from log table DELETE FROM SYSMAINTPLAN_LOG WHERE PLAN_ID=' ' Step 3: Replace the Id obtained from Step 1 into the below query and delete the entry from subplans table as shown below, DELETE FROM SYSMAINTPLAN_SUBPLANS WHERE PLAN_ID = ' ' Step 4: Finally delete the maintenance plan using the below query where ID is obtained from Step1 DELETE FROM SYSMAINTPLAN_PLANS WHERE ID = ' ' Step 5: Check and delete the jobs from SSMS if it exists

Log Shipping issues-FAQ


Introduction: I could see most of them are asking known errors of log shipping in many forums and we are repeatedly providing the same solutions. Hence I thought of consolidating all the know errors and its solutions for log shipping in this post as FAQ. Probably I must pickup all the know errors, If I have missed anything you can very well post in FORUMS section. Question : IS it possible to log ship database between SQL 2000 & SQL 2005? Answer: No, thats impossible, In SQL 2005 transaction log architecture is changed compared to SQL 2000 and hence you won't be able to restore tlog backups from SQL 2000 to SQL 2005 or vice versa. ----------------------------------------------------------------------------Question: How to failover in SQL 2005 Log Shipping? Answer: I can better ask to check out the link Failover in SQL 2005 Log Shipping, Deepak written this article clearly. ----------------------------------------------------------------------------Question:I'm getting the below error message in restoration job on secondary server, WHY? [Microsoft SQL-DMO (ODBC SQLState: 42000)] Error 4305: [Microsoft][ODBC SQL Server Driver][SQL Server]The log in this backup set begins at LSN 7000000026200001, which is too late to apply to the database. An earlier log backup that includes LSN 6000000015100001 can be restored. [Microsoft][ODBC SQL Server Driver][SQL Server]RESTORE LOG is terminating abnormally. Answer:Was your sql server or agent restarted Y'day in either source or destination ? because the error states there is a mismatch in LSN. A particular tran log was not applied in the destination server hence the subsequent tran logs cannot be applied as a result ! You can check log shipping monitor \ log shipping tables to check the which transaction log is last applied to secondary db, if the next consecutive transaction logs are available in the secondary server share folder you manually RESTORE the logs with NORECOVERY option, Once you restored all the logs automatically from the next cycle the job will work fine. Incase if you are not able to find the next transaction log in secondary server shared folder, you need to reconfigure log shipping. Try the below tasks to re-establish log shipping again. Disable all the log shipping jobs in source and destination servers Take a full backup in source and restore it in secondary server using the With Standby option Enable all the jobs you disabled previously in step1 ----------------------------------------------------------------------------Question: Is it possible load balance in log shipping? Answer:Yes ofcourse its possible in log shipping, while configuring log shipping you have the option to choose standby or no recovery mode, there you select STANDBY option to make the secondary database readonly. For SQL 2005 log shipping configuration check out the link 10 Steps to configure Log Shipping

----------------------------------------------------------------------------Question: Can I take full backup of the log shipped database in primary server?? Answer: In SQL Server 2000 you won't be able to take full backup of log shipped database, because this will break the LSN chain and it directly affects the log shipping. In SQL Server 2005, yes its possible. You can take full backup of log shipped database and this won't affect the log shipping. ----------------------------------------------------------------------------Question : Can I shrink log shipped database log file?? Answer: Yes ofcourse you can shrink the log file, but you shouldn't use WITH TRUNCATE option. If you use this option obviously log shipping will be disturbed. ----------------------------------------------------------------------------Question : Can I take full backup of the log shipped database in secondary server?? Answer: No chance , you won't be able to execute BACKUP command against a log shipped database in secondary server. ----------------------------------------------------------------------------Question: I've configured Log shipping successfully on standby mode, but in the restoration job I'm getting the below error. What I do to avoid this in future?? Message 2006-07-31 09:40:54.33 *** Error: Could not apply log backup file 'C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Backup\LogShip\TEST_20060731131501.trn' to secondary database 'TEST'.(Microsoft.SqlServer.Management.LogShipping) *** 2006-07-31 09:40:54.33 *** Error: Exclusive access could not be obtained because the database is in use. RESTORE LOG is terminating abnormally.(.Net SqlClient Data Provider) *** Answer: To restore transaction logs to the secondary db, SQL Server needs exclussive access on the database. When you configure it in standby mode, users will be able to access the database and runs query against the secondary db. Hence If the scheduled restore jobs runs at that time, the db will have a lock and it won't allow SQL Server to restore the tlogs. To avoid this you need to check "Disconnect users in the database when restoring backups" options in log shipping configuration wizard. Check the link 10 Steps to configure Log Shipping. ----------------------------------------------------------------------------Question : Can you tell me the pre-requisites for configuring log shipping?? Answer : Check out the link Pre-requisites for Log Shipping. ----------------------------------------------------------------------------Question : Suddenly I'm getting the error below, How can I rectify this??? [Microsoft SQL-DMO (ODBC SQLState: 42000)] Error 4323: [Microsoft][ODBC SQL Server Driver][SQL Server]The database is marked suspect. Transaction logs cannot be restored. Use RESTORE DATABASE to recover the database. [Microsoft][ODBC SQL Server Driver][SQL Server]RESTORE LOG is terminating abnormally

Answer : We had the same issue some time ago, this was related to a new file being created in a filegroup on the source. Don't know if this applies to your case, but restoring a backup of this new file on the secondary server solved the problem. ----------------------------------------------------------------------------Question : Is it possible to log ship database from SQL server 2005 to SQL server 2008 and vice versa? Answer : Yes you can log ship database from SQL server 2005 to SQL Server 2008 this will work. However log shipping from SQL Server 2008 to SQL Server 2005 is not possible because you wont be able to restore SQL server 2008 backup to SQL Server 2005 (downgrading version)

Thursday, June 2, 2011

How Server failover works?


What is Cluster? A server cluster is a collection of servers, called nodes that communicate with each other to make a set of services highly available to clients. Server clusters are based on one of the two clustering technologies in the Microsoft Windows Server 2003 operating systems. The other clustering technology is Network Load Balancing. Server clusters are designed for applications that have long-running in-memory state or frequently updated data. Typical uses for server clusters include file servers, print servers, database servers, and messaging servers. This section provides technical background information about how the components within a server cluster work. Server Cluster Architecture ---------------------------The most basic type of cluster is a two-node cluster with a single quorum device. For a definition of a single quorum device, see What Is a Server Cluster?. The following figure illustrates the basic elements of a server cluster, including nodes, resource groups, and the single quorum device, that is, the cluster storage. Basic Elements of a Two-Node Cluster with Single Quorum Device Applications and services are configured as resources on the cluster and are grouped into resource groups. Resources in a resource group work together and fail over together when failover is necessary. When you configure each resource group to include not only the elements needed for the application or service but also the associated network name and IP address, then that collection of resources runs as if it were a separate server on the network. When a resource group is configured this way, clients can consistently get access to the application using the same network name, regardless of which node the application is running on. The preceding figure showed one resource group per node. However, each node can have multiple resource groups. Within each resource group, resources can have specific dependencies. Dependencies

are relationships between resources that indicate which resources need to come online before another resource can come online. When dependencies are configured, the Cluster service can bring resources online or take them offline in the correct order during failover. The following figure shows two nodes with several resource groups in which some typical dependencies have been configured between resources. The figure shows that resource groups (not resources) are the unit of failover. Resource Dependencies Configured Within Resource Groups

Cluster Service Component Diagrams and Descriptions The Cluster service runs on each node of a server cluster and controls all aspects of server cluster operation. The Cluster service includes multiple software components that work together. These components perform monitoring, maintain consistency, and smoothly transfer resources from one node to another. Diagrams and descriptions of the following components are grouped together because the components work so closely together: Database Manager (for the cluster configuration database) Node Manager (working with Membership Manager) Failover Manager Global Update Manager Separate diagrams and descriptions are provided of the following components, which are used in specific situations or for specific types of applications: Checkpoint Manager Log Manager (quorum logging) Event Log Replication Manager Backup and Restore capabilities in Failover Manager

Diagrams of Database Manager, Node Manager, Failover Manager, Global Update Manager, and Resource Monitors The following figure focuses on the information that is communicated between Database Manager, Node Manager, and Failover Manager. The figure also shows Global Update Manager, which supports the other three managers by coordinating updates on other nodes in the cluster. These four components work together to make sure that all nodes maintain a consistent view of the cluster (with each node of the cluster maintaining the same view of the state of the member nodes as the others) and that resource groups can be failed over smoothly when needed. Basic Cluster Components: Database Manager, Node Manager, and Failover Manager

The following figure shows a Resource Monitor and resource dynamic-link library (DLL) working with Database Manager, Node Manager, and Failover Manager. Resource Monitors and resource DLLs support applications that are cluster-aware, that is, applications designed to work in a coordinated way with cluster components. The resource DLL for each such application is responsible for monitoring and controlling that application. For example, the resource DLL saves and retrieves application properties in

the cluster database, brings the resource online and takes it offline, and checks the health of the resource. When failover is necessary, the resource DLL works with a Resource Monitor and Failover Manager to ensure that the failover happens smoothly. Resource Monitor and Resource DLL with a Cluster-Aware Application

Descriptions of Database Manager, Node Manager, Failover Manager, Global Update Manager, and Resource Monitors The following descriptions provide details about the components shown in the preceding diagrams. Database Manager Database Manager runs on each node and maintains a local copy of the cluster configuration database, which contains information about all of the physical and logical items in a cluster. These items include the cluster itself, cluster node membership, resource groups, resource types, and descriptions of specific resources, such as disks and IP addresses. Database Manager uses the Global Update Manager to replicate all changes to the other nodes in the cluster. In this way, consistent configuration information is maintained across the cluster, even if conditions are changing such as if a node fails and the administrator changes the cluster configuration before that node returns to service. Database Manager also provides an interface through which other Cluster service components, such as Failover Manager and Node Manager, can store changes in the cluster configuration database. The interface for making such changes is similar to the interface for making changes to the registry through the Windows application programming interface (API). The key difference is that changes received by Database Manager are replicated through Global Update Manager to all nodes in the cluster. Database Manager functions used by other components Some Database Manager functions are exposed through the cluster API. The primary purpose for exposing Database Manager functions is to allow custom resource DLLs to save private properties to the cluster database when this is useful for a particular clustered application. (A private property for a resource is a property that applies to that resource type but not other resource types; for example, the SubnetMask property applies for an IP Address resource but not for other resource types.) Database Manager functions are also used to query the cluster database. Node Manager Node Manager runs on each node and maintains a local list of nodes, networks, and network interfaces in the cluster. Through regular communication between nodes, Node Manager ensures that all nodes in the cluster have the same list of functional nodes. Node Manager uses the information in the cluster configuration database to determine which nodes have been added to the cluster or evicted from the cluster. Each instance of Node Manager also monitors the other nodes to detect node failure. It does this by sending and receiving messages, called heartbeats, to each node on every available network. If one node detects a communication failure with another node, it broadcasts a message to the entire cluster, causing all nodes that receive the message to verify their list of functional nodes in the cluster. This is called a regroup event. Node Manager also contributes to the process of a node joining a cluster. At that time, on the node that is joining, Node Manager establishes authenticated communication (authenticated RPC bindings) between itself and the Node Manager component on each of the currently active nodes. Note

A down node is different from a node that has been evicted from the cluster. When you evict a node from the cluster, it is removed from Node Managers list of potential cluster nodes. A down node remains on the list of potential cluster nodes even while it is down; when the node and the network it requires are functioning again, the node joins the cluster. An evicted node, however, can become part of the cluster only after you use Cluster Administrator or Cluster.exe to add the node back to the cluster.

Membership Manager Membership Manager (also called the Regroup Engine) causes a regroup event whenever another nodes heartbeat is interrupted (indicating a possible node failure). During a node failure and regroup event, Membership Manager and Node Manager work together to ensure that all functioning nodes agree on which nodes are functioning and which are not. Cluster Network Driver Node Manager and other components make use of the Cluster Network Driver, which supports specific types of network communication needed in a cluster. The Cluster Network Driver runs in kernel mode and provides support for a variety of functions, especially heartbeats and fault-tolerant communication between nodes. Failover Manager and Resource Monitors Failover Manager manages resources and resource groups. For example, Failover Manager stops and starts resources, manages resource dependencies, and initiates failover of resource groups. To perform these actions, it receives resource and system state information from cluster components on the node and from Resource Monitors. Resource Monitors provide the execution environment for resource DLLs and support communication between resources DLLs and Failover Manager. Failover Manager determines which node in the cluster should own each resource group. If it is necessary to fail over a resource group, the instances of Failover Manager on each node in the cluster work together to reassign ownership of the resource group. Depending on how the resource group is configured, Failover Manager can restart a failing resource locally or can take the failing resource offline along with its dependent resources, and then initiate failover. Global Update Manager Global Update Manager makes sure that when changes are copied to each of the nodes, the following takes place: Changes are made atomically, that is, either all healthy nodes are updated, or none are updated.

Changes are made in the order they occurred, regardless of the origin of the change. The process of making changes is coordinated between nodes so that even if two different changes are made at the same time on different nodes, when the changes are replicated they are put in a particular order and made in that order on all nodes.

Global Update Manager is used by internal cluster components, such as Failover Manager, Node Manager, or Database Manager, to carry out the replication of changes to each node. Global updates are typically initiated as a result of a Cluster API call. When an update is initiated by a node, another node is designated to monitor the update and make sure that it happens on all nodes. If that node cannot make the update locally, it notifies the node that tried to initiate the update, and changes are not made

anywhere (unless the operation is attempted again). If the node that is designated to monitor the update can make the update locally, but then another node cannot be updated, the node that cannot be updated is removed from the list of functional nodes, and the change is made on available nodes. If this happens, quorum logging is enabled at the same time, which ensures that the failed node receives all necessary configuration information when it is functioning again, even if the original set of nodes is down at that time. Diagram and Description of Checkpoint Manager Some applications store configuration information locally instead of or in addition to storing information in the cluster configuration database. Applications might store information locally in two ways. One way is to store configuration information in the registry on the local server; another way is to use cryptographic keys on the local server. If an application requires that locally-stored information be available on failover, Checkpoint Manager provides support by maintaining a current copy of the local information on the quorum resource. The following figure shows the Checkpoint Manager process. Checkpoint Manager Checkpoint Manager handles application-specific configuration data that is stored in the registry on the local server somewhat differently from configuration data stored using cryptographic keys on the local server. The difference is as follows: For applications that store configuration data in the registry on the local server, Checkpoint Manager monitors the data while the application is online. When changes occur, Checkpoint Manager updates the quorum resource with the current configuration data.

For applications that use cryptographic keys on the local server, Checkpoint Manager copies the cryptographic container to the quorum resource only once, when you configure the checkpoint. If changes are made to the cryptographic container, the checkpoint must be removed and re-associated with the resource.

Before a resource configured to use checkpointing is brought online (for example, for failover), Checkpoint Manager brings the locally-stored application data up-to-date from the quorum resource. This helps make sure that the Cluster service can recreate the appropriate application environment before bringing the application online on any node. Note When configuring a Generic Application resource or Generic Service resource, you specify the application-specific configuration data that Checkpoint Manager monitors and copies. When determining which configuration information must be marked for checkpointing, focus on the information that must be available when the application starts.

Checkpoint Manager also supports resources that have application-specific registry trees (not just individual keys) that exist on the cluster node where the resource comes online. Checkpoint Manager watches for changes made to these registry trees when the resource is online (not when it is offline). When the resource is online and Checkpoint Manager detects that changes have been made, it creates a copy of the registry tree on the owner node of the resource and then sends a message to the owner node of the quorum resource, telling it to copy the file to the quorum resource. Checkpoint Manager performs

this function in batches so that frequent changes to registry trees do not place too heavy a load on the Cluster service. Diagram and Description of Log Manager (for Quorum Logging) The following figure shows how Log Manager works with other components when quorum logging is enabled (when a node is down). Log Manager and Other Components Supporting Quorum Logging

When a node is down, quorum logging is enabled, which means Log Manager receives configuration changes collected by other components (such as Database Manager) and logs the changes to the quorum resource. The configuration changes logged on the quorum resource are then available if the entire cluster goes down and must be formed again. On the first node coming online after the entire cluster goes down, Log Manager works with Database Manager to make sure that the local copy of the configuration database is updated with information from the quorum resource. This is also true in a cluster forming for the first time on the first node, Log Manager works with Database Manager to make sure that the local copy of the configuration database is the same as the information from the quorum resource. Diagram and Description of Event Log Replication Manager Event Log Replication Manager, part of the Cluster service, works with the operating systems Event Log service to copy event log entries to all cluster nodes. These events are marked to show which node the event occurred on. The following figure shows how Event Log Replication Manager copies event log entries to other cluster nodes. How Event Log Entries Are Copied from One Node to Another

The following interfaces and protocols are used together to queue, send, and receive events at the nodes: The Cluster API

Local remote procedure calls (LRPC)

Remote procedure calls (RPC)

A private API in the Event Log service

Events that are logged on one node are queued, consolidated, and sent through Event Log Replication Manager, which broadcasts them to the other active nodes. If few events are logged over a period of time, each event might be broadcast individually, but if many are logged in a short period of time, they are batched together before broadcast. Events are labeled to show which node they occurred on. Each of the other nodes receives the events and records them in the local log. Replication of events is not guaranteed by Event Log Replication Manager if a problem prevents an event from being copied, Event Log

Replication Manager does not obtain notification of the problem and does not copy the event again. Diagram and Description of Backup and Restore Capabilities in Failover Manager

What is .TUF file in Log Shipping


What is .TUF file in Log Shipping? TUF file is a Microsoft SQL Server Transaction Undo file. .TUF File contains the information regarding any modifications that were made as part of incomplete transactions at the time the backup was performed. A transaction undo(.TUF) file is required if a database is loaded in read-only state. In this state, further transaction log backups may be applied.

TUF File in Log Shipping The transaction undo file contains modifications that were not committed on the source database but were in progress when the transaction log was backed up AND when the log was restored to another database, you left the database in a state that allowed addition transaction log backups to be restored to it (at some point in the future. When another transaction log is restored, SQL Server uses data from the undo file and the transaction log to continue restoring the incomplete transactions (assuming that they are completed in the next transaction log file). Following the restore, the undo file will be rewritten with any transactions that, at that point, are incomplete. Hope its not too geeky.

Question: In my environment there is an issue with Log shipping destination file path, I've to change the file path on the destination, I've changed and LS copy is working fine and LS restore is failing because it is trying find the .tuf file on the old path which is not exists on the destination.

I don't want to do full restore for 30+ databases, so I'm trying to update the .tuf path on msdb on destination server but I couldn't find out the path details on any of the log shipping system tables. I knew the last restored file path details can be found on dbo.log_shipping_monitor_secondary ,dbo.log_shipping_secondary_databases tables, updating these tables not helping to resolve my issue.

Where is the .tuf file path details on msdb? Ans: The tuf file path is none other than the column backup_destination_directory in log_shipping_secondary on the primary server. And this will be automatically updated when you change the folder name in the LS setup page . But TUF should be available in the old directory when the next restore happens. SELECT backup_destination_directory FROM dbo.log_shipping_secondary If you are changing the path for this directory what SQL server does is , when the next restore happens it first tries to copy the TUF file from the old directory to new directory and then only go ahead with the restore operation . If SQL server cannot find the .tuf file in the old directory or the old directory is itself lost then there is no other way than reconfiguring your LS setup from scratch.

What is Undo File? Why it is required? Undo file is needed in standby state because while restoring the log backup, uncommitted transactions will be recorded to the undo file and only committed transactions will be written to disk there by making users to read the database. When you restore next tlog backup SQL server will fetch the uncommitted transactions from undo file and check with the new tlog backup whether the same is committed or not. If its committed the transactions will be written to disk else it will be stored in undo file until it gets committed or rolled back.
http://pranahwam.blogspot.com/2013/09/what-is-tuf-file-in-log-shipping.html

http://troubleshootingsql.com/2009/12/30/troubleshooting-log-shipping-issues/

Troubleshooting Log Shipping Issues 15 comments

Very Poor

Log Shipping is a feature in SQL Server by which you can ship transaction log backups to a different server and restore the backups onto a standby database for disaster recovery or reporting purposes. One of the major fundamental differences in SQL Server 2000 and SQL Server 2005 log shipping is that SQL Server 2005 uses linked servers to communicate between the primary and monitor and the secondary and monitor. The log shipping jobs are executed via linked server queries to update information about the backup, copy and restore jobs in case you have a remote monitor server. So, if you find that your log shipping reports are defunct, then two additional things that you can do apart from the basic log shipping troubleshooting steps are: 1. Check if remote connections are enabled on the primary, secondary and monitor. For SQL Server 2000, check connectivity between the instances. 2. Check if a linked server using the same authentication settings as your log shipping setup can be setup between the primary and monitor or secondary or monitor depending on which part of your log shipping setup is broken. This is again applicable for SQL Server 2005. Basic Log Shipping Troubleshooting Steps 1. Look into the SQL Server ERRORLOGs, Application/System Event Logs for any errors related to log shipping. Check the job history of the log shipping Backup/Copy/Restore jobs for any errors. 2. Check the Backup, Copy & Restore job history details for any errors. 3. If you are using SQL Server 2005, check what details are being displayed in the Log Shipping Report under Standard Reports in Management Studio. 4. If you want to take your troubleshooting to the next level, then you can even look into the log shipping meta data by querying the log shipping tables on the primary/secondary/monitor(if configured). Addendum: April 26th, 2011 The log shipping configuration information can be found using the following methods: 1. Standard Reports Transaction Log Shipping Status (right click Server Name in Management Studio-> Reports -> Standard Reports -> Transaction Log Shipping Status) 2. Use the Stored Procedure which is called by the above report:

EXEC master..sp_help_log_shipping_monitor

Query to check log shipping job errors using the MSDB log shipping system tables

1 2 3 4 5 6 7 8 9 10

--List of Log Shipping jobs SELECT * from dbo.sysjobs WHERE category_id = 6 SELECT * FROM [msdb].[dbo].[sysjobhistory] where [message] like '%Operating system error%' order by [run_date] , [run_time]

SELECT * FROM [msdb].[dbo].[log_shipping_monitor_error_detail] where [message] like '%Operating system error%'

SELECT * FROM [msdb].[dbo].[restorehistory]

Known issues with Log Shipping 1. You might find that the last backed up/copied/restored files do not reflect correctly in the log shipping reports when you use a remote monitor server. In such a scenario, check if the following issue documented in the blog post below is applicable in your case: http://blogs.msdn.com/b/sqlserverfaq/archive/2009/03/27/transaction-log-shipping-status-report-formonitor-server-will-not-pull-up-information-if-alias-is-used-for-monitor-server.aspx The last copied and restored file will show up as null if the monitor instance is not on the same box as the secondary instance. The last backed up file will show up as null if the monitor instance is not on the same box as the primary instance if the select @@servername value is not used as the monitor server name while configuring the log shipping monitor. 2. If REMOTE ACCESS (sp_configure will show if it is enabled or not) is not enabled or the LOG SHIPPING LINKED SERVER (to the monitor server)is not working for the primary and secondary servers, then last backup file/last copy file/last restored file information will not get populated if a remote monitor server instance is being used. The easiest way to identify this issue would be to capture a profiler trace (on primary instance when the backup job is running and on the secondary instance when the copy/restore job is running). The profiler trace will report errors if an update operation pertaining to the log shipping monitor tables fails provided all Errors and Warnings profiler events are captured. 3. Another issue that you could run into while using Log Shipping is Orphaned Users if you have Database Users on the Primary Database mapped to SQL Authenticated Logins. This happens because the SIDs of the SQL Authenticated Users on the Primary and Secondary instance would be different. I documented the workaround to this issue in the following blog post: http://blogs.msdn.com/b/sqlserverfaq/archive/2009/04/13/orphaned-users-with-databasemirroring-and-log-shipping.aspx 4. When you are scripting out an existing log shipping configuration, ensure that you have Cumulative Update Package 9 applied for SQL Server 2005 Service Pack 2 applied for Management Studio. If that is already done, then use one of the options mentioned in the more information section in the KB Article below: 955693 FIX: In SQL Server 2005, the file information about the transaction log that was last copied

and the file information about the transaction log that was last restored are missing http://support.microsoft.com/default.aspx?scid=kb;EN-US;955693 5. If you have configured LogShipping with STANDBY mode on SQL Server 2008 and the destination folder for the TLOGS uses a remote server (on which the sqlservice/sqlagent is not a Local Admin), then the restore job will fail everytime with following error : 2008-12-12 14:44:58.53 *** Error: During startup of warm standby database testdb (database ID 7), its standby file (<UNC path of the TUF file>) was inaccessible to the RESTORE statement. The operating system error was 5(Access is denied.). TUF = Transaction Undo File which is required for applying the next T-LOG backup. This issued is fixed in the cumulative update mentioned in the KB Article below: FIX: Error message when you use log shipping in SQL Server 2008: During startup of warm standby database <Database Name> (database ID <N>), its standby file (<File Name>) was inaccessible to the RESTORE statement http://support.microsoft.com/kb/962008 6. Log shipping restore will fail if there is a snapshot or an active DBCC replica on the secondary database on which the restore is being done. http://troubleshootingsql.com/2012/09/12/why-did-the-restore-fail-on-the-log-shipped-secondarydatabase/ Addition: September 12, 2012 Special cases In case you need to speed up the transaction log restore for your log shipping secondary database in standby mode, then follow the steps mentioned in this post. In case you need to move your secondary log shipped database files to a new physical location, then you can use the steps mentioned in this post.

http://blogs.msdn.com/b/sqlserverfaq/archive/2011/01/07/case-study-troubleshooting-a-slow-logshipping-restore-job.aspx

Case Study: Troubleshooting a Slow Log Shipping Restore job


Balmukund 7 Jan 2011 5:10 AM

7 Scenario Consider a scenario where you have a Log shipping setup with a STANDY secondary database and things are working just fine. One fine day you notice that the secondary database is not in sync with the primary. The seasoned DBA that you are, you go ahead and looked at the log shipping jobs and identify that the restore is taking a lot of time. The obvious question that come to your mind is whether a lot of transactions have happened recently causing the log backup to be much larger. So you check the folder and see the .TRN file sizes remain pretty much the same. What next? I will cover some basic troubleshooting that you can do, to identify why the restore process is so slow. To give you a perspective, lets says that earlier a restore of a 4MB Transaction Log backup used to take less than a minute. Now, it takes about approximately 20-25 minutes.Before I get into troubleshooting, make sure that you have ruled out these factors:1. The Log backup size (.TRN) is pretty much the same as it was before. 2. The Disk is not a bottleneck on the secondary server. 3. The Copy job is working just fine and there is no delay here. From the job history you clearly see that Restore is where the time is being spent. 4. The Restore job is not failing and no errors are reported during this time (e.g. Out of Memory etc.). Troubleshooting The 1st thing to do to get more information on what the restore is doing is to enable these trace flags

DBCC TRACEON (3004, 3605, -1)

3004 Gives extended information on backup and restore 3605 Prints trace information to the error log. You can read more about these trace flags here http://blogs.msdn.com/b/psssql/archive/2008/01/23/how-it-works-what-is-restore-backupdoing.aspx Here is a sample output after I enabled these trace flags. Focus on the specific database which is the secondary database in your log shipping. <Snippet from SQL Errorlog>
2010-12-29 16:11:19.10 spid64 2010-12-29 16:11:19.10 spid64 2010-12-29 16:11:19.10 spid64 2010-12-29 16:11:19.10 spid64 2010-12-29 16:11:19.10 spid64 2010-12-29 16:11:19.10 spid64 2010-12-29 16:11:19.12 spid64 2010-12-29 16:11:19.12 spid64 2010-12-29 16:11:19.12 spid64 2010-12-29 16:11:19.12 spid64 2010-12-29 16:11:23.46 spid64 2010-12-29 16:11:23.46 spid64 to 0x4c1284000) 2010-12-29 16:11:23.46 spid64 2010-12-29 16:11:23.46 spid64 2010-12-29 16:11:23.51 spid64 2010-12-29 16:11:23.51 spid64 2010-12-29 16:11:23.51 spid64 2010-12-29 16:11:23.51 spid64 2010-12-29 16:11:23.51 spid64 2010-12-29 16:11:23.51 spid64 2010-12-29 16:11:24.24 spid64 2010-12-29 16:11:24.24 spid64 2010-12-29 16:11:24.24 spid64 2010-12-29 16:11:24.24 spid64 2010-12-29 16:11:25.69 spid64 2010-12-29 16:11:25.69 spid64 2010-12-29 16:11:25.74 spid64 2010-12-29 16:11:26.74 spid64 2010-12-29 16:11:26.76 spid64 recovery to be run. 2010-12-29 16:11:27.63 spid64 0x4c176a000. 2010-12-29 16:11:27.63 spid64 to 0x4c17e2000) 2010-12-29 16:11:27.65 spid64 2010-12-29 16:24:30.55 spid64 RestoreLog: Database TESTDB X-locking database: TESTDB Opening backup set Restore: Configuration section loaded Restore: Backup set is open Restore: Planning begins Dismounting FullText catalogs Restore: Planning complete Restore: BeginRestore (offline) on TESTDB Restore: Undoing STANDBY for TESTDB SnipEndOfLog from LSN: (296258:29680:1) Zeroing D:\SQL\SQLLog\TESTDB.ldf from page 2492695 to 2492738 (0x4c122e000 Zeroing completed on D:\SQL\SQLLog\TESTDB.ldf Restore: Finished undoing STANDBY for TESTDB Restore: PreparingContainers Restore: Containers are ready Restore: Restoring backup set Restore: Transferring data to TESTDB Restore: Waiting for log zero on TESTDB Restore: LogZero complete FileHandleCache: 0 files opened. CacheSize: 10 Restore: Data transfer complete on TESTDB Restore: Backup set restored Restore-Redo begins on database TESTDB Rollforward complete on database TESTDB Restore: Done with fixups Transitioning to STANDBY Starting up database 'TESTDB'. The database 'TESTDB' is marked RESTORING and is in a state that does not allow FixupLogTail() zeroing S:\SQLServer\SQLLog\TESTDB.ldf from 0x4c1769400 to Zeroing D:\SQL\SQLLog\TESTDB.ldf from page 2493365 to 2493425 (0x4c176a000 Zeroing completed on D:\SQL\SQLLog\TESTDB.ldf Recovery is writing a checkpoint in database 'TESTDB' (5). This is an informational

message only. No user action is required. 2010-12-29 16:24:35.43 spid64 Starting up database 'TESTDB'. 2010-12-29 16:24:39.10 spid64 CHECKDB for database 'TESTDB' finished without errors on 2010-12-21 23:31:25.493 (local time). This is an informational message only; no user action is required. 2010-12-29 16:24:39.10 spid64 Database is in STANDBY 2010-12-29 16:24:39.10 spid64 Restore: Writing history records 2010-12-29 16:24:39.10 Backup Log was restored. Database: TESTDB, creation date(time): 2008/01/26(09:32:02), first LSN: 296258:29680:1, last LSN: 298258:40394:1, number of dump devices: 1, device information: (FILE=1, TYPE=DISK: {'S:\SQL\SQLLogShip\TESTDB\TESTDB_20101229011500.trn'}). This is an informational message. No user action is required. 2010-12-29 16:24:39.12 spid64 Writing backup history records 2010-12-29 16:24:39.21 spid64 Restore: Done with MSDB maintenance 2010-12-29 16:24:39.21 spid64 RestoreLog: Finished

</Snippet>

From the above output we see that the restore took ~13 minutes. If you look closely at the output the section highlighted in green is where most of the time is spent. Now when we talk about log restores, the number of VLFs play a very important factor. More about the effect of VLFs vs. Restore Time given here http://blogs.msdn.com/b/psssql/archive/2009/05/21/how-a-log-file-structure-can-affectdatabase-recovery-time.aspx Bottom line is that a large number of virtual log files (VLFs) can slow down transaction log restores. To find out if this is the case here, use the following command
DBCC LOGINFO (TESTDB) WITH NO_INFOMSGS

The following information can be deciphered from the above output :1. The number of rows returned in the above output is the number of VLFs.

2. The number of VLFs that had to be restored in this log backup can be calculated from the section highlighted in blue (above).
first LSN: 296258:29680:1, last LSN: 298258:40394:1

298258 296258 = 2000 VLFs

3. The Size of each VLF can be calculated based on the FileSize column.
9175040 8.75 MB 9437184 9 MB 10092544 9.62 MB

Problem (s) So based on the above there are two possibilities,


1. The number of VLFs is rather large which we know will impact restore performance. 2. The size of each VLF is large is cause for concern if STANDY mode is in effect.

The 2nd problem is aggravated if there are batch jobs or long-running transactions that span multiple backups (e.g. Rebuild Indexes). In this case the work of repeatedly rolling back the long-running transaction, writing the rollback work to the standby file (TUF file), then undoing all the rollback work with the next log restore just to start the process over again can easily cause a log shipping secondary to get behind.
While we are talking about the TUF file, I know many people out there are not clear on what this is used for. So here goes,

What is the Transaction Undo File (TUF)? This file contains information on any modifications that were made as part of incomplete transactions (uncommitted transactions and is used to save the contents of these pages) at the time the backup was performed. A transaction undo file is required if a database is loaded in read-only state (STANDY mode option in LS). In this state, further transaction log backups may be applied.

In the standby mode (which we have for secondary database), database recovery is done when the log is restored and this mode also creates a file with the extension .TUF (which is the transaction Undo file on the destination server). That is why in this mode we will be able to access the databases (Read-Only

access). So before the next TLOG backup is applied, the saved changes in the undo file are reapplied to the database. Since this is in STANDBY mode, for any large transactions, the restore process also does the work of writing the rollback to the standby file (TUF), so we might be spending time initializing the whole virtual log.

Solution 1 You need to reduce the number of VLFs. You can do this by running DBCC SHRINKFILE to reduce the ldf (s) to a small size, thereby reducing the number of VLFs. Note: You need to do this on the Primary database. After the shrink is complete verify the VLFs have reduced by running DBCC LOGINFO again. A good range would be somewhere between 500-1000. Resize the log file to the desired size using a single growth operation. You can do this by tweaking the Initial Size setting, also pay attention to the AutoGrowth setting for the LDF file. Setting it too small a value can lead to too many VLFs.
ALTER DATABASE DBNAME MODIFY FILE (NAME='ldf_logical_name', SIZE=<target>MB)

Also remember that you still have to first apply the pending Log backups before we get to the one which holds the shrink operation. Once we reach this then you can measure the restore time to see if the changes above had a positive impact. Solution 2 For problem 2 where the size of the VLFs is causing havoc with the STANDBY mode, you will have to truncate the transaction log. This means that log shipping has to be broken. You can truncate the TLOG on the source database by setting the recovery model to SIMPLE (using ALTER DATABASE command). If on SQL 2005 or lower versions you can use the BACKUP LOG DBNAME with TRUNCATE_ONLY command. Then make modifications to the Log file Auto-Grow setting to an appropriate value. Pay attention to the value you set here such that it is not too high, else transactions will have to wait while the file is being grown or too low that it creates too many VLFs. Take a full database backup immediately and use this to re-setup the log shipping.

Tip: You can use the DBCC SQLPERF(LOGSPACE) command to find out what percent of your log file is used. I hope this information was helpful in demystifying log shipping restore troubleshooting. As always, stay tuned for more on this blog.

http://mysqlserverdba.blogspot.com/2011/05/frequently-raised-errors-in-log.html

http://sqlerrors.wordpress.com/2013/12/09/log-shipping-false-errors-messages-14420/

Log shipping false errors messages 14420


Problem: We recently migrated our SQL Server 2008 R2 and SAN storage to the new infrastructure, I have log shipping setup on few databases and to retain all the jobs I restored MSDB database on the new server, post migration I synced primary and secondary servers, enabled all log shipping related jobs and the entire process was successful, but few hours later Primary Server started generating alerts, error 14420 The log shipping primary database [ServerName].[DatabaseName] has backup threshold of 60 minutes and has not performed a backup log operation for 120 minutes. Investigation: I started my investigation with sp_readeerorlog, I found a lot of error messages as below.

To confirm my log shipping status I checked Log backup , log copy and log restore jobs and all were running successfully, my last log backup was 15 min back and it was successfully restored on secondary server as shown below. On Primary Server.

Secondary Server.

Now I am pretty sure there is no problem with log shipping, but why server generating alerts, on further investigation using following query on Primary server.

Select * from msdb.dbo.log_shipping_monitor_primary

last_backup_date column was not at all updating with latest dates and it was showing date and time of last backup just before we migrated to new server. This helped me a lot for my further investigation, and somewhere I read a suggestion just to check my server name.

Select ServerProperty('ServerName'), @@ServerName

Above query returned two different names,ServerProperty(ServerName) returned actual server, whereas @@ServerName returned the name we had assigned to the server during SQL Installation(SQL Server Setup sets the server name to the computer name during installation ) Solution: When you run following queries all should return you same server name.
Select ServerProperty('ServerName') Select @@ServerName Select primary_server from msdb.dbo.log_shipping_monitor_primary

This happens if you install SQL Server and later you change Server name, to solve my problem I have to drop old server name and add it again with new server name as shown in the following queries

EXEC sp_dropserver 'OLD_SERVER_NAME' EXEC sp_addserver 'NEW_SERVER_NAME', 'local'

You need to restart your SQL Service for changes to take effect, no more error messages now.

http://calyansql.blogspot.com/2010/06/error-14420-14421-severity-16-state-1.html

Error: 14420, 14421, Severity: 16, State: 1 LogShipping


The log shipping primary database %s.%s has backup threshold of %d minutes and has not performed a backup log operation for %d minutes. Check agent log and log shipping monitor information. WorkAround 1. The Message 14420 does not necessarly indicate a problm with logshipping. And this message mostly occur when monitor server is configured. This message generally occured when the differnce between t-log backup and the current time on the monitor server is the greater than the time is set for backup threshold. 2. Ensure the transaction log backup happend on the primary server. If the t-log backup fails then also above error will occur. 3. You may set incorrect value for the backup alert. 4. The date and time of monitor server is different from the date and time of primary server. 5. The logshipping copy job is run on the primary server and may not update the entry in the msdb database at monitor server in the log_shipping_primaries table. 14421 Error : The log shipping secondary database %s.%s has restore threshold of %d minutes and is out of sync. No restore was performed for %d minutes. Restored latency is %d minutes. Check agent log and logshipping monitor information. The message doesn't necessarly indicate a problm with logshipping. 1. This may occur restore job on the secondary server is failing. 2. You may set out of sync alert is wrong.

http://ms-dba.blogspot.com/2010/06/copy-and-restore-job-errors-with-log.html

Copy and Restore Job Errors with Log Shipping

Hopefully this can help some other people out, I couldn't find much on this error when I got it, and after a bit of digging I was able to sort out the problem. I set up log shipping for a database in SQL 2008, very simple set up, just following the SSMS GUI. The backup job would run fine, but the copy and restore jobs on the secondary server kept failing. This was the output of the copy job: Microsoft (R) SQL Server Log Shipping Agent [Assembly Version = 10.0.0.0, File Version = 10.0.1600.22 ((SQL_PreRelease).0807091414 )] Microsoft Corporation. All rights reserved. 2010-06-29 09:41:52.53 ----- START OF TRANSACTION LOG COPY ----2010-06-29 09:41:52.80 *** Error: Could not retrieve copy settings '[removed]'.(Microsoft.SqlServer.Management.LogShipping) *** 2010-06-29 09:41:52.81 *** Error: The specified agent_id [removed] not form a valid pair for log shipping monitoring processing.(.Net Provider) *** 2010-06-29 09:41:52.82 *** Error: Could not log history/error message.(Microsoft.SqlServer.Management.LogShipping) *** 2010-06-29 09:41:52.82 *** Error: The specified agent_id [removed] not form a valid pair for log shipping monitoring processing.(.Net Provider) *** 2010-06-29 09:41:52.83 *** Error: Could not cleanup history.(Microsoft.SqlServer.Management.LogShipping) *** 2010-06-29 09:41:52.84 *** Error: The specified agent_id [removed] not form a valid pair for log shipping monitoring processing.(.Net Provider) *** 2010-06-29 09:41:52.84 ----- END OF TRANSACTION LOG COPY ----Exit Status: 1 (Error) As you can see, the same error message was the Error 32016: The specified agent_id %s or agent_type %d do not form a valid pair for log shipping monitoring processing. I ended up doing a trace on the primary server whilst running the job, and was able to see that the error was being thrown by this code: if (sys.fn_MSvalidatelogshipagentid(@agent_id, @agent_type) = 0) begin select @agent_idstring = cast(@agent_id as sysname) raiserror(32016, 16, 1, @agent_idstring, @agent_type) return 1 end The function sys.fn_MSvalidatelogshipagentid returns either 1 or 0 using the following code: return case when ((@agent_type = 0) and

for secondary ID or agent_type 1 do SqlClient Data

or agent_type 1 do SqlClient Data

or agent_type 1 do SqlClient Data

exists (select * from msdb.dbo.log_shipping_monitor_primary where primary_id = @agent_id)) then 1 when ((@agent_type in (1,2)) and exists (select * from msdb.dbo.log_shipping_monitor_secondary where secondary_id = @agent_id)) then 1 else 0 end Now as I knew the agent type was either 1 (copy) or 2 (restore), so I looked at the tablemsdb.dbo.log_shipping_monitor_secondary on the primary server, which was empty, hence the function returning 0. After a bit of banging my head on the desk as to why this function was running on the primary server, as that table is only meant to be populated on the secondary server, I had a look at the job to see how it was calling the sqllogship.exe program: "D:\Program Files\Microsoft SQL Server\100\Tools\Binn\sqllogship.exe" -Copy [ID_removed] -server SRV01 where SRV01 is the name of the primary server. So I changed the -server parameter over to the secondary server name (SRV02), re-ran the job, and it worked! Reading the BOL about sqllogship.exe, it states for the -server parameter: For -copy or -restore, instance_name must be the name of a secondary server in a log shipping configuration. Which makes sense why the jobs now ran OK as they were now using the correct parameter value. To summarize: If you are getting error 32016 The specified agent_id %s or agent_type %d do not form a valid pair for log shipping monitoring processing check the command of the Copy and Restore jobs, that the server parameter is set to the secondary server name. The weird thing is that log shipping was setup via SSMS, I just filled in the details and the jobs were created automatically - so SQL Server itself had put the parameter of SRV01 there on its own accord. I can't see how something I did made that parameter be the wrong value. Maybe I did do something, but I did recreate the log shipping several times and double-checked what information I had put in. I'd be interested to hear if anyone else has had this issue where the primary server name ends up in the copy and restore jobs instead of the secondary server name, and how you set up your log shipping.

http://thakurvinay.wordpress.com/2011/11/27/day-27-log-shipping-errors/ Day 27: Log Shipping Errors


Log shipping is another tool for High availability it will generally have primary and secondary server, Full Backup restores at both location and then Tlog backups will copied to secondary server and restore.

The step by step and detail information about log shipping with related links are blog here.

Like replication or any other High availability tools, you can use one way from lower version to higher version, means your primary sever should be LOWER version and SECONDORY should be higher version, you cannot restore backup from sql server 2005 and restore to sql server 2000(should by other way round )

Generally, everything goes good things will not have any issue on it, as log shipping is very simple and easy to maintain.

Some errors we get in log shipping is as follows:

1.

Description of error message 14420 and error message 14421 that occur when you use log shipping in SQL Server

For sql server 2000

Error

message 14420

Error: 14420, Severity: 16, State: 1 The log shipping primary database %s.%s has backup threshold of %d minutes and has not performed a backup log operation for %d minutes. Check agent log and logshipping monitor information.

Error message 14421

Error: 14421, Severity: 16, State: 1 The log shipping secondary database %s.%s has restore threshold of %d minutes and is out of sync. No restore was performed for %d minutes. Restored latency is %d minutes. Check agent log and log shipping monitor information.

Below link shows detail information about

http://support.microsoft.com/kb/329133

1. when the primary and secondary server backup and restored is miss matched due to missing tlog backup you may get error
Msg 4305, Level 16, State 1, Line 3

The log in this backup set begins at LSN 22000000018800001, which is too recent to apply to the database. An earlier log backup that includes LSN 21000000002500001 can be restored.

Msg 3013, Level 16, State 1, Line 3

RESTORE LOG is terminating abnormally.

Generally for such issue you have to apply the Tlog backup restore sequentially, by any chance if you miss the tlog backup(Lost), you make have to start again with FULL backup and then restoring tlog backup their after.

1. Sometime due to permission issue you may get an access denied error
The Service account on the Production and the Stand-By server must have the same account

http://saveadba.blogspot.com/2011/12/sql-server-log-shipping-could-not-find.html

Log shipping 'Could not find a log backup file that could be applied to secondary database'
Today the log shipping that had been set up for a database started to fail. Found the following errors in the job history for LSRestore

***Error: The file 'E:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Backup\DBname_201112081234.trn is too recent to apply to the secondary database 'DBname'.(Microsoft.SqlServer.Management.LogShipping) *** *** Error: The log in this backup set begins at LSN 842977000024386700001, which is too recent to apply to the database. An earlier log backup that includes LSN 841347000010148300001 can be restored. Searching for an older log backup file. Secondary Database: 'DBname' *** Error: Could not find a log backup file that could be applied to secondary database 'Dbname'.(Microsoft.SqlServer.Management.LogShipping) ***

This means that the current log backup file that is being restored is too recent. A backup from a prior time needs to be applied first. So lets check what was the log file copied. The following query will give you the last log backup file that was copied SELECT * FROM [msdb].[dbo].[log_shipping_secondary]

Then check what was the last log backup file that was restored. SELECT * FROM [msdb].[dbo].[log_shipping_secondary_databases] Then check the transaction backup file that was immediately copied after the last restored file. Sometimes one file is so big that the subsequent backup files get copied quickly. (sort by name and not by time). The file that is still being copied will show up in the correct order when you sort by name. But if you sort by date modified then it might show up at the end of the list. So on the next restore attempt it will try to restore those files that have been successfully copied. And it will skip the file that is still being copied. So wait for that big file to be copied. It might also be the case that one of the log backup files did not get copied because it was deleted on the primary server. Or it got deleted from the secondary server before it was restored. The second case is unlikely so check the retention time on backup job that runs on primary server.

If you see file on primary that was not yet copied then copy that over and log shipping to start working again.

http://trilist.blogspot.com/2010/03/fix-log-shipping-failure-due-to-broken.html

FIX: Log shipping failed due to broken/ damaged tansaction log backup.
Log shipping has failed due to incomplete/ damaged transaction log backup. This issue occurred also when there is lack of free space on drive where backups are placed???(no logical explanation). For some reason transaction log backup was saved with errors (damaged) and then transfered to the destination server. Result is - log shipping has failed. I've prepared a 6 step guide how to repair log shipping:

1. Create Full Backup of source database (mine was placed '\\sharedstorage\XYZ\XYZ_backup.bak') 2. Disable Restore and copy jobs on destination server. 3. Restore database with: restore database XYZ from disk ='\\sharedstorage\XYZ\XYZ_backup.bak' with replace, norecovery;

4. If destination database should be in standby mode go to step 5; else go to step 6 5. Restore next transaction log backup and set database to standby (transaction log backups are working on source): restore log XYZ from disk ='\\sharedstorage\XYZ\XYZ_.trn' with standby= 'F:\Microsoft SQL Server\MSSQL.1\MSSQL\Backup\UNDO_standby_file.BAK'

*note if full backup takes too long or is old and we are trying to restore transaction log backup which is not the correct one error will be generated. If backup is older error will be transaction log xxxx is to early. And if backup is newer the error will be transaction log is too recent.

6. 7.

Enable Restore and copy jobs on destination server. Done.

Aditional links: http://msdn.microsoft.com/en-us/library/ms175106.aspx http://msdn.microsoft.com/en-us/library/ms188625.aspx

Vous aimerez peut-être aussi