DBA2 Premm

Achieving High Availability with SQL Server using EMC SRDF
Prem Mehra SQL Server Development, Microsoft Art Ullman - CSC
Topics Covered
Share experiences gained on deploying SQL Server and SAN for a Highly Available Data Warehouse. Emphasis on

Intersection of SAN and SQL Server Technologies Not on Large Data Base Implementation or on Data Warehouse Best Practices
Project Overview Best Practices in a SAN environment Remote Site Fail-over using EMC SRDF and SQL Server Log Shipping
USDA GDW Project Overview

Project Build Geo-spatial Data Warehouse for two sites with remote fail-over USDA EMC SAN (46 terabytes) SQL Server 2000 USDA / CSC
Client
Storage Database Implementation
Consultants
Geo-spatial Software
EMC / CSC / Microsoft / ESRI

ESRI Data Management Software
Application Requirements
A large (46 TB total storage) geo-spatial data warehouse for 2 USDA sites: Salt Lake City & Fort Worth Provide database fail-over and fail-back between remote sites
Run data replication across DS3 network between sites (45Mb/sec)
Support read- only access at failover sites on ongoing basis
SAN Implementation
Understand your throughput, response time and availability requirements and potential bottlenecks and issues Work with your storage vendor

Get Best Practices Get design advice on LUN size, sector alignment, etc Understand the available backend monitoring tools
Do not try to over optimize, keep LUN, filegroup, file design simple, if possible
SAN Implementation
Balance I/O across all HBAs when possible using balancing software (e.g., EMCs PowerPath)
Provides redundant data paths Offers the most flexibility and much easier to design when compared to static mapping Some vendors are now offering implementations which use Microsofts MPIO (multi-path IO). Permits more flexibility in heterogeneous storage environments. Some configurations offer dynamic growth of existing LUNs for added flexibility (e.g., Veritas Volume Manger or SAN Vendor Utilities)
Managing growth
Working with SAN vendor engineers is highly recommended
SAN Implementation 3
Benchmarking the I/O System
Before implementing SQL Server, benchmark the SAN. Shake out hardware/driver problems
Test a variety of I/O types and sizes. Combinations - read/write & sequential/random Include I/O of at least 8K, 64K, 128K, and 256K. Ensure test files are significantly larger than SAN Cache At least 2 to 4 times Test each I/O path individually & in combination to cover all paths Ideally - linear scale up of throughput (MB/s) as paths are added Save the benchmark data for comparison when SQL is being deployed
SAN Implementation 4
Benchmarking the I/O System
Share results with vendor: Is performance reasonable for the configuration? SQLIO.exe is an internal Microsoft tool
On-going discussions to post it as an unsupported tool at http://www.microsoft.com/sql/techinfo/admi nistration/2000/scalability.asp
SAN Implementation
Among other factors, parallelism also influenced by

Number of CPUs on the host Number of LUNs
For optimizing Create Database and Backup/Restore performance, consider

More or as many volumes as the number of CPUs. Could be volumes created by dividing a dynamic disk or separate LUNs Internal file structures require synchronization, consider the # of processors on the server Number of data files should be >= the number of processors
Database and TempDB Files

Remote Site Fail-over with SQL Server and EMC SRDF

USDA Geo-spatial Database
Data Requirements

23 terabytes of EMC SAN storage per site (46 TB total storage) 2 primary SQL Servers and 2 fail-over servers per site 15 TB of image data in SQL Server at Salt Lake City site with fail-over to Fort Worth 3 TB of vector data in SQL server at Fort Worth site with fail-over to Salt Lake City 80 GB of daily updates that need to be processed and pushed to fail-over site
Solution
Combination of SRDF and SQL Server Log Shipping Initial Synchronization using SRDF Push updates using SQL Server Log Shipping Use SRDF incremental update to fail-back after a fail-over Use SRDF to move log backups to remote site
Hardware Infrastructure
Site Configuration (identical at each site)
Technical Overview EMC Devices
EMC SAN is partitioned into HyperVolumes and Meta-Volumes (collections of Hyper-Volumes) through BIN File configuration All drives are either mirrored or Raid 7+1 Hypers and or Metas are masked to hosts and are viewable as LUNs to the OS EMC Devices are identified by Sym Id EMC Devices are defined as R1, R2, Local or BCV devices in the Bin File
Technical Overview Device Mapping

Windows Device Manager and SYMPD LIST Output
Technical Overview SRDF 1

SRDF provides track to track data mirroring between remote EMC SAN devices. BCVs are for local copies.
Track to track replication (independent of host)

R1 Device is source R2 Device is target
R2 is read/write disabled until the mirror is split

Synchronous Mode Semi-Synchronous Synchronous with some lag Adaptive Copy Mode Asynchronous Adaptive Copy A Asynchronous with guaranteed write sequence using buffered track copies Note: only Adaptive Copy A requires additional storage space. All other SRDF replications simply keep a table of tracks that have changed.

SRDF replicates by Sym Device (Hyper or Meta). SRDF Devices can be Grouped for synchronizing.
SQL Server databases are replicated by database or by groupings of databases if TSIMSNAP2 is used. Primary Host Fail-over Host
R2
R1 Group A Database 1
R1
R1 Group B Database 2
R1
R2
Process Overview

Initial Synchronization using SRDF in Adaptive Copy Mode (all database files). Use TSIMSNAP(2) to split SRDF group after synchronization is complete. Restore fail-over databases using TSIMSNAP(2) after splitting SRDF mirror. Use SQL Server Log shipping to push all updates to fail-over server (after initial sync). Fail-over database is up and running at all times, giving you confidence that the failover server is working.
Planning
Install SQL Server and system databases on Primary and Fail-over Servers (on Local non-replicated devices) Create user databases on R1 devices (MDF, NDF and LDF) on Primary Host Dont share devices among databases, if you need to keep databases independent for fail-over and fail-back. (Important) Database volumes can be drive letters or mount points
Initial Step
Create Databases on R1 Devices Load Data
Synchronize to Fail-over host

Create SRDF Group for Database on R1 Set Group to Adaptive Copy Mode Establish SRDF Mirror to R2
Synchronize to Fail-over host
Wait until Adaptive Copy is synchronized Use TSIMSNAP command to split SRDF group after device synchronization is complete. Use TSIMSNAP2 for multiple databases. TSIMSNAP writes Meta Data about databases to R1, which is used for recovering databases on R2 host.
Break Mirror
Write Meta Data
Attach Database on Fail-over Host

Verify SQL Server is installed and running on Fail-over host. Mount R2 volumes on remote host. Run TSIMSNAP RESTORE command on Fail-over host. Specify either standby (read-only) or norecovery mode.

Database is now available for log shipping on fail-over. SRDF Mirror is now broken, but the track changes are still tracked (for incremental mirror and/or for fail-back).
Log Shipping at Primary Site

Log Shipping volume on separate R1 device (not the same as the database R1) Log Backup Maintenance Plan to backup logs to log shipping volume, which is an R1 device Set R1 to Adaptive Copy Mode Establish R1/R2 Mirror. Logs automatically get copied to R2.
Log Shipping at Fail-over Site

BCV (mirror) of R2 Schedule a script that splits and mounts BCV, then restores logs to SQL Server database(s) Flush, un-mount and re-establish BCV mirror after logs have been restored
Process Overview Summary

Initial Synchronization using SRDF in Adaptive Copy Mode. Use TSIMSNAP(2) to split SRDF group after synchronization is complete. Use SQL Server Log shipping to push updates to fail-over server. Fail-over database is up and running at all times, giving you confidence that the fail-over server is working.
Fail-over Process
Fail-over Type Read-only Full Update Required Action
No Server Action. Clients would need to point to fail-over server. SQL Command: Restore Database DBName with Recovery
Fail-back Process
From Read-only Failover Full Update Failover Required Action None Required. Point Clients to Primary.
1. 2. 3. 4.
5. 6.
Run SYMRDF Update command to copy from R2 to R1 in Adaptive Copy Mode. Detach database on R2 after Update Complete. Flush and un-mount volumes on R2 Run SYMRDF FAILBACK to replicate final changes back to R1 and write enable R1 Mount R1 volumes Attach Database on Primary Host
Closing Observations
So far SQL Server 2000 has met High Availability objectives Network traffic across the WAN was minimized, (by shipping only SQL Server Log Copies, once the initial synchronization was completed.) The dual Nishan fiber-to-IP switches allowed for data transfer at about 16 GB / hour, taking full advantage of the DS3. This transfer rate easily met USDAs needs for initial synchronization, daily log shipping, and the fail-back process. The working read-only version of the fail-over database meant that the administrators always knew the status of their fail-over system. The USDA implementation did not require a large number of BCV volumes - as some other replication schemes require.
Closing Observations
After the R1/R2 mirror has been split, SRDF continues to track updates to R1 (from normal processing) and R2 (from log restore process). SRDF is then able to ship only the modified tracks during fail-back or re-synchronization. This process is called an Incremental Establish, or an Incremental Fail-back and is much more efficient than a Full Establish or Full Fail-back. After fail-back, the R1 and R2 devices will be insync, and ready for log shipping startup with a minimal amount of effort. Since SRDF (initial synchronization, fail-back, and log shipping) all run in adaptive copy mode, the performance on the primary server is not impacted.
Software

SQL Server 2000 Enterprise Edition Windows 2000 / Windows 2003 Server EMC SYM Command Line Interface EMC Resource Pack
Call To Action

Understand your HA requirements Work with your SAN Vendor to architect and design for SQL Server deployment

Plan your device & database allocation before requesting a BIN File Decide if sharing devices for databases (use TSIMSNAP or TSIMSNAP2). Decision effects convenience, space & flexibility of operations
Stress test subsystem prior to deploying SQL Server
For more information, please email SCDLITE@microsoft.com You can download all presentations at www.microsoft.com/usa/southcentral/
SQL Server Summit Brought To You By:
2004 Microsoft Corporation. All rights reserved. This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.

DBA2 Premm

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

DBA2 Premm

Transféré par

Droits d'auteur :

Formats disponibles

Achieving High Availability with SQL Server using EMC SRDF

Prem Mehra SQL Server Development, Microsoft Art Ullman - CSC

USDA GDW Project Overview

EMC / CSC / Microsoft / ESRI

Run data replication across DS3 network between sites (45Mb/sec)

Support read- only access at failover sites on ongoing basis

Working with SAN vendor engineers is highly recommended

On-going discussions to post it as an unsupported tool at http://www.microsoft.com/sql/techinfo/admi nistration/2000/scalability.asp

Among other factors, parallelism also influenced by

For optimizing Create Database and Backup/Restore performance, consider

Database and TempDB Files

Remote Site Fail-over with SQL Server and EMC SRDF

Technical Overview EMC Devices

Technical Overview Device Mapping

Technical Overview SRDF 1

Track to track replication (independent of host)

R2 is read/write disabled until the mirror is split

Technical Overview SRDF 2

Technical Overview SRDF 3

Synchronize to Fail-over host

Synchronize to Fail-over host

Write Meta Data

Attach Database on Fail-over Host

Log Shipping at Primary Site

Log Shipping at Fail-over Site

Process Overview Summary

Stress test subsystem prior to deploying SQL Server

SQL Server Summit Brought To You By:

Vous aimerez peut-être aussi