Vous êtes sur la page 1sur 47

1

Starwood Hotels

<Insert Picture Here>


Storing More Data for Less

Oracle Partitioning for ILM

Dr Lilian Hobbs, Oracle ILM Product Manager

Arup Nanda, Director – Database Engineering, Starwood Hotels


Agenda

• Introduction to Oracle ILM <Insert Picture Here>

• What is Information Lifecycle Management


• 4 Steps for implementing ILM using Oracle Database
• Implementing ILM at Starwood Hotels
• How was Partitioning used
• Partitioning & Tiered Storage
• Moving Data
• Security & Compliance issues
• ILM Assistant for managing your ILM environment

3
<Insert Picture Here>

Introduction to
Oracle Information
Lifecycle Management

4
What Drives ILM?

• Reduce cost to retain data


• Vast amounts of data are retained enterprises for business
and regulatory reasons
• Need to optimize the cost of retaining data in the database
to avoid skyrocketing costs

Active Data Less Active Data Historical Data

5
Database Implementation without ILM

Data Lifecycle
Active Less Historical Archive
Active

DIGITAL DATA STORAGE

High Performance Tape


Storage Tier Archive

6
Match Lifecycle to Storage to Optimise Cost

Data Lifecycle
Less Offline
Active Historical
Active Archive

DIGITAL DATA STORAGE

High Performance Low Cost Online Archive Offline


Storage Tier Storage Tier Storage Tier Archive

7
ILM and Oracle

• From database perspective ILM is a set of policies and


techniques for Managing Data
• Managing data is Oracle’s core competency
• Oracle platform can be used to implement ILM policies
and techniques for business data
• Some new technology but mostly an application of existing data
management capabilities

8
Oracle is Ideal for Business ILM

• Oracle Database is the best place to


implement business ILM
• Application Transparent ILM
• Oracle classifies business data
transparently to the applications
• Fine Grained ILM
• Oracle manages the lifecycle of groups
of business data down to the level of
individual rows
Less Historical • Low Cost ILM
Active Active
• Oracle can use low cost storage to
reduce the cost of retaining data
• Enforceable Compliance Policies
• Oracle has sophisticated techniques to
define and enforce data policies

9
4 Steps to Business ILM
3. Create Data
Access and
1. Define Data Migration
Classes Policies

4. Define and
Enforce
Compliance
2. Create Storage Less Policies
Tiers for the Active Historical
Active
Data Classes

DIGITAL DATA STORAGE

High Performance Low Cost Online Archive Offline


Storage Tier Storage Tier Storage Tier Archive

10
Step 1 to Business ILM

1. Define Data
Classes

11
Define Classes of Data

• First understand data as part of


D a business process
A High A • How is it used?
C Active Volume T
T A • How long must it be kept?
I • How does access vary over
V V time?
I O
T L • Choose classification based on
Y Low Less U this understanding
Volume Active M • Classification by Age is most
E
common
0 1 1 5 10
• Others are possible
Months Years • Privacy
• Product ID
• Consider a Hybrid Classification
• Classify by business attribute
AND age

12
Separate Data by Class

All Orders • The goal of ILM is to apply different policies to


Q1 different classes of data
Orders
• To treat data classes differently, you must
Q2 physically separate data by class
Orders
• Data classes must be mapped to data attributes
Q3 e.g. Order date
Orders
• Table Partitions enable you to separate data by
Q4 data attribute
Orders • Can manage each class (partition) as a unit
• Store, move, archive, search, query
Previous • Partitions are transparent to the application
Orders
• Define Data Lifecycles using the ILM Assistant

13
Step 2 to Business ILM

2. Create Storage Tiers


for the Data Classes

DIGITAL DATA STORAGE

High Performance Low Cost Online Archive Offline


Storage Tier Storage Tier Storage Tier Archive

14
Create Physical Storage Tiers

• Create separate storage areas for high


performance and low cost storage
• High Performance Storage Tier uses
• High performance storage arrays
High Performance
Storage Tier • Disks optimized for throughput
• Low Cost Storage Tier uses
• Modular arrays for reduced cost
• Large capacity commodity ATA disks
Low Cost
Storage Tier

15
Online Archive Storage Tier

• The Online Archive Storage Tier is


• Very large
• Low Activity
• Read-only or Read-mostly
• Store in low cost storage tier
• Information is still online and always readable
• No delay when data needed, always available
• Storage cost is almost same as tape
• Leverage usage patterns to further
reduce size & cost
• Defragment and Compress
• Rows, tables, files
• Declare read-only

16
Assign Classes to Storage Tiers

All Orders
Q1
Active Data
Orders
High Performance
Q2 Storage Tier
Orders

Q3 Less
Assign data Orders

partitions to Active Data


Q4
appropriate Orders Low Cost Storage Tier

storage tiers
Historical
Previous
Orders Data
Online Archive Storage Tier

17
Tiered Storage – Sample Costs

Storage Tier Single Tier Multiple Tiers w/Compression

High Performance (2550 GB) $74,300.00

High Performance (50 GB) $1,450.00 $1,450.00

Low Cost (500 GB) $3,500.00 $3,500.00

Online Archive (2000 GB) $14,000.00 $5,600.00

$74,300.00 $18,950.00 $10,550.00

• Usage of appropriate storage tiers reduces total cost


of ownership by order of magnitudes
• Compression reduces TCO even more
• Full application transparency

18
Step 3 to Business ILM

3. Create Data Access


and Migration
Policies

Less
Active Historical
Active

DIGITAL DATA STORAGE

High Performance Low Cost Online Archive Offline


Storage Tier Storage Tier Storage Tier Archive

19
Define Access Policies

• Access Policies determine data


Authorized Special visibility
Users Users • Only authorized data
• Only recent data
• Special users or operations
Access
access historical data
Policy • Hiding historical data speeds up
data scans and maintenance
• Implement Access Policies
using Views and Virtual Private
Database
High Performance Low Cost Online Archive • Transparent to the application
Storage Tier Storage Tier Storage Tier

20
Migrate Data between Classes

High Performance • Periodically move data between


Storage Tier
storage tiers as access patterns
change
• e.g. MOVE PARTITION holding Q2 Orders
from high performance storage tier to low
cost storage tier
Q2 • Move important data on demand
Orders
• UPDATE of partition key will cause row to
Low Cost move to a new partition
Storage Tier • e.g. product warranty expires

21
Step 4 to Business ILM
4. Define and
Enforce
Compliance
Policies

DIGITAL DATA STORAGE

High Performance Low Cost Online Archive Offline


Storage Tier Storage Tier Storage Tier Archive

22
Elements of Compliance Policy

• Retention
• Ensure data is retained unmodified for a specific time period
• Immutability
• Prove to external party that data is complete and unmodified
• Privacy
• Protect personal and other sensitive data
• Auditing
• Track and report changes to important data
• Expiration
• Expunge stale data to limit liability

23
<Insert Picture Here>

Implementing ILM at
Starwood Hotels

24
System Overview

• OLTP Database – 4.5 TB


• 3-node RAC on HPUX on Itanium Chipset
• Storage – EMC DMX, ClariiOn and SATA
• ASM
• DW Database – 16 TB
• Single Instance on HPUX on PA-RISC
• Storage – EMC ClariiOn and SATA
• ASM
• Both OLTP and DW are used for marketing
campaigns. OLTP has some historical data too.
• Retention – 18 months in OLTP, 3 years in DW and 7
years in tape

25
Objectives

• Cost
• Cheaper Storage
• Access Time
• Make Full Table Scans as fast as possible
• Backup:
• Reduce the total data to be backed up
• Quick backup
• ETL
• Quick and more efficient
• Archive and Purge
• quickly and efficiently based on retention requirements

26
How We did Partitioning

• Partitioned RANGE based on date, e.g. CREATE_DT


• Partitions named P<mm><yy>, e.g. P0406
• Each partition is created on a different tablespace
• However, to reduce the number of tablespaces, use a
single tablespace for the partitions of the same date
range of all tables, e.g.
• RES_TS0406 contains the partition P0406 of
RESERVATION_HEADER and the partition P0406 of
RESERVATION_DETAILS and so on, all reservation related
tables.

27
Placing Partitions on Tiered Storage

• Plan to have one ASM diskgroup for a tablespace;


may not always be possible.
• If not, then have an ASM group for a given date
range, e.g. diskgroup DG0406 holds tablespaces
RES_TS0406 and CNSMP_TS0406.
• The tablespaces go into the appropriate storage.

28
Moving Data

Three options
1. Moving partitions across tablespaces
• ALTER TABLE T1 MOVE PARTITION P1 TABLESPACE
TS1
2. Moving whole tablespaces across storage tiers using
datafile rename technique
• ALTER TABLESPACE TS1 READ ONLY;
• Copy the files to a lower storage tier
• After completion, ALTER TABLESPACE TS1 OFFLINE
• Finally ALTER DATABASE RENAME FILE …
3. Moving data transparently in ASM

29
Data Movement Using ASM

DG0406 Disk c7t4d12 is EMC DMX (Tier 1 – Fastest)


d12 Diskgroup DG0406 is built on d12
CREATE DISKGROUP DG0406 DISKS
‘/dev/rdsk/c7t4d12’
Tablespace TS0406 is built on the Disk Group
DG0406
CREATE TABLESPACE TS0406
DATAFILE ‘+DG0406/ts0406.dbf’;
Partition P0406 of table RES_HDR is on the
tablespace TS0406.

30
Data Movement Using ASM, contd.

DG0406
d12 d22

Disk c7t4d22 is EMC ClariiOn (Tier 2 – Slower)


Add it to the diskgroup DG0406
ALTER DISKGROUP DG0406 ADD DISK
‘/dev/rdsk/c7t4d22’
Data is spread over the two disks

31
Data Movement Using ASM, contd 2.

DG0406
d12 d22

Finally, drop the disk c7t4d12 (Tier 1)


ALTER DISKGROUP DG0406 DROP DISK
‘/dev/rdsk/c7t4d12’
Data is only on d22 (Tier 2)

32
VPD Usage

• We support two different companies


• Starwood Hotels
• Starwood Vacation Ownership (SVO)
• Each shouldn’t see the other’s data
• Challenges:
• Deciding the predicate (the WHERE clause applied to the
VPD policy)
• Ended up adding a new column to major tables
• New Indexes Required

33
Proving Data hasn’t Changed

• Once the data is made read only


• calculate the hash value using DBMS_CRYPT.HASH
• This hash value is stored on a separate, control
database
• Every 6 months hash value is recalculated and
checked against the stored value

34
FGA usage

• Fine Grained Auditing


• capture SQL statements, even for SELECT
• We use these statements
• to analyze the user query patterns
• use as an input to indexing decisions and better data
modelling, e.g. using IOTs, clusters and so on.

35
Archiving

• Two step approach:


• Convert the partition to a standalone table
• Transport the tablespace
• Example: Partition P0406 (June 2004) of table
RES_HDR is ready for archiving
• CREATE TABLE RES_HDR_0406 AS
SELECT * FROM RES_HDR;
• ALTER TABLE RES_HDR EXCHANGE PARTITION P0406
WITH RES_HDR_0406 INCLUDING INDEXES
• ALTER TABLE RES_HDR DROP PARTITION P0406
• Transport the tablespace

36
Future Plans

• Making this a repeatable and simple process


• Making ILM a part of the data modelling initiative
• Making the developers, architects, system designers
aware of the need to have ILM at the outset of the
design
• Developing a set of best practices guidelines, e.g.
• Think partitioning when planning tables, indexes
• Think of ILM when doing partitioning, not just performance
• Avoid nullable columns
• … and so on …

37
<Insert Picture Here>

ILM Assistant
For Managing your
ILM Environment

38
Lifecycle Calendar

39
Lifecycle Definition

40
Lifecycle Tables

41
Cost Summary

42
Lifecycle Events

43
Proving Data hasn’t Changed

44
Conclusion • Oracle Database for ILM
• Implements policies in a
Financial Data application transparent fashion
Customer Data • Stores maximum data for
Product Data
lowest cost
• Centralizes compliance
enforcement

Less
Active Historical
Active

DIGITAL DATA STORAGE

High
Performance Low Cost Online Archive Offline
Storage Tier Storage Tier Storage Tier Archive

45
Further Information

OTN
For white papers, presentations, eSeminar

http://www.oracle.com/technology/deploy/ilm/index.html

46
47

Vous aimerez peut-être aussi