Vous êtes sur la page 1sur 51

Get the best out of Oracle Partitioning

Yasin Mohammed
Technology Consultant
yasin.mohammed@oracle.com

Nirmal Grewal
Technology Sales Representative
nirmal.grewal@oracle.com

Agenda

Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Q&A

The Concept of Partitioning


Simple Yet Powerful

Large Table

Partition

Composite Partition

Difficult to Manage

Divide and Conquer

Better Performance

Easier to Manage

More flexibility to match


business needs

Improve Performance

Transparent to applications

What is Oracle Partitioning?

It is
Powerful functionality to logically partition objects into
smaller pieces
Only driven by business requirements
Partitioning for Performance, Manageability, and
Availability

It is not
Just a way to physically divide or clump - any large
data set into smaller buckets
Enabling pre-requirement to support a specific
hardware/software design
Hash mandatory for shared nothing systems

Physical versus Logical Partitioning


Shared Nothing Architecture
Physical Partitioning
Fundamental system setup
requirement
Node owns piece of DB

Enables parallelism

DB

DB

DB

Number of partitions is equivalent to min.


parallelism

Always needs HASH distribution


Equally sized partitions per node required
for proper load balancing

Physical versus Logical Partitioning


Shared Everything Architecture - Oracle
Logical Partitioning
Does not underlie any constraints
SMP, MPP, Cluster, Grid does not matter

Purely based on the business


requirement
Availability, Manageability, Performance

DB

Beneficial for every environment


Provides the most comprehensive
functionality

Agenda

Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Q&A

Partition Pruning
Q: What was the total
sales for the weekend
of May 20 - 22 2008?

Sales Table
May 18th 2008

May 19th 2008

May 20th 2008

Select sum(sales_amount)
From SALES

May 21st 2008

Where sales_date between

to_date(05/20/2008,MM/DD/YYYY)

May 22nd 2008

And
to_date(05/23/2008,MM/DD/YYYY);

Only the 3
relevant
partitions are
accessed

May 23rd 2008

May 24th 2008

Partition Pruning

Works for simple and complex SQL statements


Support for every data access

Transparent to any application


No extra coding required

Two flavors of pruning


Static pruning at compile time
Dynamic pruning at runtime

Complementary to Exadata Storage Server


Partitioning prunes logically through partition elimination
Exadata prunes physically through storage indexes
Further data reduction through filtering and projection

Static Partition Pruning

Relevant Partitions are known at compile time


Look for actual values in PSTART/PSTOP columns in the
plan

Optimizer has most accurate information for the SQL


statement
SELECT sum(amount_sold) FROM sales
WHERE times_id
BETWEEN 01-MAR-2004 and 31-MAY-2004;

04-Jan 04-Feb 04-Mar 04-Apr 04-May 04-Jun

Static Pruning

Sample plan

Static Pruning

Sample plan

Dynamic Partition Pruning

Advanced Pruning mechanism for


complex queries
Recursive statement evaluates the
relevant partitions at runtime

04-Jan
04-Feb

04-Mar

Look for the word KEY in PSTART/PSTOP


columns in the Plan

04-Apr
04-May
Time

04-Jun
Sales

SELECT sum(amount_sold)
FROM sales s, times t
WHERE t.time_id = s.time_id
AND
t.calendar_month_desc IN
(MAR-2004, APR-2004,
MAY-2004);

Dynamic Partition Pruning


Nested Loop
Sample plan

Sample explain plan output

Dynamic Partition Pruning


Nested Loop
Sample plan

Sample explain plan output

Dynamic Partition Pruning


Subquery pruning
Sample plan

Dynamic Partition Pruning


Bloom filter pruning
Sample plan

Enhanced Pruning Capabilities


Oracle Database 11g Release 2
Extended modeling capabilities for better data
placement and pruning
Support for virtual columns as primary and foreign key for
Reference Partitioning

Enhanced optimizer support for Partitioning


AND pruning
Intelligent multi-branch execution plan with unusable index
partitions

20

AND Pruning

All predicates on partition key will used for pruning


Dynamic and static predicates will now be used combined
A.k.a. multi-predicate pruning

Example:
Star transformation with pruning predicate on both the FACT
Dynamic pruning
table and a dimension
FROM sales s, times t
Static pruning
WHERE s.time_id = t.time_id ..
AND t.fiscal_year in (2000,1999)
AND s.time_id
between TO_DATE('01-JAN-1999','DD-MON-YYYY')
and TO_DATE('01-JAN-2000','DD-MON-YYYY')

21

AND Pruning

Sample plan

Ensuring Partition Pruning

Dont use functions on partition key filter predicates

Ensuring Partition Pruning

Dont use functions on partition key filter predicates

Agenda

Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Q&A

Partition Exchange loading


DBA
1. Create external table
for flat files

2. Use CTAS command


to create nonpartitioned table
TMP_SALES
Tmp_ sales Table

3. Create indexes

Tmp_ sales
Table

Sales Table

Sales Table

May 18th 2008

May 18th 2008

May 19th 2008

May 19th 2008

May 20th 2008

May 20th 2008

May 21st 2008

May 21st 2008

May 22nd 2008

May 23rd 2008

May 24th 2008

4. Alter table Sales


exchange partition
May_24_2008 with table
tmp_sales

5. Collect
stats
Sales
table now
has all the
data

May 22nd 2008

May 23rd 2008

May 24th 2008

Agenda

Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Q&A

Unusable Indexes

Unusable index partitions are commonly used in


environments with fast load requirements
Safe the time for index maintenance at data insertion
Unusable index segments do not consume any space (11.2)

Unusable indexes are ignored by the optimizer


SKIP_UNUSABLE_INDEXES = [TRUE | FALSE ]

Partitioned indexes can be used by the optimizer


even if some partitions are unusable
Prior to 11.2, static pruning and only access of usable index
partitions mandatory
With 11.2, intelligent rewrite of queries using UNION ALL

Intelligent Multi-Branch Execution

Intelligent UNION ALL expansion in the presence of


partially unusable indexes
Transparent internal rewrite
Usable index partitions will be used
Full partition access for unusable index partitions

Multi-Branch Execution

Sample plan

Agenda

Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Q&A

Statistics Gathering

You must gather optimizer statistics


Using dynamic sampling is not an adequate solution
Statistics on global and partition level recommended

Run all queries against empty tables to populate


column usage
This helps identify which columns automatically get
histograms created on them

Optimizer statistics should be gathered after the data


has been loaded but before any indexes are created
Oracle will automatically gather statistics for indexes as they
are being created

Efficient Statistics Management

Use AUTO_SAMPLE_SIZE
The only setting that enables new efficient statistics collection
Hash based algorithm, scanning the whole table
Speed of sampling, accuracy of compute

Enable incremental global statistics collection


Avoids scan of all partitions after changing single partitions
Prior to 11.1, scan of all partitions necessary for global stats
Managed on per table level
Static setting

Incremental Global Statistics


Sales Table

1. Partition level stats are


gathered & synopsis
created

May 18th 2008

May 19th 2008

2. Global stats generated by


aggregating partition
synopsis

May 20th 2008

May 21st 2008

May 22nd 2008

May 23rd 2008

Sysaux Tablespace

Incremental Global Statistics Contd


3. A new partition
is added to the
Sales Table table & Data is
Loaded
th
May 18 2008

May 19th 2008

May 20th 2008

6. Global stats generated by


aggregating the original
partition synopsis with the
new one

May 21st 2008

May 22nd 2008

May 23rd 2008

May 24th 2008

5. Retrieve synopsis for


each of the other
partitions from Sysaux

4. Gather partition
statistics for new
partition
Sysaux Tablespace

Step necessary to gather accurate statistics


Turn on incremental feature for the table
EXEC
DBMS_STATS.SET_TABLE_PREFS('SH,'SALES','INCREMENTAL','TRUE');

After load gather table statistics using GATHER_TABLE_STATS


No need to specify parameters
EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','SALES');

The command will collect statistics for partitions and update the global
statistics based on the partition level statistics and synopsis
Possible to set incremental to true for all tables
Only works for already existing tables
EXEC DBMS_STATS.SET_GLOBAL_PREFS('INCREMENTAL','TRUE');

Partition Advisor SQL Access Advisor

Summary

Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Demo (Performance & availability)

Partitioning Demonstration

Data Partitioning provides significant Service Benefits


Date 16/03/2010

Scenario Partitioning for Reliability

Two interactive scenarios will demonstrate:


Query performance between Partitioned vs NonPartitioned data.

Query resilience against unanticipated events


affecting data availability.

Demo Data Overview : Sales Information


Below tables hold same sales information, with different storage structures
Size: 5,513,058 sales entries

Table: SALES_p1
Partitioning Scheme used:
Initially Partitioned into yearly, halve-yearly, and quarterly periods
Further Partitioned by country regions

Table: SALES_nop1
For comparison purposes, a similar non-partitioned table is created.
450000
400000
350000
300000
250000
200000
150000
100000
50000
0

Demo Data Overview : Customer Information


Below tables hold same customer information, with different storage structures
Size: 832,500 customer entries

Table: CUSTOMER_p1
Partitioning Scheme used:
Partitioned by country regions

Table: CUSTOMER_nop1
For comparison purposes, a similar table is created as non-partitioned.

Demo Distribution of Customers across


Countries:
Brazil

300000
250000
200000
150000
100000
50000
0

Denmark
Poland
South Africa
China
United Kingdom
New Zealand
Saudi Arabia

United States of America


Germany
Spain
France

COUNT(C.CUST_ID)

Australia
Canada
Singapore
Argentina
Italy
Japan
Turkey

Demonstration Infrastructure
Equipment

Amazon Cloud-based Virtual Machine Image (AMI)


2 Core, 1.7 GB Memory
Oracle Linux OS (OEL 5)

Software Configuration of Public Amazon AMI

Oracle 11g Enterprise Edition (v11.1.0.7)


One disk /u02 (dev8-2) dedicated to Oracle storage I/O for
benchmark accuracy.
Oracle Sample Data (i.e. SH repository )installed and
extended for demo purposes.

Scenario 1 Key Performance Benefits

Areas highlighted in red show the resulting overhead of accessing normal table structures.
Areas highlighted in green reveal the overhead benefits of accessing partitions instead the whole table.
Note: Above graph details were generated by sar directives from a VMWare image installed on a notebook. The Demo will use an Amazon AMI.

Scenario 2 Key Availability Benefits


Database files that hold data
involved in below queries have
been accidently removed !!!!

Non Partitioned Table

Partitioned Table

For a particular date range, i.e. 1999 - 2000,


relevant non-partitioned tables are not
reachable resulting in an error.

As expected, for a particular date range, i.e.


1999 - 2000, data in a partitioned table is
not reachable resulting in an error.

Note: sales_nop2 data resides in example tablespace which in-turn references datafile example01.dbf.

Scenario 2 Key Availability Benefits


(Cont.)

related to database file: example01_01.dbf


related to missing file: example02_01.dbf
related to database file: example03_01.dbf

Non Partitioned Table

Partitioned Table

More current date range, i.e. 2000 - 2001,


data in is still not reachable resulting in an
error.

More current information, i.e. 2000 - 2001, data is


now reachable as a result of using partitions to
better isolate against data disruptions.

Demo Summary
Performance improvement - Scenario 1
Up to 3 times faster than traditional methods of data retrieval.
Availability Enhancement - Scenario 2
Limits detrimental effects of data access failures.
In General
Improves system scalability and manageability.
Adds data-level of protection to any High Availability strategy.

Data Partitioning provides significant Service Benefits

Next Steps

Upcoming Webinars
More coming soon!!!

http://otn.oracle.com/database
Follow OracleDirect ANZ on Twitter at
http://www.twitter.com/OracleDirectANZ
Or our NEW!!! blog http://blogs.oracle.com/techtalk

Vous aimerez peut-être aussi