Best Partitioning

Get the best out of Oracle Partitioning
Yasin Mohammed
Technology Consultant
yasin.mohammed@oracle.com
Nirmal Grewal
Technology Sales Representative
nirmal.grewal@oracle.com
Agenda
Partitioning in a nutshell
Getting optimal pruning
Partition exchange loading
Partitioning and unusable indexes
Efficient statistics management
Q&A
The Concept of Partitioning

Simple Yet Powerful
Large Table
Partition
Composite Partition
Difficult to Manage
Divide and Conquer
Better Performance
Easier to Manage
More flexibility to match

business needs
Improve Performance
Transparent to applications
What is Oracle Partitioning?
It is
Powerful functionality to logically partition objects into
smaller pieces
Only driven by business requirements
Partitioning for Performance, Manageability, and
Availability
It is not
Just a way to physically divide or clump - any large
data set into smaller buckets
Enabling pre-requirement to support a specific
hardware/software design
Hash mandatory for shared nothing systems
Physical versus Logical Partitioning

Shared Nothing Architecture
Physical Partitioning
Fundamental system setup
requirement
Node owns piece of DB
Enables parallelism
DB
DB
DB
Number of partitions is equivalent to min.

parallelism
Always needs HASH distribution

Equally sized partitions per node required
for proper load balancing
Physical versus Logical Partitioning

Shared Everything Architecture - Oracle
Logical Partitioning
Does not underlie any constraints
SMP, MPP, Cluster, Grid does not matter
Purely based on the business

requirement
Availability, Manageability, Performance
DB
Beneficial for every environment

Provides the most comprehensive
functionality
Agenda
Q&A
Partition Pruning
Q: What was the total
sales for the weekend
of May 20 - 22 2008?
Sales Table
May 18th 2008
May 19th 2008
May 20th 2008
Select sum(sales_amount)
From SALES
May 21st 2008
Where sales_date between
to_date(05/20/2008,MM/DD/YYYY)
May 22nd 2008
And
to_date(05/23/2008,MM/DD/YYYY);
Only the 3
relevant
partitions are
accessed
May 23rd 2008
May 24th 2008
Partition Pruning
Works for simple and complex SQL statements

Support for every data access
Transparent to any application

No extra coding required
Two flavors of pruning

Static pruning at compile time
Dynamic pruning at runtime
Complementary to Exadata Storage Server

Partitioning prunes logically through partition elimination
Exadata prunes physically through storage indexes
Further data reduction through filtering and projection
Static Partition Pruning
Relevant Partitions are known at compile time

Look for actual values in PSTART/PSTOP columns in the
plan
Optimizer has most accurate information for the SQL

statement
SELECT sum(amount_sold) FROM sales
WHERE times_id
BETWEEN 01-MAR-2004 and 31-MAY-2004;
04-Jan 04-Feb 04-Mar 04-Apr 04-May 04-Jun
Static Pruning
Sample plan
Static Pruning
Sample plan
Dynamic Partition Pruning
Advanced Pruning mechanism for

complex queries
Recursive statement evaluates the
relevant partitions at runtime
04-Jan
04-Feb
04-Mar
Look for the word KEY in PSTART/PSTOP

columns in the Plan
04-Apr
04-May
Time
04-Jun
Sales
SELECT sum(amount_sold)
FROM sales s, times t
WHERE t.time_id = s.time_id
AND
t.calendar_month_desc IN
(MAR-2004, APR-2004,
MAY-2004);

Nested Loop
Sample plan
Sample explain plan output

Nested Loop
Sample plan
Sample explain plan output

Subquery pruning
Sample plan

Bloom filter pruning
Sample plan
Enhanced Pruning Capabilities

Oracle Database 11g Release 2
Extended modeling capabilities for better data
placement and pruning
Support for virtual columns as primary and foreign key for
Reference Partitioning
Enhanced optimizer support for Partitioning

AND pruning
Intelligent multi-branch execution plan with unusable index
partitions
20
AND Pruning
All predicates on partition key will used for pruning

Dynamic and static predicates will now be used combined
A.k.a. multi-predicate pruning
Example:
Star transformation with pruning predicate on both the FACT
Dynamic pruning
table and a dimension
FROM sales s, times t
Static pruning
WHERE s.time_id = t.time_id ..
AND t.fiscal_year in (2000,1999)
AND s.time_id
between TO_DATE('01-JAN-1999','DD-MON-YYYY')
and TO_DATE('01-JAN-2000','DD-MON-YYYY')
21
AND Pruning
Sample plan
Ensuring Partition Pruning
Dont use functions on partition key filter predicates
Ensuring Partition Pruning
Dont use functions on partition key filter predicates
Agenda
Q&A
Partition Exchange loading

DBA
1. Create external table
for flat files
2. Use CTAS command

to create nonpartitioned table
TMP_SALES
Tmp_ sales Table
3. Create indexes
Tmp_ sales
Table
Sales Table
Sales Table
May 18th 2008
May 18th 2008
May 19th 2008
May 19th 2008
May 20th 2008
May 20th 2008
May 21st 2008
May 21st 2008
May 22nd 2008
May 23rd 2008
May 24th 2008
4. Alter table Sales

exchange partition
May_24_2008 with table
tmp_sales
5. Collect
stats
Sales
table now
has all the
data
May 22nd 2008
May 23rd 2008
May 24th 2008
Agenda
Q&A
Unusable Indexes
Unusable index partitions are commonly used in

environments with fast load requirements
Safe the time for index maintenance at data insertion
Unusable index segments do not consume any space (11.2)
Unusable indexes are ignored by the optimizer

SKIP_UNUSABLE_INDEXES = [TRUE | FALSE ]
Partitioned indexes can be used by the optimizer

even if some partitions are unusable
Prior to 11.2, static pruning and only access of usable index
partitions mandatory
With 11.2, intelligent rewrite of queries using UNION ALL
Intelligent Multi-Branch Execution
Intelligent UNION ALL expansion in the presence of

partially unusable indexes
Transparent internal rewrite
Usable index partitions will be used
Full partition access for unusable index partitions
Multi-Branch Execution
Sample plan
Agenda
Q&A
Statistics Gathering
You must gather optimizer statistics

Using dynamic sampling is not an adequate solution
Statistics on global and partition level recommended
Run all queries against empty tables to populate

column usage
This helps identify which columns automatically get
histograms created on them
Optimizer statistics should be gathered after the data

has been loaded but before any indexes are created
Oracle will automatically gather statistics for indexes as they
are being created
Efficient Statistics Management
Use AUTO_SAMPLE_SIZE
The only setting that enables new efficient statistics collection
Hash based algorithm, scanning the whole table
Speed of sampling, accuracy of compute
Enable incremental global statistics collection

Avoids scan of all partitions after changing single partitions
Prior to 11.1, scan of all partitions necessary for global stats
Managed on per table level
Static setting
Incremental Global Statistics

Sales Table
1. Partition level stats are

gathered & synopsis
created
May 18th 2008
May 19th 2008
2. Global stats generated by

aggregating partition
synopsis
May 20th 2008
May 21st 2008
May 22nd 2008
May 23rd 2008
Sysaux Tablespace
Incremental Global Statistics Contd

3. A new partition
is added to the
Sales Table table & Data is
Loaded
th
May 18 2008
May 19th 2008
May 20th 2008
6. Global stats generated by

aggregating the original
partition synopsis with the
new one
May 21st 2008
May 22nd 2008
May 23rd 2008
May 24th 2008
5. Retrieve synopsis for

each of the other
partitions from Sysaux
4. Gather partition
statistics for new
partition
Sysaux Tablespace
Step necessary to gather accurate statistics

Turn on incremental feature for the table
EXEC
DBMS_STATS.SET_TABLE_PREFS('SH,'SALES','INCREMENTAL','TRUE');
After load gather table statistics using GATHER_TABLE_STATS

No need to specify parameters
EXEC DBMS_STATS.GATHER_TABLE_STATS('SH','SALES');
The command will collect statistics for partitions and update the global
statistics based on the partition level statistics and synopsis
Possible to set incremental to true for all tables
Only works for already existing tables
EXEC DBMS_STATS.SET_GLOBAL_PREFS('INCREMENTAL','TRUE');
Partition Advisor SQL Access Advisor
Summary
Demo (Performance & availability)
Partitioning Demonstration
Data Partitioning provides significant Service Benefits

Date 16/03/2010
Scenario Partitioning for Reliability
Two interactive scenarios will demonstrate:

Query performance between Partitioned vs NonPartitioned data.
Query resilience against unanticipated events

affecting data availability.
Demo Data Overview : Sales Information

Below tables hold same sales information, with different storage structures
Size: 5,513,058 sales entries
Table: SALES_p1
Partitioning Scheme used:
Initially Partitioned into yearly, halve-yearly, and quarterly periods
Further Partitioned by country regions
Table: SALES_nop1
For comparison purposes, a similar non-partitioned table is created.
450000
400000
350000
300000
250000
200000
150000
100000
50000
0
Demo Data Overview : Customer Information

Below tables hold same customer information, with different storage structures
Size: 832,500 customer entries
Table: CUSTOMER_p1
Partitioning Scheme used:
Partitioned by country regions
Table: CUSTOMER_nop1
For comparison purposes, a similar table is created as non-partitioned.
Demo Distribution of Customers across

Countries:
Brazil
300000
250000
200000
150000
100000
50000
0
Denmark
Poland
South Africa
China
United Kingdom
New Zealand
Saudi Arabia
United States of America

Germany
Spain
France
COUNT(C.CUST_ID)
Australia
Canada
Singapore
Argentina
Italy
Japan
Turkey
Demonstration Infrastructure
Equipment
Amazon Cloud-based Virtual Machine Image (AMI)

2 Core, 1.7 GB Memory
Oracle Linux OS (OEL 5)
Software Configuration of Public Amazon AMI
Oracle 11g Enterprise Edition (v11.1.0.7)

One disk /u02 (dev8-2) dedicated to Oracle storage I/O for
benchmark accuracy.
Oracle Sample Data (i.e. SH repository )installed and
extended for demo purposes.
Scenario 1 Key Performance Benefits
Areas highlighted in red show the resulting overhead of accessing normal table structures.
Areas highlighted in green reveal the overhead benefits of accessing partitions instead the whole table.
Note: Above graph details were generated by sar directives from a VMWare image installed on a notebook. The Demo will use an Amazon AMI.
Scenario 2 Key Availability Benefits

Database files that hold data
involved in below queries have
been accidently removed !!!!
Non Partitioned Table
Partitioned Table
For a particular date range, i.e. 1999 - 2000,

relevant non-partitioned tables are not
reachable resulting in an error.
As expected, for a particular date range, i.e.

1999 - 2000, data in a partitioned table is
not reachable resulting in an error.
Note: sales_nop2 data resides in example tablespace which in-turn references datafile example01.dbf.
Scenario 2 Key Availability Benefits

(Cont.)
related to database file: example01_01.dbf

related to missing file: example02_01.dbf
related to database file: example03_01.dbf
Non Partitioned Table
Partitioned Table
More current date range, i.e. 2000 - 2001,

data in is still not reachable resulting in an
error.
More current information, i.e. 2000 - 2001, data is

now reachable as a result of using partitions to
better isolate against data disruptions.
Demo Summary
Performance improvement - Scenario 1
Up to 3 times faster than traditional methods of data retrieval.
Availability Enhancement - Scenario 2
Limits detrimental effects of data access failures.
In General
Improves system scalability and manageability.
Adds data-level of protection to any High Availability strategy.
Data Partitioning provides significant Service Benefits
Next Steps
Upcoming Webinars
More coming soon!!!
http://otn.oracle.com/database
Follow OracleDirect ANZ on Twitter at
http://www.twitter.com/OracleDirectANZ
Or our NEW!!! blog http://blogs.oracle.com/techtalk

Best Partitioning

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Best Partitioning

Transféré par

Droits d'auteur :

Formats disponibles

Get the best out of Oracle Partitioning

The Concept of Partitioning

Divide and Conquer

More flexibility to match

What is Oracle Partitioning?

Physical versus Logical Partitioning

Number of partitions is equivalent to min.

Always needs HASH distribution

Physical versus Logical Partitioning

Purely based on the business

Beneficial for every environment

May 19th 2008

May 20th 2008

May 21st 2008

Where sales_date between

May 22nd 2008

May 23rd 2008

May 24th 2008

Works for simple and complex SQL statements

Transparent to any application

Two flavors of pruning

Complementary to Exadata Storage Server

Static Partition Pruning

Relevant Partitions are known at compile time

Optimizer has most accurate information for the SQL

04-Jan 04-Feb 04-Mar 04-Apr 04-May 04-Jun

Dynamic Partition Pruning

Advanced Pruning mechanism for

Look for the word KEY in PSTART/PSTOP

Dynamic Partition Pruning

Sample explain plan output

Dynamic Partition Pruning

Sample explain plan output

Dynamic Partition Pruning

Dynamic Partition Pruning

Enhanced Pruning Capabilities

Enhanced optimizer support for Partitioning

All predicates on partition key will used for pruning

Ensuring Partition Pruning

Dont use functions on partition key filter predicates

Ensuring Partition Pruning

Dont use functions on partition key filter predicates

Partition Exchange loading

2. Use CTAS command

May 18th 2008

May 18th 2008

May 19th 2008

May 19th 2008

May 20th 2008

May 20th 2008

May 21st 2008

May 21st 2008

May 22nd 2008

May 23rd 2008

May 24th 2008

4. Alter table Sales

May 22nd 2008

May 23rd 2008

May 24th 2008

Unusable index partitions are commonly used in

Unusable indexes are ignored by the optimizer

Partitioned indexes can be used by the optimizer

Intelligent Multi-Branch Execution

Intelligent UNION ALL expansion in the presence of