Académique Documents
Professionnel Documents
Culture Documents
PPI is used to improve performance for large tables when you submit queries that specify a range
constraint.
PPI allows you to reduce the number of rows to be processed by using partition elimination.
PPI will increase performance for incremental data loads, deletes, and data access
when working with large tables with range constraints
Lets take Order_Table, where we have both January and February dates in column Order_Date.
The Order_Table spread across the AMPs.Notice that January and February dates are mixed on every
AMP in what is a random order. This is because the Primary Index is Order_Number.
When we apply Range Query, that means it uses the keyword BETWEEN.
The BETWEEN keyword in Teradata means find everything in the range BETWEEN this date and this
other date. We had no indexes on Order_Date so it is obvious the PE will command the AMPs to do a Full
Table Scan. To avoid full table scan, we will Partition the table.
Each AMP always sorts its rows by the Row-ID in order to do a Binary Search on Primary Index queries.
Notice that the rows on an AMP dont change AMPs because the table is partitioned. Remember it is the
Primary Index alone that will determine which AMP gets a row. If the table is partitioned then the AMP will
sort its rows by the partition.
Now we are running our Range Query on our Partitioned Table,each AMP only reads from one partition.
The Parsing Engine will not to do full table scan. It instructs the AMPs to each read from their January
Partition. You Partition a Table when you CREATE the Table.
A Partitioned Table is designed to eliminate a Full Table Scan, especially on Range Queries.
Types of partitioning:
RANGE_N Partitioning
Case_N Partitioning
CREATE TABLE ORDER_TABLE
(
ORDER_NO INTEGER NOT NULL,
CUST_NO INTERGER,
ORDER_DATE DATE,
ORDER_TOTAL DECIMAL(10,2)
)
PRIMARY INDEX(ORDER_NO)
PARTITION BY CASE_N
(ORDER_TOTAL < 1000,
ORDER_TOTAL < 2000,
ORDER_TOTAL < 5000,
ORDER_TOTAL < 10000,
ORDER_TOTAL < 20000,
NO CASE, UNKNOWN);
The UNKNOWN Partition is for an Order_Total with a NULL value. The NO CASE Partition is for partitions
that did not meet the CASE criteria.
For example, if an Order_Total is greater than 20,000 it wouldnt fall into any of the partitions so it goes to
the NO CASE partition.
Multi-Level Partitioning:
You can have up to 15 levels of partitions within partitions.
CREATE TABLE ORDER_TABLE
(
ORDER_NO INTEGER NOT NULL,
CUST_NO INTERGER,
ORDER_DATE DATE,
ORDER_TOTAL DECIMAL(10,2)
)
PRIMARY INDEX(ORDER_NO)
PARTITION BY (RANGE_N
(ORDER_DATE BETWEEN
DATE '2012-01-01' AND DATE '2012-12-31'
EACH INTERVAL '1' DAY)
CASE_N (ORDER_TOTAL < 5000,
ORDER_TOTAL < 10000,
ORDER_TOTAL < 15000,
ORDER_TOTAL < 20000,
NO CASE, UNKNOWN));
'I ','J ','K ','L ','M ','N ','O ','P ','Q ','R ','S ','T ',
'U ','V ','W ','X ','Y ','Z ' AND 'ZZ',UNKNOWN));
Partitioning Rules:
Partitioning determines how an AMP will sort the row on its own.
A table cannot have an UPI as the Primary Index if the Partition table does not include PI.
They provide efficient searches by using partition elimination at the various levels or combination
of levels.
They provide multiple access paths to the data, and an MLPPI provides even more partition
elimination and more partitioning expression choices, (i.e., you can use last name or some other
value that is more readily available to query on.)
The Primary Index may be either a UPI or a NUPI; a NUPI allows local joins to other similar
entities
Row hash locks are used for SELECT with equality conditions on the PI columns.
They allow for range queries without having to use a secondary index.
May be created on Volatile tables; global temp tables, base tables, and non-compressed join
indexes.