Académique Documents
Professionnel Documents
Culture Documents
Partitioning Sessions
Performance can be improved by processing data in parallel in a single session by creating multiple partitions of the pipeline. If you have PowerCenter partitioning available, you can increase the number of partitions in a pipeline to improve session performance. Increasing the number of partitions allows the Integration Service to create multiple connections to sources and process partitions of source data concurrently.
Session Partition
THREAD 1
WRITER
THREAD 2
Partition Types
Round-robin Partitioning Hash Partitioning Key Range Partitioning Pass-through Partitioning
Partition Types
Round-robin Partitioning
The Integration service distributes data evenly among all partitions. Use round-robin partitioning when you need to distribute rows evenly and do not need to group data among partitions.
Hash Partitioning
The PowerCenter Server uses a hash function to group rows of data among partitions. The Server groups the data based on a partition key. There are two types of hash partitioning:
Partition Types
Hash auto-keys. The Integration Service uses all grouped or
sorted ports as a compound partition key. You can use hash auto-keys partitioning at or before Rank, Sorter, and unsorted Aggregator transformations to ensure that rows are grouped properly before they enter these transformations.
Partition Types
Key Range Partitioning
With this type of partitioning, you specify one or more ports to form a compound partition key for a source or target. The Integration Service then passes data to each partition depending on the ranges you specify for each port.
Pass-through Partitioning
In this type of partitioning, the Integration Service passes all rows at one partition point to the next partition point without redistributing them.
To obtain expected results and get best performance when partitioning a sorter/Aggregator transformation, you must group and sort data. To group data, ensure that rows with the same key value are routed to the same partition. The best way to ensure that data is grouped and distributed evenly among partitions is to add a hash auto-keys partition.
Hash functions can be used to locate records in a large file which have similar keys. For that purpose, one needs a hash function that maps similar keys to hash values that differ by at most m, where m is a small integer (say, 1 or 2). The Hash function groups the similar records in the same bucket.
Summary
This presentation showed you how to: Problem Definition Informatica Partitions Approach the performance tuning challenge