Vous êtes sur la page 1sur 9

Informatica PowerCenter on Grid for Greater

Performance and Scalability


ETL Design , Performance Tips

Informatica has developed a solution that leverages the power of grid computing for greater data
integration scalability and performance. The grid option delivers the load
balancing, dynamic partitioning, parallel processing and high availability to ensure optimal
scalability, performance and reliability. In this article lets discuss how to setup Infrmatica Workflow
to run on grid.

What is PowerCenter On Grid


Performance Improvement Features
Pushdown Optimization
Pipeline Partitions
Dynamic Partitions
Concurrent Workflows
Grid Deployments
Workflow Load Balancing
When a PowerCenter domain contains multiple nodes, you can configure workflows and sessions to run on a
grid. When you run a workflow on a grid, the Integration Service runs a service process on each available
node of the grid to increase performance and scalability. When you run a session on a grid, the Integration
Service distributes session threads to multiple DTM processes on nodes in the grid to increase performance
and scalability.
Domain : A PowerCenter domain consists of one or more nodes in the grid environment. PowerCenter
services run on the nodes. A domain is the foundation for PowerCenter service administration.
Node : A node is a logical representation of a physical machine that runs a PowerCenter service.

Admin Console with Grid Configuration


Below shown is an Informatica Admin Console, with two node Grid configuration. We can see two nodes
Node_1, Node_2 and the Node_GRID grid created using two nodes. The integration service
Int_service_GRID is running on the grid.
Setting up Workflow on Grid
When you setup a workflow to run grid, the Integration Service distributes workflows across the nodes in a
grid. It also distributes the Session, Command, and predefined Event-Wait tasks within workflows across the
nodes in a grid.

You can setup the workflow to run on grid as shown in below image.You can assign the integration service,
which is configured on grid to run the workflow on grid.
Setting up Session on Grid
When you run a session on a grid, the Integration Service distributes session threads across nodes in a grid.
The Load Balancer distributes session threads to DTM processes running on different nodes. You might want
to configure a session to run on a grid when the workflow contains a session that takes a long time to run.

You can setup the session to run on grid as shown in below image.
Workflow Running on Grid
Below workflow monitor screen shots sows a workflow running on grid. You see two of the session in the
workflow wf_Load_CUST_DIM run on Node_1 and other one on Node_1 from 'Task Progress Details' Window.
Key
Features and Advantages of Grid
 Load Balancing : While facing spikes in data processing, load balance guarantees smooth operations
by switching the data processing between nodes on the grid. The node is chosen dynamically based on
process size, CPU utilization, memory requirements etc...
 High Availability : Grid complements the High Availability feature or PowerCenter by switching the
master node in case of a node failure. This ensures the monitoring and the shorten time needed for
recovery processes.
 Dynamic Partitioning : Dynamic Partitioning helps making the best use of currently available nodes on
the grid. By adapting to available resources, it also helps increasing the performance of the whole ETL
process.

Hope you enjoyed this article, please leave us a comment or feedback if you have any, we are happy to hear
from you.

Vous aimerez peut-être aussi