Vous êtes sur la page 1sur 10

2/20/2018 Informatica Dynamic Lookup Cache

Informatica Dynamic Lookup Cache


Written by Saurav Mitra Last Updated: 29 May 2016

A LookUp cache does not change its data once built. But what if the underlying table upon which lookup
was done changes the data after the lookup cache is created? Is there a way so that the cache always
remain up-to-date even if the underlying table changes?

Why do we need Dynamic Lookup Cache?


Let's think about this scenario. You are loading your target table through a mapping. Inside the mapping
you have a Lookup and in the Lookup, you are actually looking up the same target table you are loading.

You may ask me, "So? What's the big deal? We all do it quite often...". And yes you are right.

There is no "big deal" because Informatica (generally) caches the lookup table in the very beginning of the
mapping, so whatever record getting inserted to the target table through the mapping, will have no effect
on the Lookup cache.

The lookup will still hold the previously cached data, even if the underlying target table is changing.

But what if you want your Informatica Lookup cache to get updated as and when the data in the
underlying target table changes?

What if you want your lookup cache to always show the exact snapshot of the data in your target table at
that point in time? Clearly this requirement will not be fullfilled in case you use a static cache. You will need
a dynamic cache to handle this.

But in which scenario will someone need to use a dynamic cache? To understand this, let's first understand
a static cache scenario.

Static Lookup Cache Scenario


Let's suppose you run a retail business and maintain all your customer information in a customer master
table (RDBMS table). Every night, all the customers from your customer master table is loaded in to a
Customer Dimension table in your data warehouse. Your source customer table is a transaction system
table, probably in 3rd normal form, and does not store history. Meaning, if a customer changes his address,
the old address is updated with the new address.

But your data warehouse table stores the history (may be in the form of SCD Type-II). There is a map that
loads your data warehouse table from the source table. Typically you do a Lookup on target (static cache)
and check with every incoming customer record to determine if the customer is already existing in target or
not. If the customer is not already existing in target, you conclude the customer is new and INSERT the
record whereas if the customer is already existing, you may want to update the target record with this new
record (if the record is updated). This scenario - commonly known as 'UPSERT' (update else insert) scenario
- is illustrated below.

https://dwbi.org/etl/informatica/138-dynamic-lookup-cache 1/10
2/20/2018 Informatica Dynamic Lookup Cache

A static Lookup Cache to determine if a source record is new or updatable

You don't need dynamic Lookup cache for the above type of scenario.

Dynamic Lookup Cache Scenario


Notice in the previous example I mentioned that your source table is an RDBMS table. Generally speaking,
this ensures that your source table does not have any duplicate record.

But, What if you had a flat file as source with many duplicate records in the same bunch of data that you
are trying to load? (Or even a RDBMS table may also contain duplicate records)

Would the scenario be same if the bunch of data I am loading contains duplicate?

Unfortunately Not. Let's understand why from the below illustration. As you can see below, the new
customer "Linda" has been entered twice in the source system - most likely mistakenly. The customer
"Linda" is not present in your target system and hence does not exist in the target side lookup cache.

When you try to load the target table, Informatica processes row 3 and inserts it to target as customer
"Linda" does not exist in target. Then Informatica processes row 4 and again inserts "Linda" into target
since Informatica lookup's static cache can not detect that the customer "Linda" has already been inserted.
This results into duplicate rows in target.

The problem arising from above scenario can be resolved by using dynamic lookup cache

Here are some more examples when you may consider using dynamic lookup,

Updating a master customer table with both new and updated customer information coming together
as shown above
Loading data into a slowly changing dimension table and a fact table at the same time. Remember,
you typically lookup the dimension while loading to fact. So you load dimension table before loading
fact table. But using dynamic lookup, you can load both simultaneously.
https://dwbi.org/etl/informatica/138-dynamic-lookup-cache 2/10
2/20/2018 Informatica Dynamic Lookup Cache

Loading data from a file with many duplicate records and to eliminate duplicate records in target by
updating a duplicate row i.e. keeping the most recent row or the initial row
Loading the same data from multiple sources using a single mapping. Just consider the previous
Retail business example. If you have more than one shops and Linda has visited two of your shops
for the first time, customer record Linda will come twice during the same load.

How does dynamic lookup cache work


Once you have configured your lookup to use dynamic cache (we will see below how to do that), when
Integration Service reads a row from the source, it updates the lookup cache by performing one of the
following actions:

Inserts the row into the cache: If the incoming row is not in the cache, the Integration Service
inserts the row in the cache based on input ports or generated Sequence-ID. The Integration Service
flags the row as insert.
Updates the row in the cache: If the row exists in the cache, the Integration Service updates the
row in the cache based on the input ports. The Integration Service flags the row as update.
Makes no change to the cache: This happens when the row exists in the cache and the lookup is
configured or specified To Insert New Rows only or, the row is not in the cache and lookup is
configured to update existing rows only or, the row is in the cache, but based on the lookup
condition, nothing changes. The Integration Service flags the row as unchanged.

Notice that Integration Service actually flags the rows based on the above three conditions.

And that's a great thing, because, if you know the flag you can actually reroute the row to achieve
different logic.

Fortunately, as soon as you create a dynamic lookup Informatica adds one extra port to the lookup. This
new port is called:

NewLookupRow

Using the value of this port, the rows can be routed for insert, update or to do nothing. You just need to
use a Router or Filter transformation followed by an Update Strategy.

Oh, forgot to tell you the actual values that you can expect in NewLookupRow port are:

0 = Integration Service does not update or insert the row in the cache.
1 = Integration Service inserts the row into the cache.
2 = Integration Service updates the row in the cache.

When the Integration Service reads a row, it changes the lookup cache depending on the results of the
lookup query and the Lookup transformation properties you define. It assigns the value 0, 1, or 2 to the
NewLookupRow port to indicate if it inserts or updates the row in the cache, or makes no change.

Configuring a Dynamic Lookup - Mapping Example


Ok, I design a mapping for you to show Dynamic lookup implementation. I have given a full screenshot of
the mapping. Since the screenshot is slightly bigger, so I link it below. Just click to expand the image.

https://dwbi.org/etl/informatica/138-dynamic-lookup-cache 3/10
2/20/2018 Informatica Dynamic Lookup Cache

https://dwbi.org/etl/informatica/138-dynamic-lookup-cache 4/10
2/20/2018 Informatica Dynamic Lookup Cache

If you check the mapping screenshot, there I have used a router to reroute the INSERT group and UPDATE
group. The router screenshot is also given below. New records are routed to the INSERT group and existing
records are routed to the UPDATE group.

https://dwbi.org/etl/informatica/138-dynamic-lookup-cache 5/10
2/20/2018 Informatica Dynamic Lookup Cache

Dynamic Lookup Sequence ID


While using a dynamic lookup cache, we must associate each lookup/output port with an input/output port
or a sequence ID. The Integration Service uses the data in the associated port to insert or update rows in
the lookup cache. The Designer associates the input/output ports with the lookup/output ports used in the
lookup condition.

When we select Sequence-ID in the Associated Port column, the Integration Service generates a sequence
ID for each row it inserts into the lookup cache.

When the Integration Service creates the dynamic lookup cache, it tracks the range of values in the cache
associated with any port using a sequence ID and it generates a key for the port by incrementing the
greatest sequence ID existing value by one, when the inserting a new row of data into the cache.

When the Integration Service reaches the maximum number for a generated sequence ID, it starts over at
one and increments each sequence ID by one until it reaches the smallest existing value minus one. If the
Integration Service runs out of unique sequence ID numbers, the session fails.

Dynamic Lookup Ports


The lookup/output port output value depends on whether we choose to output old or new values when the
Integration Service updates a row:

Output old values on update: The Integration Service outputs the value that existed in the cache
before it updated the row.
Output new values on update: The Integration Service outputs the updated value that it writes in
the cache. The lookup/output port value matches the input/output port value.

Note: We can configure to output old or new values using the Output Old Value On Update
transformation property.

Handling NULL in dynamic LookUp


If the input value is NULL and we select the Ignore Null inputs for Update property for the associated input
port, the input value does not equal the lookup value or the value out of the input/output port. When you
select the Ignore Null property, the lookup cache and the target table might become unsynchronized if you
pass null values to the target. You must verify that you do not pass null values to the target.

When you update a dynamic lookup cache and target table, the source data might contain some null
values. The Integration Service can handle the null values in the following ways:

Insert null values: The Integration Service uses null values from the source and updates the lookup
cache and target table using all values from the source.
Ignore Null inputs for Update property : The Integration Service ignores the null values in the
source and updates the lookup cache and target table using only the not null values from the source.

If we know the source data contains null values, and we do not want the Integration Service to update the
lookup cache or target with null values, then we need to check the Ignore Null property for the
corresponding lookup/output port.

https://dwbi.org/etl/informatica/138-dynamic-lookup-cache 6/10
2/20/2018 Informatica Dynamic Lookup Cache

When we choose to ignore NULLs, we must verify that we output the same values to the target that the
Integration Service writes to the lookup cache. We can Configure the mapping based on the value we want
the Integration Service to output from the lookup/output ports when it updates a row in the cache, so that
lookup cache and the target table might not become unsynchronized.

New values. Connect only lookup/output ports from the Lookup transformation to the target.
Old values. Add an Expression transformation after the Lookup transformation and before the Filter
or Router transformation. Add output ports in the Expression transformation for each port in the
target table and create expressions to ensure that we do not output null input values to the target.

Some other details about Dynamic Lookup


When we run a session that uses a dynamic lookup cache, the Integration Service compares the values
in all lookup ports with the values in their associated input ports by default.

It compares the values to determine whether or not to update the row in the lookup cache. When a value in
an input port differs from the value in the lookup port, the Integration Service updates the row in the
cache.

But what if we don't want to compare all ports?

We can choose the ports we want the Integration Service to ignore when it compares ports. The Designer
only enables this property for lookup/output ports when the port is not used in the lookup condition. We
can improve performance by ignoring some ports during comparison. (Learn how to improve performance
of lookup transformation (/etl/informatica/158-tuning-informatica-lookup) here)

We might want to do this when the source data includes a column that indicates whether or not the row
contains data we need to update. Select the Ignore in Comparison property for all lookup ports except
the port that indicates whether or not to update the row in the cache and target table.

Note: We must configure the Lookup transformation to compare at least one port else the Integration
Service fails the session when we ignore all ports.

Prev (/etl/informatica/139-active-lookup-transformation)

Next (/etl/informatica/137-the-benefit-and-disadvantage-of-informatica-persistent-cache-lookup)

Correct!

Now make this answer count

Become a Certified Business Intelligence Professional


Get a Professional Certification to sky-rocket your career.
Know how... (https://dwbi.org/)

Popular
https://dwbi.org/etl/informatica/138-dynamic-lookup-cache 7/10
2/20/2018 Informatica Dynamic Lookup Cache

Top 20 SQL Interview Questions with Answers (/database/sql/72-top-20-sql-interview-questions-with-


answers)

Best Informatica Interview Questions & Answers (/etl/informatica/131-important-practical-interview-


questions)

Top 50 Data Warehousing/Analytics Interview Questions and Answers (/data-modelling/dimensional-


model/58-top-50-dwbi-interview-questions-with-answers)

Top 50 DWBI Interview Questions with Answers - Part 2 (/data-modelling/dimensional-model/59-top-50-


dwbi-interview-questions-with-answers-part-2)

Performance Considerations for Dimensional Modeling (/data-modelling/dimensional-model/22-


performance-considerations-for-dimensional-modeling)

Top 30 BusinessObjects interview questions (BO) with Answers (/analysis/business-objects/69-top-


businessobjects-interview-questions)

Also Read
SSIS- Lookup Transform (/etl/ssis/128-ssis-lookup-transform)

SSIS- Aggregate Transform (/etl/ssis/126-ssis-aggregate-transform)

How to implement SCD Type 3 in Data Services (/etl/sap-data-services/105-how-to-implement-scd-type-


3-in-data-services)

How to use Lookup and Join in SAP Data Services (/etl/sap-data-services/91-lookup-and-join-in-sap-data-


services)

Best Informatica Interview Questions & Answers (/etl/informatica/131-important-practical-interview-


questions)

Have a question on this subject?


Ask questions to our expert community members and clear your doubts. Asking question or
engaging in technical discussion is both easy and rewarding.

Ask a Question, we'll Answer

Are you on Twitter?


Start following us. This way we will always keep you updated with what's happening in Data
Analytics community. We won't spam you. Promise.

Follow @dwbic

https://dwbi.org/etl/informatica/138-dynamic-lookup-cache 8/10
2/20/2018 Informatica Dynamic Lookup Cache

Comparing Performance of SORT operation (Order By) in Informatica and Oracle (/etl/informatica/154-
informatica-oracle-sort-performance-test)
In this "DWBI Concepts' Original article", we put Oracle database and Informatica PowerCentre to lock
horns to prove which one of them handles data SORTing operation faster. This article gives a crucial
insight to application developer in order to...

Stop Hardcoding- Follow Parameterization Technique (/etl/informatica/150-stop-hardcoding-follow-


parameterization-technique)
This article tries to minimize hard-coding in ETL, thereby increasing flexibility, reusability, readabilty
and avoides rework through the judicious use of Informatica Parameters and Variables.

Using Informatica Normalizer Transformation (/etl/informatica/147-using-informatica-normalizer-


transformation)
Normalizer transformation is a native transformation in Informatica that can ease many complex data
transformation requirements. Learn how to effectively use normalizer in this tutorial.

Challenges of Informatica Partitioning with Sequencing (/etl/informatica/185-informatica-partition-


custom-sequencing)
In the previous article (//dwbi.org/etl/informatica/184), we showed how surrogate keys can be
generated without using Sequence Generator transformation. However, if Informatica partitioning is
implemented for such cases, then since each partition pipeline will call the lookup...

Working with Informatica Flatfiles (/etl/informatica/140-working-with-informatica-flatfiles)


In this article series we will try to cover all the possible scenarios related to flatfiles in Informatica.

Informatica Dynamic Lookup Cache (/etl/informatica/138-dynamic-lookup-cache)


A LookUp cache does not change its data once built. But what if the underlying table upon which
lookup was done changes the data after the lookup cache is created? Is there a way so that the cache
always remain up-to-date even if the underlying...

How to Tune Performance of Informatica Joiner Transformation (/etl/informatica/159-tuning-


informatica-joiner)
Joiner transformation allows you to join two heterogeneous sources in the Informatica mapping. You
can use this transformation to perform INNER and OUTER joins between two input streams. For
performance reasons, I recommend you ONLY use JOINER...

Generate Surrogate Key without using Sequence Generator (/etl/informatica/184-informatica-


surrogate-key-without-sequence-generator)
It is possible to generate sequential surrogate key in the target table without the use of an Informatica
Sequence Generator transformation. Using this option, one can avoid any gap in the sequence
numbers of the surrogate key.

Implementing SCD2 in Informatica Using ORA_HASH at Source (/etl/informatica/153-informatica-scd-


type2-ora-hash)
In this article we shall see how we can implement SCD type2 in Informatica using ORA_HASH, which is
an ORACLE function that computes hash value for a given expression. We can use this feature to find
the existence of any change in any of the SCD...

Informatica Performance Tuning - Complete Guide (/etl/informatica/157-informatica-performance-


tuning-complete-guide)
This article is a comprehensive guide to the techniques and methodologies available for tuning the
performance of Informatica PowerCentre ETL tool. It's a one stop performance tuning manual for
Informatica.

https://dwbi.org/etl/informatica/138-dynamic-lookup-cache 9/10
2/20/2018 Informatica Dynamic Lookup Cache

About Us
Data Warehousing and Business Intelligence Organization™ - Advancing Business
Intelligence
DWBI.org is a professional institution created and endorsed by veteran BI and Data Analytics
professionals for the advancement of data-driven intelligence

Join Us (/dwbi.org/component/easysocial/login) | Submit an article (/contribute) | Contact Us


(/contact)

Copyright
(https://creativecommons.org/licenses/by-nc-sa/4.0/)
Except where otherwise noted, contents of DWBI.ORG by Intellip LLP (http://intellip.com) is licensed
under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Privacy Policy (/privacy) | Terms of Use (/terms)

Get in touch

(https://www.facebook.com/datawarehousing) (https://twitter.com/dwbiconcepts)

(https://www.linkedin.com/company/dwbiconcepts)

(https://www.youtube.com/dwbiconcepts)

(https://plus.google.com/b/105042632846858744029)

Security
(https://www.beyondsecurity.com/vulnerability-scanner-verification/dwbi.org)

https://dwbi.org/etl/informatica/138-dynamic-lookup-cache 10/10

Vous aimerez peut-être aussi