Vous êtes sur la page 1sur 33

Q.

What is the difference between $ & $$ in mapping or


parameter file? In which cases they are generally used?

A. $ prefixes are used to denote session Parameter and


variables and $$ prefixes are used to denote mapping parameters
and variables

Q. What are Target Types on the Server?


A. Target Types are File, Relational and ERP.

Q. How do you identify existing rows of data in the target table


using lookup transformation?

A. There are two ways to lookup the target table to verify a row exists
or not :
1. Use connect dynamic cache lookup and then check the values of
NewLookuprow Output port to decide whether the incoming record
already exists in the table / cache or not.
2. Use Unconnected lookup and call it from an expression
transformation and check the Lookup condition port value (Null/ Not
Null) to decide whether the incoming record already exists in the table
or not.

Q. What are Aggregate transformations?

A. Aggregator transform is much like the Group by clause in traditional


SQL.

This particular transform is a connected/active transform which can


take the incoming data from the mapping pipeline and group them
based on the group by ports specified and can caculated aggregate
functions like ( avg, sum, count, stddev....etc) for each of those groups.

From a performance perspective if your mapping has an AGGREGATOR


transform use filters and sorters very early in the pipeline if there is
any need for them.

Q. What are various types of Aggregation?


A. Various types of aggregation are SUM, AVG, COUNT, MAX, MIN,
FIRST, LAST, MEDIAN, PERCENTILE, STDDEV, and VARIANCE.

Q. What are Dimensions and various types of Dimension?


A. Dimensions are classified to 3 types.

1. SCD TYPE 1(Slowly Changing Dimension): this contains current


data.

2. SCD TYPE 2(Slowly Changing Dimension): this contains current


data + complete historical data.

3. SCD TYPE 3(Slowly Changing Dimension): this contains current


data.

Q. What are 2 modes of data movement in Integration service?

A. The data movement mode depends on whether Integration service


should process single byte or multi-byte character data. This mode
selection can affect the enforcement of code page relationships and
code page validation in the Informatica Client and Server.

a) Unicode - IS allows 2 bytes for each character and uses additional


byte for each non-ascii character (such as Japanese characters)

b) ASCII - IS holds all data in a single byte

The IS data movement mode can be changed in the Integration service


configuration parameters. This comes into effect once you restart the
Integration service.

Q. What is Code Page Compatibility?

A. Compatibility between code pages is used for accurate data movement


when the Informatica Sever runs in the Unicode data movement mode. If
the code pages are identical, then there will not be any data loss. One
code page can be a subset or superset of another. For accurate data
movement, the target code page must be a superset of the source code
page.

Superset - A code page is a superset of another code page when it


contains the character encoded in the other code page, it also contains
additional characters not contained in the other code page.

Subset - A code page is a subset of another code page when all characters
in the code page are encoded in the other code page.
What is Code Page used for?
Code Page is used to identify characters that might be in different
languages. If you are importing Japanese data into mapping, u must select
the Japanese code page of source data.

Q. What is Router transformation?


A. It is different from filter transformation in that we can specify multiple
conditions and route the data to multiple targets depending on the
condition.

Q. What is Load Manager?


A. While running a Workflow, the PowerCenter Server uses the Load
Manager process and the Data Transformation Manager Process (DTM) to
run the workflow and carry out workflow tasks. When the PowerCenter
Server runs a workflow, the Load Manager performs the following tasks:

1. Locks the workflow and reads workflow properties.


2. Reads the parameter file and expands workflow variables.
3. Creates the workflow log file.
4. Runs workflow tasks.
5. Distributes sessions to worker servers.
6. Starts the DTM to run sessions.
7. Runs sessions from master servers.
8. Sends post-session email if the DTM terminates abnormally.

When the PowerCenter Server runs a session, the DTM performs the
following tasks:
1. Fetches session and mapping metadata from the repository.
2. Creates and expands session variables.
3. Creates the session log file.
4. Validates session code pages if data code page validation is enabled.
Checks query
conversions if data code page validation is disabled.
5. Verifies connection object permissions.
6. Runs pre-session shell commands.
7. Runs pre-session stored procedures and SQL.
8. Creates and runs mappings, reader, writer, and transformation threads
to extract, transform, and load data.
9. Runs post-session stored procedures and SQL.
10. Runs post-session shell commands.
11. Sends post-session email.

Q. What is Data Transformation Manager?


A. After the load manager performs validations for the session, it
creates the DTM process. The DTM process is the second process
associated with the session run. The primary purpose of the DTM
process is to create and manage threads that carry out the session
tasks.

• The DTM allocates process memory for the session and divide it
into buffers. This is also known as buffer memory. It creates the
main thread, which is called the master thread. The master thread
creates and manages all other threads.
• If we partition a session, the DTM creates a set of threads for
each partition to allow concurrent processing.. When Integration
service writes messages to the session log it includes thread type
and thread ID.

Following are the types of threads that DTM creates:

Master Thread - Main thread of the DTM process. Creates and


manages all other threads.

Mapping Thread - One Thread to Each Session. Fetches Session and


Mapping Information.

Pre and Post Session Thread - One Thread each to Perform Pre and
Post Session Operations.

Reader Thread - One Thread for Each Partition for Each Source
Pipeline.

Writer Thread - One Thread for Each Partition if target exist in the
source pipeline write to the target.

Transformation Thread - One or More Transformation Thread For


Each Partition.

Q. What is Session and Batches?

A. Session - A Session Is A set of instructions that tells the


Integration service How And When To Move Data From Sources To
Targets. After creating the session, we can use either the server
manager or the command line program pmcmd to start or stop the
session. Batches - It Provides A Way to Group Sessions For Either
Serial Or Parallel Execution By The Integration service. There Are
Two Types Of Batches :
1. Sequential - Run Session One after the Other.
2. Concurrent - Run Session At The Same Time.

Q. What is a source qualifier?


A. It represents all data queried from the source.

Q. Why we use lookup transformations?

A. Lookup Transformations can access data from relational tables that


are not sources in mapping. With Lookup transformation, we can
accomplish the following tasks:

Get a related value-Get the Employee Name from Employee table


based on the Employee ID

Perform Calculation.

Update slowly changing dimension tables - We can use unconnected


lookup transformation to determine whether the records already exist
in the target or not.

Q. While importing the relational source definition from


database, what are the meta data of source U import?
Source name
Database location
Column names
Data types
Key constraints

Q. How many ways you can update a relational source


definition and what are they?
A. Two ways
1. Edit the definition
2. Reimport the definition

Q. Where should you place the flat file to import the flat file
definition to the designer?
A. Place it in local folder

Q. Which transformation should u need while using the Cobol


sources as source definitions?
A. Normalizer transformation which is used to normalize the data.
Since Cobol sources r often consists of denormalized data.

Q. How can you create or import flat file definition in to the


warehouse designer?
A. You can create flat file definition in warehouse designer. In the
warehouse designer, you can create a new target: select the type
as flat file. Save it and u can enter various columns for that created
target by editing its properties. Once the target is created, save it.
You can import it from the mapping designer.

Q. What is a mapplet?
A. A mapplet should have a mapplet input transformation which
receives input values, and an output transformation which passes the
final modified data to back to the mapping. Set of transformations
where the logic can be reusable when the mapplet is displayed within
the mapping only input & output ports are displayed so that the
internal logic is hidden from end-user point of view.

Q. What is a transformation?
A. It is a repository object that generates, modifies or passes data.

Q. What are the designer tools for creating transformations?


A. Mapping designer
Transformation developer
Mapplet designer

Q. What are connected and unconnected transformations?

A. Connect Transformation : A transformation which participates in the


mapping data flow. Connected transformation can receive multiple
inputs and provides multiple outputs

Unconnected: An unconnected transformation does not participate in


the mapping data flow. It can receive multiple inputs and provides
single output
Q. In how many ways can you create ports?
A. Two ways
1. Drag the port from another transformation
2. Click the add button on the ports tab.

Q. What are reusable transformations?


A. A transformation that can be reused is called a reusable
transformation
They can be created using two methods:
1. Using transformation developer

2. Create normal one and promote it to reusable


Q. What are mapping parameters and mapping variables?
A. Mapping parameter represents a constant value that U can define
before running a session. A mapping parameter retains the same value
throughout the entire session.
When u use the mapping parameter ,U declare and use the parameter
in a mapping or mapplet. Then define the value of parameter in a
parameter file for the session.
Unlike a mapping parameter, a mapping variable represents a value
that can change throughout the session. The Integration service saves
the value of mapping variable to the repository at the end of session
run and uses that value next time U run the session.

Q. Can U use the mapping parameters or variables created in


one mapping into another mapping?
A. NO.
We can use mapping parameters or variables in any transformation of
the same mapping or mapplet in which U have created mapping
parameters or variables.

Q. How can U improve session performance in aggregator


transformation?

A. 1. Use sorted input. Use a sorter before the aggregator

2. Do not forget to check the option on the aggregator that tells the
aggregator that the input is sorted on the same keys as group by. The
key order is also very important.

Q. Is aggregate cache in aggregator transformation?


A. The aggregator stores data in the aggregate cache until
it completes aggregate calculations. When u run a session
that uses an aggregator transformation, the Integration
service creates index and data caches in memory to
process the transformation. If the Integration service
requires more space, it stores overflow values in cache
files.

Q. What r the difference between joiner


transformation and source qualifier transformation?
A. You can join heterogeneous data sources in joiner
transformation which we cannot achieve in source qualifier
transformation.
You need matching keys to join two relational sources in
source qualifier transformation. Whereas u doesn’t need
matching keys to join two sources.
Two relational sources should come from same data
source in sourcequalifier. You can join relational sources
which r coming from different sources also.

Q. In which conditions can we not use joiner


transformations?
A. You cannot use a Joiner transformation in the following
situations (according to Informatica 7.1):
♦Either input pipeline contains an Update Strategy
transformation.
♦You connect a Sequence Generator transformation
directly before the Joiner
transformation.

Q. What r the settings that u use to configure the


joiner transformation?
A. Master and detail source
Type of join
Condition of the join

Q. What are the join types in joiner transformation?


A. Normal (Default) -- only matching rows from both
master and detail
Master outer -- all detail rows and only matching rows from
master
Detail outer -- all master rows and only matching rows
from detail
Full outer -- all rows from both master and detail
( matching or non matching)
Q. What are the joiner caches?
A. When a Joiner transformation occurs in a session, the
Integration service reads all the records from the master
source and builds index and data caches based on the
master rows.
After building the caches, the Joiner transformation reads
records from the detail source and performs joins.

Q. Why use the lookup transformation?


A. To perform the following tasks.
Get a related value. For example, if your source table
includes employee ID, but you want to include the
employee name in your target table to make your
summary data easier to read.
Perform a calculation. Many normalized tables include
values used in a calculation, such as gross sales per
invoice or sales tax, but not the calculated value (such as
net sales).
Update slowly changing dimension tables. You can use a
Lookup transformation to determine whether records
already exist in the target.

Q. What is meant by lookup caches?


A. The Integration service builds a cache in memory when it
processes the first row of a data in a cached look up
transformation. It allocates memory for the cache based on
the amount u configure in the transformation or session
properties. The Integration service stores condition values in
the index cache and output values in the data cache.

Q. What r the types of lookup caches?


A. Persistent cache: U can save the lookup cache files and
reuse them the next time the Integration service processes a
lookup transformation configured to use the cache.

Recache from database: If the persistent cache is not


synchronized with the lookup table, you can configure the
lookup transformation to rebuild the lookup cache.

Static cache: U can configure a static or read-only cache for


only lookup table. By default Integration service creates a
static cache. It caches the lookup table and lookup values in
the cache for each row that comes into the transformation.
When the lookup condition is true, the Integration service
does not update the cache while it processes the lookup
transformation.

Dynamic cache: If you want to cache the target table and


insert new rows into cache and the target, you can create a
look up transformation to use dynamic cache. The Integration
service dynamically inserts data to the target table.

Shared cache: U can share the lookup cache between


multiple transactions. You can share unnamed cache between
transformations in the same mapping.

Q. What r the types of lookup caches?


A. Persistent cache: U can save the lookup cache files and
reuse them the next time the Integration service processes a
lookup transformation configured to use the cache.

Recache from database: If the persistent cache is not


synchronized with the lookup table, you can configure the
lookup transformation to rebuild the lookup cache.

Static cache: U can configure a static or read-only cache for


only lookup table. By default Integration service creates a
static cache. It caches the lookup table and lookup values in
the cache for each row that comes into the transformation.
When the lookup condition is true, the Integration service
does not update the cache while it processes the lookup
transformation.

Dynamic cache: If you want to cache the target table and


insert new rows into cache and the target, you can create a
look up transformation to use dynamic cache. The Integration
service dynamically inserts data to the target table.
Shared cache: U can share the lookup cache between
multiple transactions. You can share unnamed cache between
transformations in the same mapping.

Q: What do you know about Informatica and ETL?


A: Informatica is a very useful GUI based ETL tool.

Q: FULL and DELTA files. Historical and Ongoing load.


A: FULL file contains complete data as of today including history data,
DELTA file contains only the changes since last extract.

Q: Power Center/ Power Mart – which products have you


worked with?
A: Power Center will have Global and Local repository, whereas Power
Mart will have only Local repository.

Q: Explain what are the tools you have used in Power Center
and/or Power Mart?
A: Designer, Server Manager, and Repository Manager.

Q: What is a Mapping?
A: Mapping Represent the data flow between source and target

Q: What are the components must contain in Mapping?


A: Source definition, Transformation, Target Definition and Connectors

Q: What is Transformation?
A: Transformation is a repository object that generates, modifies, or
passes data. Transformation performs specific function. They are two
types of transformations:
1. Active
Rows, which are affected during the transformation or can
change the no of rows that pass through it. Eg: Aggregator,
Filter, Joiner, Normalizer, Rank, Router, Source qualifier, Update
Strategy, ERP Source Qualifier, Advance External Procedure.
2. Passive
Does not change the number of rows that pass through it. Eg:
Expression, External Procedure, Input, Lookup, Stored Procedure,
Output, Sequence Generator, XML Source Qualifier.

Q: Which transformation can be overridden at the Server?


A: Source Qualifier and Lookup Transformations

Q: What is connected and unconnected Transformation and


give Examples?

Q: What are Options/Type to run a Stored Procedure?


A:
Normal: During a session, the stored procedure runs where the
transformation exists in the mapping on a row-by-row basis. This
is useful for calling the stored procedure for each row of data
that passes through the mapping, such as running a calculation
against an input port. Connected stored procedures run only in
normal mode.

Pre-load of the Source. Before the session retrieves data from


the source, the stored procedure runs. This is useful for verifying
the existence of tables or performing joins of data in a temporary
table.

Post-load of the Source. After the session retrieves data from


the source, the stored procedure runs. This is useful for removing
temporary tables.

Pre-load of the Target. Before the session sends data to the


target, the stored procedure runs. This is useful for verifying
target tables or disk space on the target system.

Post-load of the Target. After the session sends data to the


target, the stored procedure runs. This is useful for re-creating
indexes on the database.

It must contain at least one Input and one Output port.

Q: What kinds of sources and of targets can be used in


Informatica?
A:
 Sources may be Flat file, relational db or XML.
 Target may be relational tables, XML or flat files.

Q: Transformations: What are the different transformations


you have worked with?
A:
 Source Qualifier (XML, ERP, MQ)
 Joiner
 Expression
 Lookup
 Filter
 Router
 Sequence Generator
 Aggregator
 Update Strategy
 Stored Proc
 External Proc
 Advanced External Proc
 Rank
 Normalizer

Q: What are active/passive transformations?

A: Passive transformations do not change the nos. of rows passing


through it whereas active transformation changes the nos. rows
passing thru it.
Active: Filter, Aggregator, Rank, Joiner, Source Qualifier
Passive: Expression, Lookup, Stored Proc, Seq. Generator

Q: What are connected/unconnected transformations?


A:
 Connected transformations are part of the mapping pipeline. The
input and output ports are connected to other transformations.
 Unconnected transformations are not part of the mapping
pipeline. They are not linked in the map with any input or output
ports. Eg. In Unconnected Lookup you can pass multiple values
to unconnected transformation but only one column of data will
be returned from the transformation. Unconnected: Lookup,
Stored Proc.

Q: In target load ordering, what do you order - Targets or


Source Qualifiers?
A: Source Qualifiers. If there are multiple targets in the mapping, which
are populated from multiple sources, then we can use Target Load
ordering.

Q: Have you used constraint-based load ordering? Where do


you set this?
A: Constraint based loading can be used when you have multiple
targets in the mapping and the target tables have a PK-FK relationship
in the database. It can be set in the session properties. You have to set
the Source “Treat Rows as: INSERT” and check the box “Constraint
based load ordering” in Advanced Tab.
Q: If you have a FULL file that you have to match and load into
a corresponding table, how will you go about it? Will you use
Joiner transformation?
A: Use Joiner and join the file and Source Qualifier.

Q: If you have 2 files to join, which file will you use as the
master file?
A: Use the file with lesser nos. of records as master file.

Q: If a sequence generator (with increment of 1) is connected


to (say) 3 targets and each target uses the NEXTVAL port,
what value will each target get?
A: Each target will get the value in multiple of 3.

Q: Have you used the Abort, Decode functions?


A: Abort can be used to Abort / stop the session on an error condition.
If the primary key column contains NULL, and you need to stop the
session from continuing then you may use ABORT function in the
default value for the port. It can be used with IIF and DECODE function
to Abort the session.

Q: Have you used SQL Override?


A: It is used to override the default SQL generated in the Source
Qualifier / Lookup transformation.

Q: If you make a local transformation reusable by mistake, can


you undo the reusable action?
A: No

Q: What is the difference between filter and router


transformations?
A: Filter can filter the records based on ONE condition only whereas
Router can be used to filter records on multiple condition.

Q: Lookup transformations: Cached/un-cached


A: When the Lookup Transformation is cached the Integration service
caches the data and index. This is done at the beginning of the session
before reading the first record from the source. If the Lookup is
uncached then the Informatica reads the data from the database for
every record coming from the Source Qualifier.

Q: Connected/unconnected – if there is no match for the


lookup, what is returned?
A: Unconnected Lookup returns NULL if there is no matching record
found in the Lookup transformation.
Q: What is persistent cache?
A: When the Lookup is configured to be a persistent cache Integration
service does not delete the cache files after completion of the session.
In the next run Integration service uses the cache file from the
previous session.

Q: What is dynamic lookup strategy?


A: The compares the data in the lookup table and the cache, if there is
no matching record found in the cache file then it modifies the cache
files by inserting the record. You may use only (=) equality in the
lookup condition.
If multiple matches are found in the lookup then Informatica fails the
session. By default the Integration service creates a static cache.

Q: Mapplets: What are the 2 transformations used only in


mapplets?
A: Mapplet Input / Source Qualifier, Mapplet Output

Q: Have you used Shortcuts?


A: Shortcuts may used to refer to another mapping. Informatica refers
to the original mapping. If any changes are made to the mapping /
mapplet, it is immediately reflected in the mapping where it is used.

Q: If you used a database when importing sources/targets that


was dropped later on, will your mappings still be valid?
A: No

Q: In expression transformation, how can you store a value


from the previous row?
A: By creating a variable in the transformation.

Q: How does Informatica do variable initialization?


Number/String/Date
A: Number – 0, String – blank, Date – 1/1/1753

Q: Have you used the Informatica debugger?


A: Debugger is used to test the mapping during development. You can
give breakpoints in the mappings and analyze the data.

Q: What do you know about the Integration service


architecture? Load Manager, DTM, Reader, Writer,
Transformer.
A:
 Load Manager is the first process started when the session runs.
It checks for validity of mappings, locks sessions and other
objects.
 DTM process is started once the Load Manager has completed its
job. It starts a thread for each pipeline.
 Reader scans data from the specified sources.
 Writer manages the target/output data.
 Transformer performs the task specified in the mapping.

Q: Have you used partitioning in sessions? (not available with


Powermart)
A: It is available in PowerCenter. It can be configured in the session
properties.

Q: Have you used External loader? What is the difference


between normal and bulk loading?
A: External loader will perform direct data load to the table/data files,
bypass the SQL layer and will not log the data. During normal data
load, data passes through SQL layer, data is logged in to the archive
log file and as a result it is slow.

Q: Do you enable/disable decimal arithmetic in session


properties?
A: Disabling Decimal Arithmetic will improve the session performance
but it converts numeric values to double, thus leading to reduced
accuracy.

Q: When would use multiple update strategy in a mapping?


A: When you would like to insert and update the records in a Type 2
Dimension table.

Q: When would you truncate the target before running the


session?
A: When we want to load entire data set including history in one shot.
Update strategy do not have dd_update, dd_delete and it does only
dd_insert.

Q: How do you use stored proc transformation in the mapping?


A: In side mapping we can use stored procedure transformation, pass
input parameters and get back the output parameters. When handling
through session, it can be invoked either in Pre-session or post-session
scripts.

Q: What did you do in the stored procedure? Why did you use
stored proc instead of using expression?
A:

Q: When would you use SQ, Joiner and Lookup?


A:
 If we are using multiples source tables and they are related at
the database, then we can use a single SQ.
 If we need to Lookup values in a table or Update Slowly Changing
Dimension tables then we can use Lookup transformation.
 Joiner is used to join heterogeneous sources, e.g. Flat file and
relational tables.

Q: How do you create a batch load? What are the different


types of batches?
A: Batch is created in the Server Manager. It contains multiple
sessions. First create sessions and then create a batch. Drag the
sessions into the batch from the session list window.
Batches may be sequential or concurrent. Sequential batch runs the
sessions sequentially. Concurrent sessions run parallel thus optimizing
the server resources.

Q: How did you handle reject data? What file does Informatica
create for bad data?
A: Informatica saves the rejected data in a .bad file. Informatica adds a
row identifier for each record rejected indicating whether the row was
rejected because of Writer or Target. Additionally for every column
there is an indicator for each column specifying whether the data was
rejected due to overflow, null, truncation, etc.

Q: How did you handle runtime errors? If the session stops


abnormally how were you managing the reload process?

Q: Have you used pmcmd command? What can you do using


this command?
A: pmcmd is a command line program. Using this command
 You can start sessions
 Stop sessions
 Recover session

Q: What are the two default repository user groups


A: Administrators and Public

Q: What are the Privileges of Default Repository and Extended


Repository user?
A:
 Default Repository Privileges
o Use Designer
o Browse Repository
o Create Session and Batches
 Extended Repository Privileges
o Session Operator
o Administer Repository
o Administer Server
o Super User

Q: How many different locks are available for repository


objects
A: There are five kinds of locks available on repository objects:

 Read lock. Created when you open a repository object in a folder


for which you do not have write permission. Also created when
you open an object with an existing write lock.
 Write lock. Created when you create or edit a repository object in
a folder for which you have write permission.
 Execute lock. Created when you start a session or batch, or when
the Integration service starts a scheduled session or batch.
 Fetch lock. Created when the repository reads information about
repository objects from the database.
 Save lock. Created when you save information to the repository.

Q: What is Session Process?


A: The Load Manager process. Starts the session, creates the DTM
process, and sends post-session email when the session completes.

Q: What is DTM process?


A: The DTM process creates threads to initialize the session, read,
write, transform data, and handle pre and post-session operations.

Q: When the Integration service runs a session, what are the


tasks handled?
A:
 Load Manager (LM):
o LM locks the session and reads session properties.
o LM reads the parameter file.
o LM expands the server and session variables and
parameters.
o LM verifies permissions and privileges.
o LM validates source and target code pages.
o LM creates the session log file.
o LM creates the DTM (Data Transformation Manager)
process.

 Data Transformation Manager (DTM):


o DTM process allocates DTM process memory.
o DTM initializes the session and fetches the mapping.
o DTM executes pre-session commands and procedures.
o DTM creates reader, transformation, and writer threads for
each source pipeline. If the pipeline is partitioned, it
creates a set of threads for each partition.
o DTM executes post-session commands and procedures.
o DTM writes historical incremental aggregation and lookup
data to disk, and it writes persisted sequence values and
mapping variables to the repository.
o Load Manager sends post-session email

Q: What is Code Page?


A: A code page contains the encoding to specify characters in a set of
one or more languages.

Q: How to handle the performance in the server side?


A: Informatica tool has no role to play here. The server administrator
will take up the issue.

Q: What are the DTM (Data Transformation Manager)


Parameters?
A:
 DTM Memory parameter - Default buffer block size/Data & Index
Cache size ,
 Reader Parameter - Line Sequential buffer length for flat files,
 General Parameter - Commit Interval (source and Target)/
Others- Enabling Lookup cache,
 Event based Scheduling - Indicator file to wait for.

1. Explain about your projects


– Architecture
– Dimension and Fact tables
– Sources and Targets
– Transformations used
– Frequency of populating data
– Database size

2. What is dimension modeling?


Unlike ER model the dimensional model is very asymmetric with one
large central table called as fact table connected to multiple
dimension tables .It is also called star schema.

3. What are mapplets?


Mapplets are reusable objects that represents collection of
transformations
Transformations not to be included in mapplets are
Cobol source definitions
Joiner transformations
Normalizer Transformations
Non-reusable sequence generator transformations
Pre or post session procedures
Target definitions
XML Source definitions
IBM MQ source definitions
Power mart 3.5 style Lookup functions

4. What are the transformations that use cache for


performance?
Aggregator, Lookups, Joiner and Ranker

5. What the active and passive transformations?


An active transformation changes the number of rows that pass
through the mapping.
1. Source Qualifier
2. Filter transformation
3. Router transformation
4. Ranker
5. Update strategy
6. Aggregator
7. Advanced External procedure
8. Normalizer
9. Joiner

Passive transformations do not change the number of rows that


pass through the mapping.
1. Expressions
2. Lookup
3. Stored procedure
4. External procedure
5. Sequence generator
6. XML Source qualifier

6. What is a lookup transformation?


Used to look up data in a relational table, views, or synonym, The
integration service queries the lookup table based on the lookup
ports in the transformation. It compares lookup transformation port
values to lookup table column values based on the lookup condition.
The result is passed to other transformations and the target.
Used to :
Get related value
Perform a calculation
Update slowly changing dimension tables.
Diff between connected and unconnected lookups. Which is
better?
Connected :
Received input values directly from the pipeline
Can use Dynamic or static cache.
Cache includes all lookup columns used in the mapping
Can return multiple columns from the same row
If there is no match , can return default values
Default values can be specified.
Un connected :
Receive input values from the result of a LKP expression in another
transformation.
Only static cache can be used.
Cache includes all lookup/output ports in the lookup condition and
lookup or return port.
Can return only one column from each row.
If there is no match it returns null.
Default values cannot be specified.

Explain various caches :


Static:
Caches the lookup table before executing the transformation. Rows
are not added dynamically.
Dynamic:
Caches the rows as and when it is passed.
Unshared:
Within the mapping if the lookup table is used in more than one
transformation then the cache built for the first lookup can be used
for the others. It cannot be used across mappings.
Shared:
If the lookup table is used in more than one transformation/mapping
then the cache built for the first lookup can be used for the others.
It can be used across mappings.
Persistent :
If the cache generated for a Lookup needs to be preserved for
subsequent use then persistent cache is used. It will not delete the
index and data files. It is useful only if the lookup table remains
constant.

What are the uses of index and data caches?


The conditions are stored in index cache and records from the
lookup are stored in data cache
7. Explain aggregate transformation?
The aggregate transformation allows you to perform aggregate
calculations, such as averages, sum, max, min etc. The aggregate
transformation is unlike the Expression transformation, in that you
can use the aggregator transformation to perform calculations in
groups. The expression transformation permits you to perform
calculations on a row-by-row basis only.
Performance issues ?
The Integration service performs calculations as it reads and stores
necessary data group and row data in an aggregate cache.
Create Sorted input ports and pass the input records to aggregator
in sorted forms by groups then by port

Incremental aggregation?
In the Session property tag there is an option for performing
incremental aggregation. When the Integration service performs
incremental aggregation , it passes new source data through the
mapping and uses historical cache (index and data cache) data to
perform new aggregation calculations incrementally.

What are the uses of index and data cache?


The group data is stored in index files and Row data stored in data
files.

8. Explain update strategy?


Update strategy defines the sources to be flagged for insert,
update, delete, and reject at the targets.
What are update strategy constants?
DD_INSERT,0 DD_UPDATE,1 DD_DELETE,2
DD_REJECT,3

If DD_UPDATE is defined in update strategy and Treat


source rows as INSERT in Session . What happens?
Hints: If in Session anything other than DATA DRIVEN is mentions
then Update strategy in the mapping is ignored.

What are the three areas where the rows can be flagged for
particular treatment?
In mapping, In Session treat Source Rows and In Session Target
Options.

What is the use of Forward/Reject rows in Mapping?


9. Explain the expression transformation ?
Expression transformation is used to calculate values in a single row
before writing to the target.
What are the default values for variables?
Hints: Straing = Null, Number = 0, Date = 1/1/1753

10. Difference between Router and filter transformation?


In filter transformation the records are filtered based on the
condition and rejected rows are discarded. In Router the multiple
conditions are placed and the rejected rows can be assigned to a
port.

How many ways you can filter the records?


1. Source Qualifier
2. Filter transformation
3. Router transformation
4. Ranker
5. Update strategy
.
11. How do you call stored procedure and external procedure
transformation ?
External Procedure can be called in the Pre-session and post session
tag in the Session property sheet.
Store procedures are to be called in the mapping designer by three
methods
1. Select the icon and add a Stored procedure transformation
2. Select transformation – Import Stored Procedure
3. Select Transformation – Create and then select stored procedure.

12. Explain Joiner transformation and where it is used?


While a Source qualifier transformation can join data originating
from a common source database, the joiner transformation joins
two related heterogeneous sources residing in different locations or
file systems.
Two relational tables existing in separate databases
Two flat files in different file systems.
Two different ODBC sources
In one transformation how many sources can be coupled?
Two sources can be couples. If more than two is to be couples add
another Joiner in the hierarchy.
What are join options?
Normal (Default)
Master Outer
Detail Outer
Full Outer
13. Explain Normalizer transformation?
The normaliser transformation normalises records from COBOL and
relational sources, allowing you to organise the data according to
your own needs. A Normaliser transformation can appear anywhere
in a data flow when you normalize a relational source. Use a
Normaliser transformation instead of the Source Qualifier
transformation when you normalize COBOL source. When you drag
a COBOL source into the Mapping Designer Workspace, the
Normaliser transformation appears, creating input and output ports
for every columns in the source.

14. What is Source qualifier transformation?


When you add relational or flat file source definition to a mapping ,
you need to connect to a source Qualifier transformation. The
source qualifier represents the records that the integration service
reads when it runs a session.
Join Data originating from the same source database.
Filter records when the Integration service reads the source data.
Specify an outer join rather than the default inner join.
Specify sorted ports
Select only distinct values from the source
Create a custom query to issue a special SELECT statement for the
Integration service to read the source data.

15. What is Ranker transformation?


Filters the required number of records from the top or from the
bottom.

16. What is target load option?


It defines the order in which integration service loads the data into
the targets.
This is to avoid integrity constraint violations

17. How do you identify the bottlenecks in Mappings?


Bottlenecks can occur in
1. Targets
The most common performance bottleneck occurs when the
integration service writes to a target
database. You can identify target bottleneck by configuring the
session to write to a flat file target.
If the session performance increases significantly when you write
to a flat file, you have a target
bottleneck.
Solution :
Drop or Disable index or constraints
Perform bulk load (Ignores Database log)
Increase commit interval (Recovery is compromised)
Tune the database for RBS, Dynamic Extension etc.,

2. Sources
Set a filter transformation after each SQ and see the records are
not through.
If the time taken is same then there is a problem.
You can also identify the Source problem by
Read Test Session – where we copy the mapping with sources, SQ
and remove all transformations
and connect to file target. If the performance is same then there
is a Source bottleneck.
Using database query – Copy the read query directly from the log.
Execute the query against the
source database with a query tool. If the time it takes to execute
the query and the time to fetch
the first row are significantly different, then the query can be
modified using optimizer hints.
Solutions:
Optimize Queries using hints.
Use indexes wherever possible.

3. Mapping
If both Source and target are OK then problem could be in
mapping.
Add a filter transformation before target and if the time is the
same then there is a problem.
(OR) Look for the performance monitor in the Sessions property
sheet and view the counters.
Solutions:
If High error rows and rows in lookup cache indicate a mapping
bottleneck.
Optimize Single Pass Reading:
Optimize Lookup transformation :
1. Caching the lookup table:
When caching is enabled the integration service caches
the lookup table and queries the
cache during the session. When this option is not enabled
the server queries the lookup
table on a row-by row basis.
Static, Dynamic, Shared, Un-shared and Persistent cache
2. Optimizing the lookup condition
Whenever multiple conditions are placed, the condition
with equality sign should take
precedence.
3. Indexing the lookup table
The cached lookup table should be indexed on order by
columns. The session log contains
the ORDER BY statement
The un-cached lookup since the server issues a SELECT
statement for each row passing
into lookup transformation, it is better to index the lookup
table on the columns in the
condition

Optimize Filter transformation:


You can improve the efficiency by filtering early in the data
flow. Instead of using a filter
transformation halfway through the mapping to remove a
sizable amount of data.
Use a source qualifier filter to remove those same rows at the
source,
If not possible to move the filter into SQ, move the filter
transformation as close to the
source
qualifier as possible to remove unnecessary data early in the
data flow.
Optimize Aggregate transformation:
1. Group by simpler columns. Preferably numeric columns.
2. Use Sorted input. The sorted input decreases the use of
aggregate caches. The server
assumes all input data are sorted and as it reads it
performs aggregate calculations.
3. Use incremental aggregation in session property sheet.
Optimize Seq. Generator transformation:
1. Try creating a reusable Seq. Generator transformation and
use it in multiple mappings
2. The number of cached value property determines the
number of values the informatica
server caches at one time.
Optimize Expression transformation:
1. Factoring out common logic
2. Minimize aggregate function calls.
3. Replace common sub-expressions with local variables.
4. Use operators instead of functions.

4. Sessions
If you do not have a source, target, or mapping bottleneck, you
may have a session bottleneck.
You can identify a session bottleneck by using the performance
details. The integration service
creates performance details when you enable Collect
Performance Data on the General Tab of
the session properties.
Performance details display information about each Source
Qualifier, target definitions, and
individual transformation. All transformations have some basic
counters that indicate the
Number of input rows, output rows, and error rows.
Any value other than zero in the readfromdisk and writetodisk
counters for Aggregate, Joiner,
or Rank transformations indicate a session bottleneck.
Low bufferInput_efficiency and BufferOutput_efficiency
counter also indicate a session
bottleneck.
Small cache size, low buffer memory, and small commit intervals
can cause session bottlenecks.
5. System (Networks)

18. How to improve the Session performance?


1 Run concurrent sessions
2 Partition session (Power center)
3. Tune Parameter – DTM buffer pool, Buffer block size, Index cache
size, data cache size, Commit Interval, Tracing level (Normal, Terse,
Verbose Init, Verbose Data)
The session has memory to hold 83 sources and targets. If it is
more, then DTM can be increased.
The integration service uses the index and data caches for
Aggregate, Rank, Lookup and Joiner
transformation. The server stores the transformed data from the
above transformation in the data
cache before returning it to the data flow. It stores group
information for those transformations in
index cache.
If the allocated data or index cache is not large enough to store the
date, the server stores the data
in a temporary disk file as it processes the session data. Each time
the server pages to the disk the
performance slows. This can be seen from the counters .
Since generally data cache is larger than the index cache, it has to
be more than the index.
4. Remove Staging area
5. Tune off Session recovery
6. Reduce error tracing
19. What are tracing levels?
Normal-default
Logs initialization and status information, errors encountered,
skipped rows due to transformation errors, summarizes session
results but not at the row level.
Terse
Log initialization, error messages, notification of rejected data.
Verbose Init.
In addition to normal tracing levels, it also logs additional
initialization information, names of index and data files used and
detailed transformation statistics.
Verbose Data.
In addition to Verbose init, It records row level logs.

20. What is Slowly changing dimensions?


Slowly changing dimensions are dimension tables that have slowly
increasing data as well as updates to existing data.
21. What are mapping parameters and variables?
A mapping parameter is a user definable constant that takes up a
value before running a session. It can be used in SQ expressions,
Expression transformation etc.
Steps:
Define the parameter in the mapping designer - parameter &
variables .
Use the parameter in the Expressions.
Define the values for the parameter in the parameter file.

A mapping variable is also defined similar to the parameter except


that the value of the variable is subjected to change.
It picks up the value in the following order.
1. From the Session parameter file
2. As stored in the repository object in the previous run.
3. As defined in the initial values in the designer.
4. Default values

Q. What are the output files that the Integration service


creates during the session running?
Integration service log: Integration service (on UNIX) creates a log for
all status and error messages (default name: pm.server.log). It also
creates an error log for error messages. These files will be created in
Informatica home directory
Session log file: Integration service creates session log file for each
session. It writes information about session into log files such as
initialization process, creation of sql commands for reader and writer
threads, errors encountered and load summary. The amount of detail
in session log file depends on the tracing level that you set.
Session detail file: This file contains load statistics for each target in
mapping. Session detail includes information such as table name,
number of rows written or rejected. You can view this file by double
clicking on the session in monitor window.
Performance detail file: This file contains information known as session
performance details which helps you where performance can be
improved. To generate this file select the performance detail option in
the session property sheet.
Reject file: This file contains the rows of data that the writer does not
write to targets.
Control file: Integration service creates control file and a target file
when you run a session that uses the external loader. The control file
contains the information about the target flat file such as data format
and loading instructions for the external loader.
Post session email: Post session email allows you to automatically
communicate information about a session run to designated recipients.
You can create two different messages. One if the session completed
successfully the other if the session fails.
Indicator file: If you use the flat file as a target, you can configure the
Integration service to create indicator file. For each target row, the
indicator file contains a number to indicate whether the row was
marked for insert, update, delete or reject.
Output file: If session writes to a target file, the Integration service
creates the target file based on file properties entered in the session
property sheet.
Cache files: When the Integration service creates memory cache it also
creates cache files.

For the following circumstances Integration service creates index and


data cache files:
Aggregator transformation
Joiner transformation
Rank transformation
Lookup transformation

Q. What is the difference between joiner transformation and


source qualifier transformation?
A. You can join heterogeneous data sources in joiner transformation
which we cannot do in source qualifier transformation.

Q. What is meant by lookup caches?


A. The Integration service builds a cache in memory when it processes
the first row of a data in a cached look up transformation. It allocates
memory for the cache based on the amount you configure in the
transformation or session properties. The Integration service stores
condition values in the index cache and output values in the data
cache.

Q. What is meant by parameters and variables in Informatica


and how it is used?
A. Parameter: A mapping parameter represents a constant value that
you can define before running a session. A mapping parameter retains
the same value throughout the entire session.
Variable: A mapping variable represents a value that can change
through the session. Integration service saves the value of a mapping
variable to the repository at the end of each successful session run and
uses that value the next time you run the session

Q. What is target load order?


You specify the target load order based on source qualifiers in a
mapping. If you have multiple source qualifiers connected to multiple
targets, you can define the order in which Integration service loads
data into the targets

1 bit = a 1 or 0 (b)
4 bits = 1 nybble (?)
8 bits = 1 byte (B)
1024 bytes = 1 Kilobyte (KB)
1024 Kilobytes = 1 Megabyte (MB)
1024 Megabytes = 1 Gigabyte (GB)
1024 Gigabytes = 1 Terabyte (TB)

We need answers:::
1) why we use source qualifier in mapping? what is the basic need?
2)while we import files into mapping ........ how we igore the header & footer of the
file?
3) when we use lookup condition in lookup transformation it gives one record..but in
the sql t/m (sql transformation) it give's the all records matching with the condition? is
there any internal proess occurs based on lookup & sql transformation/

Q) my source is
id
1
1
1
1
2
2
2
2
3
3
3
3
then my targets are like
target1:

id
1
2
3

target2:

id
1
1
1
2
2
2
3
3
3

Ans)
Din & Ramesh is correct. may be u got confused with Jai answer using aggregator. now
aggregator is not required. here i'm giving you the coding check it out.

Source Qualifier -> sorter -> Expressino ->router -- 2 targets

in sorter : sort on education

in expression : INPUT & OUTPUT PORTS ARE ENO, ENAME, EDUCATION

Take two variable and one output port


v_FLAG : IIF(V_TEMP = EDUCATION, 1,0)
V_TEMP : EDUCATION
O_flag : v_flag
In router :
Create two groups

unique values group : o_flag = 0


duplicate values group : o_flag=1

map those values to corresponding group.

hi friends correct me if any wrong in that code..

Q) Mapping variable :

Ans.
In the Designer, you can create mapping variables in a mapping or
mapplet. After you create a mapping variable, it appears in the
Expression Editor. You can then use it in any expression in the
mapping or mapplet. You can also use mapping variables in a source
qualifier filter, user-defined join, or extract override, and in the
Expression Editor of reusable transformations.

Unlike mapping parameters, mapping variables are values that can


change between sessions. The Integration Service saves the latest
value of a mapping variable to the repository at the end of each
successful session. During the next session run, it evaluates all
references to the mapping variable to the saved value. You can
override a saved value with the parameter file. You can also clear all
saved values for the session in the Workflow Manager.

You might use a mapping variable to perform an incremental read of


the source. For example, you have a source table containing
timestamped transactions and you want to evaluate the transactions
on a daily basis. Instead of manually entering a session override to
filter source data each time you run the session, you can create a
mapping variable, $$IncludeDateTime. In the source qualifier, create a
filter to read only rows whose transaction date equals $
$IncludeDateTime, such as:

TIMESTAMP = $$IncludeDateTime

In the mapping, use a variable function to set the variable value to


increment one day each time the session runs. If you set the initial
value of $$IncludeDateTime to 8/1/2004, the first time the Integration
Service runs the session, it reads only rows dated 8/1/2004. During the
session, the Integration Service sets $$IncludeDateTime to 8/2/2004. It
saves 8/2/2004 to the repository at the end of the session. The next
time it runs the session, it reads only rows from August 2, 2004.

Diffrence between informatica 7 X and 8 X

Ans.
1)The architecture of Power Center 8 has changed a lot; PC8 is service-oriented for
modularity, scalability and flexibility.
2) The Repository Service and Integration Service (as replacement for Rep Server and
Informatica Server) can be run on different computers in a network (so called nodes),
even redundantly.
3) Management is centralized, that means services can be started and stopped on nodes
via a central web interface.
4) Client Tools access the repository via that centralized machine, resources are
distributed dynamically.
5) Running all services on one machine is still possible, of course.
6) It has a support for unstructured data which includes spreadsheets, email, Microsoft
Word files, presentations and .PDF documents. It provides high availability, seamless fail
over, eliminating single points of failure.
7) It has added performance improvements (To bump up systems performance,
Informatica has added "push down optimization" which moves data transformation
processing to the native relational database I/O engine whenever its is most appropriate.)
8) Informatica has now added more tightly integrated data profiling, cleansing, and
matching capabilities.
9) Informatica has added a new web based administrative console.
10) Ability to write a Custom Transformation in C++ or Java.
11) Midstream SQL transformation has been added in 8.1.1, not in 8.1.
12) Dynamic configuration of caches and partitioning
13) Java transformation is introduced
. 14) User defined functions
15) Power Center 8 release has "Append to Target file"
A new web based administrative console has been added.

Q) work flow
i have 2 work flows namely wkf1,wkf2. after the execution of wk1 we have to execute
wkf2. if wkf1 is not executed means you should not execute wkf2 we have to do it
automatically

Ans.
In this approach we can't achieve all the requirements. any time u can start the wkf2 in ur
case but requirement is wkf2 shouldn't run if wkf1 is not succeeded.

Solution for In this case u have to create a flat file at the end of wkf1 by using touch
xyz.txt and in wkf2 use even wait task and use file watch event on that flat file. in this
solution if you run the wkf2 it will wait for the flat file to create if not it won't run the
wkf2.