Académique Documents
Professionnel Documents
Culture Documents
3. What is ER Diagram?
Star Schema contains Centralized Fact Tables, Fact Table surrounded by different
Dimension Tables its called Star Schema.
A star schema contains only single dimension table for each dimension.
When dimension table contains less number of rows, we can choose Star schema.
Both Dimension and Fact Tables are in De-Normalized
Less No.of Joins we can used
Good for data marts with simple relationships (1:1 or 1:many)
Less number of foreign keys and hence shorter query execution time (faster)
Lower query complexity and easy to understand
Top down approach.
Snow Flake Schema contains Centralized Fact table, surrounded by different Dimension
tables and each dimension tables having Sub-Dimensions its called Snow Flake Schema.
Aggregate table contains the summary of existing warehouse data which is grouped to certain levels of
dimensions.Retrieving the required data from the actual table, which have millions of records will take
more time and also affects the server performance.To avoid this we can aggregate the table to certain
required level and can use it.This tables reduces the load in the database server and increases the
performance of the query and can retrieve the result very fastly.
OLTP :
Online Transaction Processing
It maintain current data(it means period of <1 yrs)
Will have done INSERT, UPDATE and DELETE operations
Data is volatile in OLTP
Response time is fast.
Data is Normalized
OLAP :
Online Analytical Processing
It maintain current and history data.
Only we can use SELECT operation for reporting and analysis purpose.
Data is Non-Volatile
Response time is slow
Data is De-Normalized.
ETL (Extract, Transform and Load) is a process in data warehousing responsible for
pulling data out of the source systems and placing it into a data warehouse. ETL involves
the following tasks:
- extracting the data from source systems (SAP, ERP, other oprational systems), data
from different source systems is converted into one consolidated data warehouse format
which is ready for transformation processing.
applying any kind of simple or complex data validation (e.g., if the first 3 columns in a row
are empty then reject the row from processing)
- loading the data into a data warehouse or data repository other reporting applications
Cognos (Impromptu, Power Play) , MS-EXCEL, TABLEAU, Qlikview, Oracle Express OLAP, MS reporting
services(SSRS) , Informatica Power Analyzer
Fact table contains Facts and Measures, Every fact table surrounded by number of Dimension
tables. Based on our requirement.
Dimension table stores attributes, or dimensions, that describe the objects in a fact table.
When a table is used to check for some data for its presence prior to loading of some other data
or the same data to another table, the table is called a LOOKUP Table.
The basic purpose of the scheduling tool in a DW Application is to stream line the flow of data
from Source to Target at specific time or based on some condition.
18. What are modeling tools available in the Market? Name some of them?
Market Analysis
Fraud Detection
Customer Retention
Production Control
Science Exploration
Fraud Detection
21. What is Normalization? First Normal Form, Second Normal Form , Third Normal Form?
23. What type of Indexing mechanism do we need to use for a typical Data warehouse?
Bitmap index
23. Which columns go to the fact table and which columns go the dimension table? (My user needs to see
<data element<data element broken by <data element<data element>
24. What is a level of Granularity of a fact table? What does this signify?(Weekly level summarization
there is no need to have Invoice Number in the fact table anymore)
25. How are the Dimension tables designed? De-Normalized, Wide, Short, Use Surrogate Keys, Contain
Additional date fields and flags.
26. What are slowly changing dimensions?
29. What is VLDB? (Database is too large to back up in a time frame then it's a VLDB)