Académique Documents
Professionnel Documents
Culture Documents
Data Warehouse
Data Warehouse is a subject-oriented, integrated, non-volatile and time-variant collection
of data in support of management decision.
Basic goals of data warehousing
Sources
Data warehouses are populated with data from two primary sources:
Most frequently, they are populated with periodic migration of data from
operational systems.
The second source is made up of external, frequently purchased, databases.
Examples of this data would include lists of income and demographic
information.
Explanation of definition
Subject Oriented
Online Transaction Processing (OLTP) systems are usually intended to hold information
about small subsets of the organization. Where as data warehouse database is subject
oriented i.e., organized into subject areas.
Integrated
To create a useful subject area, the source data must be integrated. In other words, the
data must be modified to comply with common coding rules.
Non-Volatile
Non-volatile is 50-cent word, meaning that the warehouse is read- only; users cant write
back. Unlike operational databases, warehouses primarily support reporting, not data
capture.
The warehouse is a historical record. Allowing users to write back to the warehouse
would be akin to George Orwells concept of rewriting history.
Time Variant
The historical records will help in trend analysis for which the time element is critical.
The data without the time element is not much useful for any analysis.
Business Intelligence differs from Transaction Processing
i.
ii.
iii.
Operational systems schemas are designed for rapid data input. In a data
warehouse, on the other hand, user enters no data - remember non-volatile.
Instead, the goal is to get data out as quickly as possible.
iv.
v.
vi.
Data Mart
Data mart has the definition similar to that of data warehouse. But a data warehouse is a
broad data store. It contains a number of subject areas. A data mart, on the other hand,
typically focuses on a more narrow part of the business. It covers a single subject area
and/ or type of analysis.
The first is to capture data directly from OLTP systems into the marts that needed
it.
The second method is to capture data from OLTP systems into a central data
warehouse and then to feed the data marts with data from the warehouse.