Vous êtes sur la page 1sur 3

Chapter 1: Warehouse, What Is It, Who Is It, and Why?

Data Warehouse
Data Warehouse is a subject-oriented, integrated, non-volatile and time-variant collection
of data in support of management decision.
Basic goals of data warehousing

To provide a reliable, single, integrated source of key corporate information.


To give end users access to their data without a reliance on reports produced by
the IS department.
To allow analysts to analyze corporate data and even produce predictive what-if
models from that data.

Sources
Data warehouses are populated with data from two primary sources:

Most frequently, they are populated with periodic migration of data from
operational systems.
The second source is made up of external, frequently purchased, databases.
Examples of this data would include lists of income and demographic
information.

Explanation of definition
Subject Oriented
Online Transaction Processing (OLTP) systems are usually intended to hold information
about small subsets of the organization. Where as data warehouse database is subject
oriented i.e., organized into subject areas.
Integrated
To create a useful subject area, the source data must be integrated. In other words, the
data must be modified to comply with common coding rules.
Non-Volatile
Non-volatile is 50-cent word, meaning that the warehouse is read- only; users cant write
back. Unlike operational databases, warehouses primarily support reporting, not data
capture.
The warehouse is a historical record. Allowing users to write back to the warehouse
would be akin to George Orwells concept of rewriting history.

Time Variant
The historical records will help in trend analysis for which the time element is critical.
The data without the time element is not much useful for any analysis.
Business Intelligence differs from Transaction Processing
i.

Operations systems are designed to work with small pieces of information.


Business intelligence, on the other hand, frequently works with huge blocks of
information.

ii.

Operational data must frequently be updated in real time. In the business


intelligence arena, on the other hand, data almost never needs to be updated in
real time. This is because BIS users work with high-level, summary data.

iii.

Operational systems schemas are designed for rapid data input. In a data
warehouse, on the other hand, user enters no data - remember non-volatile.
Instead, the goal is to get data out as quickly as possible.

iv.

Operational warehouses need immediate response. BIS users, while they


should not wait months for answer, generally dont need such a blazing
response time. When they get a response from the system, they consider it for
a while and then take the next step.

v.

Operational system usage patterns are relatively predictable. System designers


can tell how much work it will take to process a transaction and when most
transactions will occur. In BI environment on the other hand, usage patterns
are not quite so stable. Who knows when a question might arise that requires
complex analysis? Thus, the system might be heavily used for about four
hours on Monday and then not touched again until Friday.

vi.

The database designers of operational systems tend to be very complex. On


the other hand, BI applications attempt to give end users access to their own
data.

Data Mart
Data mart has the definition similar to that of data warehouse. But a data warehouse is a
broad data store. It contains a number of subject areas. A data mart, on the other hand,
typically focuses on a more narrow part of the business. It covers a single subject area
and/ or type of analysis.

Two basic ways to create a data mart

The first is to capture data directly from OLTP systems into the marts that needed
it.
The second method is to capture data from OLTP systems into a central data
warehouse and then to feed the data marts with data from the warehouse.

Vous aimerez peut-être aussi