Académique Documents
Professionnel Documents
Culture Documents
In this module, we will study our first topic in business intelligence, Data
Warehousing (DW), with the following activities.
1) Define Data Warehousing
2) Familiarize with a generic DW framework;
3) Take a closer look at DW framework with specifics of a case study;
4) Gain exposure to Real-time Data Warehousing.
In this module, we will cover our first topic in BI, data warehousing, with the following activities. We will start out with a definition of data warehouse and data warehousing; we will then get
familiarized with a generic DW framework described in the textbook; I will then introduce a case study based on my experience in the credit card industry on how a DW helped to launch a targeted
marketing campaign; We will conclude this module with some exposure to real-time data warehousing through the completion of HW2.
1/8
Dr. Nuo Xu
1. Data Warehousing
Let us now study a definition of data warehouse that has been widely used in both academia and industry.
Data Warehousing
The process of building and maintaining a data warehouse is known as Data
Warehousing.
2/8
Dr. Nuo Xu
3/8
Dr. Nuo Xu
On the far left side of the diagram are the data sources for data warehousing, which can be either internal or external to a company. The first four database icons represent some typical internal data
sources within an organization: ERP stands for enterprise resource planning system and POS stands for point of sale; and these are two examples of OLTP systems that are supporting and tracking
day-to-day activities of a business, Depending on type of business, an ERP system in an auto manufacturer would have an inventory subsystem supporting ordering and tracking thousands of parts
from hundreds of different venders. And a POS system for a retailer will be responsible for recording each transaction and charging customers for products or service they receive. For a social
network company like Facebook, most activities of the business occur on the web, which becomes a major data source of data warehousing. Some examples of external data are credit reports
compiled by credit bureaus and census data provided by the government.
Once data sources have been identified, the next activity in data warehousing is called ETL, which stands for extraction, transformation and load. Extraction refers to reading out relevant data from
one or more operation databases or external datasets; transformation refers to restructuring or merging those extracted data according to the need of analysis; and Load refers to populating data
warehouse with transformed data.
The resulting collection of data, shown in the center of the diagram, is the Data Warehouse. If the data warehousing process is applied to all corporate-wide data sources, it is referred to as enterprise
data warehouse; if only a subset of corporate-wide data that is of particular interest to a group of users is warehoused, it is referred to data mart. So a company might have a data mart for
engineering, one for marketing, so on and so forth. Metadata refers to the special type data about what data are available in a data warehouse and where they come from.
The rightmost portion of the diagram describes some of the many applications and usage of data warehouse, such as routine business reporting, data mining and text mining, dashboard which refers
to summarizing and representing data in the easiest digestible way for knowledge workers, many other customized applications.
4/8
Dr. Nuo Xu
Business Background
ABC, a credit card company, attempts to expand to other consumer
lending business by developing a low-interest personal loan product.
Using BI to launch a targeted marketing campaign
o Identify most likely responders to a new personal loan product in
existing card members.
Highest Outstanding Balance.
Develop a responding score to measure likelihood of responding
by taking into account of multiple financial factors simultaneously?
The background of this case study is a credit card company attempts to expand to other consumer lending business by develop a low-interest personal loan product. And the company is trying to first
sell the product to its existing card member base. One approach of marketing a new product can be randomly sending out mails to card members hoping they will be interested in such a product. A
better approach is thorough the so-called targeted marketing, i.e, only market to those who are mostly likely to find this product meeting their needs. There are many ways of identifying likely
responder to a loan product. A simple solution is to assume people with higher outstanding balance are more in need of extra credit, which is sufficient in our case study to illustrate a DW framework.
But there are other solutions with more sophistication.
Business Objective
Solicit the card members with highest total average monthly balance
(including ABC card and all non-ABC trades) in last 12 months through mail
campaign to achieve the highest response rate for the mailing budget.
All non-ABC trade balances include other credit card a current ABC card member might have and other type of loans such as car loans and mortgages.
5/8
Dr. Nuo Xu
Enterprise
Data Depository
Teradata Database
Credit Bureaus
(CB)
CB table(time series)
ABC table
Unix Environment
-SAS enterprise
-data aggregation
-analytical modeling
Windows Environment
-Data Analysts
This diagram illustrates the DW framework in support of this targeted marketing campaign.
6/8
Dr. Nuo Xu
Data Sources
Data sources include an internal operational database and an external data source. Card payment billing system supports and captures all transactional activities such as purchases and payments for
each card members.
Cardmember ID
A
A
B
B
month balance
201201
1500
201112
1400
201201
1500
201112
1400
7/8
Dr. Nuo Xu
Data Warehouse
In this particular example, Data warehouse includes both the Enterprise data depository for flat files and the Teradata Database Systems.
Applications
User: Data Analysts
Architecture: Windows (Tier 1) Unix (Tier 2) TeraData (Tier 3)
Activities: Produce a list of current card members who have the highest
total monthly outstanding balance to marketing team for mail campaign
In terms of application in this DW framework, Data Analysts will use a PC to access a Unix server, where the statistical analysis software SAS is residing. The SAS software will then access Teradata
database system to enable Data analysts to perform corresponding data analysis. From the DW architecture perspective, Windows will be referred to as Tier 1, Unix Tier 2 and Teradata Tier 3. In this
example, the end product from data analysis will be a list of current card members who have the highest total monthly outstanding balance in the last 12 months, based on which the marketing team
to launch a targeted mailing campaign for the newly developed loan product.
8/8
Dr. Nuo Xu