Académique Documents
Professionnel Documents
Culture Documents
Information Technology
CS 2032 DATA WAREHOUSING AND DATA MINING (Regulation 2008) Time : Three hours
Answer ALL questions PART A (10 2 = 20 marks) 1. 2. 3. 4. 5. What is data transformation? Give example. With an example explain what is metadata? Classify OLAP tools.
What is an apex cuboid? Give example. State why data preprocessing an important issue for data warehousing and data mining. What do data mining functionalities include? With an example explain correlation analysis. What is a support vector machine? What is an outlier? Give example. List any two applications of data mining.
7. 8. 9.
10.
21
6.
21
4 21
Maximum : 100 marks
PART B (5 16 = 80 marks) 11. (a) What is a data warehouse? Diagrammatically illustrate and discuss the data warehousing architecture. Or (b) List and discuss the steps involved in mapping the data warehouse to a multiprocessor architecture. (16)
(16)
12.
(a)
List and discuss the basic features that are provided by reporting and query tools used for business analysis. (16) Or
(b)
With relevant examples discuss multidimensional online analytical processing and multi relational online analytical processing. (16) How data mining systems are classified? Discuss each classification with an example. (16)
13.
(a)
(b)
How a data mining system can be integrated with a data warehouse? Discuss with an example. (16) Discuss the Apriori algorithm for discovering frequent item sets. Apply the Apriori algorithm to the following data set : (16) Trans ID 101 Items Purchased strawberry, litchi, oranges strawberry, butter fruit butter fruit, vanilla strawberry, litchi, oranges banana, oranges banana banana, butter fruit
14.
(a)
21
103 104 105 106 107
102
4
2
21
Or
4 21
11249
Or (b)
Develop an algorithm for classification using Bayesian classification. Illustrate the algorithm with a relevant example. (16) Consider five points {X 1 , X 2 , X 3 , X 4 , X 5 } coordinates as a two dimensional sample for clustering : with the following
15.
(a)
X 1 = (0 , 2 ) ; X 2 = (0 , 0 ) ; X 3 = (1 .5 , 0 ) ; X 4 = (5 , 0 ) ; X 5 = (5 , 2 )
Illustrate the K-means partitioning algorithm (clustering algorithm) using the above data set. (16)
(b)
21
3
Or
21
4 21
(16)
The set of items is {strawberry, litchi, apple, oranges, vanilla, banana, butter fruit}. Use 0.3 for the minimum support value.
11249