Vous êtes sur la page 1sur 11

DHANALAKSHMI COLLEGE OF ENGINEERING DEPT OF CSE

QUESTION BANK

Sub Code: CS2032 Sub Name: DATA WAREHOUSING & DATA MINING Year/Sem: IV / VII

UNIT-I DATA WAREHOUSING


Data warehousing components

1. Write the applications of data warehousing. 2. What is meant by data mart? 3. Define Data Warehouse 4. List the functionality of metadata. 5. What is meant by vendors solutions? 6. What are the advantages of data warehousing? 7. List the two different types of reporting tools. 8. Differentiate data warehousing from data mining. 10. Draw over all architecture diagram for data warehousing.

(Nov/Dec 2010)

[2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks]

(Nov / Dec 2011, May/June 2010)

(May/June 2009, Nov /Dec 2012) [2 Marks]

(Apr/ May 2011)

[2 Marks] [8 Marks] [16 Marks]

9. Explain the components of data warehouse. (Nov/Dec 2010, Apr/ May 2012) [16 Marks]

Building a data warehouse

1. Define Star Schema

(Apr/ May 2012, Nov / Dec 2010) [2 Marks] (Apr/ May 2012) [2 Marks] [2 Marks] [2 Marks] [2 Marks]

2. What are the technical issues to be considered when designing and implementing a data warehouse environment? 3. Define Legacy Data 4. What are the nine steps to design of a data warehousing? 5. Draw a neat diagram for the distributed memory shared disk architecture. 6. a) Enumerate the building blocks of data warehouse. b) Explain the importance of metadata in a data warehouse environment. (Apr/ May 2010, Nov/ Dec 2011, Apr/ May 2012) [16 Marks] (Apr/ May 2010, Apr/ May 2013)

Mapping the data warehouse

1. What are the different types of parallelism?


2

(Apr/ May 2010)

[2 Marks] [2 Marks]

2. Differentiate horizontal parallelism from vertical parallelism.

3. Draw a neat diagram for the distributed memory shared disk architecture. 4. Define Cube 5. State the steps involved in mapping function in data warehouse to multiprocessor architecture. 6. Explain the database architecture for parallel processing. (Apr/ May 2011)

[2 Marks] [2 Marks] [16 Marks] [16 Marks]

DBMS schemas for decision support

1. Define Star Schema 2. List the vendors solution. 3. Define Bitmapped Indexing

(Apr/ May 2009, Nov / Dec2011) [2 Marks] [2 Marks] [2 Marks] [2 Marks] [8 Marks] [16 Marks]

4. Why data mining are used in all organizations? 5. Explain the star join and star index with an example. 6. Explain multidimensional data model with star schema.

Data extraction, cleanup, and transformation tools

1. List the examples of access tools. 2. What is data transformation? . 3. What is meant by information delivery systems? 4. Explain the various methods of data cleaning. 5. Describe the data extraction, cleanup and loading.

[2 Marks] [2 Marks] [2 Marks] (Apr/ May 2013) [8 Marks] [16 Marks]

Metadata

1. List the components of metadata interchange standard frameworks. 2. Define Meta Centre 3. Explain the different types of the vendor solutions with example. 4. Explain the metadata interchange initiatives with an example.

[2 Marks] [2 Marks] [16 Marks] [16 Marks]

UNIT-II BUSINESS ANALYSIS


Reporting and query tools and applications

1. What are the different types of reporting tools?


2. Define Meta Layer 3. Define EIS 4. Define Object Oriented Architecture 5. State the types of frames in reporting tools. 6. Explain the basic features that are provided by reporting and query tools used for business analysis. (Nov / Dec2011, Apr/ May 2012)

[2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [16 Marks]

Cognos impromptu

1. Define Cognos Impromptu 2. Draw a neat sketch for three-tire client/server architecture. 3. What are the databases supported for cognos impromptu? 4. What are the functions of power play administrator? 5. a) Explain multidimensional data model. b) How does the computation can be performed efficiently on data cubes? 6. Describe cognos impromptu with an example.

[2 Marks] [2 Marks] [2 Marks] [2 Marks] [10 Marks] [6 Marks] (Nov / Dec2012, Apr/ May 2013) [16 Marks]

Online analytical processing

1. Distinguish between OLAP and OLTP. 2. Define MQE 3. Draw a neat diagram for web processing model. 4. List the five categories of decision support tools. 5. Classify OLAP tools. 6. Define Concept Hierarchy 7. List any five OLAP guidelines.

(Nov / Dec2011) (Nov / Dec2012) (Apr/ May 2010) (Apr/ May 2013)

[2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks]

8. Distinguish between multidimensional and multi-relational OLAP. 9. Define ROLAP


4

10. Explain the typical OLAP operations with an example. 11. Explain in details metadata management with neat diagram. 12. What are different types of distributed standard report?

(Apr/ May 2011)

[8 Marks] [8 Marks] [8 Marks] [8 Marks]

13. Describe multidimensional online analytical processing and multi-relational online analytical processing.

Categories of tools

1. List the applications that the organization uses to build a query and reporting environment for the data warehouse. 2. List the components of forte. 3. Describe OLAP tools and Internet. 4. Explain the following: a) Power Builder b) Forte (Nov / Dec2009) [2 Marks] [2 Marks] [8 Marks] [16 Marks]

UNIT-III
5

DATA MINING
Data mining functionalities

1. Define Data Mining 2. List the data mining tools.

(Apr/ May 2012, Nov / Dec2010) (Apr/ May 2013, Nov / Dec2010) (Apr/ May 2011. Nov / Dec2011)

[2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [8 Marks]

3. What are the various forms of data pre-processing? 4. What is concept hierarchy? data mining. 6. List the functions of data mining. 7. a) Explain the various primitives for specifying data mining task. b) Describe the various descriptive statistical measures for data mining. (Apr/ May 2012, Nov / Dec2010) 8. Explain the different types of data and functionalities. (Apr/ May 2013, Nov / Dec2011) [16 Marks] [8 Marks] (Apr/ May 2012)

5. State why the data pre-processing an important issue for data warehousing and

Types of data

1. List the steps for data mining. 2. Define Transactional Database 3. Define Temporal Database

(Apr/ May 2009, Apr/ May 2010)

[2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [16 Marks]

4. Define Times Series and Sequence Databases 5. Define Spatial Databases 6. Differentiate heterogeneous database from legacy database. 7. Define Text Mining 8. Define Multimedia Database 10. Write the different issues of data mining. (Apr/ May 2012, Nov / Dec2010) 9. How does the data warehouse differ from database? 11. Explain the process of knowledge discovery with neat diagram.

(Apr/ May 2011, Nov / Dec2012) [16 Marks]

Classification of data mining systems

1. Define Patterns 2. List the advanced database systems. 3. Define Summarization 4. What is the need for discretization in data mining? 5. What is meant by knowledge representation?

(Nov / Dec2012) (Apr/ May 2011) (Nov / Dec2011)

[2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [16 Marks] [16 Marks] [16 Marks]

6. How does the data mining system can be integrated with a data warehouse? 7. Explain the transactional data bases with an example. 8. Differentiate cluster analysis from outlier analysis.

Data pre-processing

1. Define Genetic algorithm 2. What are the various forms of data pre-processing? 3. Mention the various tasks of data pre-processing. 4. What is meant by pattern evaluation?

(Apr/ May 2010) [2 Marks] [2 Marks] [2 Marks] [2 Marks] [8 Marks] [10 Marks] [6 Marks] (Apr/ May 2011, Nov / Dec2011) [16 Marks]

5. Describe the challenges to data mining regarding performance issues. 6. Describe the interestingness of patterns. . 7. a) Explain the data pre-processing. b) Explain data mining classifications with an example.

UNIT-IV
7

ASSOCIATION RULE AND CLASSIFICATION


Mining frequent patterns 1. What is meant by market basket analysis? 2. What is the frequent item set property? 3. What is meant by frequent pattern mining? 4. What are the apriori properties used in the apriori algorithms? 5. Explain apriori algorithm for finding frequent item sets using candidate generation. 6. A database has four transactions. Let min sup=60% and min conf=80%. TID T100 T200 T300 T400 DATE 10/15/07 10/15/07 10/19/07 10/22/07 ITEMS_BOUGHT {K,A,B} {D,A,C,E,B} {C,A,B,E} {B,A,D} [16 Marks] [16 Marks] (Apr/ May 2012) [2 Marks] [2 Marks] [2 Marks] [2 Marks]

Find all frequent item sets using apriori and FP growth, respectively. Compare the efficiency of the two mining process.

Associations and Correlations


1. What are the uses of multilevel association rule?

(Nov / Dec2010) (Apr/ May 2013)

[2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [16 Marks] [16 Marks]

2. What is meant by correlation analysis? 3. Write the two measures of association rule. 4. How does the association rule mined from large databases? 5. What are the uses of multi level association rules?

6. What are the means to improve the performance of association rule mining algorithm? 7. Explain the approaches for mining multi level association rules from the transactional databases with relevant example. 8. Describe the multi-dimensional association rule with a suitable example.

Classification 1. What are the issues of the classification?


8

[2 Marks]

2. What are the different types of classification? 3. What is meant by pruning in a decision tree induction? 4. Define Conditional Pattern Base 5. List the major strength of decision tree method. (Apr/ May 2013) (Nov / Dec2011)

[2 Marks] [2 Marks] [2 Marks]

(Apr/ May 2011, Nov / Dec2012) [2 Marks] 6. What are the assumptions needed for naive Bayes classifier? 7. List the major strength of the decision tree induction. 8. What is tree pruning in decision tree induction? 9. Explain the method of decision tree classification. (Apr/ May 2012, Nov / Dec2010) [16 Marks] 10. a) Explain the algorithm for constructing a decision tree from training samples.[12 Marks] b) Explain Bayes concept. (Apr/ May 2012, Nov / Dec2011) [4 Marks] (Apr/ May2013, Nov / Dec2010, Nov / Dec2011) [16 Marks] Prediction 1. Differentiate predication from classification? 2. What is meant by support vector machine? 3. State the advantages of the decision tree approach over other approaches for performing classification. 4. How is the attributing oriented induction implemented? [2 Marks] [16 Marks] [2 Marks] [2 Marks] 11. Develop an algorithm for classification using Bayesian classification. [2 Marks] [2 Marks] [2 Marks]

UNIT-V CLUSTERING AND APPLICATION AND TRENDS IN DATA


9

Cluster analysis

1. What are the requirements of clustering? 3. What is text mining? 4. What is cluster analysis?

(Apr/ May 2010, Nov2012)

[2 Marks]

2. What are the applications of spatial data base? (Nov / Dec2011, Apr/ May 2012) [2 Marks] (Apr/ May 2011) [2 Marks] [2 Marks] [2 Marks] (Nov / Dec2010) [2 Marks] [2 Marks] [16 Marks]

5. What are the two data structures in cluster analysis? 6. Differentiate classification from clustering. 7. List the requirements of clustering in data mining. 8. Explain different types of data in cluster analysis?

Clustering methods 1. What is the objective function of K - means algorithm? 2. What are the different methods in partitioning? 3. Mention the advantages of hierarchical clustering. 4. a) Explain the BIRCH performs clustering in large data sets. b) Compare and outline the major differences of the two scalable clustering algorithms BIRCH and CLARANS. (Apr/ May 2012, Apr/ May 2010 / Nov / Dec2011) 5. Explain the following a) DBSCAN b) CURE 6. Write short notes on the following: a) Partitioning methods b) Outlier analysis 7. Describe K - means clustering with an example. 8. Describe hierarchical methods. 9. Explain constraint based cluster analysis with relevant example [16 Marks] [16 Marks] [16 Marks] [16 Marks] [16 Marks] [16 Marks] [2 Marks] [2 Marks] [2 Marks]

Outlier analysis

1. What is an outlier?
10

[2 Marks]

2. What are the different types of outlier analysis? 3. Define Density Based Local Outlier Detection 4. Define Distance Based Outlier Detection 5. Define Deviation Based Outlier Detection 6. Explain the different types of outlier analysis.

(Apr/ May 2009)

[2 Marks] [2 Marks] [2 Marks] [2 Marks] [16 Marks]

Data mining applications 1. Define Spatial Database 2. List any two various commercial data mining tools. 3. What is web usage mining? 4. What are the requirements of clustering? 5. What are the applications of spatial databases? 6. What is audio data mining? 7. List any two application of data mining. 8. Write a short note on web mining taxonomy. 9. Explain the different activities of text mining. (Apr/ May 2011, Nov / Dec2010) 10. Explain the current trends in data mining. 11. Describe spatial data bases and text databases 12. Explain the methods of mining multimedia database. 13. Describe any four data mining applications. (Apr/ May 2010, Nov / Dec2011) (Apr/ May 2010) (Apr/ May 2013) (Nov / Dec2010) [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [2 Marks] [6 Marks] [8 Marks] [16 Marks] [16 Marks] [8 Marks] [16 Marks]

11

Vous aimerez peut-être aussi