Vous êtes sur la page 1sur 42

Dhaanish Ahmed College of Engineering

Padappai
Department of Information Technology
Sub Code & Name : CS1004 – Datawarehousing and Mining
UNIT 1
2 mark
1.Define warehousing?
2.Distinguish between data warehouse and data mart?
3.List out the components of data warehouse?
4.Define data cube? Give an example?
5.What is fact table and dimension table?
6.Compare OLAP and OLTP?
7.What are meta data?
8.What is the need for OLAP?
9.Define star schema, snowflake schema and fact constellation?
10.What is starnet query model?
11.Write down the applications data warehousing.
12. When is data mart appropriate?
13. What is concept hierarchy? give an example.
14.What are the uses of statistics in data mining?
15.Name some advanced database systems?
16.Name some specific application oriented datdabases?
17.Define Relational Database?
18.Define Transactional Database?
19.Define Spatial database?
20.What is Temporal Database?
21.What is Time series database?
22.What is legacy database?
23.What is learning?
24.Why machine learning is done?
25.Give the Components of a learning system?
26.Give some factors for evaluating performance of learning system?
27.What are the steps in data mining process?
28.Define datamart?
29.List the merits of data modeling tool?
30.What is data warehouse performance issue?
31.What are the types of performance issue?
32.Why do you need data ware house life cycle process?
33.Merits of data ware house?
34.What are the steps in data ware house life cycle process?
35.What are the Characteristics of data ware house ?
36.List some of the data ware house tools?
37.What is end user data access tool?
38.Define molap?
39.Define Holap?
40.Define Rolap?
41.What is ad hoc query tool?
42.List few of the data mining applications?
43.Define Supervised learning scheme?
44. .Define UnSupervised learning scheme?
45.What is the necessity of data mining?
46.Draw the flow chart of database Evolution?
47.Define OLTP?
48.Expand OLAP?
49. Expand OLTP?
50.What is data stream?
51.What is a data tomb?
52.What is data archarology?
53.What is data dredging?
54.What is KDD?
55.What is data warehouse server?
56.Point out few Advanced databases?
57.What is an sttribute?
58.Define Tuple?
59.What is SQL?
60.Define heterogenous database?
61.Define WWW?
62.List the applications of WWW?
63.What is web log mining?
64.Define Outlier Analysis?
65.Define Evolution Analysis?
66.Define Technical meta data?
67. Define Businessl meta data?
68.What is a Distributive measure?
69. What is an Algebraic measure?
70. What is a Holistic measure?
71.What is Roll up data?
72.Define Drill down operation?
73.Define slice?
74. Define Dice?
75.What is pivot?
76. Define Drill within operation?
77. Define Drill across operation?
78. Define Drill through operation?
79.What is top down view?
80. What is Data source view?
81. What is Data ware house view?
82. What is Business query view?
83.Define Data Cube?
84.What is a cube operator?
85.What is No Materialization?
86. What is Full Materialization?
87. What is Partial Materialization?
88.merits of bitmap indexing?
89. merits of join indexing?
90.Define MDDB?
91.What is Query driven approach?
92.What is Update driven Approach?
93.What is a Dimension?
94.Define Fact?
95.Define Cuboid?
96.Define Aggregation?
97.What is a statistical database?
98.What is CRM?
99.What is the use of Load ?
100.Define Refresh?
4 mark
1.What is the difference between view and materialized view?
2. Explain the Difference between star and snowflake schema?
3. Mention the various tasks to be accomplished as part of data pre-processing.
4. Mention the advantages of Hierarchical clustering?
5.What is the difference between view and materialized view?
6. Explain the Difference between star and snowflake schema?
7. What is Data Warehouse Metadata?
8. What is Dimensionality Reduction?
9. What is Concept Description?
10..Difference between Supervised and UnSupervised learning scheme?
11.Discuss Join Indexing?
12. DiscussBitmap Indexing?
13. Explain the steps in knowledge discovery?
14.Give short notes on database Evolution?
15. Give short notes on dataware house?
16.Explain data stream?
17.Describe KDD?
18.Explain Relational database?
19.Give the importance ofER model?
20.What are the functions of relational database?
21.How can be a customer analyzed by data mining system?
22.Differentiate data ware house and data mart?
23. Differentiate data ware house vs Heterogenous DBMS?
24. Differentiate data ware house vs Operational DBMS?
25.Discuss Star schema?
26.Discuss Snowflake Schema?
27.Discuss Fact Constellation?
28.Discuss a Starnet query model?
29.Describe top down view?
30. Describe Data source view?
31. Describe Data ware house view?
32. Describe Business query view?
33.Explain Enterprise ware house?
34.Explain Data mart?
36.Describe Virtual ware house?
37.Expalin ROLAP?
38. Expalin MOLAP?
39. Expalin HOLAP?
40. Expalin Specialized SQL server?
41.Discuss efficient processing of OLAP queries?
42.Compare OLAM and OLAP?
43.Explain the Curse of Dimensionality?
44.Define Full and Ice berg cube?
45. Define closedl and shell cube?
46.Explain Knowledge mining from data?
47.Explain Knowledge Extraction?
48.Describe data analysis?
49.Explain Data archaeology?
50.Discuss data dredging?
51.Describe Database?
52.Explain Information Repository?
53.Explain Knowledge base?
54.Discuss Data mining Engine?
55.Describe Pattern Evaluation Module?
56.ExplainUser Interface?
57.Discuss data discrimination?
58.Explain Mining different kinds of knowledge in DB?
59.Discuss Interactive mining of Knowledge at multiple levels of abstraction?
60.Describe Incorporation of background knowledge?
61. Discuss data mining Query language?
62.E xplain ad hoc Data mining?
63. Discuss Presentation and visualization of data mining results?
64. Describe Handling Incomplete data?
65. Explain Scalability of data mining algorithms?
66. Describe Efficiency of data mining?
67.Explain Parallel mining algorithm?
68. Explain Distributed mining algorithm?
69. Explain Incremental mining algorithm?
70.Discuss Spreadsheets?
71.Describe Dimension table?
72.Explain Fact table?
73.Give the cube definition statement?
74.Interpret the measures for data cube?
75.Discuss Schema Hierarchy?
8 mark
1.What is over fitting and what can you do to prevent it?
2. In classification trees, what are surrogate splits, and how are they used?
3. What is the objective function of the K-Means algorithm?
4.What are the difference between three main types of data usage: information processing,
analytical processing and data mining?
5.Discuss the motivation behind OLAP mining.
6. Discuss the various types of metadata?
7. Categorize OLAP tools?
8.Explain various data mining issues?
9.Describe Indexing technique of OLAP with example?
10. Give short notes on database Evolution with a neat flow chart?
11.Give the importance of data mining?
12.Give the functions of OLAP?
13. What are the major components of data mining?
14.Give the importance of pattern evaluation model?
15.How decision making is performed in data warehouse?
16.Is data warehouse suited for OLAP, Explain in Brief?
17.Explain object relational database in detail?
18.What is raster format? Explain its use with an example?
19.Describe the role of DBMS in data mining?
20.Explain Information Delivery System?
21.Discuss Access Tools?
22.Explain Conceptual modeling of data ware house?
23.Explain Three categories of measures?
24.Explain Business Analysis Framework?
25.Discuss 4 views regarding the design of a data ware house?
25.Describe data ware house design process?
26.Explain 3 data ware house Models?
27.Describe Efficient Computation of Data cube?
28.Discuss data ware house Back end tools and utilities?
29.Explain Data ware house applications?
30.Describe the architecture of OLAM?
31.Discuss in detail about the Lattice of Cuboid?
32.How many cuboids are there in n dimensional data cube?
33. Describe Full cube?
34. Describe closed cube?
35. Describe Ice berg cube?
36. Describe shell cube?
37.Explain Significance Constraint?
38. Explain Probe Constraint?
39. Explain Gradient Constraint?
40.Discuss on Information System?
41.Describe Temporal Database?
42. Describe Time seriesl Database?
43. Describe Sequence Database?
44. Describe Spatial Database?
45. Describe SpatioTemporal Database?
46. Describe Textl Database?
47. Describe Mu;timedia Database?
48.Give short notes on Heterogenous DBMS?
49. Give short notes on Legacy Database?
50.Explain Classification of Data mining Systems?
16 mark
1.Enumerate the building blocks of a data warehouse. Explain the
importance of metadata in a data warehouse environment. What are the
challenges in metadata management?
2. Distinguish between the entity-relationship modeling technique
and dimensional modeling. Why is the entity-relational modeling
technique not suitable for the data warehouse?
3. Create a star schema diagram that will enable FIT-WORLD GYM
INC. to analyze their revenue. The fact table will include – for every
instance of revenue taken – attribute(s) useful for analyzing
revenue. The star schema will include all dimensions that can be
useful for analyzing revenue. Formulate query: “Find the
percentage of revenue generated by members in the last year”.
How many cuboids are there in the complete data cube?
4.Briefly compare the following concepts. Explain your points with an example
(i) Snowflake schema, fact constellation, star net query model
5. ) Discuss the typical OLAP operations with an example.
6.Discuss how computations can be performed efficiently on data cubes.
(ii) Write short notes on data warehouse meta data.
7. Describe the multidimensional data model.How it is used in data warehousing?
8. Explain the architecture of data warehouse with a neat sketch?
9. Explain the operations performed on data warehouse with examples?
10. Distinguish between data mining and data warehousing?
11. Discuss various data mining issues with some examples?
12.Explain Data mining Functionalities?
13.Explain different types of data repositories on which mining can be performed?
14.What are the major components of data mining? Explain with a neat Flowchart?
15.Explain SQL inDetail?
16.Discuss in detail OLAP server Architectures?
17.Explain data ware house Implementation?
18.Describe Selected Computation of Cuboids?
19.Explain Efficient methods for data cube computation?
20.Explain Optimization technique?
21.Discuss Multiway array aggregation for full cube computation?
22.Explain BUC Algorithm?
23.Discuss Star cubing?
24.Write an algorithm for Shell Fragment computation?
25.Discuss Constrained Gradient Analysis Data cube?

UNIT 2
2 mark
1.What is the need of data preprocessing?
2.Define smoothing and Binning?
3.What is the need for discretization in data mining?
4.What is concept hierarchy? Give an example?
5.Define DQML?
6.What are functional components of GUI in data mining?
7.Define task relevant data?
8.What is meant by concept description?
9.What is data generalization?
10.How to perform class comparison?
11. Define Data Mining.?
12.What is the main goal of statistics?
13.What are the factors to be considered while selecting samples in statistics?
14.Define data cleaning?
15.Define Data integration?
16.Define Data Selection?
17. Define Data Transformation?
18. What is pattern evaluation?
19.What is Knowledge Presentation?
20.List the steps in preprocessing?
21.What is visualization?
22.Name some conventional visualization techniques?
23Give the features included in modern visualization techniques?
24.Define Conventional visualization ?
25.Define Spatial visualization ?
26.Define Descriptive Data mining?
27.What is Predictive data mining?
28.What is data generalization?
29.Define attribute oriented induction?
30.What is Jack knife?
31.What is Bootstrap?
32.Give the views of Statistical approach?
33.What are the assumptions of Statistical approach?
34.What is the use of Probablistic graphical model?
35.Give the Importance of of Probablistic graphical model?
36.Define Deterministic model?
37.Define System?
38.Define Model?
39.How to choose the best model?
40.Principles of Qualitative Formulation?
41.What is linear regression?
42.State the types of linear model?
43.What is the use of linear model?
44.What are the goals of time series analysis?
45.What is smoothing?
46.What is lag?
47.What do you mean by concept hierarchy?
48.Define inconsistency cleaning?
49.What is Column level cleaning?
50.Define Descriptive data summarization?
51.What is a missing value?
52.Define Normalization?
53.What is attribute subset selection?
54.Define Dimensionality reduction?
55. Define Numerosity reduction?
56.What is a Central Tendency?
57.Define mean?
58.Define Mode?
59.Define Median?
60.What is a mid range?
61.What is Dispersion of data?
62.Define IQR?
63.What is variance?
64.Define range?
65. List the data transformation operations?
66.Define Quartiles?
67.What is weighted arithmetic mean?
68.What is Unimode?
69. What is Bimode?
70. What is Trimode?
71.Define Multimode?
72.Give the empirical relation for unimodal frequency?
73.What is Dispersion?
74.Define Standard Deviation?
75.What is 5 number summary?
76.What is a boxplot?
77.Define first Quartile?
78.Define Third Quartile?
79.What are Whiskers?
80.Give the formula for standard deviation?
81. Give the formula for variance?
82.What is discrepancy detection?
83.Define Unique rule?
84.Define Consecutive rile?
85.Define Null rule?
86.What is a data scrubbing tool?
87.What is data auditing tool?
88.Define data migration tool?
89.What is an ETL?
90.Define Redundancy?
91.What is correlation analysis?
92.Define Correlation coefficient?
93.Define attribute construction?
94.Define Discrete wavelet Transform?
95.Define Sampling?
96.What is comparison?
97.What is Discrimination?
98.Define attribute removal?
99.What is data focusing?
100.What is attribute generalization control?
4 mark
1.Distinguish between concept description and OLAP?
2.What is quantitative rule?
3.What is attribute relevance analysis?
4.What do you mean by attribute oriented induction?
5.List out the methods for implementing class comparison?
6. Write a short note on regression?
7. Write a short note on correlation?
8.Discuss Parametric methods?
9.Explain Non Parametric methods in detail?
10.Explain Data Generalization?
11.Describe Concept Hierarchy generation?
12.Explain Data mining Primitives?
13.Explain attribute oriented induction?
14.Discuss on Descriptive data summarization?
15.Explain Histogram?
16.Discuss Quantile plot?
17.Describe Q-Q plot?
18.Explain Scatter plot?
19.Discuss Loess curve?
20.Describe Missing values?
21.Explain Noisy data?
22.Describe Binning?
23.Explain Regression?
24.Discuss Clustering?
25.Describe the Mean,median,mode,mid range?
26.Explain IQR ,variance, quartiles?
27.Discuss Discrepancy detection?
28.Explain data scrubbing tools?
29.Discuss data Auditing tools?
30.Explain data migration tools?
31.Discuss Entity identification problem?
32.Explain Correlation analysis?
33.Explain smoothing?
34.Describe Aggragation?
35.Discuss Generalization?
36.Explain Normalization?
37.Describe Attribute Construction?
38.Discuss Min max Normalization?
39.Explain z-score Normalization?
40.Discuss Normalization by decimal scaling?
41.Describe Data cube aggragation?
42.Explain attribute subset selection?
43.Discuss on Dimensionality reduction?
44.Explain Numerosity reduction?
45.Explain discretization?
46.Explain concept hierarchy generation?
47.Describe stepwise forward selection?
48. Describe stepwise backward Elimination?
49. Describe the combination of forward selection and backward elimination?
50.Discuss on decision tree induction?
51.Explain DFT?
52.Explain Hierarchical pyramid algorithm?
53.Describe Orthonormal?
54.Give short notes on PCA?
55.Discuss Log Linear Models?
56.Describe Equal Width histogram?
57. Describe Equal Frequency histogram?
58.What is V-Optimal?
59.Describe MaxDiff Histogram?
60.What are the 3 data clusters?
61. Describe Multidimensional histogram?
62.Define Centroid distance?
63.Describe Multidimensional index trees?
64.Explain SRSWOR?
65.Describe SRSWR?
66.Explain Cluster sample?
67.Discuss Stratified Sample?
68.List the merits of Sampling?
69.Discuss Top down Discretization?
70.Explain Splitting?
71. Discuss bottom up Discretization?
72.Discuss Merging?
73.Draw a flow chart for stepwise forward selection?
74Draw a flow chart for stepwise backward Elimination?
75.Draw a flow chart for the combination of forward selection and backward elimination?
8 mark
1..Mention the various tasks to be accomplished as part of data pre-processing.?
2. What is over fitting and what can you do to prevent it?
3.Explain the 5 steps in the Knowledge Discovery in Databases (KDD)
process.
4.Discuss in brief the characterization of data mining algorithms.
5.Discuss in brief important implementation issues in data mining.
6. List and discuss the various data mining primitives?
7. Distinguish between statistical inference and exploratory data analysis.?
8. Write a short note on machine learning. What is supervised and unsupervised learning?
9. Write a short note on regression and correlation?
10. Discuss on Descriptive data summarization with examples?
11.Explain Graphic display of basic descriptive data summaries?
12.Explain data cleaning?
13Describe data cleaning as a process?
14.Explain the measures of Central tendency?
15.Describe the measures of Dispersion of data?
16.Explain about data integration?
17. Describe Attribute Construction with example?
18.Discuss Min max Normalization with example?
19.Explain z-score Normalization with example?
20.Discuss Normalization by decimal scaling with example?
21.Discuss about data transformation?
22.Explain About data reduction?
23.Discuss the basic heuristic methods of attribute subset selection?
24.Explain Wavelet Transforms?
25.Explain Principle Component Analysis?
26. .Explain Histogram with examples?
27.Discuss Quantile plot with examples?
28.Describe Q-Q plot with examples?
29.Explain Scatter plot with examples?
30.Discuss Loess curve with examples?
31.Describe Missing values with examples?
32.Explain Noisy data with examples?
33.Describe Binning with examples?
34.Explain Regression with examples?
35.Discuss Clustering with examples?
36.Apply binning method for data smoothing for 4,8,15,21,21,24,25,28,34?
37.Discuss that data integration is the detection and resolution of data value conflicts?
38.How can we find the good subset of original attributes?
39.Justify Wavelet Transforms can be applied to multidimensional data?
40.How can we reduce the data volume by choosing alternative smaller forms of data
representation?
41.How are the buckets determined and the attribute values partitioned in Histogram?
42.Explain Discretization by Intuitive partitioning?
43.Explain x2 merging?
44.Explain 3-4-5 rule with an example?
45.Describe the specification of a partial ordering of attributes explicitly at schema level by users
or experts?
46.Explain the specification of a portion of a hierarchy by explicit data grouping?
47.Discuss on the specification of a set of attributes not of their partial
Ordering?
48.Discuss issues to consider during data integration?
49.Data quality can be accessed in terms of accuracy , completeness and consistency. Propose
other two dimensions of data quality?
50.Suppose a group of 12 sales price records has been sorted as follows
5,10,11,13,15,35,50,55,72,92,204,215
Partition them into 3 bins by
i)Equidepth partition
ii)Equal width partitioning
iii)Clustering

16 mark
1.Explain the need and steps involved in data preprocessing?
2.List out the primitives for specifying a data mining task?
3.Describe how concept hierarchies are useful in data mining?
4.What are the various issues addressed during data integration?
5.Write in detail about attribute oriented induction with algorithm?
6.Describe the various descriptive statistical measures for data mining?
7.Explain various methods of data cleaning in detail?
8. Give an account on data mining Query language?
9. How is Attribute-Oriented Induction implemented? Explain in detail.?
10. Write and explain the algorithm for mining frequent item sets without candidate generation.
Give relevant example.?
11. With relevant examples discuss the role of statistics in data mining?
12. Enumerate and discuss various statistical techniques and methods for
data analysis?
13. For class characterization, what are the main differences between a data cube based
implementation and a relational implementation such as attribute-oriented induction?
14.Explain Smoothing Techniques?
15.Explain Data Transformation in detail?
16.Explain Normalization in Detail?
17.Discuss Data Reduction in detail?
18.Describe Parametric and Non Parametric methods in detail?
19. Explain Data Generalization and Concept Hierarchy generation?
20.Describe the Alternative method for Data generalization snd Concept Descrip[tion?
21.Given 1dimensional data set X={-5,0,23.0,17.6,9.23,1.11} normalize the data set using i)Min-
Max Normalization[0,1]
ii) Min-Max Normalization[-1,1]
iii)Standard Deviation Normalization
22.Explain Designing the GUI based On DMQL?
23.A data set for analysis includes X={7,12,5,18,9,13,12,19,7,12,12,13,3,4,5,13,8,7,6} Find
Mean, median, mode and Standard Deviation for X?
24.Give the Graphical summarization of the data set X using boxplot representation. Find
Outliers in X?
X={7,12,5,18,9,13,12,19,7,12,12,13,3,4,5,13,8,7,6}
25.Explain Entropy based Discretization?
UNIT 3
2 mark
1.What is market basket analysis?
2.Define frequent itemset?
3.Define Association rule?
4.Define FP-growth?
5.Write the use of conditional pattern base in FP-tree?
6.What is the use of Multi-level association rulr?
7.List the techniques for improving the efficiency of Apriori?
8.Define Support and Confidence?
9.What is level cross filtering?
10.How to determine redundant association rule?
11.Give the General properties of Boolean networks?
12.What is support?
13.Define Confidence?
14.How are Association rule mined from large database?
15.List the merits of Dimensional modeling?
16.What comprises of a dimensional model?
17.What is Bottleneck Detection?
18.What is Back room metadata?
19.What isFront room metadata?
20.What is Active metadata?
21.What is meta data catalogue?
22.What is association mining?
23.Give examples for association mining?
24.Define Market Basket analysis?
25.Give the applications of association rule?
26. Define Minimum Confidence threshold ?
27.Define Minimum support Threshold?
28.What are the two step process?
29.Define Single level association rule?
30. Define Multi level association rule?
31. Define Single Dimensional association rule?
32.Define Multi Dimensional association rule?
33. Define Boolean association rule?
34.Define Quantitative association rule?
35.Define Frequent Itemset Mining?
36.Define Sequential pattern mining?
37. .Define Structured pattern mining?
38.What is Apriori Principle?
39.Define Join step?
40.What is Prune step?
41.Why Progressive Refinement used?
42.What is Superset coverage property?
43.Define two or multi step mining?
44.What is frequency?
45.Define Support count?
46.What is an itemset?
47.Define absolute support?
48.What is closed frequent itemset?
49.What is Maximal frequent itemset?
50.Define Antimonotone?
51.Define Conditional Database?
52.What is Horizontal data format?
53.State Item Merging?
54.What is Sub item pruning?
55. State Item skipping?
56.Define Uniform Support?
57.What is an Ancestor?
58.Define Intra Dimensional Association rule?
59.Define Inter Dimensional Association rule?
60.Define Hybrid Dimensional Association rule?
61.What are Categorical Attributes?
62. What are Nominal Attributes?
63. What are Quantitative Attributes?
64.What is Dynamic Quantitative Attributes?
65.Define Predicate set?
66.Define null transaction?
67.Define null invariant?
68.What is a knowledge type constraints?
69.Define data constraints?
70.State level constraints?
71.Define Dimension constraints?
72.What is an Interestingness constraints?
73.Define Rule constraints?
74.List the classifications of Rule constraints?
75.Define Antimonotonic?
76.What is monotonic?
77.Define succinct?
78.Define Convertible?
79. Define InConvertible?
80.Define ECLAT?
4 mark
1.In classification trees, what are surrogate splits, and how are they used?
2. What is the objective function of the K-Means algorithm?
3.List two interesting measures for association rules.
4. What are Iceberg queries?
5.Explain Market Basket analysis?
6.Describe the basic concepts of association rule?
7.Define Minimum Confidence threshold and Minimum support Threshold?
8.Discuss association rule mining can be viewed as two-step process?
9.Explain the levels of abstraction involved in the rule set?
10.Discuss the number of data dimensions involved in the rule?
11.Describe the types of values handled in the rule?
12.Explain the kinds of patterns to be mined?
13.Explain the various extensions to association mining?
14.Explain constraint-based association mining?
15. Discuss Apriori algorithm?
16.Describe the Apriori property as a two step process?
17.Explain the Association rule based on conditional probability ?
18.Explain Hash based itemset counting?
19.Describe Transaction reduction?
20.Explain Partitioning?
21.Discuss Sampling?
22.Describe Dynamic Itemset Counting?
23.What are the Bottle neck of Apriori algorithm?
24.What are the benefits of FP tree structure?
25.What are the major steps to mine FP tree?
26.What is the principle of Frequent pattern growth?
27.Why is Frequent pattern growth fast?
28.Discuss Ice berg Queries?
29.Explain Progressive Deepening?
30.Discuss Progressive Refinement of data mining quality?
31.Explain Uniform Support?
32. Describe Reduced Support?
33.Explain Level by Level Independent?
34.Describe Level cross filtering by k-itemset?
35.Discuss Level cross filtering by single item?
36. Discuss Controlled Level cross filtering by single item?
37. Compare closed frequent itemset with that of Maximal frequent itemset?
38.How is the Apriori property used in the algorithm?
39.How can we mine closed frequent itemset?
40.Compare Intra Dimensional Association rule and
Inter Dimensional Association rule?
41.Discuss Correlation analysis using lift?
42.Compare null transaction and null invariant?
43.How are meta rules useful?
44.Specify the rule constraint types?
45. How can we use rule constraint to prune the search space?
46.Explain Antimonotonic constraint?
47.Discuss monotonic constraint?
48.Explain succinct constraint?
49.Describe Convertible constraint?
50.Explain InConvertible constraint?
8 mark
1.Find all the association rules that involve only B, C.H (in either left
or right hand side of the rule). The minimum confidence is 70%?
2. Discuss the approaches for mining multi level association rules from the transactional
databases. Give relevant example.
3.Explain the classification of association rule Mining?
4. With an algorithm explain constraint-based association mining?
5. Discuss Apriori algorithm with suitable example?
6.Design the Apriori Algorithm?
7.Illustrate the Working of Apriori Algorithm ?
8.Generate Association rule from Frequent Itemsets?
9.Suppose the data contains the Frequent Itemsets l={I1,12,15}.What are the Association rule
that can be Generated from l?
10. Given a Sample Transactional database X:
X: TID Items
T01 A.B,C,D
T02 A,C,D,F
T03 C,D,E,G,A
T04 A,D,F,B
T05 B,C,G
T06 D,F,G
T07 A,B,G
T08 C,D,F,G
Using Threshold values support =25% and Confidence = 60%.Find
All large item sets in database X?
11. Given a Sample Transactional database X:
X: TID Items
T01 A.B,C,D
T02 A,C,D,F
T03 C,D,E,G,A
T04 A,D,F,B
T05 B,C,G
T06 D,F,G
T07 A,B,G
T08 C,D,F,G
Using Threshold values support =25% and Confidence = 60%.Find
Strong association rules for database X?
12. Given a Sample Transactional database X:
X: TID Items
T01 A.B,C,D
T02 A,C,D,F
T03 C,D,E,G,A
T04 A,D,F,B
T05 B,C,G
T06 D,F,G
T07 A,B,G
T08 C,D,F,G
Using Threshold values support =25% and Confidence = 60%.
Analyze misleading association for the rule set obtained in b?
13.Explain the method Frequent pattern growth?
14.Describe the procedure for creating a conditional pattern base?
15. What are the approaches to mining multi level association rules?
16.Discuss the Strategies for mining multi level association rules using reduced support?
17.Explain multi level association rules: Redundancy Filtering?
18.Design a method that mines the complete set of frequent itemset without candidate
generation?
19.Discuss Constraint based Association mining?
20.Explain meta rules Guided mining of Association rule?
21.Explain how rule constrains can be used in mining process?
22.Prove that all nonempty subsets of a frequent itemset must also be frequent?
23.Prove that the support of any nonempty subset’s of itemset s must be at least as great as the
support of s`?
24.Give an example to show that items in a strong association rule may actually be negatively
correlated?
25.Discuss effective methods that can be used to reduce the number of rules generated while still
preserving most of the interesting rule?
16 mark
1.Write the algorithm to discover frequent itemsets without candidate generation and explain it
with an example?
2.Discuss Apriori algorithm with suitable example and explain how its efficiency can be
improved?
3.Discuss mining of multi-level association rules from transactional databases?
4. Explain with an algorithm, how to mine single dimensional Boolean Association Rules from
transactional database. Give relevant example?
5. Describe the multi-dimensional association rule, giving a suitable example?
6. With an algorithm explain constraint-based association mining. Give relevant example
7.There are 9 transactions in this database |D|=9 and a minimum support count is taken as 2.Use
Apriori Algorithm for finding frequent itemsetsin D?

TID ITEMS
T1001 1,12 ,15
T2001 2,14
T3001 2,13
T4001 1,12,14
T5001 1,13
T6001 2,13
T7001 1,13
T8001 1,12,13,15
T9001 1.12.13

8.There are 4 transactions in this database |D|=4 and a minimum support count is taken as 2.Use
Apriori Algorithm for finding frequent itemsetsin D?
TID ITEMS
100 134
200 235
300 1235
400 25

9.Given a Sample Transactional database X:


X: TID Items
T01 A.B,C,D
T02 A,C,D,F
T03 C,D,E,G,A
T04 A,D,F,B
T05 B,C,G
T06 D,F,G
T07 A,B,G
T08 C,D,F,G
Using Threshold values support =25% and Confidence = 60%.Find
a)All large item sets in database X?
b)Strong association rules for database X?
c)Analyze misleading association for the rule set obtained in b?
10.Find the frequent element itemsets and then the association riles with atleast findind the
frequent item sets C1.50% support and 75% Confidence?
11.A Database have 5 Transactions. Let min_sup=60% and min_conf = 80%
TID Items Bought
T100 {M,O,N,K.E,Y}
T 200 {D,O,N,K,E,Y}
T300 {M,A,K,E}
T400 {M,V,C,K,Y}
T500 {C.O.O.K.L.E}
Find all frequent item sets using Apriori method?
12.Finding Frequent itemself without candidate generation?
13. A Database has 5 Transactions. Find all frequent item sets using FP Growth. Assume
support threshold as 3
TID Items Bought
T100 {M,O,N,K.E,Y}
T 200 {D,O,N,K,E,Y}
T300 {M,A,K,E}
T400 {M,V,C,K,Y}
T500 {C.O.O.K.L.E}
14.Consider an example with following set of transactions. Find the Frequent itemsets and then
the association rules with at least 30% support using Aprson and 60% Confidence?
TID Items Bought
10 B,M,T,Y
20 B,M
30 A,T,S,P
40 A,B,C,D
50 A,B
60 T,Y,E,M
70 A,B,M
80 B,C,D,T,P
90 D,T,S
10 A,B,M

15. Consider an example with following set of transactions.


TID Items Bought
1 Bag,Uniform,Crayons
2 Books, Bag, Uniform
3 Bag,Uniform,Pencil
4 Bag,Pencil,Books
5 Uniform,Crayons,Bag
6 Bag.Pencil,Books
7 Crayons,Uniforms,Bag
8 Books,Crayons,Bag
9 Uniform,Crayons,Pencil
10 Pencil,Uniform,Books
Assume min_Support has 3. Use Apriori Algorithm for finding frequent item sets?
finding frequent item sets?
16. A Database have 5 Transactions. Let min_sup=60% and min_conf = 85%
TID Items Bought
T100 {B,C,E,J}
T 200 {B,C,J}
T300 {B,M,Y}
T400 {B,J,M}
T500 {C,J,M}
Find frequent item sets using FP Growth?
17. A Database have 6 Transactions. Let min_sup=30% and min_conf = 60%
TID Items Bought
T100 {a,b,d}
T 200 {a,c,e}
T300 {b.d}
T400 {a,b,e}
T500 {c,d,e}
T600 {a,b,e}
Find frequent item sets using FP Growth?
18.Explain with an example Constraint based Association mining?
19.Elaborate on meta rules Guided mining of Association rule?
20.How can we use rule constraint to prune the search space?What kind of rule constraint can be
‘pushed’ deep into the mining process?
UNIT 4
2 mark
1.Define Classification?
2.Define decision tree induction?
3.Define Bayes theorem?
4.What are the classification rules?
5.What is linear regression?
6.Why is naïve Bayesian classification is called ‘naïve’?
7.Define Bagging and Boosting?
8.What is classifier accuracy?
9.What is clustering?
10.What is density based clustering?
11.State the requirements of clustering?
12.Name the categories of Clustering?
13.Name 2 common approaches to tree pruning?
14.Where are decision trees mainly used?
15.What is decision tree pruning?
16.Explain ID3?
17.Name the 2 forms of data analysis?
18.Define Prediction?
19.Define Accuracy?
20.Define Robustness?
21.Explain speed?
22.Define Scalability?
23.Define Interpretability?
24.Define Attribute selection measure?
25.List the Attribute selection measures?
26.Define Information gain?
27.Define Gain ratio?
28.Define Tree Pruning?
29. Define Pre Pruning?
30. Define Post Pruning?
31.What is SLIQ?
32.Define Posteriori Probability?
33.Define Prior Probability?
34.Give an example for Posteriori Probability?
35.Give an example for Prior Probability?
36.Define Probability Estimation?
37. State the merits of Bayesian classifier?
38. State the demerits of Bayesian classifier?
39.Define Directed Acyclic graph?
40.Define Conditional Probaability tables?
41.Define neural network?
42.Draw the structure of a neuron?
43.List the Disadvantages of neural network?
44.List the advantages of neural network?
45.What is an input unit?
46.What are newrodes?
47.What is an output unit?
48.Define Feed forward network?
49.Define network topology?
50.Define backpropagation?
51.Define Sensitivity analysis?
52.Define ARCS?
53.Define CAEP?
54.Define Associative classification?
55.What is K-nearest neighbor algorithm?
56.Define CBR?
57.What is Genetic algorithm?
58.Define Cross over?
59.Define Mutation?
60.What is Fitness?
61.Define Algorithm?
62.Write the overview of Fuzzy set approach?
63.Define Fuzzy logic?
64.Define classifier accuracy?
65.What is K-fold cross validation?
66.Define Bootstraping?
67.What are Increasing classifier accuracy?
68.What are the measures of classifier accuracy?
69.Define Bagging?
70.Define Boosting?
71.What is the measure of goodness of split?
72.What is learning by observation?
73. What is learning by examples?
74.Mention any 2 Qualities of good clustering?
75.State any 2 properties of clustering algorithm?
76.List few applications of Cluster analysis?
77.What is high intra-class similarity?
78.What is low inter- class similarity?
79.Define similarity/Dissimilarity metric?
80.Define Data matrix?
81.What is Object by variable structure?
82.Define Dissimilarity matrix?
83. What is Object by object structure?
84.What is interval scaled variables?
85.Determine Minkowski distance?
86.Estimate Manhattan distance?
87.Write the formula of Euclidean Distance?
88.What is mean absolute deviation?
89.Define standardized measure?
90.Define Binary variable?
91.What is Symmetric binary dissimilarity ?
92. What is Asymmetric binary dissimilarity ?
93.Define Jaccard coefficient?
94.Define Categorical variable?
95.Define Ratio scaled variable?
96.What are vector objects?
97.What is discordancy test?
98.What is SOM?
99.Define Smoothing factor?
100.Define Cardinality function?
4 mark
1.Compare agglomerative and divisive hierarchical clustering?
2.Define CLARANS?
3.What is outlier? Name the methods for detecting outliers?
4.Define BIRCH,ROCK and CURE?
5.Explain the use of Bayesian network?
6.Describe Linear regression?
7. Discuss multiple regression?
8. Discuss Model Construction?
9.Describe Model Usage?
10. List the applications of classification ?
11. List the applications of prediction?
12.Discuss data cleaning?
13.Explain Relevance analysis?
14.Describe data transformation?
15.Discuss data Reduction?
16.Explain Normalization?
17.Discuss Generalization?
18. Explain Prediction?
19.Describe Accuracy?
20.Describe Robustness?
21.Explain speed?
22.Discuss Scalability?
23.Defscribe Interpretability?
24.What is the reason for using Decision trees in classification?
25.List the features of Decision tree?
26. Explain Information gain?
27.Discuss Gain ratio?
28.Discuss Tree Pruning?
29. Describe Pre Pruning approach?
30. Describe Post Pruning approach?
31.Give a short note on SLIQ?
32.State Naïve Bayesian classifier?
33.State Bayesian Belief Network?
34.State the classification of probability?
35.Explain Posteriori Probability?
36.Explain Prior Probability?
37.State the merits and demerits of Bayesian classifier?
38. Discuss Directed Acyclic graph?
39.Discuss Conditional Probaability tables?
40.Explain Training Bayesian Belief Network?
41.Explain Neural networks?
42.Discuss the Disadvantages and advantages of neural network?
43.Give the representation of Feed forward network?
44. Give the representation of backpropagation network?
45.Give the steps for extracting rules from networks?
46.Write the steps for Sensitivity analysis?
47.Discuss ARCS?
48.Discuss CAEP?
49.List the merits of K-nearest neighbor algorithm?
50.List the demerits of K-nearest neighbor algorithm?
51. List the merits of Case based Reasoning?
52. List the demerits of Case based Reasoning?
53. List the merits of Genetic Algorithm?
54. List the demerits of Genetic Algorithm?
55. List the merits of Rough set approach?
56. List the demerits of Rough set approach?
57. List the merits of Fuzzy set approach?
58. List the demerits of Fuzzy set approach?
59.Discuss classifier accuracy?
60.Write briefly on Increasing classifier accuracy?
61.Discuss the Qualities of good clustering?
62.Explain the measure of Quality of clustering?
63.Describe the desirable properties of clustering algorithm?
64.List the Cluster analysis applications?
65.Give 5 dimensional numeric samples A=(1,0,2,5,3) and B=(2,1,0,3,-1). Find the Euclidean
Distance?
66. Given 2 objects represented by the tuples (22,1,42,10) and(20,0,36,8)
Compute the Euclidean Distance?
67. Given 2 objects represented by the tuples (22,1,42,10) and(20,0,36,8)
Compute the Minkowski distance?
68. Given 2 objects represented by the tuples (22,1,42,10) and(20,0,36,8)
Compute the Manhattan distance?
69.List the properties of Distance?
70.Discuss nominal or categorical variable?
71.Describe variables of Mixed types?
72.State 2 types of Hierarchical clustering?
73.State the methods of improving the quality of Hierarchical clustering?
74.List the merits and demerits of CLARANS?
75.Give the steps and methods of Detecting Outliers?
8 mark
1.Discuss the different types of clustering methods?
2.Describe the working of PAM algorithm?
3.Describe K-means clustering with an example?
4.Explain hierarchical method of clustering?
5.Explain the various methods for detecting outliers?
6. Explain Bayes theorem?
7. Explain the following clustering methods in detail:
(i) BIRCH
(ii) CURE
8.Explain Linear regression?
9. What are classification rules?
10.How is regression related to classification?
11.Explain Data classification as a 2 step process?
12. Explain Data Prediction as a 2 step process?
13.List the applications of classification and prediction?
14.Describe the preprocessing steps of classification and prediction?
15.Compare classification and prediction methods?
16.List the criteria for Evaluating classification and prediction methods?
17.How to extract classification rules from trees?
18.Explain the enhancements to basic decision tree induction?
19.Explain Scalability and decision tree induction?
20.Explain the working of Naïve Bayesian classifier?
21.Explain the working of Naïve Bayesian Belief Network?
22.Illustrate with an example Training Bayesian Belief Network?
23.Discuss Multilayer Feed forward neural network?
24.Explain backpropagation algorithm?
25.Explain Association rule clustering system ?
26.Explain Associative classification?
27.Describe classification by aggregating emerging patterns?
28.Explain K-nearest neighbor algorithm?
29.Describe Case based Reasoning?
30.Discuss Genetic Algorithm?
31.Explain Rough set approach?
32.Describe Fuzzy set approach?
33.Explain multiple regression?
34.Explain Non linear regression?
35.Discuss Generalized Linear models?
36.Describe Log linear models?
37.How to estimate classifier accuracy?
38.Given 2 objects represented by the tuples (22,1,42,10) and(20,0,36,8)
a)Compute the Euclidean Distance?
b) Compute the Minkowski distance?
c) Compute the Manhattan distance?
39.Discuss Classical Partitioning methods: k-means and k-Medoids?
40.Explain the working of CLARANS?
41.Discuss BIRCH?
42.Describe CURE?
43.Explain ROCK?
44.Discuss the hierarchical algorithm using Dynamic modeling?
45.Explain Density based clustering?
46. Explain OPTICS?
47. Explain GRID based clustering method?
48.Discuss on MODEL based clustering?
49.Describe deviation based Outlier detection?
50. Describe distance based Outlier detection?
16 mark
1.State Bayes theorem and discuss Bayesian classifiers work?
2.What is back propagation? How does it work?
3.Describe the various techniques for improving classifier accuracy?
4.BIRCH and CLARANS are two interesting clustering algorithms that perform effective
clustering in large data sets.
(i) Outline how BIRCH performs clustering in large data sets.
(ii) Compare and outline the major differences of the two scalable clustering algorithms : BIRCH
and CLARANS.
5. Decision tree induction is a popular classification method. Taking one typical decision tree
induction algorithm , briefly outline the method of
decision tree classification?
6. Explain the algorithm for constructing a decision tree from training samples?
7.Explain with an example the various steps in decision tree induction?
8.What are classification rules? How is regression related to classification?
9.Discuss in detail the issues regarding classification and prediction?
10.Explain the decision tree induction algorithm for Scalability?
11.Classify the given training same using ID3 algorithm.Apply the same to construct a decision
tree for the data given below?

SIZE COLOR SHAPE CLASS


Small Yellow Round A
Big Yellow Round A
Big Red Round A
Small Red Round A
Small Black Round B
Big Black Cube B
Big Yellow Cube B
Big Black Round B
Small Yellow Cube B

12.Build a decision tree classification model to classify bank loan applications by assigning
applications to one of 3 classes?
Own’s Home Married Gender Employed Class
Yes Yes Male Yes B
No No Female Yes A
Yes Yes Female Yes C
Yes No male No B
No Yes Female Yes C
No No Female Yes A
No No male No B
Yes No Female Yes A
No Yes Female Yes C
Yes Yes Female Yes C

13.Classify a training sample using ID3 and construct a decision tree with your own example?
14.Given a training data set Y:
A B C Class
15 1 A C1
20 3 B C2
25 2 A C1
30 4 A C1
35 2 B C2
25 4 A C1
15 2 B C2
20 3 B C2

a)Find the best threshold for attribute A?


b)Find the best threshold for attribute B?
c)Find a decisiontree
d)Derive decision rules from decision tree .Sample =8 , c1=4 and C2=4
15.Given a data set X with 3 dimensonal categorical samples
Attribute1 Attribute2 Class
T 1 C2
T 2 C1
F 1 C2
F 2 C2
Construct a decision tree for s=4 ,c1=1 and C2=3?
16.Discuss in detail about the split algorithm based on GINI index?
17.Explain in detail classification by backpropagation?
18.Describe the 3 methods for association rule based classification?
19.Explain in detain other classification methods?
20.Discuss hoe prediction models continuous valued functions?
21.The following table shows the mid term and final exam grades obtained for student in a
database course
X Y
Middle Exam Final Exam
72 84
50 63
81 71
74 78
94 90
86 75
59 49
83 79
65 77
33 52
88 74
a)Plot the data.Do X and Y seem to have relationship?
b)Use the method of least squares to find an equation for the prediction of a students final exam
grade based on the students midterm grade in the course?
c)Predict the final exam grade of a student who received an 86 on midterm exam?
22.Discuss the various techniques for improving classifier accuracy?
23.Explain Partitioning methods in detail?
24.Suppose that data mining task is to cluster the following 8 points into 3 clusters
A1(2,10),A2(2,5),A3(8,4),B1(5,8),B2(7,5),B3(6,4),C1(1,2),C2(4,9).Use the k-means Algorithm to
show
a)the 3 clusters centers after the first round execution?
b)The final 3 clusters
25.Explain the various method of detecting outliers?
UNIT 5
2 mark
1.What is text mining?
2.What is spatial data mining?
3.What are the applications of spatial data mining?
4.What is web mining?
5.Define Hub and Authorities?
6.What is HITS algorithm?
7.What is web usage mining?
8.List out some data mining tools?
9. What are the measures for text retrieval?
10.What is trend analysis?
11.Define object relational database system?
12. Define object oriented database system?
13.State the limitations of Multidimensional database analysis?
14.What are list valued attributes?
15. What are sequence valued attributes?
16.Define Spatial merge?
17. Define Spatial union?
18.Define Spatial overlapping?
19.Define Spatial intersetion?
20.Define multimedia database?
21.Define Class?
22.Define an object?
23.What is an Object identifier?
24.What is composition Hierarchy?
25.Define object cube?
26.Define Plan?
27.What is Plan Mining?
28.Define geo statistics?
29.What is spatial statistics?
30.Mention any 2 spatial data mining applications?
31.What is spatial to non spatial dimension?
32.What is non spatial dimension?
33. What is spatial to spatial dimension?
34.Define Numerical measure?
35.State spatial measure?
36.Define Superset coverage property?
37.What is false positive test?
38.What is false negative test?
39.Define spatial classification?
40.Define mining raster database?
41.What is description based retrieval system?
42.What is Content based retrieval system?
43.Mention the 2 kinds of queries used in Content based retrieval system?
44.What are image sample based queries?
45.What are Image feature specification queries?
46.Define QBIC?
47.What is a Feature Descriptor?
48.What is a Layout descriptor?
49.Define Image Excavator?
50.Define URL?
51.Define MFC?
52.Define MFO?
53.Mention few dimensions of Multimedia data cube?
54.Define sequence database?
55.What is Time series database?
56.What are Long term movements?
57.Define Trend movements?
58.Define Cyclic movements?
59.Define Seasonal variations?
60.What are irregular movements?
61.Define random movements?
62.What is a free hand methob?
63.What is Least square method?
64.Define Time series Forcasting?
65.Define ARIMA?
66.What are moving averages?
67.Define Whole matching?
68.What is subsequence matching?
69.Define scientific database?
70.What is medical diagnosis?
71.Define DFT?
72.Define DWT?
73.Define atomic matching?
74.What is window stitching?
75.Define Subsequence ordering?
76.What is a multidimensional index?
77.Define query language?
78.Define Episode?
79.Name the methods for Sequential pattern mining?
80.Define Periodicity Analysis?
81.What is full Periodicity?
82.Define partial Periodicity?
83.Define Serial episode?
84.Define parallel episode?
85.Define Regular expression?
86.List few applications of Time series Database?
87.Define Information retrieval?
88.What is a document database?
89.Define inverted index?
90.Define Signature file?
91.What is document table?
92.What is term table?
93.Define Hypertext analysis?
94.Define Link Analysis?
95.Define Anomaly Detection?
96.What is Document clustering?
97.Define Web search engine?
98.What is a web log?
99.Define Hyperlink?
100.Define Chasm?
4 marks
1.Discuss on spatial operators?
2.What is multidimensional analysis?
3. What is time series analysis?
4. Explain the generalization of object Identifiers
5. Explain the generalization of Class/Subclass Hierarchies?
6.Explain the generalization of Class composition Hierarchies?
7.Discuss the construction of object cubes?
8. Describe the mining of object cubes?
9.Give the multidimensional model for the plan base?
10. Mention any 4 spatial data mining applications?
11. Discuss Spatial data cube construction?
12.Explain spatial OLAP?
13.Discuss on mining raster database?
14Analyze Multimedia data Mining?
15.Discuss the dimensions of Multimedia data cube?
16.Explain mining associations in multimedia data?
17.State the association between Association between image content and non image content
features?
18. State the association among image content that are not related to spatial relationship?
19. State the association among image content related to spatial relationship?
20.Determine the components to categorize time series data?
21.Specify the estimates of trend analysis?
22.Give the similarity search in time series analysis?
23.Discuss data transformation ?
24.Explain Enhanced similarity search methods?
25.What are the steps for performing a similarity search?
26.Discuss multidimensional indexing?
27.Describe time sequence query language?
28.Discuss shape definition language?
29.List the applications of Sequential pattern mining?
30. List the applications of Time series Database?
31.Discuss the basic measures of text retrieval?
32.Explain keyword based retrieval?
33.Describe similarity based retrieval in text database?
34.Discuss Latent Semantic Indexing?
35.Describe Random sampling?
36.Explain Sliding window?
37.Describe Histigrams?
38.Describe Partial materialization of a stream cube?
39.Discuss FP mining in data stream?
40.Explain VFDT?
41.Explain CVFDT?
42.Give short notes on Markov Chain?
43.Discuss Coherent Substructures?
44.Discuss Dense Substructures?
45.What is a Social Network?
46.Discuss Data reduction?
47. Describe Data Compression?
48.What is Pattern Discovery?
49.Discuss Microeconomic View?
50.Describe Inductive Databases?
51.What is Data Visualization?
52.Give short notes on Analysis of variance?
53. Discuss Data Mining Process Visualization?
54.What is Interactive Visual data mining?
55.Write briefly about Collaborative Filtering?
56.Discuss on Privacy protection and Information Security in Data Mining?
57. Give short notes on Distributive data mining?
58.How to mine Complex data types?
59.Write briefly about Domain specific Knowledge?
60. Give short notes on web usage mining?
61.Compare LSI and LPI?
62.Justify Web is a highly dynamic information source?
63.Describe on Document Selection?
64.Describe on Document Ranking?
65.What is web Content Mining?
66. What is web Structure Mining?
67.Discuss web log Mining?
68.Describe Abundance problem?
69.How to perform resource finding?
70.Define Hub and Authority?
71.List the applications of web Mining?
72.How to use hub pages to find Authoritative pages?
73.Discuss Color Histogram based signature?
74. Discuss Multi feature composed signature?
75.How to selectively pre compute some spatial measures in the spatial data cube?
8 marks
1.Write a short note on web mining taxonomy?
2.Explain the different activities of text mining?
3. Discuss and elaborate the current trends in data mining?
4.Explain the generalization of object Identifiers and Class/Subclass Hierarchies?
5.Describe the construction and mining of object cubes?
6.Explain the generalization based Mining of plan database by divide and conquer?
7. Discuss the spatial data mining applications?
8.Discuss Spatial data cube construction vs spatial OLAP?
9.Enumerate 3 types of dimensions in a spatial data cube?
10.Illustrate the computation of spatial measures in soatial data cube construction?
11.Give the star schema representation of BC weather pattern Analysis?
12. Explain Mining Spatial Association ?
13. Explain Co Location Pattern?
14.Discuss spatial clustering methods?
15. Describe Multimedia data Mining?
16.Discuss the approaches for similarity based retrieval in image database?
17.Difference between mining associations in multimedia database vs in transaction database?
18.Explain Mining Time series data?
19.Discuss Sequential pattern mining cases and parameters?
20. Discuss Latent Semantic Indexing?
21.Design Randomized Algorithm?
22.Discuss Multiresolution methods?
23.Distinguish stream OLAP and Stream data cubes?
24. Discuss FP mining in data stream?
25.Design Lossy counting Algorithm?
26.Discuss classification of Dynamic data stream?
27.Design Hoeffding Tree algorithm?
28.Compare VFDT vs CVFDT?
29.Explain Trend Analysis?
30.Discribe Similarity search in Time series Analysis?

31.Discuss Sequential pattern mining algorithm Based on candidate generate and Test?
32.Explain SPADE?
33.Discuss Prefix span?
34.Describe mining closed sequential pattern?
35.Explain mining multidimensional, Multilevel sequential pattern?
36.Discuss Constraint based mining of sequential pattern?
37.Explain Periodicity Analysis for Time related sequence data?
38.Discuss mining sequential pattern in Biological data?
39.Describe Alignment of Biological Sequences?
40. Elaborate Markov Chain?
41.Design Forward Algorithm?
42. Design Viterbi Algorithm?
43.Design Baum-welch Algorithm?
44.Discuss Graph Mining?
45. Describe mining closed Frequent Substructures?
46. Discuss Constraint based mining of Substructures pattern?
47.Explain Generalization of Class Composition Hierarchies?
48.Discuss Text Retrieval Methods?
49.Explain How to choose a Data Mining System?
50.Describe Data mining for Intrusion Detection?
16 mark
1.Explain the mining of spatial databases?
2.Discuss the mining of text databases?
3.What are the salient features of time series data mining?
4.What is web mining? Discuss the various web mining techniques?
5.Discuss in detail the Application of Data mining for financial data analysis?
6.Discuss the application of data mining for biomedical and DNA data analysis and
telecommunication industry?
7.Discuss the social impacts of data mining Systems?
8. Why is outline mining importantt? Briefly describe the different approaches behind statistical
based outlier detection, distance-based outlier detection and deviation-based outlier detection?
9. What is multidimensional analysis? Discuss the same with an example?
10. What is time series analysis? Discuss the same with an example?
11.Describe BC weather pattern Analysis?
12.Explain Mining Spatial Association and Co Location Pattern?
13.Describe Multimedia data Mining?
14.Elaborate on Sequential pattern mining?
15.Is data mining merely managers Business or every one’s Business?
16. Is data mining a threat to privacy and data security?
17.Explain Mining data streams?
18.Discuss stream query processing?
19.Explain data reduction and transformation techniques?
20.Elborate Indexing Methods for similarity search?
21.Describe Mining Sequence Pattern in Transactinal Database?
22.Discuss Data mining in Telecommunication and retail Industry?
23.Explain few examples of Commercial data mining systems?
24.Discuss Statistical Data mining?
25.Explain the Trends in Data mining?

Vous aimerez peut-être aussi