Vous êtes sur la page 1sur 13

www.vujannat.ning.

com

CS614 Data Warehousing


Final Term Examination – Spring 2005
Time Allowed: 150 Minutes

Total Marks: 81 Total Questions: 12

Question No. 1 Marks : 01

Pipeline parallelism focuses on which of the following?


o a. Increasing throughput of task execution
o b. Decreasing sub task execution time
o c. Overlapping the execution tasks
o d. Uniform task execution times
o e. All of the above

Question No. 2 Marks : 01

What type of operation can be performed on a package that has been built and
saved for further use?
o a. Edited
o b. Password protected
o c. Scheduled for execution
o d. Retrieved by version
o e. All of the above

Question No. 3 Marks : 01

If w is the window size and n is the size of data set, then the complexity of merging
phase in BSN method is…
o a. O (n)
o b. O(w)
o c. O(wn)
o d. O (w log n)
o e. O (n log n)

Question No. 4 Marks : 01

Which is the least appropriate join operation for Pipeline parallelism?


o a. Hash join
o b. Inner Join
o c. Outer Join
o d. Sort-merge join
o e. None of the above.

Question No. 5 Marks : 15

Differentiate the followings: (5 each)

• Classification and Estimation. Which is more flexible?


• Knowledge Discovery in Databases (KDD), Data Mining (DM) and Data
Warehousing (DWH).
• B-tree indexing vs. Hash based indexing or scan. Which one is better?

Question No. 6 Marks : 15

Briefly explain the following questions. (5 each)

• Define de-normalization. What are the four fundamental guidelines for de-
normalization?
• List and explain three fundamental advantages of Bit map indexing
• List any five steps for extracting data using the SQL server DTS wizard.

Question No. 7 Marks : 15

Give precise answer of the following questions. (5 each)

• Define Data warehouse. Why we require Data warehouse? Give at least four
reasons.
• What is the difference between data matrix and similarity/dissimilarity matrix in
terms of rows and columns draw both of them? Which one is symmetric?
• List any four significant points about the architecture in the data warehouse
development lifecycle and briefly explain them.
Question No. 8 Marks : 15

How following quality metrics can be evaluated using simple ratios? Give examples.
a. Free of error
b. Completeness
c. Consistency

Question No. 9 Marks : 10

What sort of objective assessment metrics are used by companies? What are the
possible issues in formulating these metrics?

Question No. 10 Marks : 05

What are the "Five Signs of Trouble" that serve as a key indicator that the data
warehousing project is under threat of failure?

Question No. 11 Marks : 06

List and briefly explain the three fundamental factors that affect the amount of
history stored in a DWH.

Question No. 12 Marks : 01

Which of the following is a possible error due to data duplication?


o a. False frequency distributions.
o b. Incorrect aggregates due to double counting.
o c. Difficulty with catching fabricated identities.
o d. All of the above
WWW.vujannat.ning.COM
Connecting VU Students
CS614-DATA WAREHOUSE
Final Spring 2007

Q#1: Clustering is at higher level than classification.(2)


True
False

Q#2: Give two real-life examples of clustering i.e. clustering used for market
segmentation for telecom industry and clustering used for crop identification of
insurance fraud.(15)

Q#3: If w is the window size and n is the size of data set, then the complexity of
merging phase in BSN method is ___________.(2)

Q#4:
a) Comment on the statement that “Creating indexes requires careful
consideration so as to avoid performance degradation”. [5]
b) Write the following Bit vectors in compressed form using Run Length
Encoding with # being the separator symbol. [5+5]

I. 111001010110000101000001001010010000001
II. 110101010100001111011010010010001011111

Q#5:
a) Why we require Data warehouse. Give at least three reasons. [6]
b) What is meant by the “House of Quality”? What type of risks can be dealt
with this technique? [4]
What are the goals of horizontal splitting and what are different methods of
horizontal splitting? [4+6]

Q#6:
In DW project, it is assumed that _________ environment is very similar to the
production environment.(2)

Q#7: Which is the least appropriate join operation for Pipeline parallelism?(2)

Q#8: What is meant by clickstream data? Can it be useful in a web DWH


environment? If yes, then how?(10)
Q#9: ROI stands for ___________.(2)
CS614- Data Warehousing
http://vujannat.ning.com
BEST SITE TO HELP STUDENTS
Midterm Fall2005

Give precise answer of the following questions. (5 each)


What is the major transformation of Calculated and Derived Values?
Calculated and Derived Values:
Why bother about data duplication? Explain using an example.
Data Duplication:
Briefly explain the Summarization basic Data Transformation task? Briefly explain
the following questions. (5 each)
Onetomany Transformation:
Why onetomany transformation is complex, give example?
Define ETL. List three typical MIS/ERP systems that are found while doing ETL?
What is the advantage of CDC in the context of "inflight" transformation? Write
down short note on the followings: (5 each) known in advance. Typically, in Decision
Support System, the data access is ad
AdHoc access means that the pattern in which the data will be accessed is not
AdHoc access
ER modelling:
Why ER Modelling has been so successful?

What is meant by HOLAP? Why is it used?


HOLAP:
CS614- Data Warehousing
Midterm Fall2005
www.vujannat.ning.com

Give precise answer of the following questions. (5 each)


� What is the major transformation of Calculated and Derived Values?
Calculated and Derived Values:
� Why bother about data duplication? Explain using an example.
Data Duplication:
� Briefly explain the Summarization basic Data Transformation task? Briefly explain
the following questions. (5 each)
Onetomany Transformation:
� Why onetomany transformation is complex, give example?
� Define ETL. List three typical MIS/ERP systems that are found while doing ETL?
� What is the advantage of CDC in the context of "inflight" transformation? Write
down short note on the followings: (5 each) known in advance. Typically, in Decision
Support System, the data access is ad
AdHoc access means that the pattern in which the data will be accessed is not
AdHoc access
ER modelling:
� Why ER Modelling has been so successful?

� What is meant by HOLAP? Why is it used?


HOLAP:
www.vujannat.ning.com

CS508 Modern Programming Languages


Mid Term Examination – Spring 2006
Time Allowed: 90 Minutes

Please read the following instructions carefully before


attempting any of the questions:
1. Attempt all questions.
2. Do not ask any questions about the contents of this
examination from anyone.
a. If you think that there is something wrong with any of the
questions, attempt it to the best of your understanding.
b. If you believe that some essential piece of information is
missing, make an appropriate assumption and use it to solve
the problem.

**WARNING: Please note that Virtual University takes serious note of


unfair means. Anyone found involved in cheating will get an `F` grade in
this course.

Question No. 1 Marks : 10

Write brief description of the following terms. (4+6)

1. DOLAP.
2. Explain OLAP FASMI TEST.

Question No. 2 Marks : 10

Give precise answers to the following questions.

1. What are the benefits of CDC? (Any five)


2. Define and list fundamental benefits of star schema.
Question No. 3 Marks : 2

The good example of pivoting is changing the dimensions along the axis.

 True
 False
Question No. 4 Marks : 2

DASD stands for ____________

 Direct access storage device


 Direct application storage device
 Diverse access storage device

Question No. 5 Marks : 2

ETL & ELT are the same.

 True
 False

Question No. 6 Marks : 10

Differentiate between the followings.

1. MOLAP and ROLAP.


2. DQM and TQM.

Question No. 7 Marks : 2

________splitting places a group of columns in one table and the remaining columns in
another table.

 1. Horizontal
 2. Vertical
 3. Both 1 and 2

Question No. 8 Marks : 10

Give answers to the following questions.

1. What is BSN Method? Discuss its three steps.


2. What is meant by HOLAP? Why is it used?

Question No. 9 Marks : 2

Analytical processing uses multi-level aggregates, instead of record level access.

 True
 False
www.vujannat.ning.com

CS614 Data Warehousing


Mid Term Examination – Special Semster 2005
Time Allowed: 90 Minutes

Maximum Time Allowed: (90 minute)


Please read the following instructions carefully before
attempting any of the questions:
1. Attempt all questions. Marks are written adjacent to each
question.
2. Do not ask any questions about the contents of this
examination from anyone.
a. If you think that there is something wrong with any of
the questions, attempt it to the best of your understanding.
b. If you believe that some essential piece of information is
missing, make an appropriate assumption and use it to
solve the problem.

**WARNING: Please note that Virtual University takes serious


note of unfair means. Anyone found involved in cheating will
get an `F` grade in this course.

Total Marks: 50 Total Questions: 8

Question No. 1 Marks : 15

Briefly explain the following questions. (5 each)

• Write a typical OLTP Query and explain why it is different from DWH Query.
• Briefly explain the Enrichment basic Data Transformation task?
• What is the major transformation of Decoding of Fields?
Question No. 2 Marks : 15

Give precise answer to the following questions. (5 each)

• Define ETL. List three typical Operating Systems that are found while doing
ETL?
• Briefly explain the selection basic Data Transformation task?
• What is meant by "grain" in the context of a data warehouse? Give at least three
examples.

Question No. 3 Marks :01

Every dimensional model is composed of one --------- table

o Central
o Parallel
o Vertical
o Horizontal

Question No. 4 Marks : 01

In Online Data Extraction data is extracted directly from the ------ system itself.

o Host
o Destination
o Source
o Terminal

Question No. 5 Marks : 01

Triggers can be created in operational systems to keep track of recently ---------------


--- records.

o Deleted
o Updated
o Inserted

Question No. 6 Marks : 01

ETL stands for ---------------


o Extract, transform and load
o Extended transformation loading
o Enhanced logical transformation

Question No. 7 Marks : 01

There is a relationship between the grain and the dimensions

o True
o False

Question No. 8 Marks : 15

Write down the short notes on followings: (5 each)

• Define data warehouse. Differentiate conventional RDBMS and Data warehouse


system?
• Online Data Extraction?
• How does CDC (Change Data Capture) ensured using Triggers?

Vous aimerez peut-être aussi