Académique Documents
Professionnel Documents
Culture Documents
Data Warehousing
ASSIGNMENT#2
1.
Answer:
An OLAP cube is a term that typically refers to multidimensional array of data. OLAP is an acronym for online
analytical processing, which is a computer-based technique of
analysing data to look for insights. The term cube here refers to
a multi-dimensional dataset, which is also sometimes called a
hypercube if the number of dimensions is greater than 3.
A cube can be considered a multi-dimensional generalization of
a two- or three-dimensional spreadsheet. For example, a
company might wish to summarize financial data by product,
by time-period, and by city to compare actual and budget
expenses. Product, time, city and scenario (actual and budget)
are the data's dimensions.
Cube is a shortcut for multidimensional dataset, given that data
can have an arbitrary number of dimensions. The term
hypercube is sometimes used, especially for data with more
than three dimensions.
Slicer is a term for a dimension which is held constant for all
cells so that multidimensional information can be shown in a
two dimensional physical space of a spreadsheet or pivot table.
Each cell of the cube holds a number that represents some
measure of the business, such as sales, profits, expenses,
budget and forecast.
OLAP data is typically stored in a star schema or snowflake
schema in a relational data warehouse or in a special-purpose
data management system. Measures are derived from the
records in the fact table and dimensions are derived from the
dimension tables.
2. What is MULTI-DIMENSIONAL Analysis?
Answer:
Multi-Dimensional Analysis is an Informational Analysis on data
which takes into account many different relationships, each of
which represents a dimension. For example, a retail analyst
may want to understand the relationships among sales by
region, by quarter, by demographic distribution (income,
education level, gender), by product. Multi-dimensional analysis
will yield results for these complex relationships.
Multi-Dimensional Analysis is generally used in statistics,
econometrics and other related fields and the results of this
3.
Briefly
discuss
the
Design
Approaches & Architecture DWH.
Answer:
Design Approaches:
i.
Bottom-Up Design:
In the bottom-up design approach, the data marts are created
first to provide reporting capability. A data mart addresses a
single business area such as sales, Finance etc. These data
marts are then integrated to build a complete data warehouse.
The integration of data marts is implemented using data
warehouse bus architecture. In the bus architecture, a
dimension is shared between facts in two or more data marts.
These dimensions are called conformed dimensions. These
conformed dimensions are integrated from data marts and then
data warehouse is built.
Advantages of bottom-up design are:
This model contains consistent data marts and these data
marts can be delivered quickly.
As the data marts are created first, reports can be generated
quickly.
The data warehouse can be extended easily to accommodate
new business units. It is just creating new data marts and then
integrating with other data marts.
Disadvantages of bottom-up design are:
The positions of the data warehouse and the data marts are
reversed in the bottom-up approach design.
ii.
Top-Down Design:
In the top-down design approach the, data warehouse is built
first. The data marts are then created from the data warehouse.
Advantages of top-down design are:
Provides consistent dimensional views of data across data
marts, as all data marts are loaded from the data warehouse.
This approach is robust against business changes. Creating a
new data mart from the data warehouse is very easy.
Disadvantages of top-down design are:
This methodology is inflexible to changing departmental needs
during implementation phase.
It represents a very large project and the cost of implementing
the project is significant.
3. Accessibility
The OLAP tool should be capable of applying its own
logical structure to access heterogeneous sources of data
and perform any conversions necessary to present a
coherent view to the user. The tool (and not the user)
should be concerned with where the physical data comes
from.
4. Consistent reporting performance
Performance of the OLAP tool should not suffer
significantly as the number of dimensions is increased.
5. Client/server architecture
The server component of OLAP tools should be sufficiently
intelligent that the various clients can be attached with
minimum effort. The server should be capable of mapping
and consolidating data between disparate databases.
6. Generic Dimensionality
Every data dimension should be equivalent in its structure
and operational capabilities.
7. Dynamic sparse matrix handling
The OLAP servers physical structure should have optimal
sparse matrix handling.
8. Multi-user support
OLAP tools must provide concurrent retrieval and update
access, integrity and security.