Analysis

Creating a Hierarchy
Overview
A Hierarchy is a set of logically related attributes with a fixed cardinality. While browsing the data, a
hierarchy exposes the top level attribute which can be broken down into lower level attributes. For
example, Year -> Semester Quarter Month is a hierarchy. While analyzing the data, it might be
required to drill down from a higher level to a detail level, and exposing data as a hierarchy is one of
the best solutions for this.
Explanation
Creating a hierarchy is as easy as dragging and dropping attributes in the hierarchy pane of the
dimension editor. We want to create a hierarchy in the Sales Territory dimension. Open Sales Territory
dimension in the dimension editor, drag and drop attributes in the hierarchy pane, click on each of them
and rename them to something appropriate. After completing this, your hierarchy should look similar to
the below screenshot.
You will find a warning icon on the hierarchy pane, which says that attribute relationships are missing
between these attributes. Country has a one-to-many relationship with Region, and Group has a one-tomany relationship with Country. But these relationships need to be defined explicitly in the dimension.
Click on Attribute Relationships tab, right-click the region attribute and select New Attribute
Relationship. Set the values as shown in the below screenshot to correct the relationships between
these attributes.
After you have applied the above changes, your attribute relationship tab should look like the below
screenshot.
If you have observer carefully, relationship types are of two types: Rigid and Flexible. This has an
effect on the processing of the cube. Rigid means that you do not expect the relationship to change and
Flexible means that relationship values can change. In our dataset, Group is a logical way to categorize
countries and it can change, while regions within country have limited or no change. So the relationship
type between country and group should be flexible and relationship type between region (sales territory
key) and country should be rigid. Double click on the arrow joining Key attribute and Country, and
change the relationship type as shown below.
Check out the Hierarchy pane, and you should find that the warning icon is no longer visible. You can
change the name of the hierarchy to something appropriate. In the interest of beginners who might get
confused with the distinction between attributes and hierarchy, we will keep the name as Hierarchy.
Edit the Date dimension, and create a Year Semester Quarter Month hierarchy in the date
dimension.
Creating a Cube using the Cube Wizard
Overview
A Cube acts as an OLAP database to the subscribers who need to query data from an OLAP data store.
A Cube is the main object of a SSAS solution where the majority of fine tuning, calculations,
aggregation design, storage design, defining relationship and a lot of other configurations are
developed. We will create a cube using our dimension and fact tables.
Explanation
Right-click the Cube folder and select New Cube, and it will invoke the Cube Wizard. In the first
screen you need to select one of the methods of creating a Cube. We already have our dimensions
ready, and schema is already designed to contain dimension and fact tables. So we will select the option
of Use existing tables.
In the next screen, we need to select the tables which will be used to create measure groups. We already
have a DSV which has fact tables in the schema. So we will use this as shown in the below screenshot.
In the next screen, we need to select the measures that we want to create from the fact tables we just
selected in the previous screen. For now, select all the fields as shown below and move to the next
screen.
In this screen you need to select any existing dimensions. We have created three dimensions and we
will include all of these dimensions as shown below.
In the next screen, we can select if we want to create any additional new dimensions from the tables
available in the DSV. We do not want to create any more dimensions, so unselect any selected tables as
shown below and move to the next screen.
Finally you need to name your cube, which is the last step of the wizard before your cube is created.
Name it something appropriate like Sales Cube as shown below.
Now your cube should have been created and if your cube editor is open you should find different tabs
to configure and design various features and aspects of the cube. If you look carefully in the below
screenshot, you will find FactInternetSales and FactResellerSales measure groups. Also you will find
Sales Territory and Product dimension, but Date dimension is missing. Both fact tables have multiple
fields referencing the DateKey from the Date dimension. BIDS intelligently creates three dimensions
from the Date dimension and names them to the name of the field which is referenced from the Date
dimension. So you will find three compounds of Date dimension Ship Date, Due Date and Order
Date dimensions. These are known as role-playing dimensions.
Processing and Deploying a Cube
Overview
Once the cube design and development is complete, the next step is to deploy the cube. When the cube
is deployed, a database for the solution is created in the SSAS instance, if not already present. Each of
the dimensions and measure group definitions are read, and data is calculated and stored as per the
design and configuration of these objects. Once the cube is successfully deployed, client applications
can connect to the cube and browse the cube data. We will deploy the cube we have developed and test
connecting to the cube. We might also face errors during deployment, and we will attempt debugging
and resolving these errors.
Debugging Deployment Errors
Overview
In a development environment, ideally you would come across errors during deployment and
processing of the cube. Debugging errors is an essential part of the cube development life cycle. We
will configure the deployment properties and we should face some errors during the deployment. We
will then analyze and resolve these errors.
Explanation
Right-click the solution and select Properties, this would bring up a pop-up window. Select the
deployment tab and it will bring up the deployment properties. Mention the SSAS server name and the
database name that was created for your solution in the SSAS instance. Since SSAS in installed on my
local / development machine, I have chosen server as localhost and name of the database as Sales.
We will keep the rest of the options as default for now.
Right-click the solution and select Deploy, this will start deploying the solution. If you have not
specified an appropriate account in the impersonation information, your deployment might fail as the
account might not have sufficient privileges.
If you have followed all the previous steps as explained, you should face errors as shown below. From
the error message you can make out that cube processing failed due to the Date dimension.
Right-click the Cube Dim Date dimension and select Process, and you would find the following
error.
If you recall we have defined a hierarchy in the Date dimension, Year -> Semester -> Quarter ->
Month, and the attribute relation expected is one to many. If you browse the data, you will find that the
same set of semester values exist in each year, so how do you make them unique for each Quarter?
When the Quarter is processed, it will find duplicate Semester as the key columns for the Semester is
Semester itself by default which is not unique. So we need to make each attribute unique by changing
its key columns.
Edit the Date dimension in the dimension editor, select the Semester attribute and edit the Key
Columns property. This should bring up a pop-up window as shown below. To make the Semester
attribute unique, we need to make the key column a composite key Year + Semester to make it unique.
So select key columns as shown below.
When you select multiple columns in the key column, the name column property becomes blank and
its a mandatory property. So select this property and set it again to Semester as we want to display
semesters when this is browsed.
This should solve the error we were facing on the date dimension. Duplicate keys are one of the most
common errors during dimension processing and we just learned how to resolve this issue.
Processing Dimensions and Cube
Overview
SSAS provides various cube processing methods and options to configure error logging as well as
impact on processing when errors are encountered. We will briefly look at these options, understand
what processing of the cube means, deploy our cube and try to access data from the cube.
Explanation
Right-click on the dimension or cube and select Process, and this should bring up a similar screen
with processing options as shown in the below screenshot. Various processing options are visible in the
dropdown. Unprocess would remove all the aggregation created by the processing of the object.
Process Full would also do the same operation, but also create all the aggregations again. More
reference about these options can be found in MSDN BOL.
In the "Change Settings" and "Impact Analysis" options you will find more error configuration and
other options related to processing.
Deploy the cube and the cube should be deployed successfully. Go to the Browser pane after successful
deployment, and try to connect to the cube and browse data by dragging and dropping dimension
attributes and measures on the browsing area. Below is an example.
Calculated Measures and Named Sets
Overview
Fields from fact tables get converted into measures in measuregroups in a cube. When measuregroups
are created in a cube, one measuregroup is created per fact table. Often in production systems,
developing calculated measures is a regular requirement. Multi-Dimensional Expressions (MDX) is the
query language for a cube and is synonymous to what T-SQL is to SQL Server. Often queries that are
frequently used are required to be in some ready format in a cube, so that the users do not need to
develop them over and over again. One of the solutions for this is named sets, which can be perceived
as a query already defined in the cube, similar to views in SQL Server. We will develop a calculated
measure and a few named sets in this section.
Developing a Calculated Measure
Overview
Measures created directly from the fields of a fact table are called base measures. But often we require
measures based on custom requirements, so we apply some logic and/or formula to these base measures
and create calculated measures. We will add two measures from two measure groups and create a
calculated measure.
Explanation
Open the cube designer, and click on the Calculations tab. Click on New Calculated Measure from
the toolbar, and key in the values as shown in the below screenshot.
We have named this new calculated measure TotalSales. The "Parent hierarchy" specifies which
parent hierarchy the measure will be part and in this case it will be Measures. Its a built-in hierarchy
and all measures normally fall under this.
In the Expression, we can specify any MDX expression. Here we are adding Internet Sales Amount
from FactInternetSales and Reseller Sales Amount from FactResellerSales measure groups. You do
not need to type the values you can just drag and drop values from the panes on the left-hand side of
the window.
In the additional properties you can set additional options for this measure. Save your solution, in the
next section we will create named sets and then deploy these at the same time.
Developing Named Sets
Overview
Named sets return a dataset based on defined logic. They are primarily useful to create datasets that are
often requested from the cube. Named sets are of two types: Static and Dynamic. The difference
between these two is that static named sets are calculated when they are requested the first time in a
session and dynamic named sets are calculated each time a query references it. In this section we will
look at how to create dynamic named sets. Note that dynamic named sets were not introduced
until SQL Server 2008.
Explanation
Open the cube designer, and click on the Calculations tab. Click on New Named Set from the toolbar
and key in the values as shown in the below screenshots.
Here we are creating two named sets, Internet Sales Top 25 and Reseller Sales Top 25. In these named
sets, we are returning the Top 25 products based on Internet Sales and Reseller Sales. In this formula,
TopCount, the MDX function returns top 25 records from the dataset.
In the Type selection, we can select whether we want the named set to be static or dynamic. We have
selected Dynamic as we want to create a dynamic named set.
In the Display folder selection, we can specify where the named sets will appear. By default named
sets appear in the last dimension that is used in the formula. Here we have used an attribute hierarchy
from Product dimension, so the named sets should appear in the same dimension under Named Sets
directory.
Save and deploy the solution, and then re-connect to the cube in the Browser pane. You should be
able to see the calculated measure and named sets as shown in the below screenshot.
Browsing a Cube Using Excel
Overview
Once the cube is deployed and ready to host queries from the data store, client applications can start
querying the cube. One of the most user friendly client tools for business users to query a cube is
Microsoft Excel. It has a built-in interface and components to support GUI based connection, querying
and formatting of data sourced from a cube. Business users can use the familiar interface of Excel and
create ad-hoc pivot table reports by querying the cube without any detailed knowledge about querying
a multi-dimensional data source. We will connect to the cube we just created using Excel and develop a
very simple report using the cube data.
Using Excel and Creating a Pivot Table Report
Overview
We will first create a connection to the cube we have developed in the previous exercises. After
connecting the cube we will use the calculated measures and a named set to create a very basic pivot
table report. For the purpose of demonstration, Excel 2010 is used and is installed on the development
machine, but you can also use Excel 2007 to connect to the cube.
Explanation
Open Microsoft Excel and select the Data tab from the menu ribbon. Click on From Other Sources
and select From Analysis Services option as shown in the below screenshot.
In the next step specify the SSAS server name and logon credentials. If you have everything on the
local machine, you can also use localhost as the server name.
If you were able to successfully connect to the specified SSAS instance with the logon credentials
specified, in the next step you should be able to select the SSAS Sales database and find the Sales
Cube. Select the Sales Cube and proceed to the next step.
In the next step, specify the name of the connection file to save. This file will be saved as an .ODC file
and you can reuse this connection file when you want to use the same connection in other workbooks.
After saving the file, you will be prompted with the option to select the kind of report you want to
create. We will go with the default option and select PivotTable Report.
After selecting PivotTable Report, a designer will open with options to select dimension, attributes
and measures to populate your pivot table. Select the values as shown in the below screenshot. Our
intention is to display the hierarchy we created in the Sales Territory dimension on the columns axis,
Internet Sales Top 25 named set on the rows axis, and the Total Sales calculated measure in the values
area.
After making the above selections, your report should look like the below screenshot. Using the
features available from the Options tab, you can format this report and give it a more professional
look. You can try drilling down the hierarchy, but you will see that you need to develop the hierarchies.
Users who frequently want to see sales of products to top customers, can pick up any named-set that we
defined earlier. Instead of having users define formulas for adding internet sales and reseller sales,
users can just select Total Sales.
SQL Server Analysis Services Glossary

Following is a list of common terms when working with SQL Server Analysis Services.
Cube - Cube is a multi dimensional data structure composed of dimensions and measure groups. The
intersection of dimension and measure groups contained in a cube returns the dataset.
Calculated Measure - Each field in a measure group is known as a base measure. Measures created
using MDX expressions with/without base measures are known as calculated measures.
Data Source View - It's an insulation layer that inherits the basic schema from the data source with the
flexibility to manipulate the schema in this layer without modifying the actual schema in the data
source.
Dimension - Dimension is an OLAP structure that is basically used to contain attributes related to an
entity to categorize data on the row / column axis. A dimension almost never contains measurable
numeric data, and if at all it contains, it is used as an attribute. Typical example of dimensions are
Geography, Organization, Employee, Time etc.
Fact - Fact known as a Measure Group in a cube, is an OLAP structure that is basically used to contain
measureable numeric data, for one or more entities. In cube parlance these entities are known as
Dimensions. A dimension need not be necessarily associated directly with a fact, but a fact is always
associated directly with at least one dimension. Typical example of facts are Sales, Performance, Tax
etc.
Hierarchy - Hierarchy is collection of nested attributes associated in a parent-child fashion with a
defined cardinality. Dimension is formed of attributes, and hierarchy contained in a dimension is
formed of one or more attributes from the same dimension.
KPI - Key Performance Indicators are logical structures defined using MDX expressions. Each KPI
has a goal, status, value, trend, and indicator associated with it. Value is derived based on the definition
of KPI, all the rest of these values vary based on this derived value. KPIs are the primary elements that
makes up a scorecard in a dashboard.
MDX - Multi Dimensional Expressions is considered as the query language of multi dimensional data
structures. This can be considered as the SQL of OLAP databases, with the major difference that MDX
is mostly used for reading data only.
Named Set - Named Set is a pre-defined MDX query defined in the script of the cube. It can be
thought of synonymous to Views in a SQL Server database. Named sets can be dynamic or static and
this nature defines the time when this query gets evaluated.
OLAP - Online Analytical Processing is a term used to represent analytical data sources and analysis
systems. The fundamental perception and expectation associated with the term OLAP is that it would
contain multi dimensional data and the environment hosting the same.
Snowflake Schema - Snowflake schema is an OLAP schema, where one or more normalized
dimension tables are associated with a fact table. For example, Product Sub Category -> Product
Category -> Product can be three normalized dimension tables and Product table can be associated with
a fact table like Sales. This is a very common example of a snowflake schema.
Star Schema - Star schema is an OLAP schema, where all dimension tables are directly associated
with fact tables, and no normalized dimension tables are considered in the schema. For example, Time,
Product, Geography dimension tables would be directly associated with a fact table like Sales. This is a
very common example of star schema.

Analysis

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Analysis

Transféré par

Droits d'auteur :

Formats disponibles

Creating a Hierarchy

Creating a Cube using the Cube Wizard

Processing and Deploying a Cube

Debugging Deployment Errors

Processing Dimensions and Cube

Calculated Measures and Named Sets

Developing a Calculated Measure

Developing Named Sets

Browsing a Cube Using Excel

Using Excel and Creating a Pivot Table Report

SQL Server Analysis Services Glossary

Vous aimerez peut-être aussi