Académique Documents
Professionnel Documents
Culture Documents
The benefits of decision support systems include more informed decision-making, timely problem
solving and improved efficiency for dealing with problems with rapidly changing variables.
A DSS can be used by operations management and planning levels in an organization to compile
information and data and synthesize it into actionable intelligence. This allows the end user to make
more informed decisions at a quicker pace.
The DSS is an information application that produces comprehensive information. This is different
from an operations application, which would be used to collect the data in the first place. A DSS is
primarily used by mid- to upper-level management, and it is key for understanding large amounts of
data.
For example, a DSS could be used to project a company’s revenue over the upcoming six months
based on new assumptions about product sales. Due to the large amount of variables that surround
the projected revenue figures, this is not a straightforward calculation that can be done by hand. A
DSS can integrate multiple variables and generate an outcome and alternate outcomes, all based on
the company’s past product sales data and current variables.
The primary purpose of using a DSS is to present information to the customer in a way that is easy to
understand. A DSS system is beneficial because it can be programed to generate many types of
reports, all based on user specifications. A DSS can generate information and output it graphically,
such as a bar chart that represents projected revenue, or as a written report.
As technology continues to advance, data analysis is no longer limited to large bulky mainframes.
Since a DSS is essentially an application, it can be loaded on most computer systems, including
laptops. Certain DSS applications are also available through mobile devices. The flexibility of the
DSS is extremely beneficial for customers who travel frequently. This gives them the opportunity to
be well-informed at all times, which in turn provides them with the ability to make the best decisions
for their company and customers at any time.
1
BIA
Attributes of a DSS
Ease of use
Ease of development
Extendibility
Characteristics of a DSS
Support for managers at various managerial levels, ranging from top executive to
line managers.
Support for individuals and groups. Less structured problems often requires the
involvement of several individuals from different departments and organization level.
Benefits of DSS
2
BIA
Increases the control, competitiveness and capability of futuristic decision-making of
the organization.
Components of a DSS
Model Management System: It stores and accesses models that managers use to
make decisions. Such models are used for designing manufacturing facility, analyzing the financial
health of an organization, forecasting demand of a product or service, etc.
Support Tools: Support tools like online help; pulls down menus, user interfaces, graphical analysis,
error correction mechanism, facilitates the user interactions with the system.
Classification of DSS
There are several ways to classify DSS. Hoi Apple and Whinstone classifies DSS as follows:
Text Oriented DSS:It contains textually represented information that could have a
bearing on decision. It allows documents to be electronically created, revised and viewed as needed.
Database Oriented DSS: Database plays a major role here; it contains organized
and highly structured data.
Rules Oriented DSS: Procedures are adopted in rules oriented DSS. Export system
is the example.
3
BIA
Compound DSS: It is built by using two or more of the five structures explained
above.
Types of DSS
Data Analysis System: It needs comparative analysis and makes use of formula or
an algorithm, for example cash flow analysis, inventory analysis etc.
Information Analysis System: In this system data is analyzed and the information
report is generated. For example, sales analysis, accounts receivable systems, market analysis etc.
Model Based System: Simulation models or optimization models used for decision-
making are used infrequently and creates general guidelines for operation or management.
Now that we know how a communications-driven group DSS can support decision-making
among geographically dispersed teams using web-based tools, it’s time to understand what
exactly it is.
There are a number of tools and technologies that can be incorporated in a GDSS (Group
Decision Support System), in order to promote better decision making. These include:
5
BIA
A group decision support system fosters collaboration and team decision-making in four
different situations:
In this situation, all decision makers are available at same time at same place. The
information is displayed on either computer projection system or on individual computers of
participants.
Offers video conferencing facilities where participants can see and hear each
other in real time
6
BIA
Offers support for meeting or interactions via two-way video
Offers additional facilities, such as screen sharing, chat, audio, white boards
In this situation, GDSS fosters communication for those who work at same place but have
different shift timings. It offers numerous facilities, including:
Document sharing
It’s important to understand how GDSS work in different time and different place situations.
It is a situation where participants are geographically distant and also operate in a different
time zone. It fosters communication, collaboration and team decision making through:
Conferencing
Bulletin board
Voice mail
The major concern of investors/users at the time of deciding whether to develop a decision
support system or not must be:
7
BIA
The selection of the best technology or system in a given decision-making
situation
Therefore, the managers must ask themselves following questions, in order to attain more
clarity:
What will be the alternative for web conferencing when participants are at
different locations and in different time zones?
How frequent will be resource sharing and how participants will access
information and to what extent?
Contingency Theory
A communications-driven GDSS addresses problems associated with group collaboration,
communication and decision making, when participants are geographically dispersed and
operate in different time zones.
This means the effectiveness of a GDSS directly depends upon its design, user-interface,
DSS architecture, integrated support tools and technical skills possessed by participants
who use DSS.
Although managers know that the set of tools that they have chosen for a GDSS are good,
but they may not perform equally good in all circumstances. There is no one best way of
making decisions or supporting group collaboration. A tool or process may work well in
some situations and may terribly fail in others.
In such a scenario, the managers must resort to a contingency approach that focuses on
three main points:
Task Type: The deciding factors include idea generation, creativity, planning,
choosing alternatives and action. For example, computer mediated communication is a
good fit for idea generation activities, and video and audio conferencing is a good choice
when decision-making is a function of human intellect.
8
BIA
Group Size: bigger the size, higher the difference between technical
abilities, likes and interests, preferences and judgments. Small groups may not require
extensive support or communication tools while large groups require more sophisticated
and automated tools.
Virtual Organizations
A virtual organization is an association of physically and/or professionally detached
individuals working together on a project or to achieve a mission. It doesn’t have any
physical existence but the technology (internet technology, more precisely) makes it look
real.
Communications-driven group decision support systems are best suited for virtual
organizations that require a lot of technological support to foster communication and
collaboration and get the work done.
Look real
Personal computers
Wireless technologies
Collaborative technologies
Web conferencing
9
BIA
Groupware
Worldwide Web
Ease of Installation and Use: A support tool must be easy to install and
use. An ideal tool is the one that requires minimal or no formal training for its users. The
decision makers may consult DSS experts to integrate group support tools that are easy to
use.
It’s important to select the right communication and support tools to promote good decision
making by a team that is physically dispersed. Moreover, a GDSS must be carefully aligned
to the structure of an organization, in order to get the best results.
Groupware Technologies
Groupware is a class of computer programs that enables individuals to collaborate on
projects with a common goal from geographically dispersed locations through shared
Internet interfaces as a means to communicate within the group.
Groupware may also include remote access storage systems to archive frequently used
data files. These can be altered, accessed and retrieved by workgroup members.
The first commercial groupware products emerged in early 1990s when international giants
such as IBM and Boeing began using electronic meeting systems for their internal projects.
Further, Lotus Notes appeared as a major product of this category, further enhancing
remote group collaborations.
11
BIA
Meeting and decision support systems capturing the common understanding
of participants
Shared applications
The extensive use of groupware on the Internet helped contribute to the development of
Web 2.0, which uses instant messaging, Web conferencing, group calendars, document
sharing, etc.
Expert Systems
THEINTACTFRONT17 APR 2018 1 COMMENT
What are Expert Systems?
The expert systems are the computer applications developed to solve complex problems in a
particular domain, at the level of extra-ordinary human intelligence and expertise.
High performance
Understandable
12
BIA
Reliable
Highly responsive
Advising
Demonstrating
Deriving a solution
Diagnosing
Explaining
Interpreting input
Predicting results
Knowledge Base
Inference Engine
User Interface
13
BIA
Knowledge Base
It contains domain-specific and high-quality knowledge. Knowledge is required to exhibit
intelligence. The success of any ES majorly depends upon the collection of highly accurate and
precise knowledge.
What is Knowledge?
The data is collection of facts. The information is organized as data and facts about the task
domain. Data, information, and past experience combined together are termed as knowledge.
Knowledge representation
It is the method used to organize and formalize the knowledge in the knowledge base. It is in the
form of IF-THEN-ELSE rules.
Knowledge Acquisition
The success of any expert system majorly depends on the quality, completeness, and accuracy of the
information stored in the knowledge base.
The knowledge base is formed by readings from various experts, scholars, and the Knowledge
Engineers. The knowledge engineer is a person with the qualities of empathy, quick learning, and
case analyzing skills.
He acquires information from subject expert by recording, interviewing, and observing him at work,
etc. He then categorizes and organizes the information in a meaningful way, in the form of IF-
THEN-ELSE rules, to be used by interference machine. The knowledge engineer also monitors the
development of the ES.
Inference Engine
Use of efficient procedures and rules by the Inference Engine is essential in deducting a correct,
flawless solution.
14
BIA
In case of knowledge-based ES, the Inference Engine acquires and manipulates the knowledge from
the knowledge base to arrive at a particular solution.
Applies rules repeatedly to the facts, which are obtained from earlier rule
application.
Resolves rules conflict when multiple rules are applicable to a particular case.
Forward Chaining
Backward Chaining
Forward Chaining
It is a strategy of an expert system to answer the question, “What can happen next?”
Here, the Inference Engine follows the chain of conditions and derivations and finally deduces the
outcome. It considers all the facts and rules, and sorts them before concluding to a solution.
This strategy is followed for working on conclusion, result, or effect. For example, prediction of
share market status as an effect of changes in interest rates.
Backward Chaining
With this strategy, an expert system finds out the answer to the question, “Why this happened?”
On the basis of what has already happened, the Inference Engine tries to find out which conditions
could have happened in the past for this result. This strategy is followed for finding out cause or
reason. For example, diagnosis of blood cancer in humans.
15
BIA
User Interface
User interface provides interaction between user of the ES and the ES itself. It is generally Natural
Language Processing so as to be used by the user who is well-versed in the task domain. The user of
the ES need not be necessarily an expert in Artificial Intelligence.
It explains how the ES has arrived at a particular recommendation. The explanation may appear in
the following forms −
The user interface makes it easy to trace the credibility of the deductions.
Its technology should be adaptable to user’s requirements; not the other way round.
No technology can offer easy and complete solution. Large systems are costly, require significant
development time, and computer resources. ESs have their limitations which include −
16
BIA
Limitations of the technology
Application Description
There are several levels of ES technologies available. Expert systems technologies include −
17
BIA
o Workstations, minicomputers, mainframes.
o Large databases.
Tools− They reduce the effort and cost involved in developing an expert system to
large extent.
Shells− A shell is nothing but an expert system without knowledge base. A shell
provides the developers with knowledge acquisition, inference engine, user interface, and
explanation facility. For example, few shells are given below −
o Java Expert System Shell (JESS) that provides fully developed Java API for
creating an expert system.
Know and establish the degree of integration with the other systems and databases.
Realize how the concepts can represent the domain knowledge best.
18
BIA
Develop the Prototype
The knowledge engineer uses sample cases to test the prototype for any deficiencies
in performance.
Test and ensure the interaction of the ES with all elements of its environment,
including end users, databases, and other information systems.
Maintain the ES
Cater for new interfaces with other information systems, as those systems evolve.
Less Production Cost− Production cost is reasonable. This makes them affordable.
Speed− They offer great speed. They reduce the amount of work an individual puts
in.
Steady response− They work steadily without getting motional, tensed or fatigued.
19
BIA
SQL is a language to operate databases; it includes database creation, deletion, fetching rows,
modifying rows, etc. SQL is an ANSI (American National Standards Institute) standard language,
but there are many different versions of the SQL language.
What is SQL?
SQL is Structured Query Language, which is a computer language for storing, manipulating and
retrieving data stored in a relational database.
20
BIA
SQL is the standard language for Relational Database System. All the Relational Database
Management Systems (RDMS) like MySQL, MS Access, Oracle, Sybase, Informix, Postgres and
SQL Server use SQL as their standard database language.
Why SQL?
Allows users to define the data in a database and manipulate that data.
Allows to embed within other languages using SQL modules, libraries & pre-
compilers.
SQL Process
When you are executing an SQL command for any RDBMS, the system determines the best way to
carry out your request and SQL engine figures out how to interpret the task.
Query Dispatcher
Optimization Engines
21
BIA
Classic Query Engine
A classic query engine handles all the non-SQL queries, but a SQL query engine won’t handle
logical files.
Features of SQL
High Performance.
High Availability.
Management Ease.
22
BIA
Lowest Total Cost of Ownership.
System Databases
THEINTACTFRONT20 APR 2018 1 COMMENT
Master Database contains information about SQL server configuration. Without Master database,
server can’t be started. This will store the metadata information about all other
objects(Databases,Stored Procedure,Tables,Views,etc.) which is Created in the SQL Server .
If the master database gets corrupted and is not recoverable from the backup, then a user has to again
rebuild the master database. Therefore, it is always recommended to maintain a current backup of the
master database. As everything crucial to SQL server is stored in the master database, it cannot be
deleted as it is the heart of SQL SERVER.
The model database sets a template for every database that was newly created . It serves as a
template for the SQL server in order to create a new database. When we create a new database, the
23
BIA
data present in model database are moved to new database to create its default objects which include
tables, stored procedures, etc. Primarily, the requirement of model database is not specific to creation
of new database only. Whenever the SQL server starts, the Tempdb is created by using model
database in the form of a template. By default it does not contain any data.
(iii) Msdb
The msdb database is used mainly by the SQL server Management Studio, SQL Server Agent to
store system activities like sql server jobs, mail, service broker, maintenance plans, user and system
database backup history, Replication information, log shipping .We need to take a backup of this
database for the proper function of SQL Server Agent Service.
(iv) TempDB
From the name of the database itself, we can identify the purpose of this database. It can be accessed
by all the users in the SQL Server Instance.
The tempdb is a temporary location for storing temporary tables(Global and Local) and temporary
stored procedure that hold intermediate results during the sorting or query processing and cursors.
If more temporary objects are created and used storage of tempDB then performance of SQL Server
will affect.So recommened to move the temdb to the location where sufficient amount of space is
there.
This Database will be created by SQL Server instance when the SQL Server service starts. This
database is created using model database.We cannot take a backup of temp Database.
MySQL
MySQL is an open source SQL database, which is developed by a Swedish company – MySQL AB.
MySQL is pronounced as “my ess-que-ell,” in contrast with SQL, pronounced “sequel.”
MySQL is supporting many different platforms including Microsoft Windows, the major Linux
distributions, UNIX, and Mac OS X.
MySQL has free and paid versions, depending on its usage (non-commercial/commercial) and
features. MySQL comes with a very fast, multi-threaded, multi-user and robust SQL database server.
History
24
BIA
First internal release on 23rdMay 1995.
Windows Version was released on the 8thJanuary 1998 for Windows 95 and NT.
Version 3.23: beta from June 2000, production release January 2001.
Version 4.0: beta from August 2002, production release March 2003 (unions).
Version 4.01: beta from August 2003, Jyoti adopts MySQL for database tracking.
Version 4.1: beta from June 2004, production release October 2004.
Version 5.0: beta from March 2005, production release October 2005.
Features
High Performance.
High Availability.
Management Ease.
MS SQL Server
MS SQL Server is a Relational Database Management System developed by Microsoft Inc. Its
primary query languages are −
25
BIA
T-SQL
ANSI SQL
History
1989 – Microsoft, Sybase, and Aston-Tate release SQL Server 1.0 for OS/2.
1990 – SQL Server 1.1 is released with support for Windows 3.0 clients.
2001 – Microsoft releases XML for SQL Server Web Release 1 (download).
2002 – Microsoft releases SQLXML 2.0 (renamed from XML for SQL Server).
Features
High Performance
High Availability
Database mirroring
Database snapshots
CLR integration
Service Broker
DDL triggers
Ranking functions
XML integration
26
BIA
TRY…CATCH
Database Mail
ORACLE
It is a very large multi-user based database management system. Oracle is a relational database
management system developed by ‘Oracle Corporation’.
Oracle works to efficiently manage its resources, a database of information among the multiple
clients requesting and sending data in the network.
It is an excellent database server choice for client/server computing. Oracle supports all major
operating systems for both clients and servers, including MSDOS, NetWare, UnixWare, OS/2 and
most UNIX flavors.
History
Oracle began in 1977 and celebrating its 32 wonderful years in the industry (from 1977 to 2009).
1977 – Larry Ellison, Bob Miner and Ed Oates founded Software Development
Laboratories to undertake development work.
1979 – Version 2.0 of Oracle was released and it became first commercial relational
database and first SQL database. The company changed its name to Relational Software Inc. (RSI).
1983 – Oracle released version 3.0, rewritten in C language and ran on multiple
platforms.
1984 – Oracle version 4.0 was released. It contained features like concurrency
control – multi-version read consistency, etc.
1985 – Oracle version 4.0 was released. It contained features like concurrency
control – multi-version read consistency, etc.
2007 – Oracle released Oracle11g. The new version focused on better partitioning,
easy migration, etc.
Features
Concurrency
27
BIA
Read Consistency
Locking Mechanisms
Quiesce Database
Portability
Self-managing database
SQL*Plus
ASM
Scheduler
Resource Manager
Data Warehousing
Materialized views
Bitmap indexes
Table compression
Parallel Execution
Analytic SQL
Data mining
Partitioning
MS ACCESS
This is one of the most popular Microsoft products. Microsoft Access is an entry-level database
management software. MS Access database is not only inexpensive but also a powerful database for
small-scale projects.
MS Access uses the Jet database engine, which utilizes a specific SQL language dialect (sometimes
referred to as Jet SQL).
MS Access comes with the professional edition of MS Office package. MS Access has easyto-use
intuitive graphical interface.
28
BIA
1992 – Access version 1.0 was released.
1993 – Access 1.1 released to improve compatibility with inclusion the Access Basic
programming language.
2007 – Access 2007, a new database format was introduced ACCDB which supports
complex data types such as multi valued and attachment fields.
Features
Users can create tables, queries, forms and reports and connect them together with
macros.
Option of importing and exporting the data to many formats including Excel,
Outlook, ASCII, dBase, Paradox, FoxPro, SQL Server, Oracle, ODBC, etc.
There is also the Jet Database format (MDB or ACCDB in Access 2007), which can
contain the application and data in one file. This makes it very convenient to distribute the entire
application to another user, who can run it in disconnected environments.
Microsoft Access offers parameterized queries. These queries and Access tables can
be referenced from other programs like VB6 and .NET through DAO or ADO.
The desktop editions of Microsoft SQL Server can be used with Access as an
alternative to the Jet Database Engine.
The SQL CREATE DATABASE statement is used to create a new SQL database.
Syntax
29
BIA
CREATE DATABASE DatabaseName;
Example
If you want to create a new database <testDB>, then the CREATE DATABASE statement would be
as shown below −
Make sure you have the admin privilege before creating any database. Once a database is created,
you can check it in the list of databases as follows −
+——————–+
| Database |
+——————–+
| information_schema |
| AMROOD |
| TUTORIALSPOINT |
| mysql |
| orig |
| test |
| testDB |
+——————–+
CREATING TABLES
30
BIA
Creating a basic table involves naming the table and defining its columns and each column’s data
type.
Syntax
CREATE TABLE table_name( column1 datatype, column2 datatype, column3 datatype, …..
columnN datatype, PRIMARY KEY( one or more columns ));
CREATE TABLE is the keyword telling the database system what you want to do. In this case, you
want to create a new table. The unique name or identifier for the table follows the CREATE TABLE
statement.
Then in brackets comes the list defining each column in the table and what sort of data type it is. The
syntax becomes clearer with the following example.
A copy of an existing table can be created using a combination of the CREATE TABLE statement
and the SELECT statement. You can check the complete details at Create Table Using another Table.
Example
The following code block is an example, which creates a CUSTOMERS table with an ID as a
primary key and NOT NULL are the constraints showing that these fields cannot be NULL while
creating records in this table −
SQL> CREATE TABLE CUSTOMERS( ID INT NOT NULL, NAME VARCHAR (20) NOT
NULL, AGE INT NOT NULL, ADDRESS CHAR (25) , SALARY DECIMAL (18, 2),
PRIMARY KEY (ID));
You can verify if your table has been created successfully by looking at the message displayed by the
SQL server, otherwise you can use the DESC command as follows −
Now, you have CUSTOMERS table available in your database which you can use to store the
required information related to customers.
Constraints
31
BIA
THEINTACTFRONT20 APR 2018 1 COMMENT
Constraints are the rules enforced on the data columns of a table. These are used to limit the type of
data that can go into a table. This ensures the accuracy and reliability of the data in the database.
Constraints could be either on a column level or a table level. The column level constraints are
applied only to one column, whereas the table level constraints are applied to the whole table.
Following are some of the most commonly used constraints available in SQL. These constraints
have already been discussed in SQL – RDBMS Conceptschapter, but it’s worth to revise them at this
point.
NOT NULL Constraint− Ensures that a column cannot have NULL value.
FOREIGN Key− Uniquely identifies a row/record in any of the given database table.
CHECK Constraint− The CHECK constraint ensures that all the values in a column
satisfies certain conditions.
INDEX− Used to create and retrieve data from the database very quickly.
Constraints can be specified when a table is created with the CREATE TABLE statement or you can
use the ALTER TABLE statement to create constraints even after the table is created.
Dropping Constraints
Any constraint that you have defined can be dropped using the ALTER TABLE command with the
DROP CONSTRAINT option.
For example, to drop the primary key constraint in the EMPLOYEES table, you can use the
following command.
Some implementations may provide shortcuts for dropping certain constraints. For example, to drop
the primary key constraint for a table in Oracle, you can use the following command.
32
BIA
ALTER TABLE EMPLOYEES DROP PRIMARY KEY;
Integrity Constraints
Integrity constraints are used to ensure accuracy and consistency of the data in a relational database.
Data integrity is handled in a relational database through the concept of referential integrity.
There are many types of integrity constraints that play a role in Referential Integrity (RI). These
constraints include Primary Key, Foreign Key, Unique Constraints and other constraints which are
mentioned above.
DML resembles simple English language and enhances efficient user interaction with the system.
The functional capability of DML is organized in manipulation commands like SELECT, UPDATE,
INSERT INTO and DELETE FROM, as described below:
SELECT: This command is used to retrieve rows from a table. The syntax is
SELECT [column name(s)] from [table name] where [conditions]. SELECT is the most widely used
DML command in SQL.
INSERT: This command adds one or more records to a database table. The insert
command syntax is INSERT INTO [table name] [column(s)] VALUES [value(s)].
DELETE: This command removes one or more records from a table according to
specified conditions. Delete command syntax is DELETE FROM [table name] where [condition].
The following table summarizes the major differences between OLTP and OLAP system design.
Operational data; OLTPs are the Consolidation data; OLAP data comes from
Source of data
original source of the data. the various OLTP Databases
Purpose of To control and run fundamental To help with planning, problem solving, and
data business tasks decision support
Inserts and Short and fast inserts and updates Periodic long-running batch jobs refresh the
Updates initiated by end users data
Backup and Backup religiously; operational data is Instead of regular backups, some
Recovery critical to run the business, data loss is environments may consider simply
likely to entail significant monetary reloading the OLTP data as a recovery
loss and legal liability
34
BIA
method
Data Marts
THEINTACTFRONT20 APR 2018 1 COMMENT
A data mart is a repository of data that is designed to serve a particular community of knowledge
workers.
The difference between a data warehouse and a data mart can be confusing because the two terms
are sometimes used incorrectly as synonyms. A data warehouse is a central repository for all an
organization’s data. The goal of a data mart, however, is to meet the particular demands of a specific
group of users within the organization, such as human resource management (HRM). Generally, an
organization’s data marts are subsets of the organization’s data warehouse.
Because data marts are optimized to look at data in a unique way, the design process tends to start
with an analysis of user needs. In contrast, a data warehouse’s design process tends to start with an
analysis of what data already exists and how it can be collected and managed in such a way that it
can be used later on. A data warehouse tends to be a strategic but somewhat unfinished concept; a
data mart tends to be tactical and aimed at meeting an immediate need.
Today, data virtualization software can be used to create virtual data marts, pulling data from
disparate sources and combining it with other data as necessary to meet the needs of specific
business users. A virtual data mart provides knowledge workers with access to the data they need
while preventing data silos and giving the organization’s data management team a level of control
over the organization’s data throughout its lifecycle.
Query-driven Approach
Update-driven Approach
Query-Driven Approach
This is the traditional approach to integrate heterogeneous databases. This approach was used to
build wrappers and integrators on top of multiple heterogeneous databases. These integrators are also
known as mediators.
When a query is issued to a client side, a metadata dictionary translates the query
into an appropriate form for individual heterogeneous sites involved.
Now these queries are mapped and sent to the local query processor.
The results from heterogeneous sites are integrated into a global answer set.
Disadvantages
This approach is also very expensive for queries that require aggregations.
Update-Driven Approach
This is an alternative to the traditional approach. Today’s data warehouse systems follow update-
driven approach rather than the traditional approach discussed earlier. In update-driven approach, the
36
BIA
information from multiple heterogeneous sources are integrated in advance and are stored in a
warehouse. This information is available for direct querying and analysis.
Advantages
This approach has the following advantages −
Query processing does not require an interface to process data at local sources.
Note − Data cleaning and data transformation are important steps in improving the quality of data
and data mining results.
Since a data warehouse can gather information quickly and efficiently, it can
enhance business productivity.
37
BIA
A data warehouse also helps in bringing down the costs by tracking trends, patterns
over a long period in a consistent and reliable manner.
To design an effective and efficient data warehouse, we need to understand and analyze the business
needs and construct a business analysis framework. Each person has different views regarding the
design of a data warehouse. These views are as follows −
The top-down view− This view allows the selection of relevant information needed
for a data warehouse.
The data source view− This view presents the information being captured, stored,
and managed by the operational system.
The data warehouse view− This view includes the fact tables and dimension tables.
It represents the information stored inside the data warehouse.
The business query view− It is the view of the data from the viewpoint of the end-
user.
Bottom Tier− The bottom tier of the architecture is the data warehouse database
server. It is the relational database system. We use the back end tools and utilities to feed data into
the bottom tier. These back end tools and utilities perform the Extract, Clean, Load, and refresh
functions.
Middle Tier− In the middle tier, we have the OLAP Server that can be implemented
in either of the following ways.
Top-Tier− This tier is the front-end client layer. This layer holds the query tools and
reporting tools, analysis tools and data mining tools.
38
BIA
Virtual Warehouse
Data mart
Enterprise Warehouse
Virtual Warehouse
The view over an operational data warehouse is known as a virtual warehouse. It is easy to build a
virtual warehouse. Building a virtual warehouse requires excess capacity on operational database
servers.
Data Mart
Data mart contains a subset of organization-wide data. This subset of data is valuable to specific
groups of an organization.
In other words, we can claim that data marts contain data specific to a particular group. For example,
the marketing data mart may contain data related to items, customers, and sales. Data marts are
confined to subjects.
39
BIA
The implementation data mart cycles is measured in short periods of time, i.e., in
weeks rather than months or years.
The life cycle of a data mart may be complex in long run, if its planning and design
are not organization-wide.
Enterprise Warehouse
An enterprise warehouse collects all the information and the subjects spanning an
entire organization
The data is integrated from operational systems and external information providers.
This information can vary from a few gigabytes to hundreds of gigabytes, terabytes
or beyond.
Load Manager
This component performs the operations required to extract and load process.
The size and complexity of the load manager varies between specific solutions from one data
warehouse to other.
Perform simple transformations into structure similar to the one in the data
warehouse.
40
BIA
1. Expectations are communicated to the users
IT is often unwilling or afraid to tell the users what they will be getting and when. Users should
be told about the following:
Function – the data that will be accessible and what pre-defined queries and reports
are available. The level of detail data, as well as how the data is integrated and aggregated
Historical data
The expectations of accuracy for both the cleanliness of the data and an
understanding of what the data means
Timeliness – when the data will be available and how frequently the data is
loaded/updated/refreshed
Have the users involved all the way through the project
The last level is by far the most successful approach, while the first almost always results in
failure.
The best sponsor is from the business side, not from IT. Most importantly, the sponsor should be
in serious need of the data warehouse’s capabilities to solve a specific problem or gain some
advantage for his or her department.
Without the right skills dedicated to the team, the project will fail. The emphasis is on “dedicated
to the team.”
41
BIA
5. The schedule is realistic
The most common cause of failure is an unrealistic schedule, usually imposed without the input
or the concurrence of the project manager or team members. Most often, the imposed schedules
have no rationale for specific dates, but are only means to “hold the project manager to a
schedule.” A realistic schedule will include all the required tasks to implement the project along
with their durations, assigned resources and task dependencies.
The first decisions to be made are the categories of tools: Extract/Transform/Load, data cleansing,
OLAP, ROLAP, data modeling, administration, and so on. The tools must match the requirements
of the organization, the users, and the project. The tools should work together without the need to
build interfaces or write special code.
In spite of what the vendors tell you, users must be trained and the training should be geared to
the level of user and the way they plan to use the data warehouse. All users must learn about the
data, and power users should have additional in-depth training on the data structures.
Data Cleaning− In this step, the noise and inconsistent data is removed.
Data Selection− In this step, data relevant to the analysis task are retrieved from the
database.
42
BIA
Data Transformation− In this step, data is transformed or consolidated into forms
appropriate for mining by performing summary or aggregation operations.
Data Mining− In this step, intelligent methods are applied in order to extract data
patterns.
1. Identify the goal of the KDD process from the customer’s perspective.
4. Cleanse and preprocess data by deciding strategies to handle missing fields and alter
the data as per the requirements.
5. Simplify the data sets by removing unwanted variables. Then, analyze useful
features that can be used to represent the data, depending on the goal or task.
6. Match KDD goals with data mining methods to suggest hidden patterns.
7. Choose data mining algorithms to discover hidden patterns. This process includes
deciding which models and parameters might be appropriate for the overall KDD process.
43
BIA
9. Interpret essential knowledge from the mined patterns.
10. Use the knowledge and incorporate it into another system for further action.
Exploration– In this step the data is cleared and converted into another form. The
nature of data is also determined
Pattern Identification– The next step is to choose the pattern which will make the
best prediction
Deployment– The identified patterns are used to get the desired outcome.
It is of high speed which makes it easy for the users to analyze huge amount of data
in less time
44
BIA
One of the most important task in Data Mining is to select the correct data mining technique. Data
Mining technique has to be chosen based on the type of business and the type of problem your
business faces. A generalized approach has to be used to improve the accuracy and cost effectiveness
of using data mining techniques. There are basically seven main Data Mining techniques which is
discussed in this article. There are also a lot of other Data Mining techniques but these seven are
considered more frequently used by business people.
Visualization classification
Decision Tree
1. Statistical Techniques
Data mining techniques statistics is a branch of mathematics which relates to the collection and
description of data. Statistical technique is not considered as a data mining technique by many
analysts. But still it helps to discover the patterns and build predictive models. For this reason data
analyst should possess some knowledge about the different statistical techniques. In today’s world
people have to deal with large amount of data and derive important patterns from it. Statistics can
help you to a greater extent to get answers for questions about their data like
What is the high level summary that can give you a detailed view of what is there in
the database ?
Statistics not only answers these questions they help in summarizing the data and count it. It also
helps in providing information about the data with ease. Through statistical reports people can take
smart decisions. There are different forms of statistics but the most important and useful technique is
the collection and counting of data. There are a lot of ways to collect data like
Histogram
Mean
Median
Mode
45
BIA
Variance
Max
Min
Linear Regression
2. Clustering Technique
Clustering is one among the oldest techniques used in Data Mining. Clustering analysis is the
process of identifying data that are similar to each other. This will help to understand the differences
and similarities between the data. This is sometimes called segmentation and helps the users to
understand what is going on within the database. For example, an insurance company can group its
customers based on their income, age, nature of policy and type of claims.
Partitioning Methods
The most popular clustering algorithm is Nearest Neighbour. Nearest neighbour technique is very
similar to clustering. It is a prediction technique where in order to predict what a estimated value is
in one record look for records with similar estimated values in historical database and use the
prediction value from the record which is near to the unclassified record. This technique simply
states that the objects which are closer to each other will have similar prediction values. Through this
method you can easily predict the values of nearest objects very easily. Nearest Neighbour is the
most easy to use technique because they work as per the thought of the people. They also work very
well in terms of automation. They perform complex ROI calculations with ease. The level of
accuracy in this technique is as good as the other Data Mining techniques.
In business Nearest Neighbour technique is most often used in the process of Text Retrieval. They
are used to find the documents that share the important characteristics with that main document that
have been marked as interesting.
3. Visualization
Visualization is the most useful technique which is used to discover data patterns. This technique is
used at the beginning of the Data Mining process. Many researches are going on these days to
46
BIA
produce interesting projection of databases, which is called Projection Pursuit. There are a lot of data
mining technique which will produce useful patterns for good data. But visualization is a technique
which converts Poor data into good data letting different kinds of Data Mining methods to be used in
discovering hidden patterns.
A decision tree is a predictive model and the name itself implies that it looks like a tree. In this
technique, each branch of the tree is viewed as a classification question and the leaves of the trees
are considered as partitions of the dataset related to that particular classification. This technique can
be used for exploration analysis, data pre-processing and prediction work.
Decision tree can be considered as a segmentation of the original dataset where segmentation is done
for a particular reason. Each data that comes under a segment has some similarities in their
information being predicted. Decision trees provides results that can be easily understood by the
user.
Decision tree technique is mostly used by statisticians to find out which database is more related to
the problem of the business. Decision tree technique can be used for Prediction and Data pre-
processing.
The first and foremost step in this technique is growing the tree. The basic of growing the tree
depends on finding the best possible question to be asked at each branch of the tree. The decision
tree stops growing under any one of the below circumstances
CART which stands for Classification and Regression Trees is a data exploration and prediction
algorithm which picks the questions in a more complex way. It tries them all and then selects one
best question which is used to split the data into two or more segments. After deciding on the
segments it again asks questions on each of the new segment individually.
Another popular decision tree technology is CHAID (Chi-Square Automatic Interaction Detector). It
is similar to CART but it differs in one way. CART helps in choosing the best questions whereas
CHAID helps in choosing the splits.
5. Neural Network
Neural Network is another important technique used by people these days. This technique is most
often used in the starting stages of the data mining technology. Artificial neural network was formed
out of the community of Artificial intelligence.
47
BIA
Neural networks are very easy to use as they are automated to a particular extent and because of this
the user is not expected to have much knowledge about the work or database. But to make the neural
network work efficiently you need to know
There are two main parts of this technique – the node and the link
The node– which freely matches to the neuron in the human brain
The link– which freely matches to the connections between the neurons in the
human brain
A neural network is a collection of interconnected neurons. which could form a single layer or
multiple layer. The formation of neurons and their interconnections are called architecture of the
network. There are a wide variety of neural network models and each model has its own advantages
and disadvantages. Every neural network model has different architectures and these architectures
use different learning procedures.
Neural networks are very strong predictive modelling technique. But it is not very easy to understand
even by experts. It creates very complex models which is impossible to understand fully. Thus to
understand the Neural network technique companies are finding out new solutions. Two solutions
have already been suggested
First solution is Neural network is packaged up into a complete solution which will
let it to be used for a single application
Neural network has been used in various kinds of applications. This has been used in the business to
detect frauds taking place in the business.
This technique helps to find the association between two or more items. It helps to know the
relations between the different variables in databases. It discovers the hidden patterns in the data sets
which is used to identify the variables and the frequent occurrence of different variables that appear
with the highest frequencies.
This technique is most often used in retail industry to find patterns in sales. This will help increase
the conversion rate and thus increases profit.
7. Classification
Data mining techniques classification is the most commonly used data mining technique which
contains a set of pre classified samples to create a model which can classify the large set of data.
This technique helps in deriving important information about data and metadata (data about data).
This technique is closely related to cluster analysis technique and it uses decision tree or neural
network system. There are two main processes involved in this technique
Classification– In this process the data is used to measure the precision of the
classification rules
Bayesian Classification
Neural Networks
49
BIA
Market basket analysis only uses transactions with more than one item, as no associations can be
made with single purchases. Item association does not necessarily suggest a cause and effect, but
simply a measure of co-occurrence. It does not mean that since energy drinks and video games are
frequently bought together, one is the cause for the purchase of the other, but it can be construed
from the information that this purchase is most probably made by (or for) a gamer. Such rules or
hypothesis must be tested and should not be taken as truth unless item sales say otherwise.
Predictive MBA is used to classify cliques of item purchases, events and services
that largely occur in sequence.
Differential MBA removes a high volume of insignificant results and can lead to
very in-depth results. It compares information between different stores, demographics, seasons of the
year, days of the week and other factors.
MBA is commonly used by online retailers to make purchase suggestions to consumers. For
example, when a person buys a particular model of smartphone, the retailer may suggest other
products such as phone cases, screen protectors, memory cards or other accessories for that
particular phone. This is due to the frequency with which other consumers bought these items in the
same transaction as the phone.
MBA is also used in physical retail locations. Due to the increasing sophistication of point of sale
systems coupled with big data analytics, stores are using purchase data and MBA to help improve
store layouts so that consumers can more easily find items that are frequently purchased together.
Data Mining is widely used in diverse areas. There are a number of commercial data mining system
available today and yet there are many challenges in this field. In this tutorial, we will discuss the
applications and the trend of data mining.
50
BIA
Here is the list of areas where data mining is widely used −
Retail Industry
Telecommunication Industry
Intrusion Detection
Design and construction of data warehouses for multidimensional data analysis and
data mining.
Retail Industry
Data Mining has its great application in Retail Industry because it collects large amount of data from
on sales, customer purchasing history, goods transportation, consumption and services. It is natural
that the quantity of data collected will continue to expand rapidly because of the increasing ease,
availability and popularity of the web.
Data mining in retail industry helps in identifying customer buying patterns and trends that lead to
improved quality of customer service and good customer retention and satisfaction. Here is the list of
examples of data mining in the retail industry −
Design and Construction of data warehouses based on the benefits of data mining.
Customer Retention.
51
BIA
Product recommendation and cross-referencing of items.
Telecommunication Industry
Today the telecommunication industry is one of the most emerging industries providing various
services such as fax, pager, cellular phone, internet messenger, images, e-mail, web data
transmission, etc. Due to the development of new computer and communication technologies, the
telecommunication industry is rapidly expanding. This is the reason why data mining is become very
important to help and understand the business.
52
BIA
Other Scientific Applications
The applications discussed above tend to handle relatively small and homogeneous data sets for
which the statistical techniques are appropriate. Huge amount of data have been collected from
scientific domains such as geosciences, astronomy, etc. A large amount of data sets is being
generated because of the fast numerical simulations in various fields such as climate and ecosystem
modeling, chemical engineering, fluid dynamics, etc. Following are the applications of data mining
in the field of Scientific Applications −
Graph-based mining.
Intrusion Detection
Intrusion refers to any kind of action that threatens integrity, confidentiality, or the availability of
network resources. In this world of connectivity, security has become the major issue. With
increased usage of internet and availability of the tools and tricks for intruding and attacking
network prompted intrusion detection to become a critical component of network administration.
Here is the list of areas in which data mining technology may be applied for intrusion detection −
There are many data mining system products and domain specific data mining applications. The new
data mining systems and applications are being added to the previous systems. Also, efforts are
being made to standardize data mining languages.
53
BIA
Data Types− The data mining system may handle formatted text, record-based data,
and relational data. The data could also be in ASCII text, relational database data or data warehouse
data. Therefore, we should check what exact format the data mining system can handle.
System Issues− We must consider the compatibility of a data mining system with
different operating systems. One data mining system may run on only one operating system or on
several. There are also data mining systems that provide web-based user interfaces and allow XML
data as input.
Data Sources− Data sources refer to the data formats in which data mining system
will operate. Some data mining system may work only on ASCII text files while others on multiple
relational sources. Data mining system should also support ODBC connections or OLE DB for ODBC
connections.
Data Mining functions and methodologies− There are some data mining systems
that provide only one data mining function such as classification while some provides multiple data
mining functions such as concept description, discovery-driven OLAP analysis, association mining,
linkage analysis, statistical analysis, classification, prediction, clustering, outlier analysis, similarity
search, etc.
Coupling data mining with databases or data warehouse systems− Data mining
systems need to be coupled with a database or a data warehouse system. The coupled components
are integrated into a uniform information processing environment. Here are the types of coupling
listed below −
o No coupling
o Loose Coupling
o Tight Coupling
o Data Visualization
Data Mining query language and graphical user interface− An easy-to-use graphical
user interface is important to promote user-guided, interactive data mining. Unlike relational
database systems, data mining systems do not share underlying data mining query language.
Data mining concepts are still evolving and here are the latest trends that we get to see in this field −
Application Exploration.
Integration of data mining with database systems, data warehouse systems and web
database systems.
Web mining.
55
BIA
Types of knowledge
THEINTACTFRONT20 APR 2018 1 COMMENT
Knowledge management is an activity practiced by enterprises all over the world. In the process of
knowledge management, these enterprises comprehensively gather information using many methods
and tools.
Then, gathered information is organized, stored, shared, and analyzed using defined techniques.
The analysis of such information will be based on resources, documents, people and their skills.
56
BIA
Properly analyzed information will then be stored as ‘knowledge’ of the enterprise. This knowledge
is later used for activities such as organizational decision making and training new staff members.
There have been many approaches to knowledge management from early days. Most of early
approaches have been manual storing and analysis of information. With the introduction of
computers, most organizational knowledge and management processes have been automated.
Therefore, information storing, retrieval and sharing have become convenient. Nowadays, most
enterprises have their own knowledge management framework in place.
The framework defines the knowledge gathering points, gathering techniques, tools used, data
storing tools and techniques and analyzing mechanism.
1. A Priori
A priori and a posteriori are two of the original terms in epistemology (the study of knowledge). A
priori literally means “from before” or “from earlier.” This is because a priori knowledge depends
upon what a person can derive from the world without needing to experience it. This is better known
as reasoning. Of course, a degree of experience is necessary upon which a priori knowledge can
take shape.
Let’s look at an example. If you were in a closed room with no windows and someone asked you
what the weather was like, you would not be able to answer them with any degree of truth. If you
did, then you certainly would not be in possession of a priori knowledge. It would simply be
impossible to use reasoning to produce a knowledgeable answer.
On the other hand, if there were a chalkboard in the room and someone wrote the equation 4 + 6
= ? on the board, then you could find the answer without physically finding four objects and adding
six more objects to them and then counting them. You would know the answer is 10 without needing
a real world experience to understand it. In fact, mathematical equations are one of the most popular
examples of a priori knowledge.
Interested in learning more about philosophy? Check out this five-star course on an introduction
to philosophy and its different schools of thought.
57
BIA
2. A Posteriori
Naturally, then, a posteriori literally means “from what comes later” or “from what comes after.”
This is a reference to experience and using a different kind of reasoning (inductive) to gain
knowledge. This kind of knowledge is gained by first having an experience (and the important idea
in philosophy is that it is acquired through the five senses) and then using logic and reflection to
derive understanding from it. In philosophy, this term is sometimes used interchangeably with
empirical knowledge, which is knowledge based on observation.
It is believed that a priori knowledge is more reliable than a posteriori knowledge. This might seem
counter-intuitive, since in the former case someone can just sit inside of a room and base their
knowledge on factual evidence while in the latter case someone is having real experiences in the
world. But the problem lies in this very fact: everyone’s experiences are subjective and open to
interpretation. This is a very complex subject and you might find it illuminating to read this post on
knowledge issues and how to identify and use them. A mathematical equation, on the other hand,
is law.
3. Explicit Knowledge
Now we are entering the realm of explicit and tacit knowledge. As you have noticed by now, types of
knowledge tend to come in pairs and are often antitheses of each other. Explicit knowledge is similar
to a priori knowledge in that it is more formal or perhaps more reliable. Explicit knowledge is
knowledge that is recorded and communicated through mediums. It is our libraries and databases.
The specifics of what is contained is less important than how it is contained. Anything from the
sciences to the arts can have elements that can be expressed in explicit knowledge. Get a taste of
explicit knowledge for yourself with this top-rated course on learning how to learn and knowing
how to tap into your inner genius.
The defining feature of explicit knowledge is that it can be easily and quickly transmitted from one
individual to another, or to another ten-thousand or ten-billion. It also tends to be organized
systematically. For example, a history textbook on the founding of America would take a
chronological approach as this would allow knowledge to build upon itself through a progressive
system; in this case, time.
4. Tacit Knowledge
I should note that tacit knowledge is a relatively new theory introduced only as recently as the 1950s.
Whereas explicit knowledge is very easy to communicate and transfer from one individual to
another, tacit knowledge is precisely the opposite. It is extremely difficult, if not impossible, to
communicate tacit knowledge through any medium.
For example, the textbook on the founding of America can teach facts (or things we believe to be
facts), but someone who is an expert musician can not truly communicate their knowledge; in other
words, they can not tell someone how to play the instrument and the person will immediately
possess that knowledge. That knowledge must be acquired to a degree that goes far, far beyond
58
BIA
theory. In this sense, tacit knowledge would most closely resemble a posteriori knowledge, as it can
only be achieved through experience.
The biggest difficult of tacit knowledge is knowing when it is useful and figuring out how to make it
usable. Tacit knowledge can only be communicated through consistent and extensive relationships or
contact (such as taking lessons from a professional musician). But even in this cases there will not be
a true transfer of knowledge. Usually two forms of knowledge are born, as each person must fill in
certain blanks (such as skill, short-cuts, rhythms, etc.). You can better understand this theory and
other ways we use knowledge with this video textbook on the psychology of learning.
Our last pair of knowledge theories are propositional and non-propositional knowledge, both of
which share similarities with some of the other theories already discussed. Propositional knowledge
has the oddest definition yet, as it is commonly held that it is knowledge that can literally be
expressed in propositions; that is, in declarative sentences (to use its other name) or indicative
propositions.
Propositional knowledge is not so different from a priori and explicit knowledge. The key attribute
is knowing that something is true. Again, mathematical equations could be an example of
propositional knowledge, because it is knowledge of something, as opposed to knowledge of how to
do something.
The best example is one that contrasts propositional knowledge with our next form of knowledge,
non-propositional or procedural knowledge. Let’s use a textbook/manual/instructional pamphlet that
has information on how to program a computer as our example. Propositional knowledge is simply
knowing something or having knowledge of something. So if you read and/or memorized the
textbook or manual, then you would know the steps on how to program a computer. You could even
repeat these steps to someone else in the form of declarative sentences or indicative propositions.
However, you may have memorized every word yet have no idea how to actually program a
computer. That is where non-propositional or procedural knowledge comes in.
Now might be a good time to brush up on how we learn with this sweet course on how to base
goals on what you want to learn in order to exceed your wildest dreams.
Non-propositional knowledge (which is better known as procedural knowledge, but I decided to use
“non-propositional” because it is a more obvious antithesis to “propositional”) is knowledge that can
be used; it can be applied to something, such as a problem. Procedural knowledge differs from
propositional knowledge in that it is acquired “by doing”; propositional knowledge is acquired by
more conservative forms of learning.
59
BIA
One of the defining characteristics of procedural knowledge is that it can be claimed in a court of
law. In other words, companies that develop their own procedures or methods can protect them as
intellectual property. They can then, of course, be sold, protected, leased, etc.
Procedural knowledge has many advantages. Obviously, hands-on experience is extremely valuable;
literally so, as it can be used to obtain employment. We are seeing this today as experience
(procedural) is eclipsing education (propositional). Sure, education is great, but experience is what
defines what a person is capable of accomplishing. So someone who “knows” how to write code is
not nearly as valuable as someone who “writes” or “has written” code. However, some people
believe that this is a double-edged sword, as the degree of experience required to become proficient
limits us to a relatively narrow field of variety.
But nobody can deny the intrinsic and real value of experience. This is often more accurate than
propositional knowledge because it is more akin to the scientific method; hypotheses are tested,
observation is used, and progress results.
Intranet
Internal expertise
Definition of KMS
60
BIA
Purpose of KMS
Improved performance
Competitive advantage
Innovation
Sharing of knowledge
Integration
o Driving strategy
Start with the business problem and the business value to be delivered first.
Identify what kind of strategy to pursue to deliver this value and address the KM
problem.
Think about the system required from a people and process point of view.
Finally, think about what kind of technical infrastructure are required to support the
people and processes.
Implement system and processes with appropriate change management and iterative
staged release.
61
BIA
Knowledge Management Technologies also support knowledge management systems and benefit
from the knowledge management infrastructure, especially the information technology
infrastructure. KM technologies constitute a key component of KM systems.
Technologies that support KM include artificial intelligence (AI) technologies including those used
for knowledge acquisition and case-based reasoning systems, electronic discussion groups,
computer-based simulations, databases, decision support systems, enterprise resource planning
systems, expert systems, management information systems, expertise locator systems,
videoconferencing, and information repositories including best practices databases and lessons
learned systems. KM technologies also include the emergent Web 2.0 technologies, such as wikis
and blog (Becerra-Fernandez and Sabherwal, 2010).
Knowledge Management Mechanisms and Technologies work together and affect each other. You
can follow the following video-clips to learn more about how information technology influence
knowledge management
There are four main knowledge management processes, and each process comprises two sub-
processes:
Knowledge discovery
o Combination
o Socialization
Knowledge capture
o Externalization
o Internalization
Knowledge sharing
o Socialization
o Exchange
Knowledge application
o Direction
o Routines
62
BIA
Organizations are closely watching emerging technology trends to discover the next great
competitive advantage in the use of information. One trend is easy to identify: more information.
Data volumes are growing across the board, with organizations seeking to tap new sources generated
by social media and online customer behavior. This trend is spurring tremendous interest in better
access and analysis of the variety of information available in unstructured or semi-structured content
sources.
From a macro perspective, it’s easy to identify the biggest long-term trend in business intelligence:
providing nontechnical users with the tools and capabilities to access, analyze, and share data on
their own. However, the road to this destination has not been easy. With IT driving application
development and deployment, standard approaches to extending enterprise BI and data analysis
capabilities have been difficult and slow. Getting the requirements right for the data, reports,
visualization, and drill-down analysis capabilities is difficult and never fully satisfactory. By the time
requirements have been gathered and turned into application features, users will have identified
different requirements.
2. Unified Access and Analysis of All Types of Information Improves User Productivity
As the implementation of BI and analytics tools spreads to more users within organizations, a
question inevitably arises: What about all the information in text and document formats, which
accounts for the vast majority of what users encounter? Difficulty in finding information, whether
structured or unstructured, is a productivity cost to organizations. If one of the measures of BI’s
value is improved productivity, then BI should help users access and analyze unstructured as well as
structured information.
Customer data intelligence has long been a major driver behind growth in the implementation of
sophisticated analytics for prediction and pattern recognition as well as advanced data warehousing.
In the brick-and-mortar days, organizations wanted to slice, dice, and mine transaction data and
interpret it against demographic information. Advanced organizations sought to mine the data to
uncover buying patterns and product affinities. As e-commerce and call centers proliferated,
organizations needed to expand customer analysis to include interaction information recorded in all
channels, bringing more terabytes into their data warehouses.
63
BIA
Now, with Twitter, Facebook, and other sites, we have hit the social media age: customers are using
social networks to influence others and express their shopping interests and experiences.
Organizations are hungry to capture and analyze activity by current and potential customers in social
networks and comment fields across the Internet marketplace.
4. Text Analytics Enables Organizations to Interpret Social Media Sentiment Trends and
Commentary
Rising interest in social media analysis is putting the spotlight on text analytics, which is the critical
technology for understanding “sentiment” in social media, as well as customer reviews and other
content sources. Like data mining, the text mining and analytics category stretches to include a range
of techniques and software, such as natural language processing, relationship extraction,
visualization, and predictive analysis.
Text analytics falls within the realm of interpretation rather than exact science, which makes it a nice
complement to BI and structured data analytics. Sentiment analysis, for example, employs statistical
and linguistic text analysis methods to understand positive and negative comments. While this
analysis can provide an early sense of the reception of a new product or service, the interpretation
cannot replace the more exacting analysis of the numbers done with BI or structured analytics tools.
Sentiment analysis, however, can help organizations become more proactive in taking steps to
address negative reactions to products and services before they lead to the poor sales that BI and data
warehouse users detect later in the reporting and analysis of sales transaction figures.
When limited to a reactive posture, organizations face delays and confusion in how to respond to
events, which can lead to increased costs and missed opportunities. Reactive organizations lack a
well-orchestrated plan and can only respond to events on a case-by-case basis. With speed and
complexity rising in many industries, a reactive posture isn’t good enough. Organizations need
business intelligence and analytics applications and services that will help them shift from a reactive
to a proactive and predictive posture. Traditional BI systems are not enough for organizations to
make this shift.
Decision management is the term industry experts and vendors use to describe the integration of
analytics with business rules and process management systems to achieve a predictive and proactive
posture in a real-time world. Decision management requires several technologies. Business rules, or
conditional statements for guiding decision processes, are common in application code and logic; the
64
BIA
challenge is to implement business rules systems that can guide decisions across applications and
processes, not just within one system. Business process management systems help organizations
optimize processes that cross applications and use analytics as part of the continuous improvement
of those processes.
Along with business rules and business process management, a third technology important to
decision management is complex (or business) event processing. Events are happening everywhere;
they are recorded or “sensed” from online behavior, RFID tags, manufacturing systems, surveillance,
financial services trading, and so on. Integrated with analytics and data visualization, event
processing systems can enable organizations to pick out meaningful events from a stream or “cloud”
of noise that is not important.
Organizations can use decision management technologies to automate decisions where speed and
complexity overwhelm human-centered decision processes, and where there are competitive
advantages to having decisions executed in real time and driven by predictive models. Decision
management is an emerging technology area currently focused on specialized systems, but as
demand for greater execution speed and efficiency grows, more organizations will evaluate its
potential for mainstream requirements.
65
BIA
66