Vous êtes sur la page 1sur 6

OLAP and Data Mining

In large data warehouse environments, many different types of analysis can occur.
In addition to SQL queries, you may also apply more advanced analytical
operations to your data. Two major types of such analysis are OLAP (On-Line
Analytic Processing) and data mining. Rather than having a separate OLAP or data
mining engine, Oracle has integrated OLAP and data mining capabilities directly
into the database server. Oracle OLAP and Oracle Data Mining are options to the
Oracle9i Database. This chapter provides a brief introduction to these technologies,
and more detail can be found in these products' respective documentation.
The following topics provide an introduction to Oracle's OLAP and data mining
capabilities:
OLAP
Data Mining
See Also:
Oracle9i OLAP User's Guide for further information regarding OLAP
and Oracle Data Mining documentation for further information regarding
data mining

OLAP
Oracle9i OLAP adds the query performance and calculation capability previously
found only in multidimensional databases to Oracle's relational platform. In
addition, it provides a Java OLAP API that is appropriate for the development of
internet-ready analytical applications. Unlike other combinations of OLAP and
RDBMS technology, Oracle9i OLAP is not a multidimensional database using
bridges to move data from the relational data store to a multidimensional data
store. Instead, it is truly an OLAP-enabled relational database. As a result,
Oracle9i provides the benefits of a multidimensional database along with the
scalability, accessibility, security, manageability, and high availability of the
Oracle9i database. The Java OLAP API, which is specifically designed for internetbased analytical applications, offers productive data access.
See Also:
Oracle9i OLAP User's Guide for further information regarding OLAP

Benefits of OLAP and RDBMS Integration


Basing an OLAP system directly on the Oracle server offers the following benefits:
Scalability
Availability
Manageability
Backup and Recovery
Security
Scalability
Oracle9i OLAP is highly scalable. In today's environment, there is tremendous
growth along three dimensions of analytic applications: number of users, size of
data, complexity of analyses. There are more users of analytical applications, and
they need access to more data to perform more sophisticated analysis and target
marketing. For example, a telephone company might want a customer dimension to
include detail such as all telephone numbers as part of an application that is used to
analyze customer turnover. This would require support for multi-million row
dimension tables and very large volumes of fact data. Oracle9 i can handle very
large data sets using parallel execution and partitioning, as well as offering support
for advanced hardware and clustering.
Availability
Oracle9i includes many features that support high availability. One of the most
significant is partitioning, which allows management of precise subsets of tables
and indexes, so that management operations affect only small pieces of these data
structures. By partitioning tables and indexes, data management processing time is
reduced, thus minimizing the time data is unavailable. Another feature supporting
high availability is transportable tablespaces. With transportable tablespaces, large
data sets, including tables and indexes, can be added with almost no processing to
other databases. This enables extremely rapid data loading and updates.
Manageability
Oracle enables you to precisely control resource utilization. The Database
Resource Manager, for example, provides a mechanism for allocating the resources
of a data warehouse among different sets of end-users. Consider an environment
where the marketing department and the sales department share an OLAP system.
Using the Database Resource Manager, you could specify that the marketing

department receive at least 60 percent of the CPU resources of the machines, while
the sales department receive 40 percent of the CPU resources. You can also further
specify limits on the total number of active sessions, and the degree of parallelism
of individual queries for each department.
Another resource management facility is the progress monitor, which gives end
users and administrators the status of long-running operations. Oracle9 i maintains
statistics describing the percent-complete of these operations. Oracle Enterprise
Manager enables you to view a bar-graph display of these operations showing what
percent complete they are. Moreover, any other tool or any database administrator
can also retrieve progress information directly from the Oracle data server, using
system views.
Backup and Recovery
Oracle provides a server-managed infrastructure for backup, restore, and recovery
tasks that enables simpler, safer operations at terabyte scale. Some of the highlights
are:
Details related to backup, restore, and recovery operations are maintained by
the server in a recovery catalog and automatically used as part of these
operations. This reduces administrative burden and minimizes the possibility
of human errors.
Backup and recovery operations are fully integrated with partitioning.
Individual partitions, when placed in their own tablespaces, can be backed
up and restored independently of the other partitions of a table.
Oracle includes support for incremental backup and recovery using
Recovery Manager, enabling operations to be completed efficiently within
times proportional to the amount of changes, rather than the overall size of
the database.
The backup and recovery technology is highly scalable, and provides tight
interfaces to industry-leading media management subsystems. This provides
for efficient operations that can scale up to handle very large volumes of
data. Open Platforms for more hardware options & enterprise-level
platforms.
See Also:
Oracle9i Recovery Manager User's Guide for further details

Security

Just as the demands of real-world transaction processing required Oracle to


develop robust features for scalability, manageability and backup and recovery,
they lead Oracle to create industry-leading security features. The security features
in Oracle have reached the highest levels of U.S. government certification for
database trustworthiness. Oracle's fine grained access control feature, enables celllevel security for OLAP users. Fine grained access control works with minimal
burden on query processing, and it enables efficient centralized security
management.

Data Mining
Oracle enables data mining inside the database for performance and scalability.
Some of the capabilities are:
An API that provides programmatic control and application integration
Analytical capabilities with OLAP and statistical functions in the database
Multiple algorithms: Nave Bayes, decision trees, clustering, and association
rules
Real-time and batch scoring modes
Multiple prediction types
Association insights
See Also:
Oracle Data Mining documentation for more information

Enabling Data Mining Applications


Oracle9i Data Mining provides a Java API to exploit the data mining functionality
that is embedded within the Oracle9 i database.
By delivering complete programmatic control of the database in data mining,
Oracle Data Mining (ODM) delivers powerful, scalable modeling and real-time
scoring. This enables e-businesses to incorporate predictions and classifications in
all processes and decision points throughout the business cycle.
ODM is designed to meet the challenges of vast amounts of data, delivering
accurate insights completely integrated into e-business applications. This

integrated intelligence enables the automation and decision speed that e-businesses
require in order to compete today.
Predictions and Insights
Oracle Data Mining uses data mining algorithms to sift through the large volumes
of data generated by e-businesses to produce, evaluate, and deploy predictive
models. It also enriches mission critical applications in CRM, manufacturing
control, inventory management, customer service and support, Web portals,
wireless devices and other fields with context-specific recommendations and
predictive monitoring of critical processes. ODM delivers real-time answers to
questions such as:
Which N items is person A most likely to buy or like?
What is the likelihood that this product will be returned for repair?
Mining Within the Database Architecture
Oracle Data Mining performs all the phases of data mining within the database. In
each data mining phase, this architecture results in significant improvements
including performance, automation, and integration.
Data Preparation
Data preparation can create new tables or views of existing data. Both options
perform faster than moving data to an external data mining utility and offer the
programmer the option of snap-shots or real-time updates.
Oracle Data Mining provides utilities for complex, data mining-specific tasks.
Binning improves model build time and model performance, so ODM provides a
utility for user-defined binning. ODM accepts data in either single record format or
in transactional format and performs mining on transactional formats. Single
record format is most common in applications, so ODM provides a utility for
transforming single record format.
Associated analysis for preparatory data exploration and model evaluation is
extended by Oracle's statistical functions and OLAP capabilities. Because these
also operate within the database, they can all be incorporated into a seamless
application that shares database objects. This allows for more functional and faster
applications.
Model Building

Oracle Data Mining provides four algorithms: Nave Bayes, Decision Tree,
Clustering, and Association Rules. These algorithms address a broad spectrum of
business problems, ranging from predicting the future likelihood of a customer
purchasing a given product, to understand which products are likely be purchased
together in a single trip to the grocery store. All model building takes place inside
the database. Once again, the data does not need to move outside the database in
order to build the model, and therefore the entire data-mining process is
accelerated.
Model Evaluation
Models are stored in the database and directly accessible for evaluation, reporting,
and further analysis by a wide variety of tools and application functions. ODM
provides APIs for calculating traditional confusion matrixes and lift charts. It stores
the models, the underlying data, and these analysis results together in the database
to allow further analysis, reporting and application specific model management.
Scoring
Oracle Data Mining provides both batch and real-time scoring. In batch mode,
ODM takes a table as input. It scores every record, and returns a scored table as a
result. In real-time mode, parameters for a single record are passed in and the
scores are returned in a Java object.
In both modes, ODM can deliver a variety of scores. It can return a rating or
probability of a specific outcome. Alternatively it can return a predicted outcome
and the probability of that outcome occurring. Some examples follow.
How likely is this event to end in outcome A?
Which outcome is most likely to result from this event?
What is the probability of each possible outcome for this event?
Java API
The Oracle Data Mining API lets you build analytical models and deliver real-time
predictions in any application that supports Java. The API is based on the emerging
JSR-073 standard.

Vous aimerez peut-être aussi