Vous êtes sur la page 1sur 24

1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

ed. Insert Information Protection Policy Classification from Slide 13


Discovery in Action: The
Transformative Power of Oracle
Endeca Information Discovery
Adam Ferrari
VP Development

Richard Tomlinson
Director, Product Management

2 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Agenda

Introduction to Endeca Information Discovery


Demos
Cooking Show

3 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Dialogues Contain Critical, Untapped Insights
Why should I pay a She saved the project Competitive pricing is RT @finwiz. Checkout
checking fee if other with strong leadership 15% lower than yours video http://bit.ly/wLe6Y2
banks dont charge & building trust with the and they offer Sweet!
one? customer. discounted shipping.

Customers Workforce 3rd Parties The Public

4 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
New Insights Drive New Opportunities

Customers Workforce
Improve Customer Improve
Loyalty and Productivity"
Satisfaction" Retain Best
Build Better Products Employees"
and Services" Attract A-Players"

Impact
3rd Parties The Public
Improve Operational Understand
Revenue & margin" Unsolicited
Efficiency"
Operational efficiency."
Create and Maintain Feedback"
Better partnerships"
Better
Product Partnerships"
positioning."
Improve Brand
Perception and
Sentiment"

5 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
The Challenges of Unstructured Data
MOSTLY TEXT, AND DATA CAN BE DIRTY OR
A DIVERSE SCHEMAS DATA IS GROWING
OF UNCERTAIN VALUE
IN VOLUME
AND DIVERSITY

XML

20% STRUCTURED 80% UNSTRUCTURED

Business Intelligence
and Data Warehouses Text in Enterprise Enterprise Content Systems, Websites Social Media Big Data
Applications File Systems, Email

6 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Oracle Endeca Information Discovery
Rapid, intuitive exploration and analysis of data
from any combination of structured and unstructured sources"

Unstructured Analytics! Benefits


Unprecedented Information Visibility
Leverage Existing BI Investments
Self-Service Data Discovery
Reduced IT Costs, Better Business Decisions

Unique Features
Contextual Search, Navigation, Analytics
Dynamic Data and Metadata
Content Acquisition and Text Enrichment
In-Memory Performance

7 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Extend Business Analytics with Unstructured Data
Introducing Oracle Endeca Information Discovery

Oracle Business Intelligence" Oracle Endeca Information Discovery"


Best platform for integrated ROLAP and MOLAP" Best platform for Unstructured Analytics"

BI Server + Essbase" Endeca Server"


Common Enterprise" Hybrid Search/Analytical Database"
Information Model" Flexible Data Model"

Structured Data" Unstructured Data"


OLTP & ODS" Enterprise Applications" Data Warehouse"
Websites" Content Systems, Social Media" Big Data"
Systems" (Oracle, SAP, Others)" & Data Marts"
Files, Email"

8 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Oracle Endeca Information Discovery
Platform Technology Overview
Studio Web Application"
Studio"
" Contextual Search, Navigation, Analytics"
Intuitive Exploration and Analysis"
Qualitative and Outlier Visualizations"
Create and Share Apps"
Easy Drag-and-Drop Applications!
Endeca Server" !
"
Hybrid Search/Analytical Database"
Endeca Server Core Database!
In-Memory Architecture" Dynamic Data and Metadata"
In-Memory, Multi-Threaded Performance"
Integration Suite"
" Enterprise Scale, Security"
Data Integration and Enrichment" !
Structured and Unstructured "
Integration Suite ETL!
Integrates Structured and Unstructured"
Text Enrichment and Sentiment Analysis"

9 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Oracle Endeca Information Discovery
Understand the Complete Picture with Context from Any Source
Data Warehouse / External Content!
Business Intelligence! Government Agencies"
"

Product Sales" Safety Administration


Metric: Sale Price! Claim from Competitor X
Dimensions: Customer, Product, Dealer, Model ABC After
Date!
"
driving this car for only 3
months, I started having
Warranty Claims"
Metric: Claim Count, Labor Cost, Part Cost!
Dimensions: Customer, Product, Part,
Websites!
Dealer, Date!
Warranty Claims Industry Forums"
ClaimID ProdID PartID Date CustID Dealer PartCost LaborCost
12324 506 234 12/3 1233 Dealer1 $300 $200 .. focus on passenger
12325 507 235 12/4 1545 Dealer2 $450 $900
Sales Transactions
vehicle crashes, and are
ProdID Wk CustID Date Dealer Price used to investigate injury
506 25 1233 10/3 Dealer1 $35,000
mechanisms to identify
507 26 1545 09/4 Dealer2 $22,000
potential improvements

How do we avoid in vehicle design.


Product Quality Application!
Customer Verbatim" costly product recalls? Social Media!
Consumer Comments and Sentiment"
..customer heard a rattling
Love my new car but
sound toward left front
driver side. Had issues having difficulty controlling
with steering column steering on sharp
locking corners..

10 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
COOKING SHOW

11 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Oracle Endeca Information Discovery
Understand the Complete Picture with Context from Any Source
Data Warehouse / External Content!
Business Intelligence! 3rd Party Agencies"
"

Incidents" Several civilians wounded


Metric: # incidents, # wounded, etc! in assault by al-Shabaab
Dimensions: Event Type, Weapon, al-Islamiya in Mogadishu,
Location, Date!
"
Somalia

Websites!
Terrorist Incidents News Media"
IncidentID EventType Weapon Date Region Country GeoCoord NumWounded
12324 IED Bomb 12/3 Africa Somalia
-1.234
5 ..Somalia: 1 dead after
10.1234
Armed Middle -20.1234 bomb explodes near
12325 Rockets 12/4 Iran 12
Attack East 54.1234 vehicle. Try http://t.co/
ve54JafA.

How do we better
Content Management System! combat international Social Media!
Incident Long Text / Notes"
terrorism? Citizen Comments and Sentiment"
in the Israac neighborhood
of Gaalkacyo, Somalia,
Reports of IED detonation in
assailants detonated an Somalia, captors holding an
improvised explosive device American hostage moved him
(IED) near a vehicle 3 times in less than 24
occupied hours..

12 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Part 1: Initial Data Load

Starting with raw incident


data, our first goal is to
create an application like
this, which enables search
and navigation of terrorism
incidents

13 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Part 1: Initial Data Load Details

2. Data loaded using


Integration Suite

3. Resulting key/value representation in


the Endeca Server

1. Start with raw CSV data

14 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Part 2: Text Enrichment

Next, we enrich the


incident data by mining
the large text
description field
associated with the
incident. This uncovers
themes, entities (people,
companies, places)
associated sentiment for
each

15 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Part 3: Adding Related Social Media Data

Incident key/
value data,
loaded in Step 1
and expanded
in Step 2

Now combined
with Facebook
and Twitter
social media
posts

16 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Part 3: Adding Related Social Media Data

The combined data allows


us to create analyses that
relate information from
various sources using
shared attributes, such as
those derived from text
enrichment like Themes,
Entities and Sentiment.

17 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Q&A

18 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
19 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
20 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
APPENDIX:
TECHNOLOGY
OVERVIEW

21 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Oracle Endeca Server
Flexible Data Model TxnID = 12324
ProductID = 506 Structured data
Category = Mountain Bike
Transac=on
Amount = $499.99
Key-Value Store Suspension = Fox 32 F-Series TxnID ProductID Category Amount
FrameType = Aluminium
No segmentation into tables
12324 506 Mountain Bike 499
Saddle = Bontrager SSR
12325 507 Road Bike 1399
Mountain Accessories = Fork and shock sag
No overarching schema meter
Mountain Accessories = Water Bottle Semi-structured data
Simple concepts: Review = A great bike for off road. Smooth ride e.g. XML
over the bumps
Attributes like columns, except ReviewSentiment = Positive
ReviewTerm = Great
may be sparse, multi-valued, ReviewTerm = Off Road
hierarchical ReviewTerm = Smooth
ReviewTerm = Bumps
Records each record is a
collection of attribute/value pairs TxnID = 12325
ProductID = 507
Accommodates: Category = Road Bike
Amount = $1399.49
Idiosyncratic structure Weight = 20lb. Unstructured data + Text enrichment
each record is self describing, has

FrameType = Composite
Saddle = Bontrager Race
its own possibly unique schema Review = Disappointing for the price. The frame
feels heavier than I expected.
Multi-valued fields ReviewSentiment = Negative
ReviewTerm = Disappointing
Large fields of unstructured text ReviewTerm = Price
ReviewTerm = Heavy

22 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Endecas Unique User Experience
Interactive Data Exploration and Analysis

+ +
Deep Search Contextual Navigation Visual Analysis
Search across all data Data-Driven. Freely browse data without Charts, crosstabs, key metrics
Dynamic typeahead predefined paths or writing queries Geospatial visualization
Automatic spell correction Interactive. Shows only valid next steps Tag clouds
Unlocks unstructured data Easy to Use. Familiar online experience

23 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13
Discovery Application Lifecycle
Building applications in days, not months
Diverse and changing Automatically unified in Oracle Interactive search, navigation
information integrated and Endeca Server no predefined Drag-and-drop application and visualization for
enriched via ETL model required composition in Studio exploration and analysis

Structured

Semi-Structured

Unstructured

Iterate

24 Copyright 2012, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 13

Vous aimerez peut-être aussi