Académique Documents
Professionnel Documents
Culture Documents
Enabling ad-hoc
Analytic
Text Apps
with Hadoop
Making Hadoop
accessible to
business
professionals
Gather Explore
Enterprises recognizing potential of
leveraging the broader web for
business intelligence coverage - as
well as for internal data
Business Questions
• Name names: Who is doing what, who
isn!t doing what
• Overlay voting record with
demographic & voting records over
time
• Buzz - what are people talking about?
BBC Digital • Visualize content relationships
Democracy Project
Achieving Increased Knowledge of Interest:
• Members of Parliament (MPs)
Government Transparency
• Bills, Debates, Voting Districts
Scenario
• Gather unstructured data from anywhere between 200 to
Knowledge of Interest:
2000 data sources - every 15 minutes • People, places, events
• Perform preprocessing (search, transform, index) over
each source
• Publish harvested content for distributed content services
and downstream Mashups Web Content To Gather:
• ~118 3rd Party Finanical News Services and
Blogs, including: BBC, CNN ,Yahoo News,
Financial Times, NY Times, The Big Picture,
Fox News, PR Newswire, Market Watch, World
Press, Forbes, Google News, Wall Street ,
Journal, MSNBC, The Sun, ZDNet,
What is it?
An insight engine for enabling ad-hoc business insights for
business users - at web scale
What!s different?
• Unlocking insights embedded in unstructured data
• Analyzing data previously unavailable to analyze
M2 -> Demo
Business Questions
• How much is a target company worth?
• What are the high-value areas of their
portfolio?
• Explored cited patent topics, litigated
patents
Knowledge of Interest:
Project: • Patents ranked by citation – e.g how often
Improve IP Portfolio Analysis was a patent referenced determines value
for Mergers & Acquisitions
• Corporate genealogies IP ownership roll-up
• Augment analysis with items affecting IP
“...please collect all US Patent value, inventor affiliation, citation rank by
filings… then let’s do…”
time
M2 Architecture Characteristics:
• Extensible via UDFs
• REST API for customer choice of analytic
service/engine
• REST APl for choice of visualization packages
• Export content as feeds, XML, etc..
• ...more to come
Conclusions
In God we trust
Conclusions
Conclusions
INTERESTED?
www-01.ibm.com/software/ebusiness/jstart/about.html
!"#$%"&!'!()*('+,*,-