Vous êtes sur la page 1sur 16

Datacap V9.0.

1 Cognitive Capture

© 2016 IBM Corporation


Why Cognitive Capture
• The next generation of capture will be cognitive
→ Traditional systems often need human intervention to review and classify these
unknown or varying document formats – which is time-consuming and costly.
We’ve raised the capture bar by applying advanced imaging, natural language
processing and machine learning technologies to help identify and process
documents with unknown, highly variable and complex formats.

• Transactional Capture: Making capture part of the business process


→ The volume of documents – paper & electronic continues to grow.
Transactional capture supports workers requirements to directly submit
incoming documents for processing on demand as part of the business
process or case management

• Enhanced Recognition
→ Overall enhancements to lower the shop costs – better performance, more capabilities
and more automation

© 2016 IBM Corporation


Capture is entering a new cognitive era

Cognitive systems can ingest unstructured


data in all its forms. And importantly, they:

Understand it—through sensing and
interaction

Reason about it—crossing structured and
unstructured data to generate hypotheses,
considered arguments, recommendations

Learn—in fact, they never stop learning
Watson is the most advanced example.

What if we applied these ideas to capture?


© 2016 IBM Corporation
Where does cognitive capture fit?

Traditional capture systems require programming – someone defines the form so


the system knows what the document is and where to find important information

This works very well for documents on forms


you control, where the format is predictable.

However documents with highly variable


formats (like correspondence or contracts)
or for complex PDF files containing with
multiple documents (a mortgage package)
typically need human processing – which is
expensive and slower.

A cognitive capture system addresses these valuable, highly variable and


complex documents that until now have been hard to automate.
© 2016 IBM Corporation
What is Datacap Insight Edition?

A powerful solution to that combines cognitive computing & advanced capture

• Examines & analyzes document structure


• Distills structured information from unstructured
text
• Parses text and detects meaning with annotators
& dictionaries
• Understands the context in which the text is
analyzed

Insight Edition is an add-on to Datacap that


includes:

New Datacap actions, functions and rulesets to
accelerate cognitive applications

Integrated Watson BigInsights runtime
(SystemT)

Integrated IBM Content Classification (with new
decision tables)

Enterprise Rulerunner
© 2016 IBM Corporation
What is in the blocks?

LABEL VALUE
PAIRS
EXTRACTION
PERSONAL INFORMATION

SEARCH FOR
ENTITY,
TAXONOMY,
CATEGORY,
SENTIMENT
DOCTOR’S FINDINGS

COMBINATIO
N
THERAPY INFORMATION

© 2016 IBM Corporation


It understands language semantics, patterns...

PATTERN: US ADDRESS
PATTERN: NUMBER

SEARCH LEFT SEARCH LEFT

PATTERN: US DATE

PATTERN: EU DATE

PATTERN: US ADDRESS

© 2016 IBM Corporation


Correspondence

Cognitive Template Sender’s Co. Ltd260 SENDER'S ADDRESS(logo,


Bloor Street East address with phones and
Toronto, ON M4W 3B3 emails)

January 11th, 2016 DATE

HEADER
Ms. Maggie JonesAngel
Cosmetics Inc.110 East 25th Street RECIPIENT'S ADDRESS
New York, NY, 10021USA

RECIPIENT'S REFERENCE (IF


ANY)SENDER'S REFERENCE
Your ref: 123Our ref: abc (IF ANY)

Dear Ms. Jones, SALUTATION

(With reference to)


SUBJECT
Forthcoming Exhibition 

BODY
First paragraph...Second
paragraph...Third BODY OF LETTER
paragraph...

Yours sincerely,Regards,
CLOSING
Yours truly

JOHN SMITH SIGNATURE (HAND-WRITTEN)

FOOTER
JOHN SMITH,President NAME (TYPED),TITLE

cc: BOB WHITTER COPY TO

Enc: catalogue, Invoice ENCLOSURE

© 2016 IBM Corporation


Index-Value Pair Matching

BORROWER

Thomas Crandell
ACCOUNT NUMBER
SB-45561254-4
PROPERTY
5 Heritage Drive, Clearwater, OK 65525

© 2016 IBM Corporation


What is BigInsights / SystemT

• Used to find entities in text


• Understands the context of the data
• State of the art AQL language for expressing
NLP algorithms
• Complete separation of specification from
execution
• Expressivity and performance advantages
over grammar based systems

© 2016 IBM Corporation


Annotators

• Annotators are the rules that find the content


• Several written and ready to use
• New ones can be written using Annotation
Query Language (AQL)
• Existing ones can be enhanced

© 2016 IBM Corporation


Annotators

• Once written, they are compiled into TAM files for use in
Datacap (\rrs\aql)
• Then called by the ExtractText action

• Finally the FindExtractedText populates the field values

• Need annotators? Contact us.

© 2016 IBM Corporation


Ideal use cases for current release

• Correspondence: many annotators already built


• Documents requiring data extraction from:
→Free form text/paragraphs
→Need list of values to be extracted regardless of location
→Need classification of text sections e.g. legal property description
para for mortgage docs
→Near future: extraction of sentiment for classification or workflow
• Potential documents types (not limited to)
→Legal documents
→Financial documents
→Mortgage documents

© 2016 IBM Corporation


What Datacap Insight Edition isn’t

• Current annotators mostly focused on


correspondence – see list
• Typically not for structured forms – use
traditional methods for those

Future
• Bluemix will support: English, French,
German, Italian, Portuguese, Russian and
Spanish
• Bluemix actions are not built OOTB … yet…
→Will support entity, keywords, sentiment
• Bluemix taxonomy will support English only

© 2016 IBM Corporation


There are now TWO paths for customers building on the Datacap
foundation
1. Start with Datacap.
2. Add on Enterprise Edition OR Insight Edition depending on customer requirements
using Authorized, Occasional, Mobile or Network Scan Device users
3. Special bid Enterprise Edition to Insight Edition migrations
Customers with
Customers with requirements for
requirements for dynamic & variable
high volume, documents such as
largely form based loan & mortgage
capture – tax packages, claims,
forms, enrollments correspondence, and
etc contracts

IBM Datacap
IBM Datacap Insight Edition
Enterprise Edition

IBM Datacap
© 2016 IBM Corporation
Questions?

© 2016 IBM Corporation

Vous aimerez peut-être aussi