Vous êtes sur la page 1sur 99

Kofax Transformation Modules Advanced Track & Whats new in KTM

Technical Track

Bo Jin Sr. Solution Architect

Fatih Karaoglu Sr. Solution Architect

Agenda
Clustering Utility Benchmarking Separation Classification Extraction
Productivity Enhancements Design Time Technology Enhancements

Project Merge Tool


Localisation Thin Client Enhancements
Productivity Enhancements Users

Q&A

Clustering Utility
Technology Enhancements

New Utility for Clustering Unknown Documents


What it does Requirements

Step-by-step
Importing into KTM

What does the Kofax Clustering Utility do?


When configuring KTM content classification, the customer needs

to provide samples for each class.


What KTM requires:

What does the Kofax Clustering Utility do?


When configuring KTM content classification, the customer needs

to provide samples for each class.


What customers usually provide:

What does the Kofax Clustering Utility do?


presorting a document set into clusters of similar documents User labels some of these clusters

Utility learns from labeling and pre-sorts again


Several iterations of labeling and pre-sorting Export of sorted documents as learn-set for KTM content

classification

What does the Kofax Clustering Utility do?


new KTM project Customer uses Utility to provide KPSG or partner with sorted

documents
KPSG or partner uses Utility to sort documents from customer Understanding what are the biggest subsets of documents in a

customers monthly mailroom volume

enhancing a KTM project Customer adds new classes to project and needs samples for

classification

Requirements
Kofax Clustering Utility works with XDocuments XDocuments must be created with KTM OCR Server tool

KTM (5.5 SP2) must be installed to use Clustering Utility.

Requirements
Using the KTM OCR Server reduces the KTM base volume count Eval licenses supported

Hardware requirements same as for KC/KTM


Files to be clustered should be local for performance Need write access to file location

10

Step by Step KTM OCR Server


Configuring the KTM OCR Server: Select path to unsorted images

Enable Save XDoc files and

Save text files


Under OCR Settings, select

proper language
Leave rest at default
Running the KTM OCR Server: Simply press the Start button

11

Step by Step Kofax Clustering Utility


1. Import Point Import directory to same directory of unsorted documents

For each document, an .xdc file and a .txt file must exist
Select Start Discovery. Takes a while, ~0.5 sec per document Converts XDocs into internal format Identifies initial clusters

12

Step by Step Kofax Clustering Utility


2. Discovery Label initial 3 clusters

You see the most representative document of each cluster


Provide a name for each cluster, will be used as class name in

KTM

13

Step by Step Kofax Clustering Utility


2. Discovery

14

Step by Step Kofax Clustering Utility


2. Discovery You can stop discovery when 80-90% of the documents are

discovered or continue until all documents are discovered


At 80-90% the most common document types are often known,

remaining documents are likely in very small clusters


Click Review to continue to next step

15

Step by Step Kofax Clustering Utility


3. Review Sort by categories (labels)

Examine the categories for consistency


Confirm some documents if you want to cluster again

16

Step by Step Kofax Clustering Utility


3. Review

17

Step by Step Kofax Clustering Utility


4. Export Select any directory for export

Sub directories will be created for each category/label


.txt files (and tif/xdoc for reference) will be exported, since only .txt

files are used to train KTM content classification later

18

Importing into KTM


In Project Builder, point New Project dialog Content Classifier

settings to exported directory


Select Discovered documents sub directory

19

Importing into KTM


A class is created in Project Builder for each category Training documents are imported

Select Train in Project Builder main menu


Verify in Classification Benchmark (Result Matrix)

20

Importing into KTM

Setting this up manually and finding/organizing the proper training documents takes hours or days. With the Kofax Clustering Utility, this example took 20 minutes.

21

Benchmarking
Productivity Enhancements Design Time

KTM 5.5 Benchmarking


Separation Benchmarking Classification Benchmarking Extraction Benchmarking

23

KTM 5.5 Separation Benchmarking


Separation Benchmark Document Separation Test

24

KTM 5.5 Separation Benchmarking


Golden Files Extraction Benchmarking

Separation Benchmark

Separation Benchmark

Golden Batch

25

KTM 5.5 Separation Benchmarking


Separation Benchmark How can a Golden Batch be created? Kofax Capture (before Export Connector) KTM Project Builder

26

KTM 5.5 Separation Benchmarking


Separation Benchmark

27

KTM 5.5 Separation Benchmarking


Separation Benchmark Quality?
Correct Documents Rejected Documents Incorrect Documents
Incorrectly classified Additional splits Missed splits

False Postive

But confidently Document Review...? The worst of all three categories

28

KTM 5.5 Classification Benchmarking


Classification Benchmark

29

KTM 5.5 Classification Benchmarking


Classification Benchmark

30

KTM 5.5 Classification Benchmarking


Classification Benchmark

31

KTM 5.5 Classification Benchmarking


Classification Benchmark

32

KTM 5.5 Classification Benchmarking


Classification Benchmark

33

KTM 5.5 Extraction Benchmarking


Extraction Benchmark

Slide 34 34

KTM 5.5- Extraction Benchmarking


Extraction Benchmark

EV = Extracted Value EV = GFV EV = GFV EV GFV EV GFV

GFV = Golden File Value (perfect file) Super Work Work False positives

Project quality Project design

Slide 35 35

KTM 5.5 Extraction Benchmarking


Extraction Benchmark - Comparison

36

KTM 5.5 Extraction Benchmarking Extraction Benchmark - Enhancements Selection List

Sorting
By Column Content By Status

Open in Document Viewer Re-arrange columns

37

Project Merge Tool


Productivity Enhancements Design Time

KTM 5.5 Project Merge Tool


Multiple Users One Project

39

KTM 5.5 Project Merge Tool


Project Master

40

KTM 5.5 Project Merge Tool


Copy the Project Master for each aditional user

41

KTM 5.5 Project Merge Tool


Project Master

42

KTM 5.5 Project Merge Tool


Copy 1

43

KTM 5.5 Project Merge Tool


Copy 2

44

KTM 5.5 Project Merge Tool


Merge Copy 1

45

KTM 5.5 Project Merge Tool


Source and Destination projects

46

KTM 5.5 Project Merge Tool


Select Classes

47

KTM 5.5 Project Merge Tool


Summary

48

KTM 5.5 Project Merge Tool


Save changes to destination project (Project Master)

49

KTM 5.5 Project Merge Tool


Merge Copy 2

50

KTM 5.5 Project Merge Tool


Source and Destination projects

51

KTM 5.5 Project Merge Tool


Select Classes

52

KTM 5.5 Project Merge Tool


Summary

53

KTM 5.5 Project Merge Tool


Save changes to destination project (Project Master)

54

KTM 5.5 Project Merge Tool


Project Master after merging

55

KTM 5.5 Project Merge Tool


Elements that can be merged...

Classes Validation Forms Fields Locators Validation Rules Script Localization

56

KTM 5.5 Project Merge Tool

57

KTM 5.5 Project Merge Tool


Elements

58

KTM 5.5 Project Merge Tool

59

KTM 5.5 Project Merge Tool

60

KTM 5.5 Project Merge Tool

61

KTM 5.5 Project Merge Tool

62

KTM 5.5 Project Merge Tool

63

KTM 5.5 Project Merge Tool

64

KTM 5.5 Project Merge Tool


Summary

65

KTM 5.5 Project Merge Tool


Save changes

66

KTM 5.5 Project Merge Tool


The merged project

67

Localisation
Productivity Enhancements Users

KTM 5.5 Localisation


KTM Languages

English

German

69

KTM 5.5 Localisation


Additional KTM Languages # 1 2 3 4 5 6 7 8 9 Language Pack Brazilian Chinese Czech French Italian Japanese Polish Russian Spanish Language ID pt-BR zh-CN cs fr it ja pl ru es sv-SE

10 Swedish

70

KTM 5.5 Localisation


Additional KTM Languages Graphic User Interface Project Builder and runtime modules

Component based messages KTM Server

Documentation
(runtime modules and Userguide.pdf)

1. 2. 3. 4.

Document Review Correction Validation Verification

71

KTM 5.5 Localisation


Project Settings - Localization

72

KTM 5.5 Localisation


Project Settings - Localization

73

KTM 5.5 Localisation


.Net concept

Primary language

English English (United Kingdom) English (United Stated)

en en-UK en-US

Secondary language

74

KTM 5.5 Localisation


Fall back principle

75

KTM 5.5 Localisation


Fall back principle
Localise

Primary Secondary language translation?

Yes

No

Primary language translation?

Yes

No Use default value for display name Use translation value for display name

End

76

KTM 5.5 Localisation


KTM GUI, Server and Active Language

77

KTM 5.5 Localisation


KTM GUI Language, Server and Active Language

The Project.ActiveLanguage overrides the Region and Language settings

78

KTM 5.5 Localisation


Summary KTM Graphic User Interface language KTM Server language Project language (Project.ActiveLanguage)

79

KTM 5.5 Localisation


What can be localised?

KTM Element Fields Table Columns Formatting Methods Validation Methods


Validation Form

Yes/No

Note

Component messages used Regular Expression only Component messages used


Tab captions Field label Simple label Button captions DB button captions Group captions Script Resources

80

KTM 5.5 Localisation


Fields

81

KTM 5.5 Localisation


Tables

82

KTM 5.5 Localisation


Project Script Resources

83

KTM 5.5 Localisation


Project Script Resources

Project.Resources.GetString("Error_Example")

84

KTM 5.5 Localisation


KTM project folder structure Default language in *.fpr file

Additional languages

Document Review Localised languages

Default language

85

KTM 5.5 Localisation


Localisation.xml

External editor Language ID Example: Field Default value Localised translation


86

KTM 5.5 Localisation


XML Update

87

KTM 5.5 Localisation


Project design language

88

Thin Client Enhancements


Productivity Enhancements Users

KTM TC 5.5 Improvements


New and Improved Functionality Inside KTM TC 5.5

Validation Form Layouts

Annotations
Additional Batch Editing Operations User Settings Advanced Login Capabilities Combo-boxes With Descriptions Combo-boxes Inside Tables Other Small Things

90

KTM TC 5.5 Improvements


Support Validation Form Layouts

Different font types and sizes

Mini-viewers
Custom buttons Location of fields Anchoring Layout localization

91

KTM TC 5.5 Improvements


Support Annotations

Display annotations created by KTM modules

Create new annotations inside Thin Client


Edit annotations Delete annotations Move annotations Hide/Display annotations

92

KTM TC 5.5 Improvements


Additional Batch Editing Operations Delete pages Move, merge, delete documents Move, merge, delete, split, create folders

93

KTM TC 5.5 Improvements


Preserve User Settings

User name at login screen Batch Open dialog box: size, columns, sorting settings Panels: size, expanded states Zoom settings: fit width, fit height, custom zoom Annotation settings: hide/display annotations

94

KTM TC 5.5 Improvements


Advanced Login Capabilities Domain login for linked users Single sign-on support for Active Directory users

95

KTM TC 5.5 Improvements


Combo-boxes Inside Tables, Items With Descriptions

Display descriptions, values or both Support empty strings consistently for all combo-boxes Paging control for over 100 items Type-ahead filtering capabilities New script events to initialize scripted combo-boxes

96

KTM TC 5.5 Improvements


Other Small Things

Batch loading performance improvements (project caching) PDF support Reject/Unreject documents support scripting on the server Allow to install Thin Client Server on top of previous version Propagate user changes in config files to a new version

97

Q&A

For further information, please contact: Bo Jin Sr. Solution Architect Phone: +41 41 799 82 30 Email: Bo.Jin@Kofax.com Fatih Karaoglu Sr. Solution Architect Phone: +41 41 799 82 36 Email: Fatih.Karaoglu@Kofax.com