Chapter 9 Negnevitsky

Knowledge Engineering and Data Mining
Jiajuan Lin John Larimer Lopa Nath Nick Remish Sean Ruck Tyler Purdom
What Is Knowledge Engineering?
John Larimer
Knowledge Engineering
It is the process of building intelligent systems. 1. 2. 3. 4. 5. 6. Problem Assessment Data & knowledge acquisition Prototype Complete System Evaluation & Revise Integration & Maintain System
1. Problem Assessment
Determine the problem's characteristics Identify the main participants in the project Specify the project's objectives Determine the resources need for building the system
Types of problems: Diagnosis, selection, prediction, classification, clustering, optimization, control E.g. Diagnosis -> domain knowledge, explanation facilities
2. Data and Knowledge Acquisition
Collect and analyse data and knowledge Make key concepts of the system design more explicit
Intelligent System
3. Development of a Prototype System
Choose a tool for building an intelligent system Transform data and represent knowledge Design and implement a prototype Test the prototype
4. Development of a complete system
Prepare a detailed design for a full-scale system Collect additional data and knowledge Develop the user interface Implement the complete System
5. Evaluation and revision of the system
Evaluate the system against the performance criteria Revise the system as necessary
6. Integration and Maintenance of the System
Make arrangements for technology transfer Establish an effective maintenance program
Case Study 1: Diagnostic Expert System
I want to develop an Intelligent system that can help me to fix malfunctions of my Mac computer. Will an expert system work for this problem?
Phone Call Rule (Firebaugh, 1988) - "Any problem that can be solved by your in-house expert in a 1030 minute phone call can be developed in an expert system."
Troubleshooting manuals Series of visual inspections Rule structure with domain knowledge
Taken from N. Ch9 Pg 309 & 310
Case Study 2: Classification Expert System
Nick Remish
Case Study 2: Classification E.S.
Problem: Identify different classes of sailboat (typical classification problem) o Handled well by both expert systems and NNs Collect information o In this case, the sail plan can help identify the class of boat. Issues with the expert system approach o What if the information is incomplete or inexact? (Rough weather obscuring sails) Manage incrementally acquired evidence with certainty factors.
9.3 Will a fuzzy expert system work for my problem?
A Fuzzy Solution? o Useful when you cannot define a set of exact rules. o Great for inherent imprecise properties and modeling human decision making. Sometimes parameters are imprecise (a doctor dealing with a patient) o Mainly used in engineering, but has applications in any sector that relies on human experience or is too complex or uncertain (ex: finance)
Case Study 3: Decision-Support Fuzzy System
Decision-Support Fuzzy System
Problem: assessing mortgage applications o Use a Decision-Support Fuzzy System Steps: o Represent the concept in fuzzy terms o Implement the concept in a prototype o Test and optimize
Represent the concept in fuzzy terms:
Triangular and trapezoidal fuzzy membership functions are used to represent knowledge.
Obtain the fuzzy rules:
Based off of Von Altrock's fuzzy model and applied to mortgages.
Evaluate and analyse performance:
Despite having 100+ rules, decision-support fuzzy systems can be developed, tested and implemented relatively quickly.
9.4 Will a neural network work for my problem?
Sean Ruck
Neural Network Overview
Very powerful, general purpose tools Successfully applied to prediction, classification, and clustering problems Quite popular due to the versatility of neural networks
Case Study 4: Character Recognition Neural Networks
Suppose you want to copy a document onto your computer without retyping the whole thing. o How? Optical Character Recognition o The ability of a computer to translate character images into a text file using software o Capture the character images by scanning the document Converts the scanned document into a bit map
Choosing The Neural Network Architecture
Architecture and size of neural newtork dependent upon complexity of the problem o Handwritten character recognition is far more complex than computer printed A 3-layer network will suffice for printed digit recognition
Determining an optimal number of hidden neurons
More neurons leads to a more accurate network, but takes longer to train Too many neurons may actually prevent the network from generalising or working for anything other than training examples o Overfitting How to prevent overfitting o Choose the smallest number of neurons that give good results and generalisation
cont'd.
We should test out the training of the network with various numbers of hidden neurons o Performance rated by sum of squared errors The training runs that have a good enough sum of squared errors result have a number of hidden neurons to consider using
Test Examples?
The test set should be entirely independent of the training examples o Only use the training runs that passed the previous test Test examples should also contain "noise" o Distortion of the input The training runs that give us a reasonable error in recognition even with noise have a good enough number of hidden neurons to use o Use the lowest number for practical purposes
Improving Performance
A neural network is only as good as the examples used to train it Improve the network by training it with noisy examples
Case Study 5: Prediction Neural Networks
Neural networks are useful in prediction situations such as predicting the market value of a house Using a neural network creates a black box around how the results were reached o The result is more important than the how anyway For prediction training examples are critically important o We need a wide array of examples to cover all possible inputs
Determining The Size Of A Training Set
Can be estimated with "Widrow's rule of Thumb": N = nw/e o Where N is the number of training examples, nw is the number of synaptic weights in the network, and e is the network error permitted
Dealing With The Data
Neural networks work best with inputs in the 0 to 1 range, but in cases such as with determining the value of a house, our inputs are not all in that range o Number of bedrooms, square footage, etc. So we need to "massage" the data to this range o massaged value = (actual value - minimum value) / (maximum value - minimum value) o Good for up to a dozen possible values
cont'd
We can also utilize 1 of N coding o Each possible value is taken as its own input each with a value of 0 or 1
Dealing With The Results
To validate the results we test the network with never before seen examples, as before Our network is working with values between 0 and 1. We need to convert back to actual values o We can reverse the "massaging" we did before To test the importance of certin inputs we can test the network's sensitivity to them: "Sensitivity Analysis" o Set each input one at a time to its minimum and maximum values and measure the results
Case Study 6: Classification Neural Networks With Competitive Learning
Using a neural network we can discover significant features of input patterns and separate the data into different classes Using competitive learning a single layer neural network can perform clustering o Combining similar data into groups or clusters o Uses 1 input neuron for each input and 1 competitive neuron for each cluster
When Is The Learning Process Completed?
In a competitive neural network, there is no obvious way to know if the network is done learning o We do not know the desired output, so we cannot use the sum of squared errors Use Euclidean Distance criterion instead o When there has been no noticeable change in the weights of the competitive neurons, the network can be considered to have converged
How Can We Associate Neurons to Specific Classes or Clusters?
Competitive neural networks enable us to identify clusters in input data, but does nothing to label the clusters o We can connect a competitive neuron with a cluster/class by analyzing its weights We can identify exactly which cluster is which by feeding the network test data corresponding to one particular cluster o The output neuron that most often is utilized is labeled as that class
9.5 Will genetic algorithms work for my problem?
Lopa Nath
Genetic Algorithm Review
Most applicable to optimization problems o Process of finding a better solution to a problem More than one solution not of equal quality Generates a population of competing candidate solutions Causes candidates to evolve through process of natural selection o Poor solutions die out while better solutions survive and reproduce Process repetition breeds an optimal solution
Case Study 7: The Traveling Salesman Problem
I want to develop an intelligent system that can produce an optimal itinerary. I am going to travel by car and I want to visit all major cities in Western and Central Europe and then return home. Will a genetic algorithm work for this problem?
o o o
Known as the traveling salesman problem (TSP) Given a finite number of cities, and the cost of travel (or the distance) between each pair of cities, we need to find the cheapest way (or shortest route) for visiting each city exactly once and returning to the starting point. TSP naturally represented in numerous transportation and logistics applications. Arranging routes, scheduling drilling of holes in a circuit board (time efficient - shortest distance)
Although we can not be completely sure if the selected route is the best one, after several runs we can be sure that the route obtained is a good one.
How does a genetic algorithm solve the TSP?
Representation
Chromosome where order of integers represents order in which cities will be visited.
Genetic Operators in the TSP
Genetic operators used to create new routes Crossover Operator o Classical form cannot be directly applied because a simple exchange
of parts between parents would contain duplicates and omissions.
Clearly classical crossover with single crossover point does not work.
How the Crossover Operator Works
Genetic Operators in the TSP Continued
Mutation Operator o Reciprocal Exchange Simply swaps two randomly selected cities in the chromosome
o
Inversion Selects two random points along the chromosome string and reverses order of cities between these points
Fitness Function in the TSP
Evaluate total length of the route o Fitness of each individual chromosome is determined as the reciprocal of the route length Shorter the route, fitter the chromosome
9.6 Will a hybrid intelligent system work for my problem?
Hybrid Intelligent Systems
Solving complex real-world problems require an application of complex intelligent systems that combine the advantages of expert systems, fuzzy logic, neural networks, and evolutionary computation. Such systems can integrate human-like expertise in a specific domain with abilities to learn and adapt to a rapidly changing environment.
Case Study 8: Neuro-fuzzy decision-support systems
I want to develop an intelligent system for diagnosing myocardial perfusion from cardiac images. I have a set of cardiac images as well as the clinical notes and physician's interpretation. Will a hybrid system work for this problem?
o o o o
The neuro-fuzzy system in this example has a heterogeneous structure - the neural network and fuzzy system will work as independent components but cooperate in solving the problem.
Analysis of two SPECT images must be done One stress image taken 10-15 minutes after injection with radioactive tracer One rest image taken 2-5 hours after the injection Brighter patches on image correspond to well-perfused areas while darker patches may indicate the presence of an ischemia. Visual inspection is highly subjective--intelligent system can help a cardiologist diagnose. One binary feature assigns an overall diagnosis--normal or abnormal
Back-Propagation Neural Network to Classify the SPECT Images into Normal and Abnormal

Each image is divided into 22 regions, so we need 44 input neurons. Since SPECT images are to be classified as either normal or abnormal, we should use two output neurons. Good generalization in this study can be obtained with 5 to 7 neurons in the hidden layer.
Testing the Neural Network
Testing the network, we find the network's performance is rather poor o 25% normal are misclassified as abnormal o Over 35% abnormal are misclassified as normal o Indicates that the training set may lack some important examples Can improve this still
Neural Network Output
Two outputs
o First - possibility that the SPECT image belongs to class o
normal Second - possibility that the SPECT image belongs to class abnormal First (normal) output is 0.92 and second (abnormal) is 0.16 image classified as normal - risk for heart attack is low NORMAL OUTPUT LOW AND ABNORMAL OUTPUT HIGH First (normal) output is 0.17 and second (abnormal) is 0.51 image classified as abnormal 0 risk for heart attack is high BOTH OUTPUTS ARE CLOSE First (normal) output is 0.51 and second (abnormal) is 0.49 - we cannot confidently classify the image.
Examples:
o NORMAL OUTPUT HIGH AND ABNORMAL OUTPUT LOW o o
Adding Fuzzy Logic for Decision-Making in Medical Diagnosis
Fuzzy logic provides us with a means of modeling how the cardiologist asses the risk of a heart attack. Need to determine input and output variables, define fuzzy sets, and construct fuzzy rules.
o o o
Two inputs (NN output 1 and NN output 2) and one output (the risk of a heart attack). Inputs [0, 1] and output vary between 0 and 100 percent. Fuzzy sets shown in Negnevitsky page 342 and 343 - Figure 9.33, Figure 9.34, and Figure 9.35 Fuzzy rules in Negnevitsky page 343 - Figure 9.36 Examples: 1. If (NN-output1 is Low) and (NN_output2 is Low) then (Risk is Moderate) 2. If (NN-output1 is Low) and (NN_output2 is Medium) then (Risk is High) 3. If (NN-output1 is Low) and (NN_output2 is High) then (Risk is Very_High) 4. If (NN-output1 is Medium) and (NN_output2 is Low) then (Risk is Low)
More Certainty
Risk between 30 and 50 percent cannot be classified as either normal or abnormal - uncertain. Apply the following heuristics known by experienced cardiologists to all corresponding regions (22 in each image) 1. If perfusion inside region i at stress is higher than perfusion inside the same region at rest, then then risk of a heart attack should be decreased. 2. If perfusion inside region i is not higher than perfusion inside the same region at rest, then the risk of a heart attack should be increased.
Three Heuristics Implemented In the Diagnostic System
Step 1 Step 2
Present the neuro-fuzzy system with the cardiac case. If the system's output is less than 30, classify the presented case as normal and then stop. If the output is greater than 50, classify the case as abnormal and stop. Otherwise go to step 3. For region 1, subtract perfusion at rest fro perfusion at stress. If the result is positive, decrease the current risk by multiplying its value by 0.99. Otherwise, increase the risk by multiplying its value by 1.01. Repeat this procedure for all 22 regions then go to Step 4.
Step 3
Step 4
When we now apply the test set to the neuro-fuzzy system, we find that the accuracy of diagnosis has dramatically improved - the overall diagnostic error does not exceed 5 percent, while only 3 percent of abnormal cases are misclassified as normal. Although we have not improved the system's performance on normal cases (over 30 percent of normal cases are misclassified as abnormal), and up to 20 percent of the total number of cases are classified as uncertain, the neuro-fuzzy system can actually achieve even better results in classifying SPECT images than a cardiologist can.
If the new risk value is less than 30, classify the case as normal; if the risk is greater than 50, classify the case as abnormal; otherwise, classify the case as uncertain.
Homogeneous Structure of Neuro-Fuzzy Systems
A typical example of a neuro-fuzzy system with a homogeneous structure is an Adaptive NeuroFuzzy Inference System (ANFIS). o It cannot be divided into two independent distinct parts. o An ANFIS is a multilayer neural network that performs fuzzy inferencing. Case Study 9: Time series prediction o Page 346 of Negnevitsky
Data Mining and Knowledge Discovery
Tyler Purdom
Data Mining
Definition: o The extraction of knowledge from data o The exploration and analysis of large quantities of data to to discover patterns. Ultimate goal is to discover knowledge Amount of data doubles every year Important to have fast algorithms to process data
Data Warehouses
Definition: o Large databases that store historical data. o Contain millions and in some cases billions of data records. The data stored is time dependent and integrated Used to help support decision making Query tools are used to discover relationships in the data.
Query Tools vs. Data Mining
Query tools are assumption-based o User must ask the right questions to get result o User must make assumptions o Can select a specific variable that affects the outcome Data Mining tools determine the most significant factors o No assumptions are necessary o Discovers patterns automatically The representation of data in data warehouses helps facilitate the data mining process
Data Mining Practice
Data Mining is a new and evolving field Very popular in the banking, finance, marketing, and telecommunications industries Data Mining uses: o Determine trends in markets o Detect frauds o Target people most likely to buy a product/use a service
Data Mining Tools
People used to use query tools and statistics to solve data mining problems o These techniques are not very efficient for large amounts of data o Can only correlate a few variables at a time Now, tools are based off of intelligent technologies: o Neural networks, neuro-fuzzy systems, and decision trees Decision trees are currently the most popular tool used for data mining
Decision Tree
A map of the reasoning process These trees do not allow for the use of noisy or incomplete data Uses tree structure to describe the data set Very effective in solving classification problems Popular because they help you visualize the problem Nodes are separated by predictors o In the book example, homeownership was used to split the tree
Decision Tree Example
Gini Coefficient
A measure of how well the predictor separates the classes contained in the parent node Introduced by Corrado Gini, an Italian economist He used it to measure the inequality in Italy's income distribution
Calculating the Gini Coefficient
Top curve represents the real economy Bottom line represents equal distribution of wealth Coefficient: o (shaded area)/ area below bottom line
Gini Split Example
Summary
Jiajuan Lin
Summary -
Knowledge engineering
Problem Assessment Data & knowledge acquisition Prototype Complete System Evaluation & Revise Integration & Maintain System
What is knowledge engineering?
o o o o o o
Summary -
Assess the Problem
Assess the Problem o problem type diagnosis, selection, prediction, classification, clustering, optimization, control o availability of data precise data? complete set input? o form of content of the solution final result only? reasoning behind the answer? o availability of expertise extra info provided? trouble to present problem solving strategy?
Summary -
Data & Knowledge acquisition
Questions about the data o Range? Continues? Discrete? Precise? Noise Tolerance? Numerical? Symbolic? Data Mining
o o
analyze data, finding pattern & rules, extracting knowledge from large quantities of data decision tree easy to follow visualization of solution makes clear sets of rules
Summary - Prototype
shows understanding of o the problem o problem-solving strategy o tool selected Test o Throw it away if needed o Forcing wrong tool leads to more time waste in the later development process o Prototype is there for discovering any inappropriate/wrong decisions made
Summary -
Complete System,Evaluation ,Revision, Integration & maintenance
Complete System Development o plan, schedule, budget Evaluation o no clear right/wrong o user satisfaction = measurement Revision o Modify as limitation & weaknesses discovered Maintenance o Knowledge evolves over time o keep modifying and updating to maintain efficiency and accuracy

Chapter 9 Negnevitsky

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Chapter 9 Negnevitsky

Transféré par

Droits d'auteur :

Formats disponibles

Knowledge Engineering and Data Mining

What Is Knowledge Engineering?

2. Data and Knowledge Acquisition

3. Development of a Prototype System

4. Development of a complete system

5. Evaluation and revision of the system

6. Integration and Maintenance of the System

Make arrangements for technology transfer Establish an effective maintenance program

Case Study 1: Diagnostic Expert System

Case Study 1: Diagnostic Expert System

Case Study 1: Diagnostic Expert System

Taken from N. Ch9 Pg 309 & 310

Case Study 2: Classification Expert System

Case Study 2: Classification E.S.

9.3 Will a fuzzy expert system work for my problem?

Case Study 3: Decision-Support Fuzzy System

Decision-Support Fuzzy System

Decision-Support Fuzzy System

Represent the concept in fuzzy terms:

Decision-Support Fuzzy System

Obtain the fuzzy rules:

Based off of Von Altrock's fuzzy model and applied to mortgages.

Decision-Support Fuzzy System

Evaluate and analyse performance:

9.4 Will a neural network work for my problem?

Neural Network Overview

Case Study 4: Character Recognition Neural Networks

Choosing The Neural Network Architecture

Determining an optimal number of hidden neurons

Case Study 5: Prediction Neural Networks

Determining The Size Of A Training Set

Dealing With The Data

Dealing With The Results

Case Study 6: Classification Neural Networks With Competitive Learning

When Is The Learning Process Completed?

How Can We Associate Neurons to Specific Classes or Clusters?

9.5 Will genetic algorithms work for my problem?

Genetic Algorithm Review

Case Study 7: The Traveling Salesman Problem

How does a genetic algorithm solve the TSP?

Genetic Operators in the TSP

How the Crossover Operator Works

Genetic Operators in the TSP Continued

Fitness Function in the TSP

9.6 Will a hybrid intelligent system work for my problem?

Hybrid Intelligent Systems

Case Study 8: Neuro-fuzzy decision-support systems

Testing the Neural Network

Neural Network Output

o First - possibility that the SPECT image belongs to class o

o NORMAL OUTPUT HIGH AND ABNORMAL OUTPUT LOW o o

Adding Fuzzy Logic for Decision-Making in Medical Diagnosis

Three Heuristics Implemented In the Diagnostic System

Homogeneous Structure of Neuro-Fuzzy Systems

Data Mining and Knowledge Discovery

Query Tools vs. Data Mining

Data Mining Practice

Data Mining Tools

Decision Tree Example

Calculating the Gini Coefficient

Gini Split Example

What is knowledge engineering?

Assess the Problem