Vous êtes sur la page 1sur 14

15ECE 331

Classifier Evaluation Measures


• Training set
• Validation set
• Test set
--------------------------------------------------------------------
• Confusion matrix :
• true positives (TP): These are cases in which we predicted yes (they
have the disease), and they do have the disease.
• true negatives (TN): We predicted no, and they don't have the
disease.
• false positives (FP): We predicted yes, but they don't actually have
the disease. (Also known as a "Type I error.")
• false negatives (FN): We predicted no, but they actually do have the
disease. (Also known as a "Type II error.")
• Accuracy: Overall, how often is the classifier correct?
• (TP+TN)/total
• Misclassification Rate: Overall, how often is it wrong?
• (FP+FN)/total
• equivalent to 1 minus Accuracy
• also known as "Error Rate"
• True Positive Rate: When it's actually yes, how often does it predict yes?
• TP/actual yes
• also known as "Sensitivity" or "Recall"
• False Positive Rate: When it's actually no, how often does it predict yes?
• FP/actual no
• True Negative Rate: When it's actually no, how often does it predict no?
• TN/actual no
• equivalent to 1 minus False Positive Rate
• also known as "Specificity"
• Precision: When it predicts yes, how often is it correct?
• TP/predicted yes
• Exercise : Receiver Operating Characteristics (ROC) curve
Statistical Decision making
• Parametric/ Non parametric
• Supervised/ Unsupervised
Parametric decision making:
• Refers to the situation in which we know or willing to assume
probability distribution function or density function for each class
• Before using this function, parameters has to be estimated.
• Examples????
• Poisson distribution
• Normal distribution
Poisson distribution
Poisson distribution contd..
• The Poisson distribution can be used to calculate the probabilities of
various numbers of "successes" based on the mean number of
successes.
• In order to apply the Poisson distribution, the various events must be
independent
• Eg:
• Suppose you knew that the mean number of calls to a fire station on
a weekday is 8. What is the probability that on a given weekday there
would be 11 calls?
Poisson distribution contd..
• The number of photons emitted from a X-ray source during a given
time interval
Normal distribution

If feature is normally distributed for each class,


• Parameter estimation has to be done
Classification task:

To estimate the probabilities that a pattern belongs to various classes based


on set of features
Eg:
To estimate the probabilities that a patient has various diseases given some
symptoms or lab tests.
• From past experience, probability of occurrence of these symptoms and
test results are known.
• Also know the probabilities of occurrence of these diseases in the
population from which the patient came
 This information can be mathematically processed for getting the
decision
Bayes theorem – Bayesian decision making
• Bayesian decision making refers to choosing the most likely class, given the
feature
• Feature value is denoted by x
• Class of interest is C
• P(x) – Probability distribution for feature x in the entire population
• P(C) – prior probability that a random sample is a member of class C
• P(x|C)- Conditional probability of obtaining feature value x given that sample is
from class C
• P(C|x)????? Estimate the probability that a sample belongs to class C, given that
it has a feature value x

• P(C|x) = (P(C) P(x|C))/ P(x)


Scenario:
What is the probability that a person has a cold given that he or she
has a fever??
• Classes
• Feature
• Prior probability of a person having a cold is P(C) =0.01
• Probability of having a fever, given that the person has a cold: P(x|C)
=0.4
• P(x) – Probability of fever in the entire population =0.02
Probability that a person has a cold given that he or she has a fever??
P(C|x)???

Vous aimerez peut-être aussi