Vous êtes sur la page 1sur 20

Generalized

Additive Neural
Network Modeling
André de Waal
Centre for Business Mathematics
and Informatics
North-West University
South Africa
Copyright © 2007, SAS Institute Inc. All rights reserved.

Agenda

ƒ Motivation
ƒ Models
ƒ Timeline
ƒ Interactive construction algorithm
ƒ Automated construction algorithm
ƒ Genetic algorithm
ƒ Support vector machines
ƒ Success story
Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 1


Motivation

ƒ Approximately 85% of all companies use


logistic regression as the main modeling
tool in the development of their
scorecards
• LC Thomas
• Credit and Behavioral Scoring
Workshop
• Credit Scoring and Credit Control IX
• Edinburgh, Scotland, 2005
Copyright © 2007, SAS Institute Inc. All rights reserved.

Models

ƒ Linear models
E ( yi ) = β 0 + β1 x1i + β 2 x2i + L + β k xki
ƒ Generalized linear models
g 0−1 ( E ( yi )) = β 0 + β1 x1i + β 2 x2i + L + β k xki
• The link function g0-1 constrains the
range of the predicted values
• Logit link: [0, 1] (used for probabilities)
• Tanh link: [-1, 1]
Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 2


Models

ƒ Generalized additive models


g 0−1 ( E ( yi )) = β 0 + f1 ( x1i ) + f 2 ( x2i ) + L + f k ( xki )
ƒ Multilayer perceptrons
⎛ k ⎞
g 0 ( E ( yi )) = w0 + w1 tanh⎜ w01 + ∑ w j1 x ji ⎟⎟ + L
−1

⎝ j =1 ⎠
⎛ k ⎞
+ wh tanh⎜⎜ w0 h + ∑ w jh x ji ⎟⎟
⎝ j =1 ⎠

Copyright © 2007, SAS Institute Inc. All rights reserved.

Timeline

NN course
W Potts

2004 2005 2006 2007

W Sarle

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 3


Interactive construction algorithm

1. Construct a GANN with 1 neuron and a


skip layer for each input.
2. Fit a generalized linear model.
3. Initialize the remaining parameters as
random values from a normal
distribution.
4. Fit the full GANN model.
5. Examine each partial residual plot.
6. Prune and add neurons. Repeat.
Copyright © 2007, SAS Institute Inc. All rights reserved.

Interactive construction algorithm

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 4


Interactive construction algorithm

Copyright © 2007, SAS Institute Inc. All rights reserved.

Interactive construction algorithm

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 5


Interactive construction algorithm

ƒ non-trivial process
ƒ intelligent behavior

Copyright © 2007, SAS Institute Inc. All rights reserved.

Interactive construction algorithm

ƒ Knowledge discovery in databases (KDD)


is the non-trivial process of identifying
valid, novel, potentially useful, and
ultimately understandable patterns in data
(Fayyad, Piatetsky-Shapiro & Smyth,
1996).
ƒ Artificial intelligence (AI) may be defined
as the branch of computer science that is
concerned with the automation of
intelligent behavior (Luger, 2002).
Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 6


Timeline
KDDCup 2004

NN course Alan Russell


W Potts Cary

2004 2005 2006 2007

W Sarle Version 1.0


SAS Award
SASA 2003

Copyright © 2007, SAS Institute Inc. All rights reserved.

Automated construction algorithm

ƒ Best-first search

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 7


Automated construction algorithm

ƒ Architecture notation

0 1 2 3 4

5 6 7 8 9
Copyright © 2007, SAS Institute Inc. All rights reserved.

Automated construction algorithm

ƒ Architecture transformation

0 1 2 3 4

5
Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 8


Automated construction algorithm

ƒ AutoGANN modeling node SAS®


Enterprise Miner™

Copyright © 2007, SAS Institute Inc. All rights reserved.

Automated construction algorithm

ƒ AutoGANN user interface

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 9


Timeline
KDDCup 2004 SAS Forum 2006
CSCCIX 2005
NN course Alan Russell Version 5.2
W Potts Cary

2004 2005 2006 2007

W Sarle Version 1.0


SAS Award Windows NT
SASA 2003 Ph.D.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Automated construction algorithm

ƒ Improved best-first search (multiple


changes)

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 10


Automated construction algorithm

ƒ Intelligent start (bootstrap)

0 1 3
? ? ?

Copyright © 2007, SAS Institute Inc. All rights reserved.

Automated construction algorithm

ƒ Ozone example

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 11


Automated construction algorithm

ƒ Inputs

Copyright © 2007, SAS Institute Inc. All rights reserved.

Automated construction algorithm

ƒ Solution Path

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 12


Automated construction algorithm

ƒ Model Space Statistics (10 077 696)

Copyright © 2007, SAS Institute Inc. All rights reserved.

Automated construction algorithm

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 13


Automated construction algorithm

Copyright © 2007, SAS Institute Inc. All rights reserved.

Automated construction algorithm

ƒ Solution Path (KDD-Cup 2004)

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 14


Automated construction algorithm

ƒ Model Space Statistics (6.4 * 1056)

Copyright © 2007, SAS Institute Inc. All rights reserved.

Automated construction algorithm

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 15


Automated construction algorithm

ƒ AutoGANN node functionality:


• Generalized additive models
−Estimate a specific model
−Search for best model
−Generate partial residual plots
−Perform model averaging
• Linear and logistic regression models
• Variable selection
Copyright © 2007, SAS Institute Inc. All rights reserved.

Automated construction algorithm

ƒ Transform Variables + Regression ≡


GAM?

ƒ ≡ ?

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 16


Automated construction algorithm

ƒ GANN scorecard
• Replace with

• Consider univariate functions as


transformations
−Keep logistic regression

Copyright © 2007, SAS Institute Inc. All rights reserved.

Automated construction algorithm

ƒ Benefits of presented approach


• Not a black box as general artificial
neural networks
• Gives insight into relationships between
input variables and target
• Potential to reduce scorecard building
time
• Can be used to evaluate current
models’ performance
Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 17


Genetic algorithm

1. Weak encoding scheme


• 102003313
2. Mutation
• 111111111 → 111111511
3. Unit-crossover
• 101111111,111111511 → 101111511
4. Fitness function
• SBC
Copyright © 2007, SAS Institute Inc. All rights reserved.

Support vector machines

SVM GANN
Non-linear
Yes Yes
classifier
Maximal
Yes No
separation
Indifferent to
No Yes
kernel choice
Interpretable
No Yes
results

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 18


Success story

ƒ South African bank


• Windows NT
• Ran AutoGANN for 1 hour
• Improved on a current model
• Request to purchase AutoGANN within
24 hours

Copyright © 2007, SAS Institute Inc. All rights reserved.

Timeline
KDD-Cup 2004 SAS Forum 2006
CSCCIX 2005
NN course Alan Russell Version 5.2
W Potts Cary M2007

2004 2005 2006 2007

W Sarle Version 1.0 ICDM’07


SAS Award Windows NT
SASA 2003 Ph.D.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 19


Questions

ƒ Can an AI search engine and optimization


algorithm compete in outright speed with
classical approaches? Probably not!
ƒ Can an AI search engine and optimization
algorithm compete in accuracy with
classical approaches? Definitely!
ƒ Can an AI search engine and optimization
algorithm compete in total modeling
time with classical approaches?
Definitely!
Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved.

Copyright © 2007, SAS Institute Inc. All rights reserved. 20

Vous aimerez peut-être aussi