Vous êtes sur la page 1sur 7

Constanta Maritime University Annals

Year XI, Vol.14

PREDICTION ANALYSIS OF BANKRUPTCY RISK USING BAYESIAN NETWORKS


1

CRACIUN MIHAELA-DACIANA, 2BUCERZAN DOMINIC, 3RATIU CRINA


12

, Aurel Vlaicu University of Arad, 3Daramec srl Arad, Romania

ABSTRACT
The Bayesian probability, is widely misunderstood by the general public, as well as some economists. On the other
hand, bankruptcy risk can be estimated in the static and dynamic analysis of the financial balance that outlines the
former performance of the enterprise. A global evaluation of the enterprises future becomes interesting for the
management of the enterprise and especially for its business partners: banks, clients, capital investors. Therefore, in this
paper we mould the Anghel Prediction Model for bankruptcy risk using the Bayesian probability. To this purpose, we
use Bayesian Networks (BN) and the AgenaRisk Tool. The result of this mould is a solution of bankruptcy risk
prediction using BN.
Keywords: Bayesian probability, Bayesian Network (BN), bankruptcy risk prediction, AgenaRisk Tool, Anghel
Prediction Model

1.

entered into any node. So entering an observation in


an effect node will result in back propagation, i.e.
revised probability distributions for the cause
nodes and vice versa. Such backward reasoning of
uncertainty is not possible in other approaches.
- Overturn previous beliefs in the light of new
evidence: The notion of explaining away evidence is
one example of this.
- Make predictions with incomplete data: There is no
need to enter observations about all the inputs, as
is expected in most traditional modelling techniques.
The
model
produces
revised
probability
distributions for all the unknown variables when any
new observations (as few or as many as you have)
are entered. If no observation is entered then the
model simply assumes the prior distribution.
- Combine diverse types of evidence including both
subjective beliefs and objective data. A BN is
agnostic about the type of data in any variable and
about the way the NPTs are defined.
- Arrive at decisions based on visible auditable
reasoning: Unlike blackbox modelling techniques
(including classical regression models and neural
networks) there are no hidden variables and the
inference mechanism is based on a long-established
theorem (Bayes).
This range of benefits, together with the explicit
quantification of uncertainty and ability to communicate
arguments easily and effectively, makes BN a powerful
solution for all types of risk assessment.
The first working applications of BN (during the
period 1988-1995) tended to focus on classical
diagnostic problems, primarily in medicine and fault
diagnosis. Intelligence Group at Aalborg University
produces the MUNIN system. Companies such as
Microsoft and Hewlett-Packard have used the early BN
for fault diagnosis, and in particular printer fault
diagnosis. There have also been numerous uses of BN in
military applications, for example the TRACS system for
predicting reliability of land vehicles. Another highstakes application domain where BN have been used

INTRODUCTION

A Bayesian Network (BN) is a way of describing


the relationships between causes and effects, and is made
up of nodes and arcs. The collection of nodes and arcs is
referred to as the graph or topology of the BN. In
addition, in a BN each node has an associated probability
table, called the Node Probability Table (NPT). The
nodes represent variables. The arcs in a BN represent
causal or influential relationships between variables. The
key feature of BN is that they enable us to model and
reason about uncertainty. The NPT for any node gives
the conditional probability of each possible outcome
given each combination of outcomes for its parent nodes.
Usually, there are several ways of determining the
probabilities in any of the tables. Alternatively, if no
such statistical data is available we may have to rely on
subjective probabilities entered by experts. A key feature
of BN is that we are able to accommodate both
subjective probabilities and probabilities based on
objective data, as specified in [1].
Having entered the probabilities we can now use
Bayesian probability to do various types of analysis.
Bayesian probability is all about revising probabilities in
the light of actual observations of events.
When we enter evidence and use it to update the
probabilities in this way, we call it propagation. In
theory we can enter any number of observations
anywhere in the BN and use propagation to update the
marginal probabilities of all the unobserved variables.
This can yield some exceptionally powerful analyses that
are simply not possible using other types of reasoning
and classical statistical analysis methods, as you see in
[5].
BN offer the following benefits, subject founded in [2]:
- Explicitly model causal factors: this key benefit is in
stark contrast to classical statistics whereby
prediction models are normally developed by purely
data-driven approaches.
- Reason from effect to cause and vice versa: A BN
will update the probability distributions for every
unknown variable whenever an observation is

157

Constanta Maritime University Annals

Year XI, Vol.14

extensively by commercial organizations is fault


prediction; subject is founded in [4].
Because of historical limitation even Bayesian
statisticians have shunned BN for problems that involve
continuous variables and complex stochastic models.
Instead they have used tools like the WinBUGS software
package, which are based on intensive sampling
algorithms collectively known as Markov Chain Monte
Carlo (MCMC) methods. Fortunately, there have been
some recent breakthroughs in algorithms for hybrid BN.
Building on the work of Koslov and Koller, Neil have
developed and implemented a dynamic discretisation
algorithm which works efficiently for a large class of
continuous distributions.
Users of AgenaRisk Tool, which implements this
algorithm, can simply define continuous nodes by their
range and distribution. Without any of the complexities
associated with the MCMC approach, they can achieve
results of matching or greater accuracy for many classes
of models, especially for models that include discrete
variables, as specified in [6]. On a wider scale, there is
considerable research into how to model extremely large
problems involving hundreds of data points, with many
variables, over long periods of time, or involving
complex sequences of variables and data. A number of
extensions to BN beyond the classical inference
algorithms are being used for this purpose, including:
Relational BN, Statistical parameter learning, Sensitivity
analysis, Safety and reliability modelling, Operational
risk in finance, Recommendation engines and
information retrieval.

industries of the national economy. The analysis covered


the period 1994 -1998 and has initially used a number of
20 economic -financial indices.
After the selection stage, four financial rates have
been established for the development of the score
function:
- X1 - earning after taxes / incomes;
- X2 - Cash Flow / total assets;
- X3 - liability / total assets;
- X4 - liability/ sales * 360
All the above rates have been aggregated in the
following score function: A = 5.667 + 6.3718 * X1 +
5.3932 * X2 5.1427 * X3 0.0105 * X4, subject is
founded in [3].
Varying within the values established for this
function, enterprises are included in one of the following
three situations:
- When A < 0, bankruptcy/failure situation;
- When 0 A 2.05, uncertainty situation demanding
prudence;
- When A > 2.05, a good financial situation.
The analysis of the previously presented models has
revealed a certain facility in detecting bankruptcy in
time.
Subject to bankruptcy risk prediction was been
treated with interest over the years. The use of BN for
BP was study, as you see in [7], by Lili Sun and Prakash
P. Shenoy.
3. THE ANGHEL PREDICTION MODEL (PM)
FOR BANKRUPTCY RISK (BR) EXPLAINED
USING A BAYESIAN NETWORK

2. THE I. ANGHEL MODEL IN


BANKCRUPTCY RISK PREDICTION

Accepting the Bayes Theorem and the accuracy of


the AgenaRisk software it is possible to explain the
Anghel PM for BR without exposing the mathematical
details. The vehicle for doing this is a visual model
called Bayesian Network (BN) as shown in the Figure 1.

Anghel has developed a model based on


discriminatory analysis, starting from a sample of 276
enterprises, grouped into non-bankrupt (60%) and
bankrupt (40%), and belonging to a number of 12

Figure 1 BN showing causal structure

158

Constanta Maritime University Annals

Year XI, Vol.14

In this structure we have four types of nodes:


sample, probability, result and assumption nodes. The
sample nodes represent the probability that Xi is faulty (i
= 1, 2, 3, 4). The probability nodes are Xi faults in
number of trials, where number=20, 10, 15, 25. The
result nodes are the following: Mediate Node1, Mediate
Node2 and A Z score. The Hypothesis node is the
assumption node. As mentioned in the previous section,
the Anghel PM for BR is based on the function score A
= 5.667+6.3718*X1+5.3932*X2-5.1427*X3-0.0105*X4.
In this case we are handling nodes with multiple parents.
The initially model we built, was that all four sample
nodes were parents for the result node. In this case the
calculation was very slowly and difficult. So we
introduce the two Mediate Nodes and so we reduce the
number of parents node and of the calculation time, too.
Next we explain how we built the nodes.
The sample nodes are simulation nodes, with
continuous interval type. The lower bound is 0.0 and the
upper bound is 1.0. The NPT is a Uniform Expression
with lower bound 0 and upper bound 1. The graph types
associated to this node are Histogram.
The probability nodes are simulation nodes, with
integer interval type. The lower bound is 1 and the upper

bound is 9. The NPT is a Binomial Expression with 20,


10, 15 and 25 trials and the probability of success given
by the parents node probability p_Xi_faulty. The graph
types associated to this node are Histogram.
The result nodes are simulation nodes, too. They
divide in two categories. The Mediate Nodes and the A
Z score node. The types of Mediate Nodes are
continuous interval with values between -10 and 50. The
NPT is an arithmetic expression 6.3718*p_ X1_faulty
+5.3932* p_ X2_faulty. The graph types associated to
this node are Histogram. The type of A Z score node is
continuous interval with values between -20 and 100.
The NPT is an arithmetical expression MN1+MN2. The
graph type associated to this node is Histogram.
The assumption node Hypothesis is a simulation
node, with Boolean type. The state options are
customised, with positive Outcome Good financial
situation and the Negative Outcome Bankruptcy /
failure situation. The NPT is a comparison expression:
if(zscore<2.05,"Bankruptcy / failure situation", "Good
financial situation"). The graph type associated to this
node is Histogram.
The statistic attached to the main risk graph is
shown in Figure 2. In this case there is no observation.

Figure 2 - Complete Hypothesis Testing model


trials from 10 to 1, X3 decrease the trials from 15 to
1, X4 decrease the trials from 25 to 2. The result is
shown in Figure 3.

In case that observations appear we can attached to


each probability node a certain number of trials.
Let`s consider the following observation: X1
decrease the trials from 20 to 2, X2 decrease the

159

Constanta Maritime University Annals

Year XI, Vol.14

Figure 3 - Risk graph of Hypothesis after evidence has been entered


The risk map for this model has attached the
following risk table, as shown in Figure 4.

Figure 4 Risk table of Hypothesis generated after


evidence has been entered
4. THE ANGHEL PREDICTION MODEL (PM)
FOR
BANKRUPTCY
RISK
(BR)
USING
HYPOTHESIS
TESTING
WITH
EXPERT
JUDGMENT
The structure of the risk map described at the
previously section, will be changed. We add a new node
at the top of the risk map named Prior Type. This node is
a labelled type with Uniform and Beta label value. The
NPT is a comparison expression with value 0. The graph
type associated to this node is Histogram. (see Figure 5).

Figure 5 - Hypothesis Testing with Expert Judgment

160

Constanta Maritime University Annals

Year XI, Vol.14

In this case the sample nodes are modified only by


the NPT. The NPT Editing Mode changes into
Partitioned Expression. We specify the distribution as
follows: for nodes X1 and X4 the uniform distribution is
given by the function Uniform(0,1) and the beta
distribution is given by the function Beta(1,9,0.0,1.0). So

we obtain the chance of failure of 1 to 10. For nodes X2


and X3 the uniform distribution is given by the function
Uniform(0,1) and the beta distribution is given by the
function Beta(2,8,0.0,1.0). So we obtain the chance of
failure of 1 to 5.

Figure 6 - Complete Hypothesis Testing model with Expert Judgment


Next, we define two scenarios one type Uniform
and the second type Beta. The two scenarios will be
correlated with the risk map entering the observation that

the Prior Type is Uniform in the scenario that we have


named Uniform and Beta in the scenario that we have
named Beta (see Figure 7).

Figure 7 - Results of hypothesis test with two different prior assumptions

161

Constanta Maritime University Annals

Year XI, Vol.14

Comparing the last two figures we observe the


difference of the results.
Combining Data and Prior Assumptions
We change the probability nodes so that we
decrease 1/5 the trials: from 20 to 4, from 10 to 2, from
15 to 3 and from 25 to 5.

For the both scenarios, Uniform and Beta, we will


introduce values as follow: for the probability nodes X1
and X4 Uniform and Beta receive the value 1,
respectively the nodes X2 and X3 receive the value 0 for
Uniform and Beta. The results are shown in Figure 8.

Figure 8 - Results of hypothesis test after entering sparse sample data


5.

CONCLUSIONS

We have shown that, using BN and AgenaRisk


Tool, it is possible to show all of the implication and
results of a complex Bayesian argument without
requiring and understanding of the underlying theory of
mathematics. Economists can use the obtained analysis
to predict the bankruptcy risk using Bayesian
probability.
6.

REFERENCES

[1] JENSEN FINN V, GRAVEN-NIELSEN THOMAS Bayesian Networks and Decision Graphs, Springer 2002
[2] POURRET OLIVIER, NAIMS PATRICK,
MARCOT BRUCE Bayesian Networks - A Practical
Guide to Applications, John Wily & Sons Ltd, 2008
[3] ANGHEL ION Falimentul radiografie i
predicie, Ed. Economic, Bucureti, 2002
[4] NEAPOLITAN RICHARD E. Learning Bayesian
Networks, Prentice Hall Series in Artificial Intelligence
[5] HECKERMANN DAVID A Tutorial on Learning
with Bayesian Network, March 1995
[6] Agena 2007, Press Release,
http://www.agenarisk.com/agenarisk/case_13.shtml
[7] SUN LILI, SHENOY PRAKASH P. Using
Bayesian Networks for Bankruptcy Prediction Some
Methodological Issues, European Journal of Operational
Research, 2007

Figure 9 Risk table of Hypothesis Testing with Expert


Judgment after entering sparse sample data
Changing the Simulation Settings
We delete the Uniform scenario and we remove
from the probability nodes X1 and X4 the value 1 for
Beta. After we calculate we obtain the mean 0.40 for X1
and 0.50 for X4 and the variance value 0.46 for X1 and
0.62 for X4.
Next, we modify the properties for the defined
model. The maxim number of iteration defined in the
simulation settings are 25. We work with 5 iterations.
Running the calculation we will obtain different values
for both probability nodes.

162

Copyright of Analele Universitatii Maritime Constanta is the property of Analele Universitatii Maritime
Constanta and its content may not be copied or emailed to multiple sites or posted to a listserv without the
copyright holder's express written permission. However, users may print, download, or email articles for
individual use.