Académique Documents
Professionnel Documents
Culture Documents
8 (1998) 191210
Received 1 March 1997; received in revised form 1 July 1998; accepted 13 July 1998
Abstract
The amount of financial information in todays sophisticated large data bases is substantial
and makes comparisons between company performanceespecially over timedifficult or at
least very time consuming. The aim of this paper is to investigate whether neural networks
in the form of self-organizing maps can be used to manage the complexity in large data bases.
We structure and analyze accounting numbers in a large data base over several time periods.
By using self-organizing maps, we overcome the problems associated with finding the appropriate underlying distribution and the functional form of the underlying data in the structuring
task that is often encountered, for example, when using cluster analysis. The method chosen
also offers a way of visualizing the results. The data base in this study consists of annual
reports of more than 120 world wide pulp and paper companies with data from a five year
time period. 1998 Elsevier Science Ltd. All rights reserved.
Keywords: Complexity; Adaptive systems; Self-organizing maps; Financial benchmarking; Financial performance; Strategic management
1. Introduction
Todays data bases hold a substantial amount of information about companies.
The trick is to find patterns in the data that reveal important information about com* Corresponding author. E-mail: bback@abo.fi
1
kaisa.sere@abo.fi
2
hannu.vanharanta@lut.fi
0959-8022/98/$19.00 1998 Elsevier Science Ltd. All rights reserved.
PII: S 0 9 5 9 - 8 0 2 2 ( 9 8 ) 0 0 0 0 9 - 5
192
panies for different stakeholders, i.e., stockholders, creditors, auditors, financial analysts, and management. Finding patterns in financial performance can, for example,
be helpful in identifying internal problems, firm evaluation by investors, and for
benchmarking purposes.
In this paper, we focus on analysis of financial performance for benchmarking
purposes. Benchmarking is an important company-internal process, in which the
functions and performance of one company are compared with those of other companies. Financial competitive benchmarking uses financial informationmost often in
the form of ratiosto perform these comparisons. Financial competitive benchmarking is utilized, among other things, as a communication tool in strategic management, for example, in situations where company management must gain approval,
from internal and external interest groups alike, for new functional objectives for
the company.
Vanharanta (1995) has built a hyperknowledge-based system for financial benchmarking. The system contains a data base with financial data on more than 160 pulp
and paper companies worldwide. This data base is used as a basis for the present
study, too. The amount of financial information in this system is, however, so large
and the structure of it is so complex that it makes comparisons between companies
difficultor at least very time consuming.
Multivariate statistical methods, especially cluster analysis, has been used as a
tool of analysis of company performance although mostly in research contexts
(Ketchen & Shook, 1996). However, many problems have been reported concerning
these methods. The two most important problems are the assumption on normality
in the underlying distributions and difficulties in finding an appropriate functional
form for the distributions (Trigueiros, 1995), (Fernandes-Castro & Smith, 1994).
Moreover, results of analyses are difficult to visualize when there are several explanatory variables (Vermeulen et al., 1994).
Many researchers have addressed these problems: Trigueiros (1995) reports on
several studies that have shown the existence of positive or negative skewedness in
the ratios and on different remedies to overcome these difficulties. He also explains
the existence of symmetrical and negatively skewed ratios and offers guidelines for
achieving higher precision when using ratios in statistical context.
Fernandes-Castro & Smith (1994) used a non-parametric model of corporate performance to overcome the need for specification of statistical distribution or functional form. Vermeulen et al. (1994) presented a way to visualize the results with
interfirm comparison when the explanatory variable was explained by more than one
firm characteristic. Successful use of visual information depends substantially on its
acceptance by the user. Meyer (1997) states that visualized information makes the
transfer of information easier and thus a bottleneck in human information processing
is avoided.
Ketchen & Shook (1996) evaluate the past use of cluster analysis in strategic
management research. One concern has been the extensive reliance on researcher
judgment that is inherent in cluster analysis. As another concern they list that the
applications lack an underlying theoretical rationale and that clustering dimensions
193
seem to be selected haphazardly. There has also been concern with the standardization of variables and problems with multicollinearity among variables.
Self-organizing maps, which are a form of artificial neural networks, are a promising new paradigm in information processing. One of the main features of neural
networks is their ability to learn from examples and adapt their behavior to new
situations. The theory of self-organizing maps facilitates a reduction and cluster
analysis of high dimensional feature spaces into two-dimensional arrays of representative weight vectors (Kohonen, 1997). The method does not need any specification
of an underlying distribution or of the functional form of the financial indicators.
Furthermore, one can visualize the results in a comprehensive way.
Neural networks have previously been suggested by Trigueiros (1995) for use with
computerized accounting reports data bases, and by Chen et al. (1995) to define
cluster structures in large data bases. Martin-del-Brio & Serrano-Cinca (1995) used
self-organizing maps for analyzing the financial state of Spanish companies.
In a previous study (Back et al., 1998) we investigated the potential of selforganizing maps to structure 76 companies financial data in our data base and
presented an approximated position of one companys financial performance compared to that of other companies. The study was explorative and limited to Nordic
and North-American companies. However, the results were very promising and that
study served as a basis for this paper.
We use the self-organizing maps to structure the financial information on more
than 120 companies, including now also Central-European companies, in our data
base into clusters based on the underlying weight vectors. Each cluster is then named
according to the financial characteristics of the cluster. We analyse the financial
performance of the companies year 1985 and take a closer look in these clusters
over the years 198589. Even though we focus only on specific companies, any
individual company or group of companies can be the focus of interest.
The rest of the paper is organized as follows: Section 2 describes the methodology
we have used, the network structure, the data base, the list of companies in the study
and the criteria for and the choice of financial ratios. Section 3 presents the construction of the self-organizing maps and Section 4 presents an analysis of the maps. The
conclusions of our study are presented in Section 5.
2. Methodology
2.1. Benchmarking
Competitive benchmarking is a company-internal process in which the activities
of a given company are measured against the best practices of other, best-in-class
companies (Geber, 1990). In the process of competitive benchmarking, internal functions are analyzed and measured using financial (i.e. quantitative) and/or non-financial (i.e. qualitative) yardsticks. Functions measured from one company are compared
with similar functions measured from leading competitors, or they are compared with
the best practices in other industries. The differences between compared functions
194
195
Since companies in the data base do not have predefined labels describing their
financial status, a network intended for structuring their data can have no pre-desired
outputs, i.e., the clustering is not known a priori. For this reason, we utilize an
unsupervised learning method. A Kohonen network (Kohonen, 1997), being the most
common network model based on unsupervised learning, is used in this study.
A Kohonen network usually consists of two layers of neurons: an input layer and
an output layer. The input layer neurons present an input pattern to each of the
output neurons. The neurons in the output layer are usually arranged in a grid, and
are influenced by their neighbors in this grid. The goal is to automatically cluster
the input patterns in such a way that similar patterns are represented by the same
output neuron, or by one of its neighbors. Every output neuron has an associated
weight vector. The neighborhood structure of the output layer will cause neighboring
neurons in it (the output layer) to have similar weight vectors. These vectors should
represent some subclass of the input patterns, thus forming a map of the input space,
a self-organizing map (SOM).
The network topology can be described by the number of output neurons present
in the network and by the way in which the output neurons are interconnected, i.e.
by describing which neurons in the output array are mutual neighbors. Usually, neurons on the output layer are arranged in either a rectangular or a hexagonal grid, see
Fig. 1. In a rectangular grid each neuron is connected to four neighbors, except for
the ones at the edge of the grid. In all the networks we use, the output neurons are
arranged in a hexagonal lattice structure. This means that every neuron is connected
to exactly six neighbors, except for the ones at the edge of the grid. This choice
was made following the guidelines of Kohonen (1997).
As mentioned above, a Kohonen network is trained using unsupervised learning.
During the training process the network has no knowledge of the desired outputs.
The training process is characterized by a competition between the output neurons.
The input patterns are presented to the network one by one, in random order. The
output neurons compete for each and every pattern. The output neuron with a weight
vector that is closest to the input vector is called the winner. For expressing the
distance (i.e. similarity) between two vectors, we use the Euclidean distance between
the two vectors. The weight vector of the winner is adjusted in the direction of the
input vector, and so are the weight vectors of the surrounding neurons in the output
Fig. 1.
Network topologies.
196
array. The size of adjustment in the weight vectors of the neighboring neurons is
dependent on the distance of that neuron from the winner in the output array.
We use two learning parameters: the learning rate and the neighborhood width
parameter. The learning rate influences the size of the weight vector adjustments
after each training step, whereas the neighborhood width parameter determines to
what extent the surrounding neurons, the neighbors, are affected by the winner. An
additional parameter is the training length, which measures the processing time, i.e.
the number of iterations through the training data.
Our criterion for the quality of a good map was the average quantization error,
which is an average of the Euclidean distances of each input vector and its best
matching reference vector in the SOM.
The clusters are formed by identifying neurons on the output layer that are close
to each other using the weight vectors as a starting point. A tool called the U-matrix
(Kohonen, 1997) can be used to visualize the distances between neighboring neurons.
Hence, a set of neurons form a cluster, if they are sufficiently close to each other.
Moreover, it is the analyst who determines the clusters based on these distances,
they are not determined a priori.
The methodology used when applying the self-organizing map is as follows:
1. Choose the data material. It is often advisable to preprocess the input data so that
the learning task of the network becomes easier (Kohonen, 1997).
2. Choose the network topology, learning rate, and the neighborhood width.
3. Construct the network. The construction process takes place by showing the input
data to the network iteratively using the same input vector many times, the s.c.
training length. The process ends when the average quantization error is small
enough.
4. Choose the best map for further analysis. Identify the clusters using the U-matrix
and interprete the clusters (give labels to them) using the weight vectors. From
the weight vectors we can read per input variable per neuron the value of the
variable associated with each neuron.
2.3. Data base and selection of companies
The Green Gold Financial Reports data base (Salonen & Vanharanta, 1990a, b,
1991) is used as the experimental financial knowledge base for the neural network
tests. It consists of income statements, balance sheets and cash flow statements, that
have been corrected for different accounting principles, of 160 companies in the
international pulp and paper industry. The data base also consists of specific financial
ratios, calculated using information from the corrected reports as well as general
company information concerning products and production volumes. There are 47
different key ratios for each company. The companies are all based in one of three
regions: North America, Northern Europe or Central Europe. The financial data
covers a period of five years from 1985 to 1989. The companies are listed in Table
1 (with some companies omitted that did not have enough data available).
For our experiment we used some 120 companies from the data base. We have
197
Table 1
The companies
Country
Company
Sweden
3
4
5
6
7
8
9
10
11
12
13
14
15
16
1
17
18
20
21
22
23
25
26
27
28
29
2
31
34
37
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
Finland
Norway
USA
AVERAGE
AB Statens Skogsindustrier
Graningeverkens AB
Korsnas AB
Mo och Domsjo AB
Munkedals AB
Munksjo AB
Norrlands Skogsagarens Cellulosa AB
Norrsundets Bruks AB
Obbola Linerboard AB
Rottneros Bruk AB
Svenska Cellulosa AB (SCA)
Stora Kopparbergs Bergslags AB
Sodra Skogsagarna AB
AVERAGE
A. Ahlstrom Oy
Enso-Gutzeit Oy
Kemi Oy
Kymmene Oy
Oy Kyro Ab
Metsa-Serla Oy
Rauma-Repola Oy
Sunila Oy
Oy Tampella Ab
Oy Veitsiluoto Ab
Yhtyneet Paperitehtaat Oy
AVERAGE
Norske Skogindustrier A.S.
A/S Union
Boise Cascade Corporation
Champion International Corporation
Chesapeake Corporation
Consolidated Papers Inc.
Dennison Manufacturing Company
The Dexter Corporation
Federal Paper Board Company
Gaylord Container Corporation
Georgia-Pacific Corporation
P.H. Glatfelter Company
Great Northern Nekoosa Corporation
International Paper Company
James River Corporation
Kimberly-Clark Corporation
Longview Fibre Company
Louisiana-Pacific Corporation
Mead Corporation
Continued overleaf
198
Table 1
(Continued)
Country
Canada
Austria
Germany
Company
55
56
57
58
59
60
61
62
63
65
66
67
68
69
71
72
73
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
91
92
93
113
114
115
94
95
96
97
98
99
100
123
199
Table 1
(Continued)
Country
Company
Italy
101
124
102
103
104
105
106
107
108
109
125
126
127
110
111
112
135
136
116
117
118
119
120
121
122
128
129
130
131
132
133
134
Holland
Portugal
UK
France
Spain
Swiss
Saffa
Cartiera
Berghuizer
Buchmann
Crown van Gelder
Gelderse
N.V. Papierfabriek
Koninklijke
Parenco
Portucel-Empresa
Group Caima
Celbi
Soporcel
Associated
BPB Industries
David S. Smith
Bowater Industries
James Cropper
Arjomari
Aussedat
Beghin-Say
Kaysersberg
La Cellulose Du.
La Rochette
Sibylle
Empresa Nacion.
La Papelera
Attisholz
Biber Holding
Industrieholding
Holstoff
Papierfab. Perlen
also included the country averages of Finland, Norway and Sweden as three
additional companies.
2.4. Choice of ratios
The population consists of 47 financial ratios in the benchmarking system
organized in the benchmarking system into six groups under the headings:
1.
2.
3.
4.
Profitability
Indebtedness
Capital Structure
Liquidity
200
5. Working capital
6. Cash flow ratios
In selecting the variables for the study we used the cognitive approach (Ketchen &
Shook, 1996), i.e., we let experts choose the most important variables for the performance analysis task. We defend that policy in agreeing with Ketchen & Shook
(1996), who cite Meyer et al. (1993) and state that in strategy research the variables
should be chosen in a way that fosters rich description of a samples characteristics.
We ended up with nine variables. As experts for the task we used ten financial
analysts from a large Finnish bank (Vanharanta et al., 1995). The financial analysts
had all been in the field for at least 5 years.
In principle we could have chosen all 47 financial ratios as input variables. We
would not have had problems with multicollinearity, but we would probably have
run into problems because of the limited number of observations. Too many input
variables in combination with too few observations could have affected the neural
networks ability to learn.
The following nine ratios were selected.
-
The numbers in parentheses indicate the appropriate ratio group number shown
earlier.
We note that there are four profitability measures, one indebtedness measure, one
capital structure measures, one liquidity measure, no working capital measures and
two cash flow measures. It seems reasonable that the emphasis is on profitability in
a benchmarking situation.
3. Constructing the maps
In this section we give a description of the construction process followed in
developing the self-organizing maps. The actual construction work was performed
using The Self-Organizing Map Program Package version 3.1 prepared by the SOM
Programming Team of the Helsinki University of Technology (Kohonen, 1997).
We started by standardizing the ratios in the data base using histogram equalization
(Klimasauskas, 1991) in order to ease the SOMs learning process and to improve
its performance. Histogram equalization is a way of mapping rare figures to a small
part of the target range and spreading out frequent figures so that it becomes easier
for the neural network to discriminate among frequent figures.
201
All the maps were constructed in two phases. The purpose of the first phase was
to order the randomly initialized weight vectors of the maps to approximately correct values. During the second phase the maps are fine-tuned, i.e. final ordering
of the reference vectors takes place.
The construction of maps is very fast due to the small amount of data available
(see next section). Such maps enable comparisons between the financial situations
of companies to be made. This approach does include the presumption that the input
space for each year contains an adequately comprehensive description of the whole
possible input space, i.e. all the realistically possible combinations of financial ratios.
We constructed maps separately for each of the years 1985, 1986, 1987, 1988,
and 1989. The network topology chosen was hexagonal with 15*10 neurons in each
map. This is the same network structure as in our previous study (Back et al., 1998).
The parameters of the best maps with respect to the average quantization error are
given in Table 2.
Table 2
Network parameters
Year
Phase
Training
length
1985
1
2
1
2
1
2
1
2
1
2
1000
95,000
1000
115,000
1000
95,000
1000
120,000
1000
100,000
1986
1987
1988
1989
Learning rate
0.05
0.02
0.08
0.02
0.07
0.03
0.06
0.02
0.06
0.03
Neighborhood
width
10
3
10
3
10
3
11
3
12
3
Quantization
error
0.247267
0.261194
0.274494
0.257365
0.253538
202
Fig. 2.
203
204
liabilities. They have a low solidity, a weak liquidity and a bad profitability based
on the ratios chosen for this study.
In the year 1986 company 25 has joined the group B and stays within this group
through the rest of the years in this study. Company 29 stays outside group B until
the last year of this study and joins the group B in 1989.
5. Conclusions
The objective of this study was to investigate the potential of self-organizing maps
to support in managing the complexity in a large data base by structuring the vast
amount of financial data available on companies. Our work bench consisted of a
hyperknowledge-based system for financial benchmarking. The data base contained
financial data on more than 120 pulp and paper companies worldwide. Using nine
different ratios as variablesfour measuring profitability, one indebtedness, one
capital structure, one liquidity, and two cash flowwe constructed different maps
for each of the years 1985, 1986, 1987, 1988, and 1989.
We anticipate that neural networks can be used in future for benchmarking purposes to help executives find company characteristics that will lead to sustainable
excellence of a company, in other words to help answer the question: Which are the
characteristics that lead a company towards long-lasting good performance? Some
company characteristics seem to produce and maintain good overall company performance, sustainable profitability, increasing productivity and continuous growth.
In this investigation we showed how to analyse Finnish pulp and paper companies
over time in a world-wide scale. Our study shows that self-organizing maps can be
useful for structuring large financial data bases in a meaningful way. We think that
this approach, although historically based, is a valuable starting point for benchmarking purposes. From the maps presented the managers can pick out different
companies for further and deeper analyses.
Acknowledgements
We like to thank Mikko Irjala for carrying out the practical work with training
the networks and Marko Gronroos for helping to get the paper into the final format.
We also want to thank the anonymous referees for their constructive comments. The
work reported here was carried out within the AnNet-project. The authors wish to
thank the Foundation for Economic Education and the Academy of Finland for providing financial support for this project.
References
Back, B., Irjala, M., Sere, K., & Vanharanta, V. (1998). Competitive financial benchmarking using selforganizing maps. In M. Vasarhelyi & A. Kogan (Eds.). Artificial Intelligence in Accounting and Auditing, (pp. 6981). Princeton: Marcus Wiener Publishers.
205
Chen, S. K., Mangiameli, P., & West, D. (1995). The comparative ability of self-organizing neural networks to define cluster structure. Omega, International Journal of Management Science, 23, 271279.
Fernandez-Castro, A., & Smith, P. (1994). Towards a General Non-parametric Model of Corporate Performance. Omega, International Journal of Management Science, 22, 237249.
Geber, B. (1990). Benchmarking: Measuring yourself against the best. Training, 27, 3644.
Ketchen, D., & Shook, C. (1996). The application of cluster analysis in strategic management research:
An analysis and critique. Strategic Management Journal, 17, 441458.
Klimasauskas, C. C. (1991). Applying neural networks, Part IV: improving performance. PC/AI Magazine,
5, 4.
Kohonen, T. (1997). Self-Organizing Maps. Berlin: Springer-Verlag.
Martin-del-Brio, B., & Serrano-Cinca, C. (1995). Self organizing neural networks: The financial state of
spanish companies. In A. Refenes (Ed.), Neural Networks in the Capital Markets, New York: John
Wiley and Sons.
Meyer, J.A. (1997). The acceptance of visual information in management. Information and Management,
32, 275287.
Meyer, A. D., Tsui, A. S., & Hinings, C. R. (1993). Configurational approaches to organizational analysis.
Academy of Management Journal, 36, 11751195.
Salonen, H., & Vanharanta, H. (1990a). Financial analysis world pulp and paper companies 19851989,
Nordic Countries. Green Gold Financial Reports, 1, Ekono Oy, Espoor, Finland.
Salonen, H., & Vanharanta, H. (1990b). Financial analysis world pulp and paper companies 19851989,
North America. Green Gold Financial Reports, 2, Ekono Oy, Espoo, Finland.
Salonen, H., & Vanharanta, H. (1991). Financial analysis world pulp and paper Companies 19851989,
Europe. Green Gold Financial Reports, 3, Ekono Oy, Espoo, Finland.
Trigueiros, D. (1995). Accounting identities and the distribution of ratios. British Accounting Review, 27,
109126.
Vanharanta, H. (1995). Hyperknowledge and continuous strategy in executive support systems. Acta Academiae Aboensis, 55, Turku, Finland.
Vanharanta, H., Kakola, T., & Back, B. (1995). Validity and utility of a hyperknowledge-based financial
benchmarking system. In Proceedings of the Twenty-Eight Annual Hawaii International Conference
on Systems Science, 3, 221230. IEEE:Computer Society Press.
Vermuelen, E.M., Spronk, J., & Van Der Wijst, D. (1994). Visualising Interfirm Comparison. Omega,
International Journal of Management Science, 22, 237249.
206
Appendix A
MAPS FOR THE YEARS 19851989
207
208
Appendix B
WEIGHT MAPS FOR YEAR 1985
209
210