Académique Documents
Professionnel Documents
Culture Documents
net/publication/303543459
CITATIONS READS
0 1,036
1 author:
Lev Ertuna
3 PUBLICATIONS 0 CITATIONS
SEE PROFILE
All content following this page was uploaded by Lev Ertuna on 26 May 2016.
Abstract
Time series forecasting is a very powerful computational tool that allows predicting future outcomes of a system based on
how the system behaved previously, it has a great number of applications in many areas of science. It is possible to predict
a wide range of various events, both of natural and human generated origin, using neural networks and machine learning
approach and methodology. Predictions based only on previous responses of the system are especially appealing for solving
problems where input variables of the system cannot be clearly defined. But forecasting with neural networks has its own
challenges: in particular it is difficult to choose a neural network architecture for a specific problem, since a very simplified
model may not be able to learn successfully, and excessively complicated model may lead to overfitting and data
memorization. In this paper the application of time series prediction to stock market forecasting is examined, and a
comparative study of different neural network structures and different learning methods is performed in order to obtain a
better understanding of how the quality of predictions changes with various approaches to solving a given problem.
1. Introduction
Artificial neural network is a computational model inspired by biological nervous systems. Neural networks are widely
used for estimation and approximation of unknown complicated functions and systems that may depend on a large number
of input variables. Artificial neural networks are commonly modeled as layers of neurons, exchanging information with
each other. Connections between neurons have some numeric weights that are adjusted during the learning process. [1]
Neural networks obtain intelligent behavior by learning from the provided data or from interactions with environment.
Due to the fact that they can learn complex non-linear mappings, neural networks are widely used for solving advanced
problems, such as pattern recognition and classification. The process of learning the data by the neural network is called
training, and while there are several approaches to neural network training, for this study the supervised learning (training)
methods are employed. [2]
In supervised training pairs of input-output data must be provided to the network, and the network will try to learn the
mapping implied by that data. The robustness of the neural network often depends on the training algorithm that was used
to learn the data. There are two training methods that will be compared in this paper – backpropagation and resilient
backpropagation. [3]
Over the last decade neural networks have been proven to be one of the most powerful tools in modelling and
forecasting. Recently neural networks have expanded their forecasting applications to many areas, such as: urban traffic
state predictions [4], disease predictions [5] [6], earthquake magnitude forecasting [7], river flow forecasting [8] [9] [10],
air quality and pollution forecasting [11] [12], solar power forecasting [13], weather forecasting [14].
Forecasting with neural networks is also widely used in analyzing complex financial systems and market based relations:
credit risk evaluations [15] [16], gas prices and production level predictions [17] [18] [19], forecasting of vehicle sales [20],
forecasting demand on consumable parts in production [21], predicting tourism demand [22], forecasting airline data [23],
forecasting stock index prices [24] [25] [26] [27] [28] and currency exchange rates [29] [30]. Since neural networks proved
to be reliable for chaotic time series forecasting, financial firms worldwide are employing neural networks to solve difficult
prediction problems; and it is anticipated that neural networks will eventually outperform even the best traders and investors
[31].
2. Time Series Forecasting
A time series can be represented as a sequence of data which depend on time 𝑡𝑡: {𝑦𝑦(𝑡𝑡0 ), 𝑦𝑦(𝑡𝑡1 ), ⋯ 𝑦𝑦(𝑡𝑡𝑘𝑘 ), ⋯ }. Normally a
single element of this sequence 𝑦𝑦(𝑡𝑡) can be described as a function of some independent variables and time: 𝑦𝑦(𝑡𝑡) =
𝑓𝑓(𝑎𝑎, 𝑏𝑏, 𝑐𝑐, ⋯ , 𝑛𝑛, 𝑡𝑡).
In order to represent such time series as a neural network model, it is necessary to define input variables (independent
variables and time) and output variables (one or multiple elements of a given sequence). Then the network can be trained
on some historical records of previous input variables, resulting in certain outputs of the system; and it can be used to predict
future outputs of the system, if future values of input variables are known.
To use neural networks and machine learning techniques for such time series model, the input variables should be
clearly defined. But what happens when the input to the system is not very straightforward?
6. Data Preparation
In this study we used stock market historical prices for Apple Inc. (AAPL) provided by Yahoo Finance. The historical
records of daily stock prices were taken for a period between January 1, 2000 and January 1, 2016. For this analysis only
stock open prices were used. The same analysis can be performed with other stocks and other specific prices, the obtained
behavior of neural networks will not change dramatically, but the network structure that will yield the best performance
might be different.
In order to feed the data into the neural network and perform the training, the data must be normalized [3] [39] to some
specific range. Most commonly used ranges are: 0.0-1.0, 0.1-0.9 or 0.2-0.8 [10]. The normalization range was the same for
all networks in this paper, although normalization also has some effect on the networks performance [10] and should be a
matter of special discussion. For this study 0.1-0.9 range was used for data normalization.
The data was also split in two parts – the training data set, on which the neural network was learning; and the testing
data set, which was not exposed to the network, but after learning was finished the network was tested on it and the error of
the network on this testing set was the most important criteria for evaluating the network’s performance. Different sizes of
testing sets were used – 1%, 10% and 20% of the provided historical data.
7. Experiment
During this study three major models of neural networks for stock prices time series forecasting were analyzed. The
models had one, five and thirty inputs correspondingly. The structure of hidden layers was varied for these networks:
examination was performed upon structures with one hidden layer, two hidden layers that have the same amount of neurons
on each layer, three hidden layers that are diamond shaped (expanding from the input layer towards the middle, and
shrinking from the middle towards the output layer), ten hidden layers with the same amount of neurons on each layer
(referred to as the deep neural networks). The complexity of each structure was varied by changing the number of neurons
in the hidden layers.
The initial assumption when dealing with neural networks was as follows: more complex structure, with more inputs,
with more hidden layers, should produce better results. But structures that are too complicated might lead to memorization
of the data, also known as the overfitting problem: network gets perfect results on the training data set but suffers from bad
performance on the data sets that were not exposed to it during training. The figure below can illustrate that phenomena:
It was analyzed whether this overfitting problem occurs for the attempted network structures. The plots of network
complexity vs testing set error for testing sets of different size are provided, except for the deep neural networks (since only
3 deep networks were tested).
The neural network’s performance also depends on the training strategy that was used to learn the data. To make training
conditions equal, networks were trained with backpropagation and resilient backpropagation methods separately, the
learning rate was 0.005, learning momentum was 0.001, and the networks were trained for 1000 epochs. These two training
methods were compared and opposed in terms of stability, predictability of behavior, and performance.
The experiments conducted during this study were performed using Java programming language and the Encog machine
learning framework for Java [39].
7.1 One input, one output
The significance of this model reads: based only on one previous response of the system, neural networks can
predict future outcomes of a system. This approach doesn’t seem reliable and it cannot be related with how real
traders or investors analyze stock market, but it demonstrates the advantages of machine learning computational
tools.
Best achieved error with corresponding network structure and training method
Testing Set Best Error Network Structure Training Method
1% Set 1.367 1-5-1 Backpropagation
10% Set 3.061 1-5-1 Backpropagation
20% Set 2.776 1-50-1 Backpropagation
The general approach of one input, one output network proved to work, although it didn’t seem likely it
would produce good results. The backpropagation method was very effective with the smallest network structure
(1-5-1) demonstrating errors of 1.367% on 1% testing data set, and 3.061% on 10% testing data set. But for more
complicated networks backpropagation was incapable of providing stable performance on testing data sets. It was
also observed that backpropagation is incapable of training deep neural networks, resulting in errors above 45%
level.
On the other hand, resilient backpropagation showed very stable results in testing data sets, with no signs
of overfitting whatsoever. It didn’t demonstrate the best performance in terms of errors, but this learning strategy
was able to reach the error below 10% in most networks, while backpropagation mostly showed results above 20%
error level. Networks that were trained with resilient backpropagation behaved exactly as it was expected, more
complicated structure gave better performance in terms of errors in the testing data sets. Deep neural networks did
not demonstrate any performance improvement compared to other network structures trained with resilient
backpropagation method, and only the network with 45 neurons in hidden layers managed to achieve errors below
10% level.
7.2 Five inputs, one output
This is a more logical approach to modelling neural networks, last 5 observations are taken into consideration when
trying to predict future system responses. It is somehow closer to how human brain analytics work.
Best achieved error with corresponding network structure and training method
Testing Set Best Error Network Structure Training Method
1% Set 1.216 5-20-100-20-1 Resilient backpropagation
10% Set 4.908 5-20-100-20-1 Resilient backpropagation
20% Set 3.424 5-20-100-20-1 Resilient backpropagation
When the input to the neural network became more complicated, the backpropagation training method could
no longer demonstrate valuable performance. Only neural networks with one hidden layer, trained with
backpropagation, managed to achieve errors somewhere around 10% level with some logical behavior. All the other
structures showed chaotic results of backpropagation training. As in the previous approach, it was observed that
backpropagation was incapable of training deep neural networks, resulting in errors above 45% level.
A very stable behavior was once again observed with the resilient backpropagation training method, most
of the structures trained with this method showed errors below the 10% level. Some evidence of overfitting problem
was observed, but it occurred in a very small error range and might not necessarily signify overfitting. In general,
networks trained with resilient backpropagation showed better performance with more complicated network
structures. The best behavior in all testing data sets (1%, 10% and 20%) was demonstrated by the 5-20-100-20-1
neural network trained with resilient backpropagation method. As it was previously observed, deep neural networks
did not produce any performance improvement, as compared to other network structures trained with resilient
backpropagation method; only the network with 45 neurons in hidden layers managed to achieve errors below 10%
level.
7.3 Thirty inputs, one output
A more advanced model that requires 30 last data samples to predict future outcome of a system was developed and
analyzed. It was expected to demonstrate the best performance, since it had the greatest theoretical computational
power.
Best achieved error with corresponding network structure and training method
Testing Set Best Error Network Structure Training Method
1% Set 0.806 30-50-50-1 Resilient backpropagation
10% Set 1.928 30-50-50-1 Resilient backpropagation
20% Set 4.752 30-50-50-1 Resilient backpropagation
If the input to the neural network becomes even more complex, backpropagation training method
completely fails to satisfy the training process. With input consisting of 30 samples of previous responses of the
system, backpropagation could hardly achieve errors below 30% level. It showed no valuable results and behaved
very chaotically. Once again, backpropagation demonstrated its incapability of training deep neural networks,
resulting in errors above 45% level.
Resilient backpropagation demonstrated the best performance with 30-50-50-1 neural network, showing
0.806% error in 1% testing data set, 1.928% error in 10% testing data set, 4.752% error in 20% testing data set.
Similar to previous observations, it demonstrated very stable behavior, most of the structures trained with this
method showed errors below the 10% level. Deep neural networks performed better than in previous experiments:
the networks with 10 and with 45 neurons in hidden layers managed to achieve errors below 10% level.
8. Best Training Set Performance
The best performance, in terms of errors, in 1% and 10% testing sets was achieved by 30-50-50-1 neural network trained
with resilient backpropagation, and in 20% testing set by 1-50-1 neural network trained with backpropagation. Figures
below demonstrate what this performance means in terms of forecasting computing power: the actual system response for
the testing sets was compared with predicted system response.
9. Conclusion
During this study the application of time series prediction to stock market forecasting was examined, and a comparative
study of different neural network structures and different learning methods was performed. The best network topologies to
solve time series forecasting problem (stock market price prediction) were determined. It was demonstrated that resilient
backpropagation training method worked equally well with most network structures, showing more predictable error
behavior, and it was in general more reliable than backpropagation. It was also observed that the complexity of inputs to the
network was not directly related with the performance of the network, but networks with more input data tended to produce
lower errors in the testing sets, although networks with only one input proved to be very effective as well, which was a
spectacular demonstration of neural networks’ computational power. Overfitting occurred to some extent with most of the
networks, but it produced very slight increase in errors and could be ignored for most purposes. Among other neural network
structures, a special case, deep neural networks, was analyzed and it was shown to be not very effective, it didn’t result in
the best prediction quality. It still produced acceptable error performance (below 10% level), but simpler network topologies
outperformed deep neural networks.
While performing this study, many networks demonstrated errors below 10% and some even below 5%, which shows
that neural networks are a great tool for time series forecasting with huge computing power, capable of learning chaotic
time series with no underlying mathematical model. Thus, it can be successfully applied to stock market and many other
fields where such prediction tools are required.
References
[1] S. Russell and P. Norvig, Artificial Intelligence a Modern Approach, New Jersey: Pearson Education, 2010.
[4] H. Peng and K.-L. Du, "Urban Traffic State Detection Based on Support". Patent US 9,037,519 B2, 19 May 2015.
[5] M. N. R. Deepthi Gurram, "A Decision Support System for Predicting Heart Disease Using Multilayer Perceptron
and Factor Analysis," International Review on Computers and Software, vol. 10, no. 8, August 2015.
[6] M. O. G. Nayeem, M. N. Wan and M. K. Hasan, "Prediction of Disease Level Using Multilayer Perceptron of
Artificial Neural Network for Patient Monitoring," 2015.
[7] J. Mahmoudi, M. A. Arjomand, M. Rezaei and M. Mohammadi, "Predicting the Earthquake Magnitude Using the
Multilayer Perceptron Neural Network with Two Hidden Layers," Civil Engineering Journal, January 2016.
[8] A. Atiya, S. M. El-Shoura, S. I. Shaheen and M. S. El-Sherif, "A comparison between neural-network forecasting
techniques-case study: river flow forecasting," IEEE Transactions on Neural Networks, April 1999.
[9] M. Shafaei and O. Kisi, "Predicting river daily flow using wavelet-artificial neural networks based on regression
analyses in comparison with artificial neural networks and support vector machine models," Neural Computing and
Applications, April 2016.
[10] A. Singh, R. Panda and N. Pramanik, "Appropriate data normalization range for daily river flow forecasting using
an artificial neural network," January 2009.
[11] H. Abderrahim, M. R. Chellali and A. Hamou, "Forecasting PM10 in Algiers: efficacy of multilayer perceptron
networks," Environmental Science and Pollution Research, September 2015.
[12] L. Hrust, Z. B. Klaic, J. Križan, O. Antonić and P. Hercog, "Neural network forecasting of air pollutants hourly
concentrations using optimised temporal averages of meteorological variables and pollutant concentrations,"
Atmospheric Environment, November 2009.
[13] C. Poolla, A. Ishihara, S. Rosenberg, R. Martin, A. Fong, S. Ray and C. Basu, "Neural network forecasting of solar
power for NASA Ames sustainability base," 2015.
[14] S. Tasdemir and A. Cinar, "Application of artificial neural network forecasting of daily maximum temperature in
Konya".
[15] C.-L. Huang, M.-C. Chen and C.-J. Wang, "Credit scoring with a data mining approach based on support vector
machines," Expert Systems with Applications, November 2007.
[16] H. A. Abdou, S. T. Alam and J. Mulkeen, "Would credit scoring work for Islamic finance? A neural network
approach," International Journal of Islamic and Middle Eastern Finance and Management, April 2014.
[17] I. S. Agbon and J. C. Araque, "Predicting Oil and Gas Spot Prices Using Chaos Time Series Analysis and Fuzzy
Neural Network Model," 2003.
[18] S. M. Al-Fattah and R. Startzman, "Predicting Natural Gas Production Using Artificial Neural Network," in SPE
Hydrocarbon Economics and Evaluation Symposium, Dallas, 2001.
[19] S. Hosseinipoor, Forecasting Natural Gas Prices in the United States Using Artificial Neural Networks, 2016.
[20] X. X. Zhang and D. T. Zhang, "A Neural Network Forecasting Model of Beijing Motor Vehicles Sold Based on Set
Pare Analysis," 2011.
[21] Y.-T. Jou, H.-M. Wee, H.-C. Chen, Y.-H. Hsieh and L. Wang, "A neural network forecasting model for consumable
parts in semiconductor manufacturing," Journal of Manufacturing Technology Management, March 2009.
[22] S. C. Kon and L. W. Turner, "Neural network forecasting of tourism demand," Tourism Economics, September
2005.
[23] L. R. Weatherford, T. W. Gentry and B. Wilamowski, "Neural network forecasting for airlines: A comparative
analysis," Journal of Revenue & Pricing Management, January 2003.
[24] Y.-H. Wang, "Nonlinear neural network forecasting model for stock index option price: hybrid GJR-GARCH
approach.," Expert Systems with Applications, January 2009.
[25] T.-S. Lee and C.-C. Chiu, "Neural network forecasting of an opening cash price index," International Journal of
Systems Science, February 2002.
[26] J.-S. Chen and P.-C. Wu, "Neural network forecasting of TAIMEX index futures," 2000.
[27] M. Dixon, D. Klabjan and J. H. Bang, "Classification-based Financial Markets Prediction using Deep Neural
Networks," March 2016.
[28] B. W. Wanjawa and L. Muchemi, "ANN Model to Predict Stock Prices at Stock Exchange Markets," August 2014.
[29] G. Zhang and M. Y. Hu, "Neural network forecasting of the British Pound/US Dollar exchange rate," Omega,
August 1998.
[30] A. D. Aydin and S. C. Cavdar, "Comparison of Prediction Performances of Artificial Neural Network (ANN) and
Vector Autoregressive (VAR) Models by Using the Macroeconomic Variables of Gold Prices, Borsa Istanbul
(BIST) 100 Index and US Dollar-Turkish Lira (USD/TRY) Exchange Rates," Procedia Economics and Finance,
December 2015.
[31] R. R. Trippi and E. Turban, "Neural Networks in Finance and Investing: Using AI to Improve Real World
Performance," 1993.
[32] S. Zhang, H.-X. Liu, T. Gao and S.-D. Du, "Determining the input dimension of a neural network for nonlinear time
series prediction," Chinese Physics, June 2003.
[33] Q. Li and D. Zheng, "Determining topology architecture for chaotic time series neural network," February 1999.
[35] E. Baum and D. Haussler, "What Size Net Gives Valid Generalization?," in Advances in Neural Information
Processing Systems, Denver, 1988.
[36] S. Aras and I. D. Kocakoç, "A new model selection strategy in time series forecasting with artificial neural
networks: IHTS," Neurocomputing, October 2015.
[37] S. F. Abdullah, A. F. N. A. Rahman, Z. A. Abas and W. H. B. M. Saad, "Multilayer Perceptron Neural Network In
Classifying Gender Using Fingerprint Global Level Features," Indian Journal of Science and Technology, February
2016.
[38] U. Smyczyńska, J. Smyczynska and R. Tadeusiewicz, "Influence of neural network structure and data-set size on its
performance in the prediction of height of growth hormone-treated patients," Bio-Algorithms and Med-Systems,
January 2016.
[39] J. T. Heaton, "Encog: Library of Interchangeable Machine Learning Models for Java and C#," 2015.