Académique Documents
Professionnel Documents
Culture Documents
a r t i c l e i n f o a b s t r a c t
Article history: The energy performance of buildings was estimated using various data mining techniques, including
Received 15 April 2014 support vector regression (SVR), artificial neural network (ANN), classification and regression tree, chi-
Received in revised form 15 July 2014 squared automatic interaction detector, general linear regression, and ensemble inference model. The
Accepted 16 July 2014
prediction models were constructed using 768 experimental datasets from the literature with 8 input
Available online 27 July 2014
parameters and 2 output parameters (cooling load (CL) and heating load (HL)). Comparison results showed
that the ensemble approach (SVR +ANN) and SVR were the best models for predicting CL and HL, respec-
Keywords:
tively, with mean absolute percentage errors below 4%. Compared to previous works, the ensemble
Cooling load
Heating load
model and SVR model further obtained at least 39.0% to 65.9% lower root mean square errors, respec-
Energy performance tively, for CL and HL prediction. This study confirms the efficiency, effectiveness, and accuracy of the
Energy-efficient building proposed approach when predicting CL and HL in building design stage. The analytical results support
Artificial intelligence the feasibility of using the proposed techniques to facilitate early designs of energy conserving buildings.
Data mining
© 2014 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.enbuild.2014.07.036
0378-7788/© 2014 Elsevier B.V. All rights reserved.
438 J.-S. Chou, D.-K. Bui / Energy and Buildings 82 (2014) 437–446
building HL and CL is critical for effective energy conservation usage. Their predictions of the annual energy consumption of 247
strategies. houses in Canada had an overall correlation coefficient of 0.909
In practice, identifying parameters that substantially affect with the annual actual energy consumption. In Kwok et al., an ANN
building energy consumption can help optimize a building design. model was employed to predict energy use by a Hong Kong office
The influential parameters, e.g., relative compactness [19], climate building. The best root-mean-squared-percentage-error (RMSPE)
[20], surface area, wall area, and roof area [21], orientation [22], was 11.409% [31]. In Hou et al., an ANN based on data-fusion tech-
can be grouped into two main categories: the physical properties nique was used to forecast air-conditioning load and achieved a
of a building and meteorological conditions. These factors make the small relative error (below 4%) [27]. Although these studies show
relationship between EPB and its influential parameters very com- that ANN can achieve a moderate fit in predicting HL and CL, model
plicated. Consequently, accurately predicting HL and CL of building performance is generally unsatisfactory.
is a challenging task. In recent years, SVR, a variation of support vector machine
The artificial intelligence (AI) inference model has recently (SVM), has been widely used in forecasting and regression [42].
proven to be a viable alternative approach to predict EPB [23]. The AI Dong et al. evaluated the use of SVR to predict energy consumption
is employed to develop models that simulate the human inference in tropical regions [28]. The SVR was used to make hourly fore-
processes. Thus, AI can infer new facts from previously acquired casts of building cooling load. Their study showed that SVR has
information and can adaptively change in response to changes in superior prediction accuracy compared to conventional back prop-
historical data. Tsanas and Xifara [10] stated that AI not only obtains agation neural networks. Likewise, Li et al. used SVR to predict the
solutions very quickly, it also assists building designers in analyzing hourly cooling load of a building [29,43]. The authors showed that
the influence of input parameters. Many studies have explored the SVR was better than ANNs. In addition, Jain et al. used SVR to fore-
use of AI models for predicting various interests in the context of cast the energy consumption of multi-family residential buildings
EPB [9,24–30]. However, most works have reported unsatisfactory [32]. A sensor-based forecasting model using SVR was applied to
error rates, and most have considered only a few factors that affect an empirical data set for a multi-family residential building in New
building energy use. York City.
Therefore, the objective of this research is to compare the Many other AI techniques have been proposed for improving
performance of various AI techniques, including support vector energy consumption prediction accuracy in the energy field. In
regression (SVR), artificial neural network (ANN), classification and Yu et al., CART provided accurate predictions of building energy
regression tree (CART), chi-squared automatic interaction detector demand with low errors [9]. Li et al. hybridized genetic algorithm
(CHAID) and general linear regression (GLR). The best perform- with adaptive network-based fuzzy inference system to enhance
ing models were then combined into ensemble models. A k-fold accuracy in forecasting building energy consumption [33]. Tsanas
cross-validation algorithm was used for validation. A synthesized and Xifara used random forest (RF) technique to estimate CL and
performance index and hypothesis testing were used to compare HL in residential buildings [10]. Other proposed machine learning
performance measures between the proposed models and those in techniques for identifying and predicting system behavior include
previous works. The contribution to the body of knowledge is the neural network system [44], general regression neural network
development of an AI technique that can predict building CL and [24], regression models [34,45], and hybrid system [33].
HL with improved accuracy in predicting energy consumption and The above studies agree that AI model performs satisfactorily
can facilitate early building design for the energy conservation. in predicting energy performance in building. Nevertheless, most
The remainder of this paper is organized as below. The following works have used individual forecasting models rather than inves-
section introduces the study context by reviewing the related liter- tigating the power of ensemble models. Moreover, the prediction
ature, including studies of EPB and predictive techniques. Section performance of the aforementioned AI techniques needs further
3 then describes research methodology and evaluation methods. study. Therefore, the objective of this study was to fill this gap
Section 4 describes the building information and experimental by using these models individually and in combination (ensemble
data obtained in this study. Section 5 presents modeling processes, models) to predict CL and HL via cross-fold validation and multiple
discusses prediction results, and compares model performance. performance measures.
Concluding remarks and research contributions are given in the
final section. 3. Methodology
Input variable 1
1
Input variable 2
2 y
..
.
Input variable u .
u ..
The ANN imitates the working principles of the human brain where i and j are target field categories, (j) is the prior probability
to perform learning and prediction tasks. The ANN is based on value for category j, Nj (t) is the number of records in category j of
biological learning and has a structure similar to that of the node t, and Nj is the number of records of category j in the root node.
human nervous system. A neural network uses inter-connected Notably, when the Gini index is used to measure the improvement
neurons as processing elements, which have similar characteris- after a split during tree growth, only records in node t and the root
tics as inputs, synaptic strength, activation output and bias. The node with valid values for the split-predictor are used to compute
inter-connections between neurons carry the weights of the net- Nj (t) and Nj , respectively.
work [47]. The neurons in the network can be classified as input,
hidden and output neurons. Fig. 2 illustrates an ANN model. 3.4. Chi-squared automatic interaction detector
The most widely used and effective neural network model is
the back-propagation neural network (BPNN). Activation of each The chi-squared automatic interaction detector (CHAID) is a
neuron in a hidden output layer is computed as Eq. (5). Notably, dur- decision tree classification technique developed by Kass [50]. It
ing the learning process, the BPNN stores non-linear information tests for independence by using the chi-squared test to assess
between influencing factors and their associated strengths. Con- whether splitting a node significantly improves purity. The pre-
nection weighs are adjusted during the training process to deliver dictor with the strongest association (p-value) with the response
predicted values close to target values. Therefore, BPNNs generally variable at each node is used as a split node. When the tested pre-
have an efficient training process. dictor shows no statistically significant improvement, no split is
1
performed, the algorithm will stop.
netk = wkj Oj and yk = f (netk ) = (5) This study, however, proposes the use of exhaustive CHAID to
1 + e−netk
classify the target field, which addresses the limitations of CHAID
where netk is the activation of kth neuron; j is the set of neurons in technique [51]. Specifically, exhaustive CHAID may not optimize
the preceding layer; wkj is the connection weight between neuron the split for a predictor variable because it stops merging categories
440 J.-S. Chou, D.-K. Bui / Energy and Buildings 82 (2014) 437–446
ANN
SVM
Combined the
Performance
Input data CART best performing Output data
evaluation
models
CHAID
GLR
as soon as it finds that all remaining categories significantly differ. where ci comprises the linear combination coefficients, which are
Exhaustive CHAID avoids this problem by continuously merging simply based on average values of different weight (Fig. 3).
predictor categories until only two super categories remain. After The artificial intelligence models used in this study to predict
identifying the predictor in the series of merges, it finds the set of CL and HL include ANN, SVM, CART, CHAID, GLR, and ensemble
categories that has the strongest association with the target vari- approach. Notably, the ensemble approach method is expected to
able and computes an adjusted p-value for the association. Thus, be superior in prediction performance to that of conventional mod-
exhaustive CHAID finds the best split for each predictor and then els [52]. The generalization of predictive models can be enhanced
chooses which predictor to split by comparing their adjusted p- by ensemble averaging. However, the effectiveness of the models
values [49]. needs further examination as described in the following sections.
The case study demonstrates that the modeling process can be
3.5. General linear regression constructed easily, objectively, and satisfactorily in terms of both
utilization and accuracy.
General linear regression (GLR), a more flexible version of lin-
ear regression (LR), assumes that data points have an arbitrary
distribution pattern. It constructs the relationship between X (pre- 3.7. Evaluation criteria
dictor variables) and Y (response variable) by using a link function
according to its distribution pattern. The (X − Y) relational model is Multiple evaluating criteria are utilized to compare the perfor-
therefore defined as: mance of prediction models. First, root mean squared error (RMSE)
is computed to find the square error of the prediction compared to
g(E(y)) = X × ˇ + O, y∼F (8) actual values and to find the square root of the summation value.
The RMSE is thus the average distance of a data point from the
where g(•) is the selected link function, O is the offset variable, F
fitted line measured along a vertical line. This tool efficiently iden-
is the distribution model of y, X is the predictor, y is the response
tifies undesirably large differences. The RMSE is stated using the
variable, and ˇ is the regression coefficient.
following equation.
The GLR uses Newton–Raphson method to obtain a continuous
estimate such that (X × ˇ + O) approaches g(E(y)).The final proximal
equation is formulated as a (X − Y) relational expression. Although
1
n
additional parameters in the GLR increase model instability, GLR RMSE = × [pi − yi ]2 (10)
n
has a wider application range and obtains a more realistic relation- i=1
ship model compared to LR.
where pi is the predicted value; yi is the actual values; and n is the
3.6. Ensemble model sample size.
In contrast to RMSE, the mean absolute error (MAE) is a quantity
An ensemble model is a numerical prediction method for used to measure how close forecasts are to the eventual outcomes.
generating a representative sample of the possible future states It computes the average magnitude of errors between predicted
of dynamical system. This approach ranks a set of the models and actual values with disregard of direction of over or under of
described above based on its performances and then combines errors. The mean absolute error is given by:
the best performing models into an ensemble model. It can be
expressed mathematically as g: Rd → R with a dimensional predic- 1 n
MAE = × pi − yi (11)
tor variable X and one-dimensional response Y. Each procedure uses n
a specified algorithm to yield one estimated function g(.). One esti- i=1
This work investigated twelve building types simulated in Eco- and dimensions. The simulation data assumed that activities of the
tect [10]. Each type consists of various elementary blocks (see buildings were sedentary. The experiments investigated eight input
Fig. 4). All buildings have the same volume (771.75 m3 ) and used parameters and two output parameters of residential buildings, i.e.,
the same materials for each block but have different surface areas relative compactness (RC), surface area, wall area, roof area, overall
Table 1
Data description.
to ensure that all data instances were applied in both training and 5.2. Analytical discussions
testing phases (see Fig. 6). Algorithm accuracy is then expressed
as the average accuracy of the ten models in ten validation Table 3 shows the performance results for the prediction mod-
rounds. els, including ANN, SVR, CART, CHAID, GLR and ensemble models.
The predictions results were obtained using the Numeric pre- The accuracy measures were used to evaluate the predictive AI
dictor node (i.e., ANN, CART, CHAID, GLR and SVR model) in techniques. Table 3 lists the summary of cross-fold modeling per-
Clementine (now IBM SPSS Modeler) [49]. The Numeric predic- formance for cooling load and heating load during the test period.
tor node estimates and compare models for continuous numeric For example, SVR (0.00) had the best results based on the SI values
range outcomes using several different methods. The node allows for heating load.
the choice of which algorithms to use and allows experimentation Particularly, the ensemble model (i.e., SVR + ANN) had an even
with multiple combinations of options in a single modeling path. better overall performance in predicting CL, and SVR was the best
Model comparisons can be based on correlation, relative error, or model for predicting HL (Table 3). Aggregate performance of the
number of variables used. The parameters of prediction models ensemble models was evaluated by identifying the top two, the
(Table 2) were set to default in Clementine to establish a base- top three, the top four, and the top five single models. All ensem-
line for validation. For example, the ANN’s parameters were set as ble models obtained excellent correlations between actual values
sigmoid transfer function, hidden layers = 1, No. of neurons in the and predicted output (higher than 0.97, see Table 3). In CL phases,
hidden layer = 3, momentum = 0.9, initial learning rate = 0.3, high the best ensemble model was derived from the two best single
learning rate = 0.1, low learning rate = 0.01, learning rate decay = 30 models (SVR + ANN). Notably, the best performance of the ensem-
and persistence = 200. ble model was superior to that of the best performance of the
Each of the 2 outputs in the experiment was run 10 times with 10 individual predictive models. However, SVR had the best perfor-
fold-cross validation. The analytical results were then averaged. In mance in predicting HL.
each fold, the training data were used to create a prediction model. The energy industry clearly recognizes the importance of pre-
The model was then evaluated using testing data. The steps for each dicting the energy performance of buildings. Several models have
fold are similar and are briefly listed below (Fig. 7): been proposed for comparing and improving the accuracy in
predicting CL and HL (Table 4). For example, Tsanas and Xifara
proposed a random forest (RF) method to estimate the CL and
• Step 1: Input phase: add data to the source node based on the HL in residential buildings [10]. Their models performed well.
cross-validation algorithm. For predicting CL, they obtained high values for RMSE = 2.567
• Step 2: Cross training and testing model: use the numerical pre- (kW), MAE = 1.42 (kW) and MAPE = 4.62 (%). Their comparisons
dictor node to train data and use the created nugget node to showed that the prediction performance of the RF model was supe-
evaluate the model. rior to that of iteratively reweighted least squares (IRLS) models
• Step 3: Combine the best models via an ensemble node. [56].
• Step 4: Output phase: assess analytical results through table and Table 4 describes the analytical results for the proposed best sin-
analysis nodes. gle and ensemble models comparing to results reported in other
Table 3
Average 10-fold cross-validation results for five single models and ensemble models.
RMSE (kW) MAE (kW) MAPE (%) R SI RMSE (kW) MAE (kW) MAPE (%) R SI
ANN 1.678 1.158 4.403 0.984 0.53 (2) 0.610 0.432 2.362 0.998 0.34 (2)
SVR 1.647 0.890 2.985 0.985 0.14 (1) 0.346 0.236 1.132 0.999 0.00 (1)
CART 1.841 1.157 4.020 0.981 0.76 (3) 0.800 0.437 2.104 0.996 0.51 (3)
CHAID 1.859 1.174 4.104 0.981 0.82 (5) 0.909 0.469 2.407 0.995 0.64 (4)
GLR 1.740 1.292 4.966 0.983 0.79 (4) 1.039 0.787 4.591 0.995 1.00 (5)
Combined 5 single models 1.614 1.030 3.539 0.986 0.23 0.539 0.345 1.610 0.998 0.20
Combined 4 best single models 1.624 1.021 3.525 0.985 0.24 0.526 0.329 1.593 0.999 0.19
Combined 3 best single models 1.610 1.000 3.474 0.986 0.20 0.488 0.315 1.581 0.999 0.15
Combined 2 best single models 1.566 0.973 3.455 0.986 0.11 0.428 0.300 1.557 0.999 0.11
Note: Underline value denotes the best overall prediction performance (SI) for the cooling and heating loads, respectively; (.): denotes performance ranking for single models.
444 J.-S. Chou, D.-K. Bui / Energy and Buildings 82 (2014) 437–446
Table 5
Hypothesis testing results for improvement rates in the best models.
Phase Second best model and models Best model Improved by the best model (%)
reported in other primary work
The improvement and hypothesis testing are calculated using average performance measures.
**
Represents the level of significance is higher than 1%.*** Represents the level of significance is higher than 0.1%.
[17] M. Ning, M. Zaheeruddin, Neuro-optimal operation of a variable air volume [37] A.H. Neto, F.A.S. Fiorelli, Comparison between detailed model simulation and
HVAC&: R system, Applied Thermal Engineering 30 (5) (2010) 385–399. artificial neural network for forecasting building energy consumption, Energy
[18] B.C. Ahn, J.W. Mitchell, Optimal control development for chilled water and Buildings 40 (12) (2008) 2169–2176.
plants using a quadratic representation, Energy and Buildings 33 (4) (2001) [38] A. Yezioro, B. Dong, F. Leite, An applied artificial intelligence approach towards
371–378. assessing building performance simulation tools, Energy and Buildings 40 (4)
[19] W. Pessenlehner, A. Mahdavi, Building morphology, transparence, and energy (2008) 612–620.
performance, in: Eighth International IBPSA Conference, Eindhoven, The [39] J. Zhang, F. Haghighat, Development of artificial neural network based heat con-
Netherlands, 2003, pp. 1025–1030. vection algorithm for thermal simulation of large rectangular cross-sectional
[20] K.K.W. Wan, D.H.W. Li, D. Liu, J.C. Lam, Future trends of building heating and area earth-to-air heat exchangers, Energy and Buildings 42 (4) (2010) 435–
cooling loads and energy consumption in different climates, Building and Envi- 440.
ronment 46 (1) (2011) 223–234. [40] A. Rabl, A. Rialhe, Energy signature models for commercial buildings: test with
[21] S. Schiavon, K.H. Lee, F. Bauman, T. Webster, Influence of raised floor on zone measured data and interpretation, Energy and Buildings 19 (2) (1992) 143–
design cooling load in commercial buildings, Energy and Buildings 42 (8) (2010) 154.
1182–1191. [41] M. Aydinalp, V. Ismet Ugursal, A.S. Fung, Modeling of the appliance, lighting,
[22] J. Parasonis, A. Keizikas, A. Endriukaitytė, D. Kalibatienė, Architectural solutions and space-cooling energy consumptions in the residential sector using neural
to increase the energy efficiency of buildings, Journal of Civil Engineering and networks, Applied Energy 71 (2) (2002) 87–110.
Management 18 (1) (2012) 71–80. [42] V. Cherkassky, Y. Ma, Practical selection of SVM parameters and noise estima-
[23] N. Fumo, A review on the basics of building energy estimation, Renewable and tion for SVM regression, Neural Networks 17 (1) (2004) 113–126.
Sustainable Energy Reviews 31 (0) (2014) 53–60. [43] Q. Li, Q. Meng, J. Cai, H. Yoshino, A. Mochida, Predicting hourly cooling load in
[24] A.E. Ben-Nakhi, M.A. Mahmoud, Cooling load prediction for buildings using the building: a comparison of support vector machine and different artificial
general regression neural networks, Energy Conversion and Management 45 neural networks, Energy Conversion and Management 50 (1) (2009) 90–96.
(13–14) (2004) 2127–2141. [44] N. Kashiwagi, T. Tobi, Heating and cooling load prediction using a neural net-
[25] B.B. Ekici, U.T. Aksoy, Prediction of building energy consumption by using artifi- work system, in:, IJCNN ‘93—Nagoya. Proceedings of 1993 International Joint
cial neural networks, Advances in Engineering Software 40 (5) (2009) 356–362. Conference on Neural Networks 931 (1993) 939–942.
[26] T. Olofsson, S. Andersson, R. Östin, A method for predicting the annual building [45] I. Korolija, Y. Zhang, L. Marjanovic-Halburd, V.I. Hanby, Regression models for
heating demand based on limited performance data, Energy and Buildings 28 predicting UK office building energy consumption from heating and cooling
(1) (1998) 101–108. demands, Energy and Buildings 59 (0) (2013) 214–227.
[27] Z. Hou, Z. Lian, Y. Yao, X. Yuan, Cooling-load prediction by the combination of [46] V.N. Vapnik, The nature of statistical learning theory, Springer-Verlag, New
rough set theory and an artificial neural-network based on data-fusion tech- York, NY, 1995.
nique, Applied Energy 83 (9) (2006) 1033–1046. [47] T.N. Singh, S. Sinha, V.K. Singh, Prediction of thermal conductivity of rock
[28] B. Dong, C. Cao, S.E. Lee, Applying support vector machines to predict build- through physico-mechanical properties, Building and Environment 42 (1)
ing energy consumption in tropical region, Energy and Buildings 37 (5) (2005) (2007) 146–155.
545–553. [48] L. Breiman, J. Friedman, R. Olshen, C. Stone, Classification and Regression Trees,
[29] Q. Li, Q. Meng, J. Cai, H. Yoshino, A. Mochida, Applying support vector machine Chapman & Hall/CRC, New York, NY, 1984.
to predict hourly cooling load in the building, Applied Energy 86 (10) (2009) [49] SPSS, Clementine 12.0 Algorithm Guide, Integral Solutions Limited, Chicago, IL,
2249–2256. 2007.
[30] A.C. Menezes, A. Cripps, R.A. Buswell, J. Wright, D. Bouchlaghem, Estimating the [50] G.V. Kass, An exploratory technique for investigating large quantities of cate-
energy consumption and power demand of small power equipment in office gorical data, Journal of the Royal Statistical Society, Series C: Applied Statistics
buildings, Energy and Buildings 75 (0) (2014) 199–209. 29 (2) (1980) 119–127.
[31] S.S.K. Kwok, E.W.M. Lee, A study of the importance of occupancy to build- [51] D. Biggs, B. De Ville, E. Suen, A method of choosing multiway partitions for
ing cooling load in prediction by intelligent approach, Energy Conversion and classification and decision trees, Journal of Applied Statistics 18 (1) (1991)
Management 52 (7) (2011) 2555–2564. 49–62.
[32] R.K. Jain, K.M. Smith, P.J. Culligan, J.E. Taylor, Forecasting energy consumption of [52] P.J.L. Adeodato, A.L. Arnaud, G.C. Vasconcelos, R.C.L.V. Cunha, D.S.M.P. Monteiro,
multi-family residential buildings using support vector regression: investigat- M.L.P. ensembles improve long term prediction accuracy over single networks,
ing the impact of temporal and spatial monitoring granularity on performance International Journal of Forecasting 27 (3) (2011) 661–671.
accuracy, Applied Energy 123 (0) (2014) 168–178. [53] A. Mahdavi, B. Gurtekin, Shapes, numbers, and perception: aspects and dimen-
[33] K. Li, H. Su, J. Chu, Forecasting building energy consumption using neural sions of the design performance space., in: Proceedings of the 6th International
networks and hybrid neuro-fuzzy system: a comparative study, Energy and Conference: Design and Decision Support Systems in Architecture, Ellecom, The
Buildings 43 (10) (2011) 2893–2899. Netherlands, 2002, ISBN 90-6814-141-4, pp. 291–300.
[34] T. Catalina, V. Iordache, B. Caracaleanu, Multiple regression model for fast pre- [54] C.M. Bishop, Pattern Recognition and Machine Learning (Information Science
diction of the heating energy demand, Energy and Buildings 57 (0) (2013) and Statistics), Springer-Verlag New York Inc., New York, NY, 2006.
302–312. [55] R. Kohavi, A study of cross-validation and bootstrap for accuracy estimation and
[35] S.A. Kalogirou, M. Bojic, Artificial neural networks for the prediction of model selection, in: Proceedings of the 14th International Joint Conference on
the energy consumption of a passive solar building, Energy 25 (5) (2000) Artificial Intelligence—vol. 2, Morgan Kaufmann Publishers Inc., Montreal, QC,
479–491. 1995, pp. 1137–1143.
[36] H.-T. Pao, Comparing linear and nonlinear forecasts for Taiwan’s electricity [56] D.B. Rubin, Iteratively reweighted least squares, Encyclopedia of Statistical Sci-
consumption, Energy 31 (12) (2006) 2129–2141. ences 4 (1983) 272–275.