Vous êtes sur la page 1sur 7

University of Pennsylvania

ScholarlyCommons
Real-Time and Embedded Systems Lab (mLAB) School of Engineering and Applied Science

7-24-2014

Evaluation of DR-Advisor on the ASHRAE Great


Energy Predictor Shootout Challenge
Madhur Behl
University of Pennsylvania, mbehl@seas.upenn.edu

Rahul Mangharam
University of Pennsylvania, rahulm@seas.upenn.edu

Follow this and additional works at: http://repository.upenn.edu/mlab_papers


Part of the Computer Engineering Commons, and the Electrical and Computer Engineering
Commons

Recommended Citation
Madhur Behl and Rahul Mangharam, "Evaluation of DR-Advisor on the ASHRAE Great Energy Predictor Shootout Challenge", . July
2014.

This paper is posted at ScholarlyCommons. http://repository.upenn.edu/mlab_papers/75


For more information, please contact repository@pobox.upenn.edu.
Evaluation of DR-Advisor on the ASHRAE Great Energy Predictor
Shootout Challenge
Abstract
This paper describes the evaluation of DR-Advisor algorithms on ''The Great Energy Predictor Shootout - The
First Building Data Analysis and Prediction Competition'' held in 1993-94 by ASHRAE.

Disciplines
Computer Engineering | Electrical and Computer Engineering

This technical report is available at ScholarlyCommons: http://repository.upenn.edu/mlab_papers/75


Evaluation of DR-Advisor on the ASHRAE Great
Energy Predictor Shootout Challenge

Madhur Behl
Dept. of Electrical and Systems Engineering
University of Pennsylvania
{mbehl}@seas.upenn.edu

1. About the Competition


This paper describes the methods and results in making predictions for test
problem A given by The Great Energy Predictor Shootout - The First Building
Data Analysis and Prediction Competition held in 1993-94 by ASHRAE.
Two distinct data sets were provided for prediction (training and testing).
Contestants were given these two sets of independent variables with the cor-
responding values of dependent variables, e.g., energy usage. The accuracy of
predictions of the dependent variables from values of independent variables from
the test data set was the criteria for judging this competition.
The following criteria was used by the organizers for assessing the respective
accuracies of the entries when analyzing the testing set:
CoefficientofVariation, CV
p Pn
{ t=1 (yt y)2 } /n
CV =
y
The idea was to look for simpler models that may not have such a strong
physical basis, yet that perform well at prediction. The competition attracted
150 entrants, who attempted to predict the unseen power loads from weather
and solar radiation data using a variety of approaches. The winner of the
competition was an entry from David Mackay [1]. Mackays algorithm was based
on Bayesian modeling using neural networks, with an Automatic Relevance
Determination (ARD) prior to help select the relevant variables from the large
number of possible inputs. Although this algorithm won the competition by
some margin, a large fraction of the other highly ranked algorithms were also
based on some form of neural network.

2. Data Description
The training data was a time record of hourly chilled water, hot water and
whole building electricity usage for a four-month period in an institutional build-

July 24, 2015


Temperature
100
80
60
F

40
20
0
0 100 200 300 400 500 600 700 800 900 1,000 1,100 1,200 1,300 1,400 1,500 1,600 1,700 1,800 1,900 2,000 2,100 2,200 2,300 2,400 2,500 2,600 2,700 2,800 2,900 3,000
Humidity
102
3

2
Ratio

0
0 100 200 300 400 500 600 700 800 900 1,000 1,100 1,200 1,300 1,400 1,500 1,600 1,700 1,800 1,900 2,000 2,100 2,200 2,300 2,400 2,500 2,600 2,700 2,800 2,900 3,000

Incident Solar Irradiation


2,000

1,500
2
W/m

1,000

500

0
0 100 200 300 400 500 600 700 800 900 1,000 1,100 1,200 1,300 1,400 1,500 1,600 1,700 1,800 1,900 2,000 2,100 2,200 2,300 2,400 2,500 2,600 2,700 2,800 2,900 3,000

Wind
20

15
mph

10

0
0 100 200 300 400 500 600 700 800 900 1,000 1,100 1,200 1,300 1,400 1,500 1,600 1,700 1,800 1,900 2,000 2,100 2,200 2,300 2,400 2,500 2,600 2,700 2,800 2,900 3,000

TimeSteps

Figure 1: Plot fo the training data for the ASHRAE Great Energy Predictor Shootout Chal-
lenge.

ing. Weather data and a time stamp were also included. The hourly values of
usage of these three energy forms was to be predicted for the two following
months. The testing set consisted of the two months following the four-month
training period. The training data had approximately 3000 samples taken hourly
during Sep - Dec 1989. The following information was provided for each time
step:
1. Outside temperature ( F)

2. Wind speed (mph)


3. Humidity ratio (water/dry air)
4. Solar flux (W/m2 )
5. Hour of Day

6. Whole building electricity, WBE (kWh/hr)


7. Whole building chilled water, CHW (millions of Btu/hr)
8. Whole building hot water, HW (millions of Btu/hr)

The corresponding date is also provided. The training data features or the
independent variables are shown in Figure 1. One of the dependent or response
variable (WBE) is plotted in Figure 2.
In addition to the variables which are provided, the date for each sample
point allows us to define proxy or context variables of our own. We define three
such proxy variables:

2
Whole Building Energy (Training Data)
1200

M T WT F

1000

800
kWh

600

Veterans Day
400 S S
Extended
Weekend
due to Christmas Week
Thanksgiving
200

0
0 500 1000 1500 2000 2500 3000

TimeSteps

Figure 2: Plot of the whole building electricity (WBE) training data. It can be seen that
special days and weekends can be easily identified in the data.

1. Day of Week: This is a number which takes values from 1 to 7 based on


what day of the week it is. The intuition behind introducing this variable
is that any repeated building use patters fro a specific day will be captured
by this variable.
2. IsWeekend: This is a boolean variable which takes the value TRUE for
Saturdays and Sundays and FALSE otherwise.
3. IsHoliday: Sometimes the building might follow a modified schedule on
a weekday due to a holiday or a special day. By referring to the 1989
calender or the training data duration, we are able to identify days in the
training and testing data which are special days or holidays.

The motivation of introducing the proxy variables can be understood from


Figure 2. One can note how weekday consumption patters appear similar, except
on certain holidays where the power consumption is irregular or seems to follow
a weekend schedule (e.g., Thanksgiving). Likewise, the building has an irregular
power consumption profile during the entire Christmas week, probably because
of a vacation schedule.

3. Results

We run the random forests method on the training data to learn three dif-
ferent models, one each for predicting the whole building energy WBE, chilled
water consumption, CHW and hot water consumption HW. The test data con-
sists of 1282 samples of weather information and date from the first two months
of 1990. Our objective is to predict the WBE, CHW and HW for these two
months.

3
Random Forests Resubstitution Error
140

135

130

125

120

115

110
RMSE (kWh/hr)

105

100

95

90

85

80

75

70

65

60
0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 220 230 240 250 260 270 280 290 300
Number of trees

Figure 3: Resubstitution error is shown for the number of trees in the random forest method.

Table 1: Shootout Competition Results

ASHRAE Team ID WBE CV CHW CV HW CV Average CV


9 10.36 13.02 15.24 12.87
Random Forests 11.72 14.88 28.13 18.24
6 11.78 12.97 30.63 18.46
3 12.79 12.78 30.98 18.85
2 11.89 13.69 31.65 19.08
7 13.81 13.63 30.57 19.34

Whole building electricity, WBE (Testing Data)


1,000

Ground Truth
900 Prediction

800

700

600
kWh

500

400

300

200

100

0
0 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750 800 850 900 950 1,000 1,050 1,100 1,150 1,200 1,250 1,300
TimeSteps

Figure 4: Comparison between predicted and ground truth values for WBE for the testing
data.

4
The comparison between the predicted WBE values and the ground truth are
shown in Figure 4. The coefficient of variation is 11.72%. The re-substitution
error (on the training data) is plotted for a different number of trees used by
the random forest in Figure 3. In the actual competition, the winners were
selected based on the accuracy of all predictions as measured by the coefficient
of variation statistic CV. The smaller the value of CV, the better the prediction
accuracy. ASHRAE released the results of the competition for the top 19 entries
which they received. In Table 1, we list the performance of the top 5 winners
of the competition and compare our results with them.
It can be seen from table 1, that the random forest method ranks 2nd in
terms of WBE CV and the overall average CV. The other algorithms in the ta-
ble are (9) Mackays Bayesian Non-Linear Modeling, (6) Ohlssons Feedforward
Multi-layer Perceptron, (2) Feustons Neural Network with Pre and Post Process-
ing. Algorithm (9) (Mackay) generates the best results overall, beating many
other neural network implementations in the competition, which suggests that
it is some particular feature of this implementation, e.g., input preprocessing,
network architecture, or training approach, that is important.
However, the results we obtain clearly demonstrate that random forests can
generate predictive performance that is comparable with the ASHRAE Shootout
winners.

References

[1] David JC MacKay et al. Bayesian nonlinear modeling for the prediction
competition. ASHRAE transactions, 100(2):10531062, 1994.

Vous aimerez peut-être aussi