Académique Documents
Professionnel Documents
Culture Documents
ASSIGNMENT
NE-3002 Energy in Buildings
1
into 1 the better the model we make. However it is not the final step. The residual plot of the prediction vs test
need to be checked (Fig.2). It will show how well the R2 value represent the fitness of the model.
Next, prediction of the Boole Library Electricity Consumption in 2013 can be created using that linear
regression model..
3. Result and Discussion
a. Coefficient Correlation
Boole Library Electric Consumptions 2012 data set consist of 5 variables of the data, Opening Hours,
Foot Fall, Heating Degree Days, Day Light, Electricity. The first thing to do to analyse this data set is to
see the relationship between each variable by using Coefficient Correlation. Figure 1 show the Coefficient
Correlation between each variable. The value 1 show the srong relation between each variable in the same
dirrection, its mean when the variable 1 goes up the variable 2 will goes up too, and vice versa. The value
-1 show the strong relation but in different dirrection, its mean variable 1 will goes down if the variable
2 goes up and vice versa. Meanwhile the value 0 means there is no relation between those variable, what
ever happen with variable 1 will not give any impact to variable 2.
2
in one day. It is strongly related with heating systems. However the correlation with the librarys electricity
consumptions is low because Boole library use gas for its heating systems, not electricity. But there might
be indirect correlation therefore the coefficient correlation is small.
Daylight has a negative but low correlation with the electricity usage. It seems daylight wasnt really give
direct impact to the electricity usage in library since the majority of the lamps are on all day during opening
hour. However there might be indirect negative correlation between daylight and librarys electricity
consumptions.
b. Linear Regression
Boole Library Electric Consumptions 2012 data set is used as the input to develope the linear regression
model of Boole Library Electric Consumption. Table 1 show the estimated weight/coefficient of each
variable in linear regression equation.
3
meanwhile the other 6% can not be explained by the linear regression model. However, when we use the
test data set to get the prediction of electrical, the R2 we got is:
The R2test is lower than R2training. Because R2training is calculated with the model which is developed base on
the training dataset. Meanwhile the test dataset is something that is not related with the model. However,
R2test value is high enough, its mean 89% of the estimated output variance can be explained by the linear
regression model.
The big R2 value doesnt mean the model is absolutely accurate. It might have some factors that cant be
explained by linear regression (example: Sinusoidal model) which could bring wrong interpretation of the
R2. Therefore we need to compare the prediction and the actual value of the electricity consumption of
test data set (Fig.2).
8000 2013
Electricity
6000 (estimate)
(kWh)
2012
4000 Electricity
2000
0
01-Jan 31-Jan 01-Mar 31-Mar 30-Apr 30-May 29-Jun 29-Jul 28-Aug 27-Sep 27-Oct
Figure 3. Boole Library Electricity Consumptions 2013 (estimated)
and Boole Library Electricity Consumptions 2012
The electricity consumption in 2013 is estimated using the linear regression model that already developed
in the previous step. The input is the Boole Library Electric Consumptions 2013 dataset which consist
4
of Opening Hours, Foot Fall, Heat Degree Days, and Day Light. Figure 3 show the graph of estimate
electricity consumption in 2013 and electricity consumption in 2012. From the graph we can see the
electricity consumption will reach some low point, and rise up again. However, it never reaches 0. It
means the Library always consume electricity. It explains the constant value of 0 which related to the
minimum electricity consumption of the library. This happen because of in the weekend the opening hours
of library is shorter. The graph also shows there are a significant decrease of electricity usage during May-
August. It is related to Holiday during academic year. It also shows different minimum electricity usage
during February-May and May-November. This difference might be caused by library policy changes of
opening hour during first and second semesters, or because of the peak time the students come to the
library.
4. Conclussion
Boole library electricity consumption is related with several factors, such as opening hour and foot fall. This
conclusion is the result of data analytical process of Boole Library Electric Consumption 2012 dataset. The
data analytical process used in this exercise is Coefficient Correlation, and Linear Regression. The coefficient
correlation also showed that Heating Degree Days and Daylight might not have big impact to the electric
consumption.
The linear regression model is developed from the Boole Library Electric Consumption 2012 dataset using
Ordinary Least Square in Python. It calculates the weighting coefficient for each variable to estimate the
electricity consumption. The model created has quite high accuracy. The internal coefficient of determination
(R2training) is 94%, and the coefficient determination (R2test) from the test dataset is 89%. The plot of estimation
and actual electricity also showed a relatively small residue. However, the weighting coefficient for the HDD
and Daylight seems irrelevant. If the HDD value is high it means need more heating, energy consumption rise.
But the HDD weight coefficient is negative, which show otherwise. Same with daylight, if its value is high,
means no need for additional light. But daylight coefficient is positive which means daylight and electric
consumptions is in the same direction. Base on this rationale and the coefficient correlation, it concludes that
HDD and daylight is not related to Electricity consumption. These two variables could be the bias of the linear
regression model. It these two variable (HDD and daylight) is taken out from the input, the linear regression
model might be improved.
Estimation of the electric consumption can be calculated using the developed linear regression model. The
graph of Boole library electric consumption could show some pattern and variation of electricity usage of
Boole library.