Vous êtes sur la page 1sur 10

Energy 61 (2013) 636e645

Contents lists available at ScienceDirect

Energy
journal homepage: www.elsevier.com/locate/energy

An articial neural network ensemble model for estimating global


solar radiation from Meteosat satellite images
Alvaro Linares-Rodriguez, Jos Antonio Ruiz-Arias, David Pozo-Vazquez,
Joaquin Tovar-Pescador*
Department of Physics, University of Jaen, Jaen, Spain

a r t i c l e i n f o a b s t r a c t

Article history: An optimized articial neural network ensemble model is built to estimate daily global solar radiation
Received 21 May 2013 over large areas. The model uses clear-sky estimates and satellite images as input variables. Unlike most
Received in revised form studies using satellite imagery based on visible channels, our model also exploits all information within
2 September 2013
infrared channels of the Meteosat 9 satellite. A genetic algorithm is used to optimize selection of model
Accepted 5 September 2013
Available online 1 October 2013
inputs, for which twelve are selected e eleven 3-km Meteosat 9 channels and one clear-sky term. The
model is validated in Andalusia (Spain) from January 2008 through December 2008. Measured data from
83 stations across the region are used, 65 for training and 18 independent ones for testing the model. At
Keywords:
Global solar radiation
the latter stations, the ensemble model yields an overall root mean square error of 6.74% and correlation
Articial neural network coefcient of 99%; the generated estimates are relatively accurate and errors spatially uniform. The
Ensemble model model yields reliable results even on cloudy days, improving on current models based on satellite
Meteosat images imagery.
Satellite-derived irradiances 2013 Elsevier Ltd. All rights reserved.
Infrared channels

1. Introduction commonly available in a study area. Numerous relationships have


been discovered between radiation at the earth surface and several
Despite an abundance of studies for estimating solar radiation variables (e.g., sunshine duration, temperature, precipitation, hu-
over large areas, a major challenge remains to more accurately midity, cloud cover, elevation, and latitude). These relationships
generate estimates at high spatial and temporal resolution. A well- can be grouped into physical, empirical and semi-empirical ap-
distributed network of ground solar radiation stations is usually proaches, with ngstrms equation (a relationship with sunshine
needed to ensure an accurate and reliable interpolation method. duration) perhaps the most commonly used in recent decades.
However, this is not always possible in many regions, since solar These methods produce reliable results, but they depend on
radiation is still rarely sampled compared to other environmental availability of other meteorological variables and, to attain spatial
variables. estimates over a region, there must be some method of further
Traditionally, the issue of spatial estimation of solar radiation interpolation. Therefore, the accuracy of radiation maps continues
has been addressed by three types of scheme: to depend on the density of meteorological measurements.
First, when there are many available solar radiation values, The third type of scheme has been the use of global meteoro-
widely dispersed across a study area, simple interpolation methods logical or climatological datasets derived from satellite imagery,
(nearest-neighbour, inverse distance-weighted, kriging, and numerical models or reanalysis products, which cover the entire
others) may give good spatial estimates (e.g., [1e3]). globe or extensive areas. They may provide continuous time series
Second, in the case of a coarser ground station network, radia- of data at pixels, with high spatial and temporal resolutions. Among
tion may be estimated using other meteorological variables more these methods, satellite imagery has been more frequently
exploited in recent decades, taking advantage of the high spatio-
temporal resolution of geostationary satellites. This makes satellite
* Corresponding author. Department of Physics, Building A3, Campus Lagunillas, approaches the best option for constructing accurate maps, espe-
University of Jaen, Jaen, Spain. Tel.: 34 953212428; fax: 34 953212838. cially over large areas.
E-mail addresses: alinares@ujaen.es (A. Linares-Rodriguez), jararias@ujaen.es
Satellite-derived irradiances can be calculated from each image
(J.A. Ruiz-Arias), dpozo@ujaen.es (D. Pozo-Vazquez), jtovar@ujaen.es (J. Tovar-
Pescador). every 15 min, with as much as w1-km spatial resolution in some

0360-5442/$ e see front matter 2013 Elsevier Ltd. All rights reserved.
http://dx.doi.org/10.1016/j.energy.2013.09.008
A. Linares-Rodriguez et al. / Energy 61 (2013) 636e645 637

Nomenclature G mean measured daily global solar irradiation (MJ/m2)


Gcs global clear sky solar irradiation (MJ/m2)
4 hyperbolic tangent activation function at the hidden GD daily global solar irradiation (MJ/m2)
layer
j linear activation function at the output layer Acronyms
w1(j,i) weight between jth input and ith hidden layer neuron ANN articial neural network
w2(i) weight between ith hidden layer neuron and the SEVIRI Spinning Enhanced Visible and Infrared Imager
output neuron MSG Meteosat second generation satellites
b1(i) bias term applied to the ith hidden layer neuron ESRA European Solar Radiation Atlas
b2 bias term applied to the output layer neuron AIC Akaike information criterion
N number of hidden neurons MBE mean bias error
M number of input neurons RMSE root mean square error
!
x input vector rRMSE relative root mean square error
xj jth element of the input vector
yi response of the ith hidden neuron Subscript
G measured daily global solar irradiation (MJ/m2) i neuron number of the hidden layer
b
G predicted daily global solar irradiation (MJ/m2) j neuron number of the input layer

regions. In general, satellite count-to-irradiance models can be for photovoltaic systems. All these can be classied as empirical
grouped into physical and empirical models, but hybrid models are approaches, within the second type of scheme explained above.
the most extensively used. These models take advantage of both the In recent years, there have also been some works in global solar
former two types. That is, they have the robustness of a physical radiation estimation using ANNs and satellite data (e.g., for
approach (usually based on radiative transfer models that explicitly monthly [30], daily [22], and hourly [31] estimates). Usually, in
describe scattering and absorption processes operating in the these types of methods, satellite visible channels are used for
earth-atmosphere system), but simplify certain equations with model inputs. Only a few studies have used other satellite channels.
empirical regressions. There have been extensive reviews of these Perez et al. [14] suggested a method to improve the performance of
models [4e6]. Among the most used, we refer to Heliosat method satellite-to-irradiance models using satellite infrared sensors. Lu
[7e9] and Perez model [10e14] and their variations, some of which et al. [22] used instead all channels of a Multifunctional Transport
combine both model types. Although they retain some degree of Satellite of the Japan Meteorological Agency for daily global solar
tting to observations, their physical underpinnings guarantee estimation, in an effort to fully exploit information contained in
suitability for generalization. both visible and infrared channels. They achieved estimates with
It has been shown that these satellite-based approaches RMSE around 20%.
generate more accurate solar radiation estimates than classical In our study, Meteosat all-channel observations and a clear sky
interpolation methods, at locations outside the vicinity of a ground model were used as ANN inputs. The extra information from
station [10,15,16]. Examples of solar radiation maps built with infrared channels signicantly improves the derivation of solar
satellite data are found in Refs. [17,18]. radiation and yields more accurate estimates, outperforming other
In a previous work [19], we proposed a novel method within the physical or semi-empirical approaches.
third type of scheme. It estimates daily solar radiation at unob- The remainder of the paper is organized as follows. The Region
served locations, using reanalysis data from ECMWF (European and data section describes the study area and data, with special
Centre for Medium-Range Weather Forecasts) [20] as an alternative reference to Meteosat images. Detailed description of the method is
to satellite imagery. Despite the use of coarse spatial-resolution ERA provided in the following section, including neural network ar-
(ECMWF Reanalysis)-Interim data (0.7, about 70 km in the study chitecture and input selection. The Results and discussion section
region [20]), the method produced valid estimates of daily radia- deals with assessment of various ANN models. Finally, conclusions
tion, even better on a monthly basis. are given.
The aim of this work is to design a new method for spatial esti-
mation of daily solar irradiation via articial neural networks and 2. Region and data
satellite-derived irradiances, toward improving results in the current
literature on daily global solar radiation (e.g., RMSE (root mean square 2.1. Region of study and ground measurement dataset
error) between 9% and 16% on a daily basis, as cited in Refs. [8,21,22]).
Two main improvements are achieved herein: 1) An optimiza- The study region is Andalusia, which covers about 87,000 km2 in
tion procedure of an ANN (articial neural network) ensemble the southern Iberian Peninsula. The experimental dataset is the
model, and 2) use of additional information contained in Meteosat same used in a previous work [19]. Here, we consider data from the
channels other than the two visible ones, which are the most used year 2008. We used 83 solar radiation ground stations across the
in these methods. region. The stations are owned and maintained by the Andalusian
In recent years, ANN models have received increased scientic regional government, and are part of the Agricultural and Envi-
attention as a method for estimating solar radiation across large ronmental monitoring network. Two different pyranometers are
areas. These models for deriving solar radiation through ANNs use a used in this network: CMP3 (Kipp & Zonen), and SP1110 from
variety of meteorological variables as inputs. It has been shown Campbell Scientic (manufactured by Skye Instruments). Overall,
that ANN models outperform traditional regression approaches an instrumental error about 5% can be assumed [1,3].
[23e26], and hence have been used in numerous papers (see a brief Daily solar radiation values were carefully ltered to eliminate
review in Ref. [19]). We refer to Kalogirous reviews of ANN appli- spurious data, and inaccurate or inconsistent measurements [32].
cations for renewable energy systems [27,28], and Mellit et al. [29] Several quality control procedures were performed at this stage.
638 A. Linares-Rodriguez et al. / Energy 61 (2013) 636e645

Particularly, a quality assessment based upon physical limits was effective radiances Cal Offset Cal Slope  Pixel Count;
applied: an upper limit of 0.8 for the atmospheric clearness,
calculated as the daily measured solar radiation to the daily po-
(1)
tential extraterrestrial, and a lower limit of 0.01 MJ/m2 [33]. We only considered the rst eleven Meteosat-9 channels, with
As a rst step, 65 stations were selected for the training dataset, 3-km nadir spatial resolution (approximately 4.5 km at study area
and the remaining 18 were used as an independent testing dataset. latitudes). For these channels, a complete image, that is, the full
Fig. 1 shows the distribution of the training and validation stations disk of the earth, nominally consists of 3712  3712 pixels. The
within the study region. Next, 133 suspicious data were eliminated potential information contained in each channel may be summa-
from the training dataset. Ultimately, 23,592 daily global solar ra- rized as follows [38]:
diation values were used as the training dataset, and 6570 as the
testing dataset, corresponding to daily horizontal global solar - Channels 1e2 (VIS0.6 and VIS0.8): These are the visible chan-
irradiation observations from 2008/01/01 to 2008/12/31 at 65 nels, essential for cloud detection, cloud tracking, scene identi-
training stations (23,592 65 stations  365 days  133 suspicious cation, aerosol, and land surface and vegetation monitoring.
data) and 18 testing stations (6570 18 stations  365 days). - Channel 3 (NIR1.6): It can discriminate between snow and cloud,
ice and water clouds, and provides aerosol information.
2.2. Meteosat dataset - Channel 4 (IR3.9): Primarily for low cloud and fog detection.
Also supports measurement of land and sea surface temperature
2.2.1. MSG (Meteosat second generation) satellite series at night, and increases low-level wind coverage from cloud
The MSG satellite series, with its SEVIRI (Spinning Enhanced tracking. For MSG, the spectral band has been broadened to
Visible and Infrared Imager), scans the complete disk of the earth longer wavelengths for improving signal-to-noise ratio.
four times per hour and has 12 spectral channels. These are three - Channels 5e6 (WV6.2 and WV7.3): Channel for observing water
solar channels (0.6, 0.8 and 1.6 mm), eight infrared channels (3.9, 6.2, vapour and winds. Enhanced to two channels peaking at
7.3, 8.7, 9.7, 10.8, 12.0 and 13.4 mm), and one high-resolution broad- different levels in the troposphere. Also supports height allo-
band visible channel (0.3e0.7 mm). The nadir spatial resolution of cation of semitransparent clouds.
SEVIRI is 1  1 km2 for the high-resolution channel, and 3  3 km2 - Channel 7 (IR8.7): Provides quantitative information on thin
for all other channels [34,35]. Raw data are internally managed via a cirrus clouds and supports discrimination between ice and
radiometric process, in four main steps [34]: a) linearization, b) water clouds.
conversion into radiances, c) calibration, and d) linear scaling. The - Channel 8 (IR9.7): Ozone radiances may be used as an input to
last step ensures that the necessary dynamic range falls into the numerical weather prediction. Temporal evolution of the total
available interval [0, 1023], through two linear coefcients: Cal_- ozone eld can also be monitored.
Slope and Cal_Offset. Both are constants provided for each image - Channels 9e10 (IR10.8 and IR12.0): Well-known, split-window
by the satellite sensors. Finally, effective radiances for each channel channels. Essential for measuring sea and land surface and
are fully dened by the relationship [35e37]: cloud-top temperatures.

Fig. 1. Study region. Highlighted region shows DEM (digital model of terrain) used.
A. Linares-Rodriguez et al. / Energy 61 (2013) 636e645 639

- Channel 11 (IR13.4): The CO2 (carbon dioxide) absorption


channel. In cloud-free areas, it may contribute temperature in-
formation from the lower troposphere that can be used for
estimating static instability.

2.2.2. Effective radiance dataset


Meteosat satellite products are available online via the EOP
(Earth Observation Portal) from the website of the EUMETSAT
(European Organisation for the Exploitation of Meteorological
Satellites) [39]. The EUMETSAT Data Centre provides a long-term
archive of data and generated products from EUMETSAT, which
can be ordered online. Before ordering products, users have to be
registered in the Earth Observation Portal (EOP) and signed up for
the Data Centre service.
Among the collections of satellite product types from the
various EUMETSAT-operated satellites, the EUMETSAT archive
supplies images and derived meteorological products from geo-
stationary Meteosat satellites since 1981, which are positioned over
Europe as well as the Indian Ocean.
Meteosat 9 High Rate SEVIRI Level 1.5 images from 2008/01/01
to 2008/12/31 were collected in HDF5 format, and all data mining
were performed within the Matlab environment. We used the
nctoolbox library [40], a Matlab toolbox that provides read-only
access to common data model datasets and allows to access
many le formats and services using the same interface.
Each satellite image was trimmed to the study region. Then,
pixels containing ground station locations were selected and
effective radiances calculated via Eq. (1), for each pixel every 15 min Fig. 2. Multilayer perceptron, feed-forward neural network.
in 2008. Some missing values (a small percentage, <1%) were lled
by Meteosat-8 images, or by interpolating previous/next values.
Daily effective radiances were obtained by summing all values each (see Nomenclature for symbol denitions).
day. In order to improve the generalization capabilities of the ANN
Finally, radiances at each station location were arranged in models, there are two widely used training methods known as
matrix form, for use as ANN candidate inputs: 11 channels  23,592 early stopping and Bayesian regularization [41]. The former method
radiance values for training, and 11  6570 values for testing. requires a validation set besides the calibration and test sets. The
latter method, which is used here, minimizes the over-tting
problem by taking into account the goodness-of-t as well as the
3. Methodology network architecture.
The Matlab Neural Network Toolbox [42] was used for the
3.1. ANN design implementation of the ANNs. For the training of the neural net-
works, the algorithm trainbr was used. The algorithm is a super-
An ANN is characterized by its architecture, training or learning vised iterative training method that updates weights and bias
algorithm, and its activation function. The ANN models constructed values according to LevenbergeMarquardt optimization [19]. It
here are of the MLP (multilayer perceptron) type. Architecture of minimizes a linear combination of squared errors and weights, via
the ANN models used (a visual scheme is shown in Fig. 2) consists gradient descent with a momentum algorithm suitable to avoid
of the following: An input layer with twelve inputs (M 12, the falling to a local minimum [43]. It then uses Bayesian regularization
eleven Meteosat channels plus a clear-sky term); one hidden to determine the correct combination of errors and weights,
layer with twenty-ve neurons (N 25); and an activation resulting in a network that generalizes satisfactorily [44e46].
function 4, dened by the hyperbolic tangent function Moreover, the training function trainbr also minimizes the num-
4 tanh(x) (exp(2x)  1)/(exp(2x) 1), x being the corre- ber of effective network parameters (weights and biases) used by
sponding input. According to this function, inputs are standardized the network.
to the range [e1,1]. For the output layer, a linear activation function The number of hidden neurons in an ANN is a function of
j was used in the implementation. problem complexity, number of input and output parameters, and
The approximating function G, ^ representing daily global solar
number of available training cases. A trial-and-error process de-
radiation, was dened as termines the number of hidden neurons. It is common in envi-
! ronmental modelling to seek a more parsimonious model, with the
X
N
b !
G x j yi ,w2 i b2 (2) smallest number of inputs and parameters possible, to avoid over-
i1 parameterization problems. After trying a number of different
congurations, we found that the best architecture contained
where twenty-ve hidden neurons.
0 1 Daily global solar radiation was selected as the only model
! X
M output. Training and testing of the network continued until there
yi x 4@ xj ,w1 j; i b1 iA (3) was no improvement in output (i.e., minimum square error be-
j1 tween ANN output values and measured data) or after a
640 A. Linares-Rodriguez et al. / Energy 61 (2013) 636e645

predetermined number of epochs, set to 500. In this case, minimum and clear-sky term, and the second with all possible inputs except
error was achieved before reaching the maximum epoch. channel 7, which is the term with lowest relative importance.
Table 1 shows all relative importances of terms, computed as the
3.2. Selection of inputs sum of relative evidence weights of all possible combinations in
which the term appears. Despite similar AIC scores of the two best
One of the more critical steps in ANN modelling is selection of an input combinations, the glmulti algorithm gives a very different
appropriate set of input variables [47e49]. If relevant inputs are relative weight to each candidate. Ranked relative evidence
omitted, the model is unable to capture the desired inputeoutput weights of the models are computed as exp(DAIC/2), where DAIC is
relationships. On the other hand, too many inputs can produce the difference of AIC between a model and the best model. These
over-parameterization problems, as cited above. Usually, a set of weights are normalized so that their sum is one; they can be
candidate input variables is found via a priori knowledge of the interpreted as probabilities of each model being the best in the set
system being modelled. [55]. The relative weight computed for the rst combination (all
In this work, the candidate inputs were a clear-sky solar radia- possible inputs) is 90.3%, against 9.70% for the other combination
tion estimate and the eleven 3-km Meteosat channels. These sat- (all inputs except channel 7). Hence, the best combination was one
ellite observations provide useful information of scattering, built with all possible candidate inputs, i.e., with no input
absorption and reection processes affecting solar radiation pass- neglected.
ing through the atmosphere.

3.2.1. Clear-sky solar radiation model 3.3. Model optimization procedures


We used the clear-sky model of the ESRA (European Solar Ra-
diation Atlas) [50], which is based on the turbidity coefcient of 3.3.1. Data division
Linke. In particular, this parameter accounts for climatic aerosols. A common practice in ANN modelling to guarantee its gener-
Daily clear-sky global solar radiation plays an important role as an alization capability (i.e., a model should yield reliable estimates
input variable because it incorporates seasonal dependence, at- outside the range of trained data) is to separate input data into
mospheric optical mass and consequently altitude [2,51]. We used appropriate calibration and validation sets. The ANN is trained only
representative averages of monthly Linke turbidity factors for the with the rst dataset and the last is used to validate performance of
study region. Calculations were similar to the previous work [19]. It a trained ANN. We use this splitting method for the training dataset
has been shown [52] that ESRA clear sky model yields accurate (23,592 input/output patterns), according to the proportions 75%
estimates, with an overall RMSE of about 5% on daily basis. and 25% for calibration and validation sets, respectively.
Another independent set was selected outside the training
3.2.2. Optimized input combination dataset (6570 input/output patterns from independent ground
Each of the twelve ANN inputs seems a priori to contain relevant stations dataset) to verify that the network correctly learned the
information regarding solar radiation. In cases of more input vari- relationship between inputs and outputs, so could therefore be
ables, a selection method to rule out any of them should be con- used to generate reliable global solar radiation estimates at un-
ducted, as the Bayesian method called automatic relevance trained locations.
determination [47,53,54]. It has been shown that irrelevant vari-
ables can deteriorate the model performance [54]. 3.3.2. Monte Carlo simulations
In our study, we conducted a simpler method to determine the The calibration and validation subsets of the training dataset
best combination of inputs. We executed a multivariate variable were sampled randomly. ANN weights and biases were also
selection procedure. Although the expected inputeoutput rela- initialized randomly. To attain robustness in the model and prevent
tionship should show nonlinearity, this procedure may give some randomness problems, we followed a procedure similar to Monte
idea of the existence of negligible inputs. We used the glmulti R Carlo simulations and repeated the overall ANN model develop-
package for selection of the best input combination [55]. Whether a ment 1000 times, leaving parameter initialization and data division
candidate input is rejected by glmulti algorithm would mean that as random. We selected the rst ve ANNs that achieved optimum
variable input contains superuous or redundant information. results as ANN ensemble model inputs (see next subsection).
From the list of the explanatory variables, i.e., the clear-sky solar This procedure represents a computationally expensive effort
radiation term and eleven Meteosat channels, the function glmulti but, once the resulting ANN has been successfully trained and
builds all possible unique combinations involving these variables. It selected, solar radiation estimates are obtained immediately and
explores the candidate set with a GA (genetic algorithm), which can training is no longer needed.
readily nd optimum combinations without tting all possible There are two major assumptions behind this procedure: (1) By
models, and ranks candidate combinations according to an infor- varying the initial parameters, the training algorithm is enabled to
mation criterion. This is the AIC (Akaike information criterion) in
the present work [56]. Table 1
By exploring only a subset of all possible models, randomly but Relative importance of terms.
with a bias towards better models thanks to selection, it makes
Channel 7 0.9029585
computation much faster. Genetic algorithms are very efcient at Channel 4 0.9999977
exploring highly discrete state spaces, and have been used suc- Channel 2 0.9999999
cessfully in related optimization problems. The GA procedure starts Channel 1 1.0000000
from a random population of models and evolves good local solu- Channel 3 1.0000000
Channel 5 1.0000000
tions by mimicking the process of natural selection using mecha-
Channel 6 1.0000000
nisms such as higher rate of replication of the more effective Channel 8 1.0000000
variable subsets (chromosomes), mutation to generate variants and Channel 9 1.0000000
crossover to improve combinations [57,58]. Channel 10 1.0000000
After algorithm convergence, there are two combinations with Channel 11 1.0000000
Gcs 1.0000000
lowest AIC scores. The rst is built with all the Meteosat channels
A. Linares-Rodriguez et al. / Energy 61 (2013) 636e645 641

Table 2 According to Table 2, of the ve best models, the ANN-E model


Error values of the ANN models. yields the most accurate daily global solar radiation estimates. MBE
ANN- Training period Independent period and RMSE values are better than those reported in the literature for
model
MBE RMSE (MJ/m2) R MBE RMSE (MJ/m2) R
a daily basis (see Introduction section), and are near the assumed
(MJ/m2) (MJ/m2) experimental/instrumental error of 5% [3,60]. The low MBE values
suggest the ANN-E model to be unbiased, although the model
ANN- 4.2E005 2.2603 (12.16%) 0.96 0.0684 2.5796 (13.84%) 0.95
2008a slightly overestimates solar radiation relative to the independent
ANN-1 0.0066 1.2535 (6.73%) 0.99 0.0296 1.2646 (6.83%) 0.99 dataset (negative values).
ANN-2 0.0241 1.2619 (6.78%) 0.99 0.0232 1.2573 (6.79%) 0.99 The new ANN-E model outperforms the previous ANN-2008
ANN-3 0.0014 1.2527 (6.73%) 0.99 0.0339 1.2688 (6.85%) 0.99
model [19], which was used in the present work as a benchmark.
ANN-4 8.00E004 1.2530 (6.73%) 0.99 0.0491 1.2709 (6.87%) 0.99
ANN-5 0.028 1.2503 (6.72%) 0.99 0.0386 1.2633 (6.82%) 0.99 In that work, we built an ANN model with eight inputs:
ANN-E 0.0055 1.2364 (6.64%) 0.99 L0.0349 1.2473 (6.74%) 0.99 geographical latitude and longitude of the ground stations, day of
Bold values correspond to ensemble model, which is the selected model.
the year, clear-sky daily radiation, and four meteorological variables
a
ANN reference model used in a previous work [18], built with ERA-Interim from ERA-Interim dataset [20]: total cloud cover, skin temperature,
reanalysis data from European Centre for Medium-Range Weather Forecasts. total column water vapour and total column ozone. The model was
trained and validated with measured data from the same experi-
search other parts of the parameter space, increasing overall mental dataset of this study, and for year 2008 yielded estimates
chances of nding a global error minimum; (2) since measured data with an overall RMSE of 13.84% for the testing stations, as depicted
are affected by noise and an ANN is too sensitive to calibration data in Table 2.
quality, a cleaner choice of training dataset should better reect real The improvement of ANN-E against that ANN-2008 model was
relationships between inputs and output, thereby obtaining more expected, since the most relevant input variables to ANN-2008
reliable results. were derived from ERA-Interim analysis, with original spatial res-
olution 70 km. Spatial resolution of the Meteosat imagery is about
3.3.3. ANN ensemble model 4.5 km in the study region. Fig. 3 shows a scatter plot between
It has been shown (e.g. Ref. [59]) that the robustness and reli- measured and predicted solar radiation values at the independent
ability of an ANN can be often signicantly improved by appro- set of stations.
priately combining several ANN models into an ANN ensemble. The The favourable results from the ANN-E model validate the initial
construction of such an ensemble requires two main steps. The rst premise of a real relationship between surface solar radiation and
is to create individual ensemble members, and the second is to nd satellite radiances. Once the ANN is trained, the model can predict
the appropriate combination of outputs from those members for solar radiation values for any location in the study region, and for
producing the unique ensemble output. any day within the training period. Estimates outside these ranges
Here, we compare results of the ve best models selected in the are not reliable. The model can be used, for example, for lling
previous stage with outputs of an ANN-E model (ANN ensemble) missing daily solar radiation values at ground stations, and for
built with those ve individual ANN models. Ensemble output was predicting solar radiation at unobserved locations. The latter rep-
constructed by averaging the ve individual ANN outputs. resents a spatial interpolation method and approach for generating
accurate solar radiation maps. The ANN model even works with
4. Results and discussion missing values from a given day, a great advantage compared to

Results were validated using the following statistical scores:


MBE (mean bias error), RMSE, rRMSE (relative root mean square
error) and correlation coefcient (R), between predicted and
measured values of daily global radiation for training and testing
datasets:

n  
1 X b
MBE Gd  G d (4)
n
d1

v
u n  2
uX
RMSE t Gd  G b =n (5)
d1

rRMSE RMSE*100=G (6)

r
X 2 X  2
R bs  G =
G Gs  G ; (7)

where n is the number of input/output patterns.

4.1. Estimation of daily global solar radiation

As explained in the previous section, the ANN training proce-


dure was repeated 1000 times and the best ve models selected. Fig. 3. Scatter plot of ANN-E predicted solar radiation values against measured data at
Statistical scores of the results are shown in Table 2. testing stations.
642 A. Linares-Rodriguez et al. / Energy 61 (2013) 636e645

Fig. 4. Comparison between measured and predicted radiation values, grouped by training stations.

other interpolation techniques such as kriging, which always re- RoeSevilla) had slightly worse error estimates than other stations.
quires real values. This fact demonstrates the generalization capability of the model,
To certify the spatial interpolation capability of the ANN-E and allows use of the ANN-E model to generate reliable estimates at
model, we checked the accuracy of estimates at each station. any unobserved point within the study region, thereby generating
Figs. 4 and 5 show statistical scores of radiation estimates at each of accurate daily solar radiation maps.
the ground training and independent stations, respectively. Nearly
all the stations have relative RMSE values between 4% and 9%. A 4.2. Dependence on sky conditions
boxplot (Fig. 6) shows overall scores. Within each box, the central
mark is the median of all the error values, and box edges denote the Fig. 7 shows error values of estimates grouped in quarters of a
25th and 75th percentiles. Whiskers extend to the most extreme year. In Andalusia, summer is usually sunny and dry, and winter is
data points not considered outliers, and outliers are plotted cloudy and rainy. Hence, from April through September there are
individually. frequent clear-sky days, and the ANN-E model yields more accurate
These results conrm that the ANN-E model gave reliable results estimates than those from rainy months.
over the entire study region, with accurate estimates and spatially To better evaluate the model for cloudy days, we calculated a
uniform errors. Only three training stations of the 65 (6th station: clear-sky index KC. It is dened as the ratio of measured global
La MojoneraeAlmera; 41st: HuesaeJan, and 51st: Esteponae irradiance GD to that received under clear-sky conditions, GCS,
Mlaga) and one independent station of the 18 (17th: Puebla del which is estimated with the ESRA clear-sky solar radiation model

Fig. 5. Comparison between measured and predicted radiation values, grouped by testing stations.
A. Linares-Rodriguez et al. / Energy 61 (2013) 636e645 643

sky conditions are superior to ones reported in the literature even


under all sky conditions (e.g., [8,19,22,61e63]).

5. Summary and conclusions

A new approach for estimating daily global solar radiation


through satellite-derived irradiances was formulated, via an opti-
mized ANN ensemble model. The approach takes advantage of the
high temporal and spatial resolution of Meteosat satellite images.
Unlike most methods for deriving solar radiation from satellite
imagery that use only visible channels, our model exploits all the
information contained in the infrared channels of Meteosat 9.
The ANN model was constructed as an ensemble of ve opti-
mized, MLP feed-forward neural networks. The ANN inputs were
the eleven Meteosat-9 channels, with 3-km nadir spatial resolution
(approximately 4.5 km at study region latitudes), and an ESRA
clear-sky term.
The ve ANN models were calibrated with 75% of solar radiation
observations at 65 training stations across the entire Andalusian
region during 2008. As an independent validation dataset, we used
10% of observations in these calibration sets and a complete year of
observations at an independent set of 18 stations across the region.
The resulting ensemble ANN model produced accurate estimates of
daily solar radiation at training and testing stations, which were
Fig. 6. Boxplot of overall errors. virtually unbiased and had RMSEs between 4% and 9%. Overall
RMSE at independent stations was 6.74%. The model outperformed
previous solar radiation estimates on a daily basis, for both cloudy
(Section 3.2.1.) [8]. Small KC values mean that incoming radiation is days and clear-sky conditions.
much less than that expected under clear-sky conditions, i.e., they Homogeneity of RMSE and MBE values at nearly all the ground
correspond to overcast conditions. High KC values (near 1 or even stations conrms that the ANN-E successfully learned the complex
greater) relate to clear-sky conditions. Fig. 8 shows a histogram of relationship between input variables and solar radiation, and it can
KC values for the training dataset. A nearly identical histogram was be used to predict GD at any location within the study region.
obtained for testing stations. The histogram reveals the high per- Therefore, it can be used to generate accurate solar radiation maps
centage of clear-sky days in Andalusia over a year. region-wide, at the same spatial resolution as Meteosat images
Results are listed in Table 3. Solar radiation estimates are sorted (4.5 km at the latitudes of the study region), that could play a
into three sets: KC < 0.5, 0.5  KC  0.8 and KC > 0.8, corresponding strategic role in the location of solar energy applications in the
to overcast, intermediate and clear sky conditions, respectively. study region.
As expected, ANN-E generated the poorest solar radiation esti- Future work will focus on improving the temporal resolution of
mates in overcast conditions e rRMSE of 21.2%, against 5.13% for estimates, taking advantage of the high temporal sampling of sat-
clear-sky days. Nevertheless, these errors for the least favourable ellite imagery (15 min). The aim is to continue improving methods

Fig. 7. Error values of estimates grouped in quarters of a year, for a) training and b) testing stations.
644 A. Linares-Rodriguez et al. / Energy 61 (2013) 636e645

Fig. 8. Histogram of KC values.

Table 3
Error values of ANN-E model, grouped by clear-sky index values.

Training stations Testing stations

KC < 0.5 0.5  KC  0.8 KC > 0.8 KC < 0.5 0.5  KC  0.8 KC > 0.8

Num. of values 2789 4993 15,810 782 1349 4439


MBE (MJ/m2) 0.39 0.36 0.19 0.35 0.36 0.12
RMSE (MJ/m2) 1.14 1.55 11.07 1.33 1.53 1.12
rRMSE 17.69% 11.07% 5.12% 21.2% 10.96% 5.13%

to obtain more accurate solar radiation estimates every 15 min, by [9] Zarzalejo LF, Polo J, Martn L, Ramrez L, Espinar B. A new statistical approach
for deriving global solar radiation from satellite images. Sol Energy
using all information contained in Meteosat channels.
2009;83(4):480e4.
[10] Perez R, Seals R, Zelenka A. Comparing satellite remote sensing and ground
network measurements for the production of site/time specic irradiance
Acknowledgements data. Sol Energy 1997;60(2):89e96.
[11] Ineichen P, Perez R. Derivation of cloud index from geostationary satellites
and application to the production of solar irradiance and daylight illuminance
This work was supported by the Spanish Ministry of Science and
data. Theor Appl Climatol 1999;64(1e2):119e30.
Technology (Project CGL2011-30377-C02-01). The data were kindly [12] Perez R, Ineichen P, Moore K, Kmiecik M, Chain C, George R, et al. A new
provided by the regional ofce of Agriculture and Fishing of operational model for satellite-derived irradiances: description and valida-
Andalusia. tion. Sol Energy 2002;73(5):307e17.
[13] Cebecauer T, Suri M, Perez R. High performance MSG satellite model for
operational solar energy applications. In: Proc ASES annual conference,
Phoenix, AZ 2010.
References [14] Perez R, Kivalov S, Zelenka A, Schlemmer J, Hemker Jr K. Improving the per-
formance of satellite-to-irradiance models using the satellites infrared sen-
[1] Alsamamra H, Ruiz-Arias JA, Pozo-Vzquez D, Tovar-Pescador J. A comparative sors. In: Proc ASES annual conference, Phoenix, Arizona 2010.
study of ordinary and residual kriging techniques for mapping global solar [15] Zelenka A, Czeplak G, dAgostino V, Josefson W, Maxwell E, Perez R. Tech-
radiation over southern Spain. Agric For Meteorol 2009;149(8):1343e57. niques for supplementing solar radiation network data. Technical report.
[2] Ruiz-Arias JA, Tovar-Pescador J, Pozo-Vazquez D, Alsamamra H. A comparative Krahbuhlstrasse, 58, CH-8044 Zurich, Switzerland: International Energy
analysis of DEM-based models to estimate the solar radiation in mountainous Agency, # IEA-SHCP-9D-1, Swiss Meteorological Institute; 1992.
terrain. Int J Geogr Inf Sci 2009;23(8):1049e76. [16] Zelenka A, Perez R, Seals R, Renn D. Effective accuracy of satellite-derived
[3] Ruiz-Arias JA, Pozo-Vzquez D, Santos-Alamillos FJ, Lara-Fanego V, Tovar- hourly irradiances. Theor Appl Climatol 1999;62(3):199e207.
Pescador J. A topographic geostatistical approach for mapping monthly mean [17] Janjai S, Laksanaboonsong J, Nunez M, Thongsathitya A. Development of a
values of daily global solar radiation: a case study in southern Spain. Agric For method for generating operational solar radiation maps from satellite data for
Meteorol 2011;151(12):1812e22. a tropical environment. Sol Energy 2005;78(6):739e51.
[4] Noia M, Ratto CF, Festa R. Solar irradiance estimation from geostationary [18] Janjai S, Pankaew P, Laksanaboonsong J, Kitichantaropas P. Estimation of solar
satellite data: I. Statistical models. Sol Energy 1993;51(6):449e56. radiation over Cambodia from long-term satellite data. Renew Energy
[5] Noia M, Ratto CF, Festa R. Solar irradiance estimation from geostationary 2011;36(4):1214e20.
satellite data: II. Physical models. Sol Energy 1993;51(6):457e65. [19] Linares-Rodrguez A, Ruiz-Arias JA, Pozo-Vzquez D, Tovar-Pescador J. Gen-
[6] Pinker RT, Frouin R, Li Z. A review of satellite methods to derive surface eration of synthetic daily global solar radiation data based on ERA-Interim
shortwave irradiance. Remote Sens Environ 1995;51(1):108e24. reanalysis and articial neural networks. Energy 2011;36(8):5356e65.
[7] Cano D, Monget JM, Aubuisson M, Guillard H, Regas N, Wald L. A method for [20] Simmons A, Uppala S, Dee D, Kobayashi S. ERA-Interim: new ECMWF rean-
the determination of global solar radiation from meteorological satellite data. alysis products from 1989 onwards. ECMWF Newsletter No 11; 2007.
Sol Energy 1986;37:31e9. [21] Benghanem M, Mellit A, Alamri SN. ANN-based modelling and estimation of
[8] Rigollier C. The method Heliosat-2 for deriving shortwave solar radiation from daily global solar radiation data: a case study. Energy Convers Manag
satellite images. Sol Energy 2004;77(2):159e69. 2009;50(7):1644e55.
A. Linares-Rodriguez et al. / Energy 61 (2013) 636e645 645

[22] Lu N, Qin J, Yang K, Sun J. A simple and efcient algorithm to estimate daily global [46] Dorvlo ASS, Jervase JA, Al-Lawati A. Solar radiation estimation using articial
solar radiation from geostationary satellite data. Energy 2011;36(5):3179e88. neural networks. Appl Energy 2002;71:307e19.
[23] Reddy K. Solar resource estimation using articial neural networks and [47] Lopez G, Batlles FJ, Tovar-Pescador J. Selection of input parameters to model
comparison with other correlation models. Energy Convers Manag direct solar irradiance by using articial neural networks. Energy 2005;30(9):
2003;44(15):2519e30. 1675e84.
[24] Tymvios F, Jacovides C, Michaelides S, Scouteli C. Comparative study of Ang- [48] Behrang MA, Assareh E, Ghanbarzadeh A, Noghrehabadi AR. The potential of
strms and articial neural networks methodologies in estimating global different articial neural network (ANN) techniques in daily global solar ra-
solar radiation. Sol Energy 2005;78(6):752e62. diation modeling based on meteorological data. Sol Energy 2010;84(8):1468e
[25] Elminir H, Azzam Y, Younes F. Prediction of hourly and daily diffuse fraction 80.
using neural network, as compared to linear regression models. Energy [49] Koca A, Oztop HF, Varol Y, Koca GO. Estimation of solar radiation using arti-
2007;32(8):1513e23. cial neural networks with different input parameters for Mediterranean
[26] Jiang Y. Computation of monthly mean daily global solar radiation in China region of Anatolia in Turkey. Expert Syst Appl 2011;38(7):8756e62.
using articial neural networks and comparison with other empirical models. [50] Rigollier C, Bauer O, Wald L. On the clear sky model of the ESRA European
Energy 2009;34(9):1276e83. solar radiation atlas with respect to the Heliosat method. Sol Energy
[27] Kalogirou SA. Applications of articial neural networks for energy systems. 2000;68(1):33e48.
Appl Energy 2000;67:17e35. [51] Batlles F, Bosch J, Tovar-Pescador J, Martinez Durban M, Ortega R, Miralles I.
[28] Kalogirou SA. Articial neural networks in renewable energy systems appli- Determination of atmospheric parameters to estimate global radiation in
cations: a review. Renew Sustain Energy Rev 2001;5:373e401. areas of complex topography: generation of global irradiation map. Energy
[29] Mellit A, Kalogirou S. Articial intelligence techniques for photovoltaic ap- Convers Manag 2008;49(2):336e45.
plications: a review. Prog Energy Combust Sci 2008;34(5):574e632. [52] Ineichen P. Comparison of eight clear sky broadband models against 16 in-
[30] Senkal O. Modeling of solar radiation using remote sensing and articial dependent data banks. Sol Energy 2006;80(4):468e78.
neural network in Turkey. Energy 2010;35(12):4795e801. [53] Bosch J, Lopez G, Batlles F. Daily solar irradiation estimation over a moun-
[31] Zarzalejo L. Articial intelligence techniques applied to hourly global irradiance tainous area using articial neural networks. Renew Energy 2008;33(7):
estimation from satellite-derived cloud index. Energy 2005;30(9):1685e97. 1622e8.
[32] Polo J, Zarzalejo L, Ramirez L, Espinar B. Iterative ltering of ground data for [54] Yacef R, Benghanem M, Mellit A. Prediction of daily global solar irradiation
qualifying statistical models for solar irradiance estimation from satellite data. data using Bayesian neural network: a comparative study. Renew Energy
Sol Energy 2006;80(3):240e7. 2012;48(0):146e54.
[33] Iqbal M. An introduction to solar radiation. Toronto; New York: Academic [55] Calcagno V, de Mazancourt C. glmulti: an R package for easy automated model
Press; 1983. selection with (generalized) linear models. J Stat Softw 2010;34(12).
[34] MSG level 1.5 image data format description. EUM/MSG/ICD/105; 2005 (3). [56] Akaike H. A new look at the statistical model identication. Autom Control
[35] Effective radiance and brightness temperature relation for Meteosat-8 and 9. IEEE Trans 1974;19(6):716e23.
EUM/OPS-MSG/TEN/08/0024http://www.eumetsat.int; 2008. [57] Goldberg DE. Genetic algorithms in search, optimization and machine
[36] A planned change to the MSG level 1.5 image product radiance denition, learning. Addison-Wesley Longman Publishing Co., Inc.; 1989.
v1A. EUM/OPS-MSG/TEN/06/0519; 2007. [58] Trevino V, Falciani F. GALGO: an R package for multivariate variable selection
[37] Radiometric calibration of MSG SEVIRI level 1.5 image data in equivalent using genetic algorithms. Bioinformatics 2006;22(9):1154e6.
spectral blackbody radiance. EUM/OPS-MSG/TEN/03/0064; 2007. [59] Sharkey AJC. Combining articial neural nets: ensemble and modular multi-
[38] Schmetz J, Pili P, Tjemkes S, Just D, Kerkmann J, Rota S, et al. An introduction to net systems. New York: Springer-Verlag; 1999.
Meteosat second generation (MSG). Bull Am Meteorol Soc 2002;83(7):977e92. [60] Estvez J, Gaviln P, Girldez JV. Guidelines on validation procedures for
[39] https://eoportal.eumetsat.int. [accessed 12.07.13]. meteorological data from automatic weather stations. J Hydrol 2011;402(1e
[40] http://code.google.com/p/nctoolbox/. [accessed 12.07.13]. 2):144e54.
[41] Bishop CM. Neural networks for pattern recognition. Oxford University Press, [61] Mefti A, Adane A, Bouroubi MY. Satellite approach based on cloud cover
Inc.; 1995. classication: estimation of hourly global solar radiation from Meteosat im-
[42] The MathWorks, Inc., 3 Apple Hill Drive, Natick, MA 01760-02098, USA. ages. Energy Convers Manag 2008;49(4):652e9.
[43] Hagan MT, Demuth HB, Beale MH. Neural network design. Boston, MA: PWS [62] Roerink GJ, Bojanowski JS, de Wit AJW, Eerens H, Supit I, Leo O, et al. Evalu-
Publishing; 1996. ation of MSG-derived global radiation estimates for application in a regional
[44] MacKay DJC. Bayesian interpolation. Neural Comput 1992;4(3):415e47. crop model. Agric For Meteorol 2012;160:36e47.
[45] Foresee FD, Hagan MT. Gauss-Newton approximation to Bayesian regulari- [63] Bosch JL, Batlles FJ, Zarzalejo LF, Lpez G. Solar resources estimation
zation. In: Proceedings of the 1997 international joint conference on neural combining digital terrain models and satellite images techniques. Renew
networks 1997. p. 1930e5. Energy 2010;35(12):2853e61.

Vous aimerez peut-être aussi