Vous êtes sur la page 1sur 7

GUIDELINES: ICIPE INNOVATIVE SEED GRANTS FINAL REPORT

1. Project Title
Towards the development of a multipurpose classifier modeling tool based on Random Forest for
successful intervention of icipes R&D activities (short Random Forest innovations project)
2. icipe Project Number
B5125A-997
3. Reporting Period
March 2014 to February 2015.
4. Project Coordinator and Project Scientists
Dr. Tobias Landmann; Dr. Henry Tonnang, Richard Kyalo (all icipe)
5. Collaborating Institutions and Staff including NARS and other Partners
For the land use mapping part of this project there are no official external partners.
6. Project Description
This projects aim to develop the Random forest (RF) machine learning classification algorithm as
a tool to improve icipes research and working agenda. Once established in a programming
environment the random forest tool will aid to sort existing point data on insect (pest)
occurrence into functional categories to aid IPM interventions and also classify reflectance
data from satellite imagery to produce better land use and land cover maps.
The project aims to address the following four main aspects and research questions;

Can we instigate institutional memory in icipe by integrating and storing R&D datasets to
assess our results and impacts, data distributional patterns for IPM activities and integrative
land use mapping?
Can RF in icipe, if established and ready to use, be utilized to effectively bundle and better
document icipes research activities?
Produce improved (semi-automatic) land use change maps and market the RF based mapping
tool as a service for icipe
Establish a foundation for developing a multipurpose software for knowledge based planning
and management of icipes current and future activities

7. Major Research Findings

One of the specific objectives for the project was to produce improved (semi-automatic) agricultural
land use maps and similarly improve modeling and predictions in IPM. The results were encouraging
and summed up as follows;

Using the RF algorithm, we developed, crops could be better distinguished from one another
and also croplands could be better separated from other vegetation and land cover classes.
We used multi-temporal 30-meter Landsat data imagery for Machakos site as a test case. The
enhanced capabilities of RF to also map inconspicuous land cover classes within highly
fragmented landscapes gave a better result than traditional classifiers embedded in most
widely available and used software packages.
The Icipe Classifier + developed will allow icipe staff and users from other institutions to
visualize imagery, perform the RF classification and view the mapping results. Below is a
screen shot of the (user friendly) interface, that was developed in JAVA.

Figure 1: Random forest tool interface

Because not all the satellite data acquisitions are available yet, the RF classification tool was
also tested on aircraft based hyperspectral data over Mwingi. The aim was to explore the
possibilities of improving the mapping of flowers in the landscape, since flowers are very
challenging to detect using traditional and less sophisticated classification tools. The results
show, see below figure 3 below, that flowers can be more accurately mapped using the herein
developed RF tool. An accuracy of 99% was attained using the new tool and reference data
from field observations, whilst conventional results had an accuracy of 95%.
Using Random forest, IPM interventions can be addressed accurately by precisely
determining the best package can be accurately classified and clearly determine which
package best that fits given conditions. The module for IPM has already been tested with
cotton data which gave a low accuracy. This was a result of the size of the available training
data, since for any machine learning algorithm to generate good results large training dataset
should be used. The cotton data had only 250 entries which are not enough for random forest.
The cotton dataset is currently under expansion and will be tested on improved accuracy upon
completion. The figure below (Figure 2) shows the interface for the Random Forest tool for
IPM.

Figure 2: Random forest tool for IPM

Mwingi hyperspectral image

Random forest
Overall accuracy (99%)

Maximum likelihood (traditional classifier)


Overall accuracy (95%)
Yellow flowers
White flowers
Green trees flowers
Bare soil flowers
Grass Land soil flowers

White flowers

Acacia plants Land soil flowers

Figure 3: Comparing various classification outputs

8. Assessment of Research Findings

Concerning the land use classification part of this project, encouraging results were attained
(see results above). However because not all the data is available yet, the classification could
only be tested on bi-temporal Landsat (spaceborne) and one hyperspectral airborne image. As
such, no multi-temporal mapping analysis results are currently available. The delay in the
starting date of the project was due to the HR recruitment process of the R-programmer
position that took longer than expected. Due to the time delay we were not able to collect
accurate reference data for the first cropping season, which was from November 2013 to
February 2014. Other challenges were excessive cloud cover during the last cropping period
and thus non availability of satellite imagery. The non-linear patterns (complexity as in small
and fragmented fields often hidden under trees) across the landscape was another great

challenge. This remains as the key drawback to be addressed for the success of this project
further on.
We had several field campaigns between the month of November 2014 and Februry 2015 in
Machakos and Bomet to score MLND severity and incidence this also included identifying
landuse classes surrounding the maize farms which could possibly give us a clue on the
possible alternative host for the disease to help in linking the disease with cropping system
Random forest application in IPM classification was tested with Cotton data and the results
obtained gave a true reflection that for any machine learning tool to be accurate more data
must be used for training the model.However, splitting the data into training and testing set is
important since by using the same data for training and testing, the model will in many
cases fit the training data almost perfectly (as you are found) when
considered in totality.
The tool is also helpful in the ongoing research for Maize lethal Necrocitic disease
management where we have used it in identifying important indices out of many indices we
have generated. We are currently using it to develop a time series based cropping season
maps for the study area which will enable us to link the MLND disease severity with spatial
and temporal cropping systems. This truly reflects the importance of this Random Forest Tool
to Icipes research activities at large.
A detailed user manual on the system was designed and intergrated in the icipe classifier+
tool to guide the users of the tool.

Figure 4: Random forest tool user manual

The research findings are encouraging. We believe that the newly herein developed Icipe
Classifier+ tool will be of help to ICIPE for improved classification of vast and complex data
sets from all the 4 H-Programs. The implementation of the RF tool in all research areas
within icipe would lead to more effective and better ecological predictions such as habitat
suitability, invasiveness species, and pest and disease distribution in IPM. Moreover, the icipe
Vision 2020 can be supported using the tool developed in this project, in that geo-spatial
mapping of climate change effects as well as risk mapping for concrete interventions in
disease control can be greatly improved.
9. Knowledge Transfer

After the Icipe Classifier + is fully functional, it can be propagated as a key innovation from
icipe and marketed as such. icipe can then sell itself as an incubator for new innovations in
the region. We specifically allocated funds to marketing and training to be able to facilitate
this. Moreover we will derive outreach products such as improved land cover maps to the
remote sensing and IPM community. Specifically we will train and sell the RF mapping
methodology to remote sensing experts at the Regional Centre for Mapping (RCMRD) in
Kasarani, Nairobi. We will also facility a training workshop through the GEOS (Group of
Earth Observation Systems) land cover expert group for Africa, of which the EO unit is a
board member.
It was a great opportunity to show case the application of the tool in a workshop held in
Luxembourg where we presented poster on flower mapping with hyperspectral data using
Random Forest.In addition to this we strongly believe that the tool will help the PhD and
Masters students undertaking there research under ARPPIS and Drip program
Add a sentence on the Workshop in Luxembourg where you marketed the tool.and mention
how many icipe PhD students and icipe projects are/will using/use RF or the new tool
10. Training

Once the Icipe Classifier tool is fully set up, the icipe Earth Observation unit will organize a
training aimed at training ICIPE staff on how to use it to perform classification for both land
use change maps and IPM. This training is scheduled to take place on January 2016, together
with the closing workshop for the MLND GIZ small grants project.
11. Lessons Learned

Remote-sensing based cover mapping has a long history of use in natural resource
management for a wide-range of applications. In order to be effectively employed, remotesensing based cover maps must be accurate and meet the spatial scale inherent to the

phenomena of interest. Moreover, operationalizing extraction of reference data to substitute


manual field data collection will be a great milestone for the project.
Through this project it is very clear that data sets are available, numerous but advanced
andDispite availability of huge database of dataset collected from the field and available for
analysis sophisticated classification methods remains a challenge to be addressed, a good
implication that random forest tool lays foundation for development and integration of
advanced independent data analysis tool to be used for all ICIPE research projects leading to
upscaled potential to all areas in ICIPE where data modeling is needed.There is a great need
to integrate radar data crop mapping since cloud cover is a major problem and the radar data
is independent of the cloud cover. This will foresee more accurate classification and
maximum utilization of the random forest classification capabilities.
12. Future Research Needs

Future research should assess this Random Forests-based supervised remote-sensing


classification approach in other study areas and study ecosystems and assess the role of
varied imagery and geographic data sources and variable selection procedures on mapping
accuracy.
The significance of this work lies not only in generating accurately classified maps, but also
in detecting the seasonal dynamics of LULC maps practices in a complex landscape.
Furthermore, reproducibility of the developed methodology will aid the extension of research
for different time periods and with newer sensors in investigating the patterns and dynamics
of land use and land cover maps.
13. Summary

Random forest has been established to be very efficient for precise land cover mapping across
complex and heterogeneous landscapes and ecological predictions with comparatively
insensitive to noise, making it suitable for application in complex and dynamic land cover
environments. As random forest does not necessitate normally distributed model training
data, its relevance is appropriate for areas where species distributions of ecological
communities follow non-linear patterns across the landscape and where complex terrain
effects data normality which is the case in Africa as a result of excessive cloud cover and
heterogeneous landscape.
Other benefits of Random forest include its relative insensitivity to outliers which is a added
advantages as compared to other methods in land use mapping and IPM modelling.
Furthermore, the Random forest classifier runs efficiently on large datasets, making it
suitable for regional-scale mapping.

RFs flexibility and non-parametric approach with the built-in unbiased accuracy control and
ability to impute missing values makes it particularly useful in ecological prediction. Thus the
tool will be of great use inH ICIPE for enhanced accuracy in classification to and facilitate
precisely determinatione of the best IPM package for intervention in for fruit flies, thrips,
leafmining flies, and also pest and and disease spatial distribution modelling for various
project including the new Citrus project among others.
Add a sentence that numerous projets like Citrus are set to use the tool..

14. Publications, Papers and Reports

Between April 14th and 16th 2015 we presented a poster on the 9th EARsel spetroscopy
conference 2015 held in Luxembourg the title was Random forest classification for flower
mapping using hyperspectral data from the feedback on the conference randomforest stands
to be a powerful non parametric machine learning method to use with hyperspectral data as it
in the case of flower mapping. In addition a publication on the same is in the final writing
stage.

Vous aimerez peut-être aussi