Predicting NDUM Student's Academic Performance Using Data Mining Techniques (2009)

2009 Second International Conference on Computer and Electrical Engineering
Predicting NDUM Student’s Academic Performance Using Data Mining

Techniques
Muslihah Wook Yuhanim Hani Yahaya

Department of Computer Science, Faculty of Science Department of Computer Science, Faculty of Science
and Defence Technology and Defence Technology
National Defence University of Malaysia, 57000 National Defence University of Malaysia, 57000
Kuala Lumpur, Malaysia Kuala Lumpur, Malaysia
muslihah@upnm.edu.my yuhanim@upnm.edu.my
Norshahriah Wahab Mohd Rizal Mohd Isa

shahriah@upnm.edu.my rizal@upnm.edu.my
Nor Fatimah Awang Hoo Yann Seong

norfatimah@upnm.edu.my yannseong@upnm.edu.my
Abstract - The ability to predict the students’ academic student data of Computer Science Department,
performance is very important in institution Faculty of Science and Defence Technology,
educational system. Recently some researchers have National Defence University of Malaysia (NDUM).
been proposed data mining techniques for higher
We will compare two data mining techniques which
education. In this paper, we compare two data mining
techniques which are: Artificial Neural Network (ANN) are: Artificial Neural Network (ANN) and the
and the combination of clustering and decision tree combination of clustering and decision tree
classification techniques for predicting and classifying classification techniques.
students’ academic performance. The data set used in ANN technique is chosen for this research based
this research is the student data of Computer Science on the study done by [6]. The study compared three
Department, Faculty of Science and Defence model, ANN, decision tree and linear regression.
Technology, National Defence University of Malaysia Students’ demographic profile and the CGPA for the
(NDUM). first of the undergraduate studies are used as the
predictor variable for the students’ academic
Keywords- data mining, clustering, decision tree,
performance. The comparison results proved that the
artificial neural network.
ANN able to produce accurate results of students’
academic performance in UiTM, Shah Alam.
I. INTRODUCTION One of the main goals in applying the data
Data mining techniques have been applied in clustering methods was to group students in clusters
many applications such as banking, fraud detection with dissimilar behavior; the students from the same
and telecommunications [1]. Recently the data cluster embrace the closest behavior, and the ones
mining methodologies were used to enhance and from different clusters have the most different one
evaluate the higher education tasks. Some researchers [7]. While the decision tree classification technique
have proposed methods and architectures using data was chosen as suggested by researcher in [8],
mining for higher education [2],[3],[4],[5]. The aim classification is the most modeling function to be
of this research is to identify the attributes that used since it can be used to find the relationship
influence and affect the performance of between a specific variable, target variable and other
undergraduate students after their first year degree variables. By combining these two techniques, we
examinations. The data set used in this research is the will apply a two-phase data mining based method in
978-0-7695-3925-6/09 $26.00 © 2009 IEEE 359

357
DOI 10.1109/ICCEE.2009.168
Authorized licensed use limited to: UNIVERSITY PUTRA MALAYSIA. Downloaded on July 19,2010 at 02:45:54 UTC from IEEE Xplore. Restrictions apply.
such a way that the result of clustering is the input to resources with an accurate estimate of how many
the decision tree classification. students will take a particular course.”
This paper is organized as follows: Section 2
briefly describes the problem statement of this B. Data Mining in Higher Education
research, section 3 describes the background of this Universities are one of institution that have many
research, section 4 details the methodology, and data such as regarding the enrolment of students each
finally the conclusions and further research are year, academic performance, alumni etc. Usually, the
outlined. past data has not been used since they do not realize
II. PROBLEM STATEMENT what the hidden values behind the data are and they
Undergraduate student’s performance is a long do not know how to use the data as well as why these
standing issue in higher education and a great deal of data are so important for the future use. Therefore,
research over the past 75 years [9]. At the end of each these institutions require an important amount of
semester, students’ result will be analyzed in order to significant knowledge mined from its past and
evaluate students’ academic performance. At NDUM current data sets using special methods and processes
the Academic Affair is responsible in managing the [14]. Then, when data mining was introduced, the
examination and the results of the students. It has application of data mining techniques was boost in
been observed that most of the student’s performance many areas such as business, telecommunication and
is not encouraging in which only small number of banking as well as educational area.
students obtained high Grade Point Average (GPA). In the educational area, data mining was defined
The analyzed results show students are apparently as “the process of converting raw data from
weak in certain grouping of courses which contribute educational systems to useful information that can be
to poor GPA. Due to the scenario, tendency of used to inform design decisions and answer research
students to churn or quit from the university is high. questions” [15]. According to [16] data mining is an
This situation could introduce bad image to NDUM analytic approach that “capitalizes on the advances of
especially and Defence Ministry generally. technology and the extreme richness of data in higher
education for improving research and decision
III. BACKGROUND making through uncovering hidden trends and
A. Data Mining patterns that lend them to predicative modeling using
Gartner Group define data mining as “the process a combination of explicit knowledge base,
of discovering meaningful new correlations, patterns, sophisticated analytical skills and academic domain
and trends by sifting through large amounts of data knowledge”.
stored in repositories and by using pattern recognition
technologies as well as statistical and mathematical C. Students’ Academic Performance
techniques.” Data mining does not intend to replace The understanding, prediction and prevention of
traditional statistics. Rather, data mining is an the academic failure among students have long been
extension of statistics, and statistics is an integral debated for each higher education institution. Such
component in data mining [10],[11]. Data mining study that has been done by [17] attempted to classify
actually is a combination of machine learning, students into three groups: the 'low-risk' students,
statistical analysis, modeling techniques and database with a high probability of succeeding; the 'medium-
technology. Thus, data mining capable to finds risk' students, who may succeed thanks to the
patterns and subtle relationships in data and infers measures taken by the university; and the 'high-risk'
rules that allow the prediction of future results. students, with a high probability of failing (or
Meanwhile, according to [12], “data mining is the dropping out). As consequences, the gaining results
process of automatically extracting useful able to classify students into three groups and
information and relationships from immense therefore the educator will able to give more attention
quantities of data. In its purest form, data mining to the ‘high-risk’ students such as extra classes, test,
doesn't involve looking for specific information. tutorial and etc. At the same time, this process
Rather than starting from a question or a hypothesis, facilitates the drawing up the students’ profile based
data mining simply finds patterns that are already on their academic performance and failure risk.
present in the data.” Author in [13] said that “these Another study has been conducted by [2]. They
patterns are then built into data mining models and developed a model allows the decision makers to
used to predict individual behavior with high better predict which students are less likely to
accuracy. For example, data mining may give an perform well in that specific course, or those who are
institution the information necessary to take action less likely to be successful in it.
before a student drops out, or to efficiently allocate
358
360
IV. METHODOLOGY than others and provide academic help for those who
In the data mining literature, various "general are less likely to be successful.
frameworks" have been proposed to serve as
blueprints for how to organize the process of B. Data Collection
gathering data, analyzing data, disseminating results, The data used for this research is the student data
implementing results, and monitoring improvements. of Computer Science Department, Faculty of Science
One such model, CRISP-DM (Cross-Industry and Defence Technology, National Defence
Standard Process for data mining) was proposed in University of Malaysia (NDUM). This research will
the mid-1990s by a European consortium of focus on 85 students’ intake Sem I 2008/2009. We
companies to serve as a non-proprietary standard use primary data in order to complement the
process model for data mining. The CRISP-DM secondary data of the students. The primary data is
methodology consists mainly of six steps: the relevant features from each of student that must
understanding the higher education objective, be collected using a questionnaire. The following is
collecting the educational data, preparing the data, a partial list of the groups of features (fields) selected
building the models, evaluating the model using one for this studies. They are:
of the evaluation methods, and finally deployment • Demographics: age, gender, religion, race,
which using the model for future prediction of the secondary school, home town etc.
student performance. Figure 1 shows the research • Education background: mode of entry
framework for this study. (SPM/STPM/Matriculation), previous
qualification results, MUET’s score,
computer skill, name and number of courses
taken, total credit taken, majoring, number
of course repetition etc.
• Personality: motivation of study, reading
level, learning environment and style,
interest etc.
While the secondary data is about the detail of
students’ previous results such as CPA, CGPA,
Grade Points by course type etc. that obtained from
the Academic Affair, NDUM.
C. Data Preparation
During data collection, the relevant data is
gathered and the quality of data must be verified.
Figure 1. Research Framework Usually, the assembled data contains of missing or
incomplete attribute, noisy (containing errors, or
A. Project Understanding outlier values that deviate from expected), and
The initial step is the understanding of project inconsistent of data are common. Therefore, the
domain mainly regarding to the students’ academic collected data must be cleaned and transformed
performance. This area of study is very complex and before it can be utilized in data mining system since
continuous processes needed to be focus on. The data mining should process cleaned data in order to
exam failure among NDUM students must be come out with better and or quality results. Data
investigated, predicted and prevented in order to cleaning involves several of processes such as filling
obtain the high quality of students graduated from in missing values; smoothing noisy data, identifying
this university. We have set our main objective is to or removing outliers, and resolving inconsistencies.
choose the best technique that able to be as a model Then, the cleaned data are transformed into a form of
to predict students’ performance based on their table that is suitable for data mining model. The
academic result. The model should be able to classify cleaned data will be divided into two; training or
students into groups of successful and unsuccessful learning data (60%) and the rest is for validating the
students. Therefore the knowledge that can be data. These training data is applied to develop the
extracted from this process is the patterns of model while the validated data is used to verify the
previously successful and unsuccessful students. By chosen model.
identifying these students known, we are able to
decide which type of students are more successful D. Modeling
As mention earlier, we proposed two techniques
which are best suited in reaching our main objectives,
359
361
mainly neural network and combination of clustering cluster centers. All instances were assigned to their
and decision tree techniques. The gaining results closest cluster center according to the ordinary
from each of the techniques will be compared and the Euclidean distance metric. Next the centroid of the
best technique will be chosen as the model of this instances in each cluster was calculated, and these
research. The descriptions of the two techniques are centroids were taken to be new center values for their
as follow: respective clusters. Finally, the whole process was
repeated with the new cluster centers. Iteration
i. Artificial Neural Network continued until the same points were assigned to each
Neural networks offer a mathematical model that cluster in consecutive rounds, at which stage the
attempts to mimic the human brain [25]. Knowledge cluster centers have stabilized and would remain the
is represented as a layered set of interconnected same [20].
nodes. The input to individual neural network nodes Unfortunately, the cluster model has one
must be numeric and fall in the closed interval range drawback; there are no explicit rules to define each
from 0 to 1 [25]. Each attribute of students must be cluster. The model obtained by clustering is thus
normalized such as age must be divided by 100. difficult to implement, and there is no clear
While the student’s gender and race are identified by understanding of how the model assigns clusters IDs
binary inputs. Neural network technologies such as or centroid value [21]. Therefore, we propose to
feed forward networks as illustrated in Figure 2 employ the decision tree that may give a simpler
(often referred to as back propagation nets) have model of classes. A decision tree is tree-shaped
demonstrated promising capability for prediction [22, structure that represents sets of decisions. These
23, 24]. In attempts to predict student’s academic decisions generate rules for the classification of a
performance, student’s data such as demographics, dataset. Trees develop arbitrary accuracy and use
educational background and their personality must be validation data sets to avoid spurious detail [21].
considered and transformed into the required range They are easy to understand and modify. Moreover,
from 0 to 1. The input data of students from the input the tree representative is more explicit, easy-to-
layer will be calculated using the sigmoid function understand rules for each cluster of student’s
then the value of the attributes will be transfer to the performance. The classes in the decision tree are
hidden layer and lastly the output layer will appear cluster IDs obtained in the first step of the method.
the prediction value of the student’s performance The decision tree represents the knowledge in the
either successful or unsuccessful profile. form of IF-THEN rules. Each rule can be created for
each path from the root to a leaf. The leaf node holds
the class prediction [21].
E. Evaluation
Before proceeding to final deployment of the
model, it is important to evaluate the model. This step
is very significant since the representative of the
model purposely is to predict the students’ academic
Figure 2. Feed-Forward Neural Network [25] performance must be proven. Then, a decision on the
use of the data mining results should be reached.
ii. Clustering and Decision Tree Moreover, there are major challenges to cultivating
Unsupervised clustering technique can be the institutional for best practices for using this
described as the process of organizing objects in a model. Therefore, the researchers are restricted to
database into clusters/groups such that objects within maintaining and updating the model usage
the same cluster have a high degree of similarity, concurrently with the associative data of students
while objects belonging to different clusters have a since students data are always change for each
high degree of dissimilarity [19]. For the clustering semester and year.
process we utilized the FarthestFirst method based on
K-means algorithm. We specified the parameter k, F. Deployment
the number of clusters to be sought. For this theme As the final stage in CRISP-DM, new data sets
the k parameter was 2, corresponding to the two will be applied to the model selected in the model
groups of students we were interested in building the building stage to generate predictions or estimates of
successful and unsuccessful student profiles: the ones the expected outcome. Hence, a deployment of neural
who passed all exams and the ones who failed one or network or combination of clustering and decision
more exams. Then k points were chosen at random as tree model is focuses on making information and
insights available reliably to the educational
360
362
institution. The reporting of the student’s prediction [10] Luan, J. (2003) “Developing learner concentric
will give a lot of benefits to students as well as the learning outcome typologies using clustering and
institutional. For example, if there are high number of decision trees of data mining”, Presentation at 43rd
students that already fail in the current semester, the AIR Forum, Tampa, FL.
[11] Zhao, C., & Luan, J. (2006). “Data mining: Going
institutional should take a necessary action to prevent
beyond traditional statistics”, In J. Luan, & C. M.
the students from getting fail in the next semester Zhao, (Eds), Chapter 1 of Data mining in action: Case
such as doing an intensive class or extra work and studies of enrollment management, New Directions
exercise to the student. for Institutional Research, No. 131. San Francisco:
Jossey-Bass.
V. CONCLUSION [12] Rubenking, N. (2001) “Hidden Messages”, PC
Predicting students’ academic performance is Magazine.
great concern to the higher education. Recently data [13] Luan, J. (2004) “Data Mining Applications in Higher
mining can be used in a higher educational system to Education”, SPSS Exec. Report.
http://www.spss.com/home_page/wp2.htm
predict the students’ academic performance. This
[14] Bresfelean V.P. (2009) “Data Mining Applications in
research attempts to use data mining techniques to Higher Education and Academic Intelligence, Theory
predict and classify students’ academic performance and Novel Applications of Machine Learning”, Book
in NDUM. Two techniques will be compared: edited by: Meng Joo Er and Yi Zhou, ISBN 978-3-
Artificial Neural Network (ANN) and the 902613-55-4, pg. 376, I-Tech, Vienna, Austria.
combination of clustering and decision tree [15] Heiner, C., Baker, R., Yacef, K. (2006), Preface. In:
classification techniques. The technique that gives Workshop on Educational Data Mining at the 8th
accurate prediction and classification will be chosen International Conference on Intelligent Tutoring
as the model for this research. Using the proposed Systems (ITS 2006), Jhongli, Taiwan.
[16] Luan, J. (2002) “Data mining: Predictive modeling &
model, the patterns that influence or affect the
clustering essentials”, Presentation at the 44th AIR
student’s academic performance will be identified. Forum, Toronto, Canada.
[17] Vandamme J.P., Meskens N., Superby J.F. (2007)
REFERENCES “Predicting Academic Performance by Data Mining
[1] Han, J., Kamber, M. (2001) “Data Mining: Concepts Methods”, Education Economics, Volume 15, Issue 4,
and Techniques”. Morgan Kaufmann Publishers. pg. 405 – 419.
[2] Delavari, N., Beikzadeh, M.R. (2004) “A New Model [18] Kalles D., Pierrakeas C.(2004) “Analyzing student
for Using Data Mining in Higher Educational performance in distance learning with genetic
System”, 5th International Conference on Information algorithms and decision trees”, Hellenic Open
Technology based Higher Education and Training: University, Patras, Greece.
ITEHT ’04, Istanbul, Turkey, 31st May-2nd Jun 2004. [19] San, O.M., Huynh, V.N., Nakamori, Y. (2004) “An
[3] Varapron, P. et al. (2003) “Using Rough Set theory for Alternative Extension of The K-Means Algorithm for
Automatic Data Analysis”. 29th Congress on Science Clustering Categorical Data”, Int. J. Appl. Math.
and Technology of Thailand. Comput. Sci., Vol. 14, No. 2, p. 241–247.
[4] Mierle, K., Laven, K., Roweis, S., Wilson, G. (2005) [20] Bresfelean, V.P., Bresfelean, M., Ghisoiu, N. (2008),
“Mining Student CVS Repositories for Performance “Determining Student’s Academic Failure Profile
Indicators”. Founded on Data Mining Methods”, Proceedings of
[5] Delavari, N., Beikzadeh, M.R., Amnuaisuk, S. (2005) the ITI 2008 30th Int. Conf. on Information
“Application of Enhanced Analysis Model for Data Technology Interfaces, June 23-26, Cavtat, Croatia, p.
Mining Processes in Higher Educational System” 6th 317 – 322.
Annual International Conference: ITEHT , July 7-9, [21] Borzemski, L., (2006) “The Use of Data Mining to
2005, Juan Dolio, Dominican Republic. Predict Web Performance”, Cybernetics and Systems:
[6] Ibrahim, Z. and Rusli, D. (2007) “Predicting Students’ An International Journal, 37: p. 587–608.
Academic Performance: Comparing Artificial Neural [22] Lapedes, A. and Farber, R., (1988), "How neural nets
Network, Decision Tree and Linear Regression”, 21st work," Evolution, Learning, and Cognition, pages
Annual SAS Forum. 331-345, World 10Scientific, Singapore.
[7] Bresfelean, V.P., Bresfelean, M., Ghisoiu, N. (2006) [23] Moody, J., (1989), "Fast learning in multi-resolution
“Continuing education in a future EU member, hierarchies," Advances in Neural Information
analysis and correlations using clustering techniques”, Processing Systems, volume 1, pages 29-39, Denver,
Proceedings of EDU'06 International Conference, Morgan Kaufmann, San Mateo.
Tenerife, Spain, pg. 195-200. [24] Werbos, PJ., (1990), "Backpropagation through time:
[8] Delavari, N. (2005) “Application of Enhanced What it does and.. how to do it," Proceedings of the
Analysis Model for Data Mining Processes in Higher IEEE, volume 78, p. 1550-1560.
Educational System”, IEEE. [25] R.J. Roiger and M.W. Geatz, (2003), “Data Mining: A
[9] Reason, R.D. (2003), “Student Variables That Predict Tutorial-based Primer. U.S: Addison-Wesley, p. 246 –
Retention: Resent Research and New Development”, 250.
NASPA Journal, pg 172 – 191.
361
363

Predicting NDUM Student's Academic Performance Using Data Mining Techniques (2009)

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Predicting NDUM Student's Academic Performance Using Data Mining Techniques (2009)

Transféré par

Droits d'auteur :

Formats disponibles

2009 Second International Conference on Computer and Electrical Engineering