Vous êtes sur la page 1sur 4

International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 4 Issue: 3

ISSN: 2321-8169
400 - 403


A Survey of Effective Techniques for SubjectiveTest Assessment

Ankita Patil

Prof. Achamma Thomas

Department of Computer Science Engineering

G.H.Raisoni College of Engineering
Nagpur, India.
Email- ankitapatil999@gmail.com

Department of Master of Computer Application

G.H. Raisoni College of Engineering
Nagpur, India.
Email- achamma.thomas@raisoni.net

AbstractSubjective test is rarely used for the assessment of online test examinations. In online examination, objective test exams are already
available but the subjective test exams are in need which is considered as the best way in terms of understanding and knowledge. This paper
presents a survey on the effective techniques for subjective test assessment. In this, the answers are unstructured data which have to be
evaluated. The evaluation is based on the semantic similarity between the model answer and the user answer. Different techniques are compared
and a new approach isproposed to evaluate the subjective test assessment of text.
Keywords: Subjective test assessment; Online examinations; Semantic Similarity; Evaluation.



Although assessment is a tough job, but it can be helpful by

making it computerized. Normally, examinations are of two
types objective type such as multiple choice questions(MCQs)
and subjective type such as descriptive answers pattern.
Nowadays online examinations held are multiple choice
questions as bank exams, GRE, GMAT, AIEEE, etc. These
examinations are usually MCQs, where the answers are
selected out of the given options. The multiple choice is a
form of assessment in which respondents are asked to select
the best possible answer (or answers) out of the choices from a
list. If guessing an answer, there's usually a 25 percent chance
of getting it correct on a 4 answer choice question. Finding the
right answer from multiple choices can be automated using
multiple choice question answering systems. The multiple
choice format is most frequently used in educational testing, in
market research, and in elections, when a person chooses
between multiple candidates, parties, or policies. But this
multiple choice have many disadvantages such as it has the
limited types of knowledge that can be assessed by multiple
choice tests. Multiple choice tests are best adapted for testing
well-defined or lower-order skills. Problem-solving and
higher-order reasoning skills are better assessed through shortanswer and essay tests.
Another disadvantage of multiple choice tests is the
examinee's interpretation of the item. Failing to interpret
information as the test maker intended can result in an
"incorrect" response, even if the students response is
potentially valid. In addition, even if students have some
knowledge of a question, they receive no credit for knowing
that information if they select the wrong answer. Similarly, if a
student who is incapable of answering a particular question
can simply select a random answer and still have a chance of
receiving a mark for it. It is common practice for students with
no time left to give all remaining questions random answers in
the hope that they will get at least some of them right. In this
method, the score is reduced by the number of wrong answers

divided by the average number of possible answers for all

For many mentors, evaluating the questions and scoring of
questions is a difficult task. Ranking of marks is based on the
observations, understanding and explanation of specified
answer, essential terminologies set by the teacher. During the
major assessment, the teachers are overloaded with large
number of answersheets. Due to which assessment becomes
difficult for teachers and causes stress, strain and mental
By the new initiations in the technology, there are many
innovations in natural language processing and information
extractions, which constitute specific categories of free-text
questions in automated tests that makes scoring now
achievable. Advantages of computerized tests scoring
comprises of time and price savings, reduces deficiency of
steadiness. As compared to objective type, descriptive pattern
is more reliable. Students can write own answers, it can also
permits students to put across their thoughts in answers, put
their responses to the questions
produces their own
assumptions. This can increase the capabilities and talents of
the student. Computerized evaluation of these subjective text
may have difficulties but there are many algorithms that can
be used to evaluate those answers.


The proposed approach is divided into two parts, first part is

keyword extraction of words and second is semantic similarity
of words.This paper presents the survey on the different
techniques of keyword extraction and semantic similarity and
best suited method for the proposed work.
A. Keyword Extraction of words
Keyword extraction is basically information retrieval which
automatically identifies the best terms in the given document.
These terms can be key phrases, key terms or just keywords.
In this approach keywords are extracted. Keywords are easy to

IJRITCC | March 2016, Available @ http://www.ijritcc.org


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 4 Issue: 3

ISSN: 2321-8169
400 - 403

define as they are widely used within the information
Example: Clustering is the process of grouping the data into
classes or clusters.
The keywords of the above example can be clustering,
process, grouping, data, classes,clusters.Keyword extraction
also improves the quality of document that are mentioned in
the text. Words that are occurred in the document are
analyzed to represent the most appropriate words.The
techniques below shows the survey of keyword extraction of
words from large text document.
a. Term Frequency- Inverse Document Frequency(TF-IDF)
TF-IDF is the weighing factor in information retrieval and text
mining. It evaluates the important word in the corpus of large
text. Term Frequency(TF) is the number of times the word
appears in the document and Inverse Document
Frequency(IDF) is the weight to measure the importance of
term in text document. Weighing is generally multiplying the
IDF by TF as TF*IDF to filter out common terms. It can be
calculated as
tfidf(t, d, D) = tf(t, d) * idf(t, D)(1)
Menaka Sand Radha N, [1] have classified the text using
keyword extraction. The keywords are extracted using TF-IDF
and WordNet. TF-IDF algorithm is used to select the words
and WordNet is the lexical database of English used to find the
similarity among the words. In this proposed work, the word
which have the highest similarity are selected as keywords.
Sungjick Lee and Han-joonKim[2] proposed conventional TFIDF model for keyword extraction. It involves cross domain
filtering and table term frequency(TTF) for extraction. Ari
Aulia Hakim, Alva Erwin, Kho I Eng, MaulahikmahGalinium,
andWahyuMuliady[3] works on the TF-IDF algorithm which
create a classifier that can classify the online articles. Stephen
Robertson[4], explains the understanding concepts of IDF.
b. Conditional Random Fields(CRF)
CRF is the probabilistic framework for segmenting the
structured data and labeling. The basic idea of conditional
sequence.JasmeenKaur and Vishal Gupta[5] presents CRF
model for keyword extraction as the suitable model that is
efficient for keyword extraction.Feng Yu, Hong-weiXuan and
De-quanZheng[6] works on the CRF model to extract the key
phrase and uses SVM model to build classification model. The
experimental result shows the method better as compared to
other machine learning approach.ChengzhiZhang,HuilinWang,
Yao Liu1, Dan Wu, Yi Liao and Bo Wang[7] have proposed
and implemented CRF model to extract keywords.
c. Query Focused keyword extraction
Keywords are correlated with the query sentences and
calculates the query by relating feature which obtains
important words. Query is calculated by words w1 and w2 of
length k words. It works on query, sentence pruning followed
by query related feature and keywords are extracted. Liang
Ma, Tingting He, Fang Li, ZhuominGui and JinguangChen[8]
proposed a strategy which summarized sentence using query
focused multi-document and extract the keywords. Massih R.

Aminiand Nicolas Usunier[9] presents the idea which expands

the keywords with their respective cluster terms. Each
sentence is characterized by features and each sentence
compared the similarities with the current sentence.Claudio
CarpinetoAnd Giovanni Romano, FondazioneUgoBordoni[10]
discussed the automatic keyword expansion for information
B. Semantic Similarity of words
Semantic similarity of words is set of documents or terms,
where the distance between the two terms are measured based
on their meaning. The two terms are semantically similar, if
their meanings are close, or if the concepts or objects
represents common attributes.
Example 1 : Google and Microsoft.
In the above example, Google and Microsoft are similar
because they are software companies. The term semantic
similarity is confused with semantic relatedness. Semantic
relatedness includes relation between two terms.
Example 2 : Apple and selling.
This example is not semantically similar, because Apple is a
fruit and selling is an activity where Apple is related to the
activity of selling. However, semantic similarity is harder to
model than sematic relatedness[11]. To evaluate semantic
similarity between two terms, the different techniques are
discussed on the basis of extracted words.
a. Latent Semantic Analysis(LSA)
LSA is based on singular value decomposition, a mathematical
matrix decomposition technique closely related to factor
analysis that is applicable to text corpora. LSA produces wordword, word-passage, and passage-passage relations. It
represents any set of words such as a sentence, paragraph, or
essay taken from the original corpus or new, in a very high
dimensional semantic space. LSA can use term-document
matrix that describes the occurrence of term in the document.
Example: X is the matrix where element(i, j) describes the
occurrence of term i in the document j.

Fig: Example of LSA[23]

Shaymaa E. Sorour, KazumasaGoda and TsunenoriMine[12]
identifies the hidden meaning of textual information by
occurrence and co-occurrence of textusing LSA.
AshwiniDeshmukhand GayatriHegde[13] presents information
retrieval technique called latent semantic indexing.
ShuchuXiongand YihuiLuo[14] works on LSA to evaluate
sentence subset based to reproduce terms using multi
document.Zongli Jiang and ChangdongLu[15] works on the

IJRITCC | March 2016, Available @ http://www.ijritcc.org


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 4 Issue: 3

ISSN: 2321-8169
400 - 403

semantic relations between words and documents using LSA.
The category of attributes of words are shown by the search
engine using Latent semantic method while the useless are

algorithm is used.LSA is the algorithm for comparison of

model answer and user answer. It represents word-word,
word-passage in matrix form.

b. Ontology Method
The data model are ontologies which is the working model of
entities and interaction. An ontology is a set of concepts,
objects, relations and other entities and relationships among
them. Ontology can be defined in the form:
O = [ C, P, RC, RP, A, I ]


keyword extraction algorithm

Where C is the concepts, P is the properties, R and RP are the

relations between concepts and properties, A is the set of
axioms and I are the instances. V Senthilkumaransand A
Sankar[16] proposed an automated system to assess short
answers using ontology mapping. The mapping is done with
two different ontologies O1 and O2 and find the similarity

Extracted keywords

Comparison of model answer

and student answerusing
semantic similarity algorithm

Betweentwo different concepts. S. BloehdornCimiano and A,

Hotho and S.Staab[17] discussed the framework of ontology
for text mining which describes different architecture of
Mohammad Yasin[18] presented the evaluation of
unstructured text using ontology. In this, answers are collected
and compared with the model answer. Chin Pang
Cheng,Gloria T. Lau, Jiayi Pan, Kincho H. Law[19] proposed
domain specific mapping using ontology. To compare the
similarity two vector based similarity algorithm is used and
compute the results.
c. Context based method
Context based used for calculating similarities between
words and large corpus. Yue Wang, Hongsong Li, Haixun
Wang, and Kenny Q.Zhu[20] presents the context based web
search method. In this paper, the framework classifies web
queries into different patterns and concepts which contain the
queries. Then answers are produced through these queries with
knowledge base. VesileEvrim, and Dennis McLeod[21]
proposed the context based approach which finds the relevant
information through web. The information provided in the
documents are measured by semantic. An Information
retrieval method is used which examines the context and gives
the relevant results.EnekoAgirre, Enrique Alfonseca, Keith
Hall, Jana Kravalova, Marius Pasca and AitorSoroa[22]
describes the context based approach through different aspects
which calculates similarity of words.



The subjective assessment of text are divided into two parts.

The first part is the extraction of keywords from the model
answer and the student answer and the second part is the
comparison between both the answers. The input are the
answers which will preprocessed by using keyword extraction
algorithm.The keywords are extracted by applying
algorithm.The CRF model is best suited for keyword
extraction. It selects the keywords by sequencing the labels.
For the comparison part the semantic similarity of words is
used. The keywords of model answer and student answers are
compared to measure the semantic similarity of words. To
evaluate the correctness of answer semantic similarity

Output (Score)

Fig : The flowchart of proposed work.



Subjective assessment is used to evaluate the understanding

concept of student through descriptive pattern. The assessment
can be done automated. There are several methods available to
assess this online subjective text. By this online subjective
assessment, the student perspective can be thoroughly assess.
In this paper, we have discussed different techniquesfor
keyword extraction such as TF-IDF, CRF model and Query
focused method. Similarly for semantic similarity such as
LSA, ontology method and context based method. By
comparing these techniques, we conclude that CRF model and
LSA is an effective method for extraction of keywords and
assessment of subjective text.
Menaka S and Radha N, Text Classification using Keyword
Extraction Technique, International Journal of Advanced
Research in Computer Science and Software Engineering,
Volume 3, Issue 12, December 2013.
[2] Sungjick Lee and Han-joon Kim, News Keyword Extraction
for Topic Tracking, Fourth International Conference on
Networked Computing and Advanced Information Management,
[3] Ari Aulia Hakim, Alva Erwin, Kho I Eng, Maulahikmah
Classification for News Article in Bahasa Indonesia based on
Term Frequency Inverse Document Frequency (TF-IDF)
Approach, 6th International Conference on Information
Technology and Electrical Engineering (ICITEE), Yogyakarta,
Indonesia, 2014.
[4] Stephen Robertson, Understanding inverse document
frequency: on theoretical arguments for IDF, Journal of
Documentation, Vol. 60, No.5, pp 503-520, 2004.

IJRITCC | March 2016, Available @ http://www.ijritcc.org


International Journal on Recent and Innovation Trends in Computing and Communication

Volume: 4 Issue: 3

ISSN: 2321-8169
400 - 403











Jasmeen Kaur and Vishal Gupta, Effective Approaches For

Extraction Of Keywords, IJCSI International Journal of
Computer Science Issues, Vol. 7, Issue 6, November 2010.
Feng Yu, Hong-wei Xuan ,De-quan Zheng, Key-Phrase
Extraction Based on a Combination of CRF Model with
Document Structure, Eighth International Conference on
Computational Intelligence and Security, 2012.
Chengzhi Zhang, Huilin Wang, Yao Liu1, Dan Wu, Yi Liao, Bo
Wang, Automatic Keyword Extraction from Documents Using
Conditional Random Fields, Journal of Computational
Information Systems, March 2008.
Liang Ma, Tingting He, Fang Li, Zhuomin Gui and Jinguang
Chen, Query-focused Multi-document Summarization Using
Keyword Extraction, International Conference on Computer
Science and Software Engineering, 2008.
Massih R. Amini and Nicolas Usunier, A Contextual Query
Expansion Approach by Term Clustering for Robust Text
Summarization, Proceedings of DUC, 2007.
Claudio Carpineto And Giovanni Romano, Fondazione Ugo
Bordoni, A Survey Of Automatic Query Expansion In
Information Retrieval, ACM Computing Surveys, Vol. 44, No.
1, Article 1, January 2012.
Peipei Li, Haixun Wang,Kenny Q. Zhu,Zhongyuan Wang,
Xuegang Hu, and Xindong Wu, Fellow, A Large Probabilistic
Semantic Network based Approach to Compute Term
Similarity, IEEE Transactions on Knowledge and Data
Engineering, 2015.
Shaymaa E. Sorour, KazumasaGoda, Tsunenori Mine,
Correlation of Topic Model and Student Grades Using
Comment Data Mining, International Conference on Learning
Analytics and Knowledge 2011.
Ashwini Deshmukh and Gayatri Hegde, A Literature Survey
on Latent Semantic Indexing, International Journal of
Engineering Inventions Volume 1, Issue 4 PP: 01-05 September
Shuchu Xiong and Yihui Luo, A New Approach for MultiDocument Summarization based on Latent Semantic Analysis,










Seventh International Symposium on Computational Intelligence

and Design, 2014.
Zongli Jiang and Changdong Lu, A Latent Semantic Analysis
Based Method of Getting the Category Attribute of Words,
International Conference on Electronic Computer Technology,
V Senthil kumaran and A Sankar, Towards an automated
system for short-answer assessment using ontology mapping,
International Arab Journal of e-Technology, Vol. 4, No. 1,
January 2015.
S. Bloehdorn and P, Cimiano and A, Hotho and S.Staab, An
Ontology-based Framework for Text Mining, LDV Forum,
kde.cs.uni-kassel.de 2005.
Yasin,Automated Score Evaluation of Unstructured Text using
Ontology, International Journal of Computer Applications Vol.
39 No.18, February 2012.
Chin Pang Cheng, Gloria T. Lau, Jiayi Pan, Kincho H. Law,
Domain-Specific Ontology Mapping by Corpus-Based
Semantic Similarity, Proceedings NSF CMMI Engineering
Research and Innovation Conference, Knoxville, Tennessee,
Yue Wang, Hongsong Li, Haixun Wang, and Kenny Q. Zhu,
Concept-Based Web Search, Springer 31st International
Conference ER 2012 Proceeding, Vol 7532, pp 449-462,
Florence, Italy, October 2012.
Vesile Evrim and Dennis McLeod, Context-based information
analysis for the Web Environment, International Journal
Knowledge and Information Systems, March 2013.
Eneko Agirre,Enrique Alfonseca, Keith Hall, Jana Kravalova,
Marius Pasca and Aitor Soroa, A Study on Similarity and
Approaches, ACL conference on Human Language
Technologies, pp 1927, 2009.

IJRITCC | March 2016, Available @ http://www.ijritcc.org


Vous aimerez peut-être aussi