Académique Documents
Professionnel Documents
Culture Documents
1
Research. Scholar, Department of CSA, SCSVMV University, Kanchipuram, T.N
2
Associate Professor, Department of CSA, SCSVMV University, Kanchipuram,T.N
Abstract
The travelling expenses of a concern can rise to alarming levels if not monitored and regulated. In the present system, a huge
difference in travel expenses was noticed among the same level employees. Hence an attempt was made to regulate the travel
expenses incurred by different level of employees who go on official travel.
Keywords: Data Mining, Decision Tree, Knowledge Discovery, Rule based Classification
1. INTRODUCTION
This research work is aimed at streamlining the travel expenses of employees who go on official travel. For the purpose
of study, the travelling expenses are categorized into
Travel Request Wise details
City Wise details and
Employee Level Wise details
OBJECTIVES OF THE STUDY : The objective of this work is to regulate travel expenses incurred by employees and
categorized them as
Approved identify expenses that fall within a permissible range and can be approved
Refer identify expenses that should be approved by a senior
Query identify expenses for which a query is made to the employee seeking for an explanation
2. REVIEW OF LITERATURE
Teklu Urgessa, Wookjae Maeng and Joong Seek Lee [1] experimented five implementations of three data mining
classification techniques in their work for extracting important insights from tourism data. The dataset contained 12030
instances and 56 attributes before selection and preprocessing. The techniques selected for comparison are
Decision Tree, C4.5 (J48 in Weka),
Random Forest
SMO
Projective Adaptive Resonance Theory (PART)
Multilayer Perceptron (MLP) models are experimented
The authors found out that the Random Forest algorithm outperformed (76%) the rest on the entire attributes.
Yan-Ying Chen, An-Jung Cheng, and Winston H. Hsu[2] focussed on the personalized recommendation framework to
provide a context-aware recommendation system. The experiments were conducted on 19 major cities in the world and
established the fact that using people attributes and travel group types have the potential to improve the personalized
travel recommendation, especially in the location where people have diverse choices of the next stops.
The authors adopted Bayesian learning model in their work because of its effectiveness in recommendation systems
and, most importantly, it can be applied for real time mobile recommendation service.
Pairaya Juwattanasamran , Sarawut Supattranuwong and Sukree Sinthupinyo [3] in their work attempted to find a
travellers interest extracted from search behaviour when the traveller searches for tourism destination. Questionnaires
were used as a tool to collect data.
Rapid miner program with training set of 2,000 transactions found 78 rules which had minimum support 0.0230
maximum support 0.490 minimum confidence 0.211 and maximum confidence 1.000. The authors claimed that their
paper demonstrated that applying data mining with tourism sector can increase opportunity for the competitive
operations of tourism firm to respond the travellers demand effectively.
Nitesh V Chawla [4] studied issues concerning decision trees and imbalanced data sets. The author claimed that a data
set is imbalanced if the classes are not equally represented. A popular way to deal with imbalanced data sets, according
to the author, is to either over-sample the minority class or under-sample the majority class.
The author presented two versions of over-sampling, one by replicating each minority class example and the other by
creating new synthetic examples(SMOTE- Synthetic Minority Over-sampling TEchnique). The author observed in his
study that SMOTE on an average was better than under-sampling and oversampling techniques.
Jiao Yabing [5] proposed an improved Apriori algorithm in his work. The author also quotes the difference between the
traditional apriori algorithm and the improved apriori algorithm. The optimised algorithm prunes Lk-1 before Ck is
consisted. The author claimed that this will decrease the possibility of combination, decline the number of candidate
item sets in Ck, and reduce the times to repeat the process. For large database, this algorithm can obviously save time
cost and increase the efficiency of data mining.
4. METHODOLOGY
4.1 MODEL FOR ANALYSIS: J48 Algorithm was used to display a Decision Tree model for analysis because of its
ability to represent the outcome as a tree like structure. A tree like structure in the result helps to identify the
knowledge in a simple and effective manner. The training and test data are used for this analysis
Pseudocode : The general algorithm for building decision trees is:
1. Check for the base cases.
2. For each attribute a, find the normalized information gain ratio from splitting on a.
3. Let a_best be the attribute with the highest normalized information gain.
4. Create a decision node that splits on a_best.
5. Recur on the sublists obtained by splitting on a_best, and add those nodes as children of node.
Volume 5, Issue 7, July 2017 Page 70
IPASJ International Journal of Computer Science (IIJCS)
Web Site: http://www.ipasj.org/IIJCS/IIJCS.htm
A Publisher for Research Motivation ........ Email:editoriijcs@ipasj.org
Volume 5, Issue 7, July 2017 ISSN 2321-5992
4.2 ANALYSIS OF DATA: Weka Data Mining tool is used for constructing the decision tree because of its
effectiveness in representing the Tree structure for the J48 algorithm. The constructed tree is shown below
5. CONCLUSION
The Decision Tree model helped to identify the range of expenses that can be approved, expenses that should be
referred and expenses which are to be queried. It also helped to predict the class label based on the supplied training
data.
References
[1] A. Teklu Urgessa, Wookjae Maeng and Joong Seek Lee (2017) Application of Data Mining Techniques for
Tourism Knowledge Discovery ,International Journal of Computer, Electrical, Automation, Control and
Information Engineering Vol:11, No:1
[2] Yan-Ying Chen, An-Jung Cheng, and Winston H. Hsu(2013) Travel Recommendation by Mining People
Attributes and Travel Group Types From Community-Contributed Photos, IEEE Transactions on Multimedia,
Vol. 15, No. 6
[3] Pairaya Juwattanasamran, Sarawut Supattranuwong and Sukree Sinthupinyo (2013)Applying Data Mining to
Analyze Travel Pattern in Searching Travel Destination Choices, The International Journal Of Engineering And
ScienceVol.2 Issue 4 Pages 38-44.
[4] Nitesh V. Chawla (2003) C4.5 and Imbalanced Data sets: Investigating the effect of sampling method,
probabilistic estimate, and decision tree structure, ICML, Washington DC,
[5] Jiao Yabing (2013) Research of an Improved Apriori Algorithm in Data Mining Association Rules, International
Journal of Computer and Communication Engineering, Vol. 2, No. 1
AUTHORS
Ilaiyaraja C completed MCA degree from Bharathidasan University in 2012. He is an expert in the
area of developing software solutions. He is currently employed as Lead Software Engineer providing
software based support to clients. His research interests are Data Mining.