Vous êtes sur la page 1sur 3



A Novel Approach to Credit Card Fraud

Detection Model
V.Dheepa, Dr.R.Dhanapal, Remigious D
Research and Development Centre, Bharathiar University, Coimbatore, Tamilnadu, India.

Abstract— Along with the great increase in credit card transactions, credit card fraud has become increasingly rampant in
recent years. In Modern day the fraud is one of the major causes of great financial losses, not only for merchants, individual
clients are also affected. In this paper, clustering and outlier detection techniques are used to find the fraudulent activities. In
the first phase, clustering is used to partition the data. In the second phase two different outlier detection algorithms are used in
the partitions separately for finding the outliers. Finally, the outliers are combined and fraudulent cases are found.

Index Terms— Clustering, Credit Card Fraud Detection, Outlier Detection.

——————————  ——————————

The use of credit cards is prevalent in modern day socie- Grid based Outlier detection was also projected [5]. There is a
ty. Credit card becomes the most popular mode of pay- lot of research in outlier detection. Many outlier detection al-
ment. Detecting credit card fraud is a difficult task when gorithms such as base on statistics [6] and distance [7, 8] are
using normal procedures, so the development of the cre- gain good application.
dit card fraud detection model has become of signific- In this paper, the method of detection outliers is used for
ance, whether in the academic or business community set up a detection model, which could mine fraud transactions
recently. as outliers.
Detecting the fraud means identifying suspicious frau-
dulent cases. In this paper, clustering and outlier mining is 3 FRAUD DETECTION MODEL
used to detect the fraudulent cases. Clustering is used to
group the similar data objects into clusters. Outliers are
defined as data that appears to be inconsistent with the rest
of the data. Mining of outliers is an important research
field in the application of fraud detection. Detection of out-
liers has recently gained a lot of application in many do-
mains. In this model, outlier detection algorithm is em-
ployed to find fraudulent transactions.

From the work of view for preventing credit card fraud,
more research works were carried out with special em-
phasis on data mining. Kim and Kim have identified
skewed distribution of data and mix of Legitimate and
fraudulent transactions as the two main reasons for the
complexity of credit card fraud detection [1]. Sam and
Karl suggest a credit card fraud detection model using
Bayesian networks and neural network techniques to
learn models of fraudulent credit card transactions
[2].L.MuKhanov finds a credit card fraud detection model
using Bayesian Belief Networks [3].
Some Clustering-based outlier detection techniques are al-
so proposed to find the Fraud detection [4].

Lecturer, Department of M.Sc-InformationTechnology,

Velammal Engineering College, Chennai.

Professor, Department of Computer Applications,

Easwari Engineering College, Chennai.

 SAP Consultant,Wipro Technolgogies,

Client: Apple Computers, Cupertino-California, U.S.
Fig.1.Outline of the Fraud Detection Model

An Outline of the Fraud detection model is shown in Fig.1. majority of the data. The degree to which each outlier
First step in this method is Data Preprocessing. According to deviates from the remainder of the data indicates the
every attribute of the transaction sample, the model does severity of the abnormal activity denoted by that outlier.
some data preprocessing to convert all of them to numerical It is based on the mean and standard deviation of the data
attribute. The second and Third step shows the detection me- observed. In this method, two parameters are used, an up-
thods. The Final step concludes the fraudulent cases. per bound, Nu, on the number of potential outliers and the
probability, α, of incorrectly declaring one or more outliers
when no outliers exist. Nu <= 1/ 2 (N - 1), where N is the
number of samples. This method works well for the densi-
ty region.
4.1 Clustering
For sparse regions, the Q-test outlier detection method is
Clustering helps in grouping the data into similar clusters that used. First, the set of data are arranged in ascending order.
helps in uncomplicated retrieval of data [9].Cluster analysis is Then the experimental Q-value is calculated. It is the ratio
a technique for breaking data down into related components which shows the variation of the suspect value from its
in such a way that patterns and order becomes visible nearest one divided by the range of the values. The Q value
[10].Clustering techniques are known as “unsupervised learn- is compared with the critical Q- value (Q_crit).Critical Q-
ing” because there is no class to be predicted. The main goal of value is defined with confidence level. If Q>Qcrit then the
clustering data is to find common patterns or to group similar suspect value can be consider as outlier. This method per-
cases in the data. In this paper, an efficient cluster based parti- forms well with small sample sizes.
tioning algorithm is used. This divides the data in specified All the detected outliers from the regions are com-
number of partitions which shows the partitions of dense re- bined. They are ranked based on the severity of the outliers
gions and sparse regions. The K-means algorithm is applied to and concluded as fraudulent cases.
cluster the data, which find out the sparse region and dense
region. K-means clustering is a method of cluster analysis
which aims to partition n observations into k clusters in which 5 DISCUSSIONS
each observation belongs to the cluster with the nearest mean. In this detection Model, the unsupervised approach is used.
The cluster mean of Ki={ti1,ti2,….,tim} is defined as The unknown frauds are easily found by using this approach.
The models based on supervised approach must have the
1 labeled data for both normal data and anomolies.It is only
m i 
t ij able to detect frauds of a type which has previously occurred.
In contrast, unsupervised methods don’t make use of labeled
records. It detects the changes in behavior or unusual Transac-
The partitioned data region is shown in Fig. 2. tions. Unsupervised learning is a feasible method to learn the
large and more complex model.
In this process, the Clustering and Outlier Detection
methods are worn to find the Outliers. By applying the
algorithms in partitions separately will reduce the num-
ber of nearest neighbor searches and number of reachabil-
ity distance computation. This model mine fraud transac-
tions as outliers.


This paper presents the credit card fraud detection model
based on clustering and outlier mining to find whether the
credit card transaction is fraudulent or not. Clustering algo-
rithm is incorporated for the process of partitioning and out-
lier detection algorithms are used to find the outliers. The out-
Fig.2.Partitioned Data Region using clustering liers are ranked and fraudulent cases are concluded. By apply-
ing the algorithms in partitions separately will reduce the
4.2 Outlier Detection number of nearest neighbor searches and number of reacha-
Outliers are defined as the data which deviate from other ob- bility distance computation. The work can be further extended
servations so as to arouse suspicion that they were generated by using density based Outlier Detection algorithm.
by a different mechanism. In this study, two outlier detection
methods are employed for two different regions. REFERENCES
For dense regions, the generalized extreme studentized
[1] M.J. Kim and T.S. Kim, “A Neural Classifier with Fraud Density Map
deviate (GESD) is used.This method identifies outliers
for Effective Credit Card Fraud Detection,” Proc. International Confe-
which are data samples that vary significantly from the rence on Intelligent Data Engineering and Automated Learn ng, Lec-

© 2010 Journal of Computing Press, NY, USA, ISSN 2151-9617

ture Notes in Computer Science, Springer Verlag, no. 2412, pp. 378-
383, 2002.
[2] Sam,Karl,Bram and Bernad .Credit Card Fraud Detection using
Bayeaian and Neural Networks .In Proceedings of NF2002,16-
19January ,Havana ,Cuba ,2002.
[3] L. Mukhanov, “Using bayesian belief networks for credit card fraud
detection,” in Proc. of the IASTED International Conference on
ArtificialIntelligence and Applications, Insbruck, Austria, Feb. 2008.
[4] E. Eskin, A. Arnold, M. Prerau, L. Portnoy, and S.Stolfo. A geometric
framework for unsupervised anomalydetection: Detecting intrusions
in unlabeled data. In Data Mining for Security Applications, 2002.
[5] Grid-ODF: Detecting Outliers Effectively and Efficiently in Large
Multi-dimensional Databases: Lecture Notes in Computer Science,
Springer Berlin, Heidelberg, 2005
[6] Victoria J.Hodge, Jim Austin: A Survey of Outlier Detection Metho-
dologies. Artif.Intell.Rev. (2004)
[7] F. Angiulli, S. Basta, and C. Pizzuti. Distance-based detection and
prediction of outliers. IEEE Transaction on Knowledge and Data En-
gineering, February 2006.
[8] Han J W, Kamber M. Data Mining: Concepts and Techniques.Beijing:
Higher Education Pr. and Morgan Kaufmann Publishers, 2007.
[9] V.Dheepa and Dr.R.Dhanapal, “Analysis of Credit Card Fraud De-
tection Methods”, International Journal of Recent Trends in Engineer-
ing, Vol.2 No.3, Nov 2009.
[10] Binu Thomas and Raju, “A Novel Fuzzy Clustering Method for Out-
lier Detection in Data Mining”, International Journal of Recent Trends
in Engineering, Vol.1, No.2, May 2009.

V.Dheepa, is pursuing research leading to Ph.D Computer Science

in Bharathiar University, Coimbatore.She obtained her Masters de-
gree in Computer Science from Madurai Kamaraj University, Madu-
rai and Master of Philosophy in Computer Science from Alagappa
University, Karaikudi.She is working as a lecturer in the Department
of M.Sc Information Technology, Velammal Engineering College,
Chennai, Tamilnadu, India. She is a Life member of ISTE Chap-
ter.Her Publications are one International Journal and Five National

Dr.R.Dhanapal obtained his Ph.D in Computer Science from Bhara-

thidasan University, India. He is currently Professor of the Depart-
ment of Computer Applications, Easwari Engineering College, Affi-
liated to Anna University of Technology Chennai, Tamil Nadu India.
He has 25 years of teaching, research and administrative expe-
rience. Besides being Professor, he is also a prolific writer, having
authored twenty one books on various topics in Computer Science.
His books have been prescribed as text books in Bharathidasan
University and autonomous colleges affiliated to Bharathidasan Uni-
versity, Tiruchirapalli. He has served as Chairman of Board of Stu-
dies in Computer Science of Bharathidasan University, member of
Board of studies in Computer Science of several universities and
autonomous colleges. Member of standing committee of Artificial
Intelligence and Expert Systems of IASTED, Canada and Senior
Member of International Association of Computer Science and In-
formation Technology (IACSIT), Singapore. He has Visited USA,
Japan, Malaysia, and Singapore for presenting papers in the Interna-
tional conferences and to demonstrate the software developed by
him. He is the recipient of the prestigious ‘Life-time Achievement’
and ‘Excellence’ Awards of Govt. of India. He is serving as Principal
Investigator of UGC sponsored innovative, major and minor research
projects about 1.6 crore. He is the recognized supervisor for re-
search programmes in Computer Science leading to Ph.D and MS
by research in several universities including Anna University of
Technology Chennai, Bharathiar University, and Manonmaniam
Sundaranar University. He has got 47 papers on his credit in interna-
tional and national journals.

Remigious D obtained his Msc in Computer Science from Bharathi-

dasan University, India. He is currently working for Wipro technolo-
gies at Apple Computers as a SAP Consultant. He has worked as a
Software Consultant in many firms including BICS, Intelligroup and
Wipro and has vast experience inSoftware technologies for the past
10 years. His main area of interest is in Artificial Intelligence.