Vous êtes sur la page 1sur 6

Proceedings of the 2008 IEEE

International Conference on Information and Automation


June 20 -23, 2008, Zhangjiajie, China

Acquisition and Visualization of Sensitive Security


Audit Events
Baoyun Wang ,Yingjie Yang
Institute of Electronic Technology
Information Engineering University
Zhengzhou, Henan Province, 450004, China
wangbaoyun303@163.com, yangyj2006@vip.163.com
Abstract- Audit data analysis plays a critical role in the field
of information security. Acquiring sensitive security audit events
(SSAE) and visualizing correlations of them is an important task
of audit data analysis and it is a very difficult issue. In this paper,
we propose an approach to acquire SSAE and present their
correlations in the form of graphs. Firstly, we use DWT (discrete
wavelet transformation) to get sensitive security audit event
objects, and then use DBSCAN (a clustering algorithm of KDD)
and database query technique to obtain SSAE related to the
sensitive objects. Secondly, a security audit event visualization
model based on the theory of Colored Petri-net is presented to
visualize correlations of SSAE, and the acquisition process of
causal relationship among audit events is given. Lastly, we carry
out an experiment, which shows the proposed approach bring
some convenience of browsing and analysing audit data to
security auditor.

I. INTRODUCTION
Security audit is an important part of information security
management. Event log plays an important role. However,
when security auditor is in the face of large amount of logs in
the form of records, it is impractical to browse and analyze
them. Acquiring sensitive security audit events (SSAE) and
Visualizing correlations of them is important for security
audit.
There are several difficulties to achieve above goal. Firstly,
audit events are of huge amount and different kinds, some of
them contain little useful information. Audit events generally
have a property named urgent level, its value can be warning,
mistake, information and so on, but they often cannot reflect
potential threats. We can not acquire SSAE only by that event
property. Thirdly, to visualize audit events correlation, it is
better to have an event visualization model theory, but there
are no related woks currently.
To overcome above problems, in section three we start from
sensitive events object selection, and then acquire SSAE. In
section four, we give a security audit event visualization
model. Section five gives acquisition process of events causal
relationship. Section six gives an experiment example, and we
conclude in section seven.
II. RELATED WORKS
Currently, there are some log analysis tools and techniques
[1].Their event correlation methods are mainly used for
network fault management, and some of them have been used
to security management and intrusion detection. But they

978-1-4244-2184-8/08/$25.00 2008 IEEE.

mainly focus on log correlation, and few of them consider


acquisition of SSAE. At the same time, some log visualization
systems [2,3,4,5,6] have been appeared for a period. They are
the applications of information visualization technique in log
monitoring area. They visualize user actions to some extent,
but their visualization is limited to network data, some of them
visualize the result of statistics of user actions. They have no
way to handle a huge amount of log messages, because they
do not have the filter mechanism and can not find out SSAE.
One of the most serious problems in related works is that the
visualization can not present the events relationships in term
of time, space and logic, and therefore they can not provide
help of deep analysis for security auditor.
III. ACQUISITION OF SENSITIVE SECURITY AUDIT EVENTS
In this section, we give the acquisition process starting with
file operation events. As audit events are of huge amount,
involving lots of files, but the importance of different files is
different, so we first find out sensitive object files, and then
find out SSAE. The main process is shown in figure 1

Figure 1. Acquisition of sensitive security audit events

In figure 1, the words on broken line present the methods


we use. The process is as follows:
1)Use DWT technique to select sensitive object files in
object set and we get sensitive object files A, B, C
2) Use DBSCAN algorithm to cluster events due to events
occurring time, the events in the same cluster have great
correlation in terms of time. The events whose objects are
sensitive object files will fall into different clusters. We get
sensitive security audit events set Cluster (A), Cluster (B,
C) in the aspect of time.

1514

3)Query users and their actions that had accessed sensitive


files A, B, C in the events database, we get sensitive
security audit events set S(A), S(B), S(C) in the aspect of
logic.
4) Unite the events obtained from the second step and the
third step and we get sensitive security audit events S.
We discuss DWT and DBSCAN technique in detail.
A. Acquisition of sensitive object files based on DWT
Object files that we will choose are the ones whose change
may be abnormal. The changes of files can be described as a
discrete signal. The abnormal file update is often relevant to
noise signal. As DWT is very popular in the field of signal
process, we use DWT to accomplish this file selection task.
DWT has also been used in anomaly detection [7] [8].
In terms of system security, the files updated on single host
are usually user files, corresponding to user applications, so
what we are interested in are the files in network system. In a
network environment, the system files in the same path are
identified as the same file. Some definitions:
1) Si ( n ) : the number of hosts that update file i on day n for
file i.
2) cAi : Si can be decomposed into a low frequency signal
and a high frequency signal, cAi is the first , reflecting the
long term update trend.
3) cDi : cDi is the high frequency signal of Si ,reflecting the
day-to-day variation from the long term trend.
4) cAi ( j 1) : the signal value of cAi on day j-1.
5) Ri ( j ) : Ri ( j ) = Si ( j ) - cAi ( j 1) .the residual signal
value.
The wavelet that we use is Haar wavelet, which is the
simplest wavelet. The process is as follows:
For file i:
1) Si = cAi + cDi
2) Ri ( j ) = Si ( j ) - cAi ( j 1)
3) if Ri ( j ) >
If Ri ( j ) exceeds the preset threshold which is a
parameter based on the statistical distribution of historical
residual values, then the actual number of host that have
updated file i on day j is significantly larger or smaller than
the prediction cAi ( j 1) based on the long term trend.
Therefore, file i is selected.
B. Acquisition of Time correlated SSAE based on DBSCAN
DBSCAN (density-based spatial clustering of application
with noise) is a clustering algorithm of KDD (Knowledge
Discovery and Data Mining). It clusters data on spatial
distance and density which suits for any shape in planar space.
Our application is based on time distance, so our method is to
use time dimensionality replace space dimensionality. Some
definitions [9], as in figure2.

Figure 2. DBSCAN algorithm


Reachability-distance (p, q1) = d (p, q1)
Reachability-distance (p, q2) = d (p, q2)

1) For any point p, the area whose center is p and radius is


is named the adjacent area of p, marked as P = p ( ) .
2) For point p and radius , if exist points q1 ,
q2 , qm p ( ) , m is the minimum number must be met,
then p is a kernel .
3) If q p ( ) and p is a kernel, then from p to q is directly
density reachable.
4) Point p and q is density reachable, if and only if
exist p1 , p2 , pn , p and p1 is directly density reachable,
pi and pi+1 ( 1i n 1 ) is directly density reachable, pn and q
is directly density reachable.
The main process is as follows:
1) Preprocess security audit events data set, every events
can be marked as (ID, T) where T is event time. The dataset
preprocessed we call it TE (time event). The time event set
that directly related to sensitive objects named M, where M is
a subset of TE.
2) Fix time radius r and minimum number minpts.
3) For any object p in M, search entire data set TE, if p is a
kernel object, then find out all the density reachable objects
from p based on r and minipts, create a new cluster and attach
a cluster tag.
4) If p is not a kernel object, consider it as a noise point
temporarily.
5) Process another object until all the objects are finished.
We omit the explanation of database query process because
it is easy to understand. So far, we have got SSAE from above
method. In order to visualize the correlation of them, we give
a security audit event visualization model in next section.
IV. SECURITY AUDIT EVENT VISUALIZATION MODEL
For a security audit event, it usually has some required
properties, such as subject, action and object. Our security
audit event visualization model integrates these properties and
some other relation sets. The model is named SAVM. It is an
extension to Colored Petri-net and it is a variation of HCPN
[10].
SAVM = (C,P,T,F,G,E,O, M 0 ,V),where:
1) C (color set): C = C u * C p
C is a non-empty finite set of colors, where Cu and C p
correspond to event subjects which are user and process.
2) P (place set)

1515

P = File* DB * Device* Networkresource *Certificate


P is a finite set of event objects; it mainly contains file,
database, device, network resource and certificate.
3) T (transition set) T =Tob *Tsv *Tat
Tob ={read,write,delete,execute,replace,change,add,
logon}
Tsv ={http,ftp,telnet,pop3...}
Tat ={scan,prob,privilege,dos,overflow,cheat,
trojanhorse,backdoor}
T is a finite set of events actions. Tob is single step action
set related to file ,database, and device etc. Tsv is action set
related to network services. Tat is action set related to attacks.
4) F (flow set)
F = F1 * F2 * F3 , F1 ( PT ) , F2 (T P ) , F3 (T T )
F is a set of relations. F1 and F2 are relations between
objects and actions, F3 is relations between actions and
actions.
5) G (guard function set) G ={g :F1 C MS }
CMS represents multi-set of color set C. G is a set of guard
functions associated with F1 , describing the conditions to be
met before an action can be conducted by subject.
6) E (effect function set) E ={e:F2 CMS } .
E is a set of functions associated with F2 , describing the
result of relationship change due to the action.
7) O (pre-order function set) O ={o: F3 T }
O is a set of functions associated with F3 , T is time
distance. Pre-order functions describe time sequence between
two actions.
8) M 0 (initial marking distribution)
M 0 is the subject-object ownership, describing the
beginning event before correlating events.
9) V (mapping rules)
V is a set of mapping rules. It aims at mapping audit events
to graphs, as follows:
Place set P mapping to circle nodes;
Transition set T mapping to rectangle nodes;
Color set C dye to circle nodes and rectangle nodes;
Guard functions set G, effect function E and pre-order set O
mapping to arcs;
Flow set F mapping to the direction of arcs;
In SAVM, if a subject accesses an object or has an action,
then the circle node (object) or rectangle node (action) is dyed
to the corresponded color (subject). There are three kinds of
arcs in SAVM, which are object to action, action to object, and
action to action (they are derived from F, G, E, O).
An example of visualization of audit events with SAVM is
shown in figure 3. It describes the process of local to root
attack (L2R) [11]. Circle nodes represent resource (event
objects), and the rectangle nodes represent actions conducted
by a user (subject). In figure 3 there is only one kind of color,
denoting all the actions were conducted by the same user.

Figure 3. local to root attack described in SAVM

V. ACQUISITION OF CAUSAL RELATIONSHIP


In above model SAVM, G, E, O reflect the causal
relationships of audit events. We acquire these sets from two
aspects.
The first, we call it static rules. Static rules contain access
control rules and attack condition rules. Access control rules
can be described as Subject(certificate)->Actions(object).
some examples of access control rules can be seen in table 1.
Privileges are described by role, and actions are what they
could conduct.
TABLE I
STATIC RULES OF ACCESS CONTROL

Subject(Certificate)
Root(key)
User(passwords)
Anonymous( )

Actions(objects)
Read (A,B),write(A,B)
Read (A,B),write(B)
Read (A,B)

Attack conditions rules mean when an attack occurs, what


prerequisite it should have and what consequence it will give,
it can be described as the following sequence [11]:
prerequisite actions consequence . Some examples
of attack condition rules can be seen in table2.
TABLE II
STATIC RULES OF ATTACK CONDITION

Action
IPSweep
PortScan
Chaos
query
TSIG
overflow

Prerequisite
Access(SrcIP)
Access(SrcIP) and
IsAlive(DestIP)
ExistService(DestIP, DNS)

Consequence
IsAlive(DestIP)
OpenPort(DestIP,
Port)
DNSVersion
(DestIP,Version)

DNSVersion
(DestIP,ISC BIND 8.2.2)

Access(DestIP)

FTP

IsAlive(DestIP)

FTPVersion
(DestIP,Version)

Wuftp
Overflow

DNSVersion(DestIP, wuftp2.6.2)

Access(DestIP)

The second, we call it dynamic rules. We use sequence


mining algorithm to get the dynamic causal relationship.
When the frequent sequences are found, we think that the
events in frequent sequence pattern have causal relationship.
The relevant definition to sequence mining is support. For
sequence pattern p, assuming its length is l, then the
computing formula of support is Support ( P ) = N p N l , where
N p is the events count of sequence pattern P in events set,
N l is the count of events whose length are l in events set.
We use sliding window algorithm. For simplicity, an audit
event can be pre-treated as e= (subject, action, object).The
events set is E ={e1, e2 ...en } . We assume that the maximum
length of the sliding window is MaxLen, The minimum
support is MinSup.The main process is as follows.
1) First, make the breadth of sliding window as 1(len=1),
and then look up the sequence length of 1 whose support

1516

exceeds MinSup in E.
2) Increase sliding window, (len=len+1), make the left
boundary of the sliding window lap over e1 , get subsequence Sub1 ={e1 ,e2 ...elen } .
3) Slide the window until the right boundary of the sliding
window lap over en , after every sliding step, we can get a subsequence length of len. When the sliding process is finished,
we
can
get
n-len+1
sub-sequences length of
len, Subi ={ei ,ei +1...ei + len 1}( i =1,2,... n len +1) , compute the
support of these n-len+1 sub-sequences, get the sequences met
by the MinSup.
4) Repeat the second and the third step until len=Maxlen.
Some examples of dynamic rules can be seen in table 3. The
sequence pattern shows the users command habit.

As in figure 5, in this network environment, every user


machine has installed the monitoring software and only one
machine can connect to Internet .There is a FTP server and the
software is Serv-U. So the audit data sources are Network
Monitoring Server log, Firewall log, FTP server log and
operating system (Windows Server 2000) log.
We use the acquisition and visualization methods the paper
proposed to process audit data of a month. Every sensitive
objects files was acquired by DWT. For example, the file
C:\WINNT\system32\wbem\Logs is acquired as figure 6.

TABLE III
DYNAMIC RULES ACQUIRED BY SEQUENCE MINING

MinSup
30%
40%
15%
20%

Sequence pattern
su->tcsh->ls
ls->ls/etc
ls->mail->su->tcsh->ls->df
ls->cat/etc/passwd

From the above static and dynamic rules, we get the causal
relationship among audit events. We give experiment in
section 6.
VI. EXPERIMENTS

Figure 6. DWT Process Result

In order to see clearly the spatial relationship of audit


events, after all the sensitive security audit events are
acquired, we part them into different sets based on the statistic
of event aim IP address. The visualization process is as in
figure 4.

In the experiment, there are 30 terminal machines, the


threshold is 10. From the figure, we can see that the on day 9
and 23 the files changes are sensitive.
We acquire about 20 sensitive object files; by these
sensitive objects and DBSCAN we get SSAE. DBSCAN
graph is in figure 7.

Figure 7. DBSCAN Process Result

Figure 4. Visualization process

Our experiment environment is a network which has been


installed security monitoring software. The topology is as
figure 5.

In figure 7, different colors represent different clusters.


In fact, the bigger cross is outlier, and the smaller cross
represent event containing sensitive objects which start a
cluster, and the circle point is the event added to the cluster.
Two typical clusters are marked by ellipses.
We get causal relationships in table 4 and table 5 which
contain static rules and dynamic rules.
TABLE IV
STATIC RULES USED IN EXPERIMENT

Access Control Rules


administrator(key, passwords)--->set policy for operator (domain,
functions);
administrator(key, passwords)--->change policy for operator
(domain, functions);
administrator(key, passwords)--->log backup;
administrator(key, passwords)--->shutdown server;
operator(key,passwords)--->set policy for each function(device,
file, network, logon)
operator (key, passwords)--->change policy for each

Figure 5. Experiment environment topology

1517

function(device, file, network, logon);


operator(key, passwords)--->uninstall agent;
auditor(key, passwords)--->query and analysis logs of each
role(administrator, operator, auditor);
user(key, passwords) --->logon system;
user(disk writeable) --->write files on disk;
user(mobile device writeable) --->write files on mobile device;
user(network address available) --->access network;
user(email service available) --->send email;
user(ftp service available) --->transport files;
Attack Condition Rules
Access(SrcIP)-> IPSweep-> IsAlive(DestIP);
IsAlive(DestIP)->PortScan-> OpenPort(DestIP,Port);
IsAlive(DestIP)->FTP-> FTPVersion(DestIP,Version);
FTPVersion(DestIP,Version)->FTP(user)->Promote (DestIP);
Promote(DestIP)->Administrator((DestIP)->Anyaction;

P1: a file which contains a horse program that can change


system files;
P3: C\Program files;
P4: P4C:\WINNT\Debug\oakley.log.sav
P5C:\WINNT\system32\wbem\Logs
P0, P2, P3, P4 and P5 were selected by DWT (because of
the transition of the horse program in FTP server, these files
were changed in lots of hosts), and other events were acquired
by DBSCAN algorithm and Database query.
An attacker Scanned IP and ports in the network system
using some tools, and he found that the port 21 in IP2 was
opened, and the software edition was Serv-U. The attacker
acquired FTP administrator passwords (The dark frame means
lots of attempting log) the privilege of Attacker was promoted
because of the software hole, at the same time, P0 was
changed. Then Attacker entered into computer system in the
role of administrator, upload file P1 and executed it, system
files P2, P3, P4 and P5 were changed immediately.
We must point out that the action of Privilege Promote was
not recorded in the log files; we add it because this can clearly
demonstrate the entire intrusion progress. Of course, in order
to record this action, we need monitoring tools of system
calling sequence. But the task of adding appropriate action for
a scenario that was not very clear is also a changeling
problem, and it is also one of our next works.

Table V
DYNAMIC RULES ACQUIRED IN EXPERIMENT
User
Wu
Liangz
pj
yongw
kk

jingyu
yuhan
baby

sweet

Sequence pattern
(network_http , google.cn)>(network_load, xxx.pdf)
(app, msdev.exe) ->(app , softtice.exe) ->
(app,vmware.exe)
(log_in, firewall)->(file_read, firewalllog)>(policy_mod, xxx.policy)
(log_in, managecenter)->(policy_check ,
xxx.policy)
(app, word.exe)->(file_rename, xx.word)>(file_copy, xxx.word)>(file_new,xxx.word)
(network_http ,business.sohu.com)>(app,ztzq.exe)->(app, ztjy.exe)>(network_http, guba.com)
(network_http, 163.com)-(network_email,
xxx.txt)
(network_http,baidu.com)>(network_load,xxx.mp3)>(app,TTplayer.exe)
(app, ppt.exe)->(device_add,
mobiledeviceA)->
(file_new, xxx.ppt)->(device_del,
molbiledeviceA)

MinSup
10%
15%
15%
10%
12%

15%
10%

VII. CONCLUSIONS AND FUTURE WORK

8%

In this paper, we describe an approach to acquire and


visualize SSAE. Our approach has the following features:
1) It effectively acquires sensitive object files using DWT
and effectively acquires SSAE using DBSCAN and database
query technique.
2) The audit event visualization model gives a framework to
visualize audit events, which is very convenient for auditors to
browse or analyse audit events, and they can know what
events had happened clearly.
3) It acquires causal relationship among audit events in
terms of static rules and dynamic rules, and dynamic rules are
acquired by sequence mining algorithm.
Our approach can be improved in the following two areas:
1) To acquire SSAE we start and focus on the sensitive files
and other objects, we plan to expand our method to suit for the
events which do not involve file objects.
2) For dynamic rules in SAVM, we use sequence mining
algorithm to acquire them, but this method required that the
audit event occurring frequency must meet the support. If one
sensitive event occurred only once then we can not acquire the
rules. Next work we can apply Correlation Rules (a method in
KDD) to solve this problem.

18%

Using the above results, we visualized two events scenario.


The graphs are as in figure 8 and figure 9.
Figure 8 displays an information leak process. In the figure,
S axis represent space (IP address) and T axis represent Time.
Object P3 and P5 were selected by DWT, other objects and
events were got by DBSCAN algorithm and event query based
on sensitive object. The meaning of figure 8:
An inner user leak security information with three identities
(three colors). Firstly, he logged on the control center on IP1
in the role of administrator, changed security policy on mobile
device on his machine, allowing writing on mobile device.
When the policy was send, it immediately caused system file
P3 updating. The user added device P5, copied file P6, logged
on Internet on IP3, transmitted the file using email. As a
result, the file was owned by another user. (P1, P4, P7,
represent password or credential for user to log on system).
Figure 9 describes an attack and intrusion progress.
Some symbols explanation:
P0: System file C:\WINNT\system32\config\SAM which is
relevant to user account and password.

1518

Figure 8. An information leak process

Figure 9. An Attack and Intrusion process


[5] Girardin L, Brodbeck D. A visual approach for monitoring logs.
Proceedings of the Twelfth Systems Administration Conference
(LISA'98), pp.299-308, 1998
[6] Zhi Guo. Research on Visualization in Intrusion Detection. Tsinghua
Uniiversity.2004
[7] J. Zhang, F. Tsui, M. M. Wagner, and W. R. Hogan. Detection of
Outbreaks from Time Series Data Using Wavelet Transform. In AMIA
Fall Symp., pp.748752. Omni Press CD, October 2003
[8] Yinglian Xie. A Spatiotemporal Event Correlation Approach to
Computer Security.School of Computer Science Carnegie Mellon
University. 2005
[9] Jun Qian, Chao Xu, and Meilin Shi. The Implementation of Alert
Aggregation and Dataset Testing. Journal of Computer Research and
Development. 43(4), pp. 627-632, 2006.
[10] Dong Yu, Deborah Frincke. A Novel Framework for Alert Correlation
and Understanding (Springer-Verlag[A].International Conference on
Applied Cryptography and Network Security. (ACNS), pp.452466 ,2004
[11] JINGMIN ZHOU.Modeling Network Intrusion Detection Alerts for
Correlation[J].ACM Transation on Informaton and System Security
10(1), pp.1-31,2007

ACKNOWLEDGMENT
Financial supports from China 863 project are highly
appreciated. The helpful comments from reviewers are also
gratefully acknowledged.
REFERENCES
[1] Risto Vaarandi. SEC a Lightweight Event Correlation Tool.
Proceedings of the 2002 IEEE Workshop on IP Operations and
Management, pp. pp.111-115. 2002.
[2] Takada T, Koike H. Tudumi: information visualization system for
monitoring and auditing computer logs. Proceedings of the Sixth
International Conference on Information Visualization (IV'02), 2002
[3] S.G. Eick and P.J. Lucas: Displaying trace files, Software Practice and
Experience, Vol.26, No.4, pp.399-409, 1996.
[4] Becker, R.A., Eick, S.G and Wilks, A.R.: Visualizing Network Data,
IEEE Trans. Visualization and Computer Graphics, Vol.1, No.1, pp.1628, 1995.

1519

Vous aimerez peut-être aussi