Académique Documents
Professionnel Documents
Culture Documents
Botnets are networks of compromised computers controlled under a common command and control (C&C) channel. Recognized as
one the most serious security threats on current Internet infrastructure, botnets are often hidden in existing applications, e.g. IRC,
HTTP, or Peer-to-Peer, which makes the botnet detection a challenging problem. Previous attempts for detecting botnets are to
examine traffic content for IRC command on selected network links or by setting up honeypots. In this paper, we propose a new
approach for detecting and characterizing botnets on a large-scale WiFi ISP network, in which we first classify the network traffic into
different applications by using payload signatures and a novel clustering algorithm and then analyze the specific IRC application
community based on the temporal-frequent characteristics of flows that leads the differentiation of malicious IRC channels created by
bots from normal IRC traffic generated by human beings. We evaluate our approach with over 160 million flows collected over five
consecutive days on a large scale network and results show the proposed approach successfully detects the botnet flows from over 160
million flows with a high detection rate and an acceptable low false alarm rate.
O ne of the biggest threats to the current Internet server. Upon the completion of these authentications, the
infrastructure is botnets which are usually comprised of command and control channels among botmaster, bots, and
large pools of compromised computers under the control of a IRC server will be established. To start a DDoS attack, the
botmaster. Botnets can be centralized, distributed or peer-to- botmaster only needs to send a simple command like
peer (P2P) according to different command and control (C&C) ".ddos.start victim_ip" while all bots receive this command
models and different communication protocols (e.g. HTTP, and start to attack the victim server. This is shown in Step 8 of
IRC or P2P). The attacks conducted by botnets are very Figure 1. More information about the botmaster command
different, ranging from Distributed Denial-of-Service (DDoS) library can be found in [1].
attacks to e-mail spamming, keylogging, click fraud, and new Detecting botnets traffic is a very challenging problem. This
malware spreading. In Figure 1, we illustrate a typical life- is because: (1) botnets use the existing application protocol,
cycle of a botnet and its attacking behaviours. and thus their traffic volume is not that big and is very similar
to the normal traffic behaviour; (2) classifying traffic
applications becomes more challenging due to the traffic
victim server content encryption and the unreliable destination port labelling
method. Previous attempts on detecting botnets are mainly
8.DDOS based on honeypots [2,3,4,5,6], passive anomaly analysis
1.exploit [7,8,9] and traffic application classification [10,11,12]. Setting
2.bot download Botnet up and installing honeypots on the Internet is very helpful to
Botmaster
capture malwares and understand the basic behaviours of
vulnerable
host botnets. The passive anomaly analysis for detecting botnets on
DNS server
3.DNS query a network traffic is usually independent of the traffic content
4.join
5.pass authen.
7.command
6.pass
7.command and has the potential to find different types of botnets (e.g.
HTTP based botnet, IRC based botnet or P2P based botnet).
The traffic application classification based botnets detection
focuses on classifying traffic into IRC traffic and non-IRC
traffic, and thus it can only detect IRC based botnets, which is
IRC server the biggest limitation when compared with the anomaly based
Fig. 1. Typical life-cycle of a IRC based botnet and its attacking behaviors
botnets detection.
In this paper, we focus on traffic classification based
The botmaster usually finds a new bot by exploiting its
botnets detection. Instead of labeling and filtering traffic into
vulnerabilities remotely. Once affected, the bot will download
non-IRC and IRC, we propose a generic approach to classify
and install the binary code by itself. After that, each bot on the
traffic into different application communities (e.g. P2P, Chat,
botnet will attempt to find the IRC server address by DNS
Web, etc.). Then, based on each specific application
query, which is illustrated in Step 3 of Figure 1. Next is the
community, we investigate and apply the temporal-frequent
communication step between bots and IRC server. In IRC
characteristics of network flows to differentiate the malicious
based communication mechanism, a bot first sends a PASS
botnet behaviors from the normal application traffic. The
message to the IRC server to start a session and then the server
major contributions of this paper include: (1) a novel
0.09
0.05
0.08
0.07
0.04
0.06
false positive rate might be generated when deploying the 0.05 0.03
0.02
0.03
the jth ASCII character on the payload over a time window ti As an example, Figures 2 and 3 illustrate the average bytes
(j=1,2,,256 and i=0,1). Given a set of N data objects F ~ {Fi | frequency over the normal IRC flows and IRC botnet flows,
i=1,2,,N}, where Fi =< f1t i , f 2t i ,..., f 256
ti
> , the detection approach respectively. The average standard deviation of bytes
is described in Algorithm I. frequency over 256 ASCII characters for normal IRC traffic is
In practice, labeling the cluster is always a challenging 0.002 and the maximal standard deviation of bytes frequency
problem when applying unsupervised algorithm for intrusion over 256 ASCII characters for normal IRC traffic is 0.05,
detection. By observing the normal IRC traffic over a long while the average standard deviation of bytes frequency over
period on a large scale WiFi ISP network and the IRC botnet 256 ASCII characters for IRC botnet traffic is 0.0009 and its
traffic collected on a honeypot, we derive a new metric, maximum is 0.01, which is much smaller than that of normal
standard deviation m for each cluster m, to differentiate IRC traffic. This observation confirms that the normal human