Network-Based Detection of I T Botnet Attacks Using Deep Autoencoders

NETWORK-BASED DETECTION OF IOT
BOTNET ATTACKS USING DEEP

AUTOENCODERS
By Aiswarya Ashok Kumar
S7 IT
Roll No: 5
Guide – Ms. Chinchu Krishna
1
OVERVIEW
■ Scope and objective
■ Introduction
■ Literature Review – Botnets and Related Work
■ Methodologies – Autoencoders and Main stages
■ Empirical Evaluation
■ Discussion and Results
■ Conclusion
■ References
2
SCOPE AND OBJECTIVE
■ A network based anomaly detection method

which uses deep autoencoders to detect
anomalous network traffic emanating from
compromised IoT devices.
3
SCOPE AND OBJECTIVE
■ Instantaneous detection can promote network

security, as it expedites the alerting and
disconnection of compromised IoT devices
from the network
4
INTRODUCTION
Number of IoT devices deployed are increasing

worldwide
Traffic volume of IoT-based DDoS attacks reaches

unprecedented levels
Need for timely detection of IoT botnet attacks has

become very crucial
5
WHAT ARE BOTNETS ?
■ Logical
connection of
Internet-
connected devices
■ Eg. Computers,
smartphones, IoT
devices
■ Security is
breached and
control ceded to a
third-party
6
BOTNET INFECTIONS
• Deploy botnets through a trojan horse

BOTHERDERS virus
USERS • Infect their own systems
•Access and modify personal information

BOTNET •Attack other computers and commit crimes
7
BOTNET FEATURES
■ Self – propagate : seek-and-infect missions

■ Search the web for vulnerable internet-connected
devices
lacking OS updates
antivirus
■ Botnet design continues to evolve

8
BOTNET FEATURES
■ Difficult to detect – users are anaware

■ Use only small amount of computing power
■ Botnets take time to grow – lay dormant
9
BOTNET ARCHITECTURE
10
BOTNETS - OPERATIONAL STEPS
■ Propagation
■ Infection
■ C&C Communication
■ Execution of Attacks
11
BOTNETS - OPERATIONAL STEPS
12
RELATED WORK – BOTNET DETECTION
METHODS
Specific operational step to

be detected
The detection approach
13
RELATED WORK
14
RELATED WORK
■ Previous studies focus on early operational steps

■ Botnet attacks mutate on a daily basis
■ Attacks become increasingly sophisticated
■ These mutations will eventually bypass existing
methods of early detection
15
RELATED WORK
■ Detection Approaches
 Host-based
 Network-based
16
RELATED WORK
■ Host – based : less realistic

 Installing detectors on IoT products
 Limited access to some IoT devices
 Constrained computation and power
 A single non-distributed solution is preferred
17
RELATED WORK
■ Network - based : uses deep learning to

perform anomaly detection
 Capture behavioral snapshot of benign IoT traffic
 Train a deep autoencoder – one for each device
 Autoencoder compress snapshots
• Failure to reconstruct snapshot - anomaly
18
RELATED WORK
■ Benefits of using network-based approach:
Heterogeneity Tolerance
Open World
Efficiency
Hardly any false alarms
19
METHODOLOGY
New data
•For each device •Detects anomalies
•Train on benign •Autoencoder •Device
traffic applied to new compromised
data of an IoT
device
Deep •Possibly infected
Anomalies
autoencoders
20
AUTOENCODERS
■ Unsupervised ANN
■ Learns to compress and encode data
■ Learns to reconstruct data back from coded
representation to a representation, as close as
to the original input
21
AUTOENCODERS
22
AUTOENCODERS
■ Reduces data dimensions – ignore noise in
data
■ Components:
 Encoder
 Bottleneck
 Decoder
 Reconstruction loss
■ Back propagation – reduce reconstruction loss
23
AUTOENCODERS
24
AUTOENCODERS FOR ANOMALY
DETECTION
■ Requires co-related input data
■ Encoding depends on co-related features
■ Train a autoencoder on a data set
Pass an image from that data set- reconstruction error
is low
Pass a random image/anomaly – reconstruction error
is high
25
DETECTION
26
DETECTION
27
DETECTION
28
PROPOSED DETECTION METHOD
■ Concentrate on large enterprises with large

number of IoT devices connecting to their
networks via Wi-Fi
■ Devices include : self-deployed (e.g. Smart
smoke detectors) or dynamically introduced
(e.g. BYO Wearables)
29
■ Four main stages:

 Data Collection
 Feature Extraction
 Training an Anomaly Detector
 Continuous Monitoring
30
■ Data Collection:
 Capture raw traffic data
 pcap format
 Using port mirroring on the switch
 Collected immediately following installation
31
■ Feature Extraction:
Packet arrives – behavioral snapshot of hosts and
protocols
Obtain packet’s context by extracting 115 traffic
statistics over several temporal windows
32
 Summarize traffic that has:

Originated from same IP
Originated from both same source MAC and same IP address
Sent between the channel
Sent between the source to destination TCP/UDP sockets
33
 Extract same set of 23 features from five time windows of
the most recent 100ms, 500ms, 1.5 sec, 10sec, and 1min
34
■ Training an anomaly detector:

 Autoencoder trained to reconstruct its inputs after some
compression
 Compression ensures network learns meaningful
concepts
 Autoencoder fails at reconstructing abnormal
observations
 Significant reconstruction error - anomaly
35
 Optimize parameters and hyper parameters – maximize
TPR (True Positive Rate) and minimize FPR (False Positive
Rate)
 Training and optimization – two separate datasets, with
only benign data
 Training dataset DStrn – 2 input parameters:
 Learning rate, 𝛈
 No. of epochs – complete passes through entire DStrn

36
 Optimization dataset DSopt

 optimize 𝛈 and epochs iteratively until MSE between model’s
input and output stops decreasing
 Dsopt used to optimize a threshold (tr) which

discriminates between benign and malicious observations
 Anomaly threshold tr* = 𝑀𝑆𝐸 Ds opt + s(MSEDS opt )
37
 Single instance – high TPR & FPR

 To reduce FPR – base the abnormality decision on
a sequence of instances
 Determine minimal window size ws* as the
shortest sequence of instances – 0% FPR on DSopt
38
■ Continuous monitoring:
 Apply optimized model to vectors extracted from
continuously observed packets to mark as benign or
anomalous
 Majority vote on a sequence (length of ws*) of marked
instances – detect if entire stream is benign or anomalous
 Issue alert on detection of anomalous stream
39
EMPIRICAL EVALUATION
Mirai BASHLITE
■ Lab setup –
Replicate
organizational
data flow
■ Collect traffic
data from 9
IoT devices
40
■ Botnets deployed
 BASHLITE – set a C&C server
 Mirai – C&C server + scanner + loader
41
■ Attacks executed
 BASHLITE attacks
1) Scan: Scanning the network for vulnerable device
2) Junk: Sending spam data
3) UDP: UDP flooding
4) TCP: TCP flooding
5) COMBO: Sending spam data and opening a
connection to a specified IP address and port
42
 Mirai Attacks
1) Scan: Automatic scanning for vulnerable devices
2) Ack: Ack flooding
3) Syn: Syn flooding
4) UDP: UDP flooding
5) UDP plain: UDP flooding with fewer options, optimized for higher
PPS
43
RESULTS AND DISCUSSION
■ Each of the nine sets of benign data from nine

IoT devices divided into 3 datasets:
 DStrn – training autoencoder
 Dsopt - optimizing parameters
 Benign part of DStst for estimating FPR
44
■ Incorporate traffic from entire life cycle of

devices
■ Training and optimization use Keras
■ Each autoencoder input layer dimension
equals no. of features in dataset (115)
45
46
■ Same benign data to train three other

algorithms – Local Outlier Factor (LOF), One-class SVM
and Isolation forest
■ Optimize parameters and hyper parameters

■ Execute attacks
47
48
CONCLUSION
■ Autoencoders for most IoT devices in a test set
obtained a zero FPR
■ Difficulty in capturing normal traffic behavior
varies among IoT devices and maybe
correlated with
 Device’s capabilities
 Network communications it normally produces
■ A solid predictability score can be leveraged by
large organizations 49
REFERENCES
■ N-BaIoT : Network-based Detection of IoT Botnet Attacks Using Deep
Autoencoders
Yair Meidan, Michael Bohadana, Yael Mathov, Yisroel Mirsky, Dominik Breitenbacher,
Asaf Shabtai, and Yuval Elovici
IEEE PERVASIVE COMPUTING, VOL. 13, NO. 9, JULY-SEPTEMBER 2018
■ Kitsune: An Ensemble of Autoencoders for Online Network Intrusion
Detection
Yisroel Mirsky, Tomer Doitshman, Yuval Elovici and Asaf Shabtai Ben-Gurion University of the
Negev
arXiv:1802.09089v2 [cs.CR] 27 May 2018
50
REFERENCES
■ Botnets and Internet of Things Security

E. Bertino and N. Islam
Computer, 2017
■ https://www.pandasecurity.com/mediacenter/security/what-is-a-botnet/
Date accessed : 4th October 2019
■ https://towardsdatascience.com/auto-encoder-what-is-it-and-what-is-it-used-for-
part-1-3e5c6f017726
Date accessed : 6th October 2019
51
THANK YOU
■ANY QUESTIONS ?
52

Network-Based Detection of I T Botnet Attacks Using Deep Autoencoders

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Network-Based Detection of I T Botnet Attacks Using Deep Autoencoders

Transféré par

Droits d'auteur :

Formats disponibles

NETWORK-BASED DETECTION OF IOT

BOTNET ATTACKS USING DEEP

■ A network based anomaly detection method

■ Instantaneous detection can promote network

Number of IoT devices deployed are increasing

Traffic volume of IoT-based DDoS attacks reaches

Need for timely detection of IoT botnet attacks has

• Deploy botnets through a trojan horse

USERS • Infect their own systems

•Access and modify personal information

■ Self – propagate : seek-and-infect missions

■ Botnet design continues to evolve

■ Difficult to detect – users are anaware

Specific operational step to

The detection approach

■ Previous studies focus on early operational steps

■ Host – based : less realistic

 Limited access to some IoT devices

 Constrained computation and power

 A single non-distributed solution is preferred

■ Network - based : uses deep learning to

■ Benefits of using network-based approach:

Hardly any false alarms

■ Concentrate on large enterprises with large

■ Four main stages:

 Summarize traffic that has:

Originated from both same source MAC and same IP address

Sent between the channel

Sent between the source to destination TCP/UDP sockets

■ Training an anomaly detector:

 No. of epochs – complete passes through entire DStrn

 Optimization dataset DSopt

 Dsopt used to optimize a threshold (tr) which

 Single instance – high TPR & FPR

 Mirai – C&C server + scanner + loader

■ Each of the nine sets of benign data from nine

■ Incorporate traffic from entire life cycle of

■ Same benign data to train three other

■ Optimize parameters and hyper parameters

■ Botnets and Internet of Things Security

Vous aimerez peut-être aussi