Vous êtes sur la page 1sur 18

Wireless Pers Commun (2017) 97:1911–1928

DOI 10.1007/s11277-017-4652-y

WDTF: A Technique for Wireless Device Type


Fingerprinting

Asish Kumar Dalai1 • Sanjay Kumar Jena1

Published online: 28 July 2017


Ó Springer Science+Business Media, LLC 2017

Abstract In this Work, a technique for wireless device type fingerprinting has been
introduced. The technique utilizes the information that revealed as a result of the homo-
geneity in devices of the same make and the heterogeneity in devices of a different make.
The diversity in devices of different make is due to different device hardware compositions
and the variations in their management capabilities. We apply the statistical technique on
network traffic to create unique, reproducible device signatures. We demonstrate the
efficacy of our technique on network traffic captured in different scenarios. We have used a
total of 300 devices types representing a wide range of device classes. In the experiment,
we have used more than 1.5 GB of filtered traffic for analysis and performance evaluation.
We measure the performance of the technique by considering the accuracy of device type
detection. The results obtained are promising with a higher detection rate than its
counterparts.

Keywords Device fingerprinting  Wireless security  Intrusion detection

1 Introduction

Wireless devices are becoming ubiquitous in the day to day life. The ease of use and low-
cost solutions provided by wireless networks make them a preferred choice for office and
home networks. However, this comes at the price of the security as wireless networks
allow attackers to circumvent some security measures available to wired networks. The
wireless network has no physical boundaries, any device within the range can have access
to the network. A malicious device may intercept the network traffic, crack the security

& Asish Kumar Dalai


dalai.asish@gmail.com
Sanjay Kumar Jena
skjenanitrkl@gmail.com
1
Natioanl Institute of Technology, Rourkela, India

123
1912 A. K. Dalai, S. K. Jena

key, disrupt the normal behavior, leak the data privacy, etc. which makes the network
vulnerable. In wireless network devices from numerous vendors are connected. In general,
the network administrator can not restrict any device type or vendor to join the network.
The presence of a malicious device may risk the entire network. Detection of device types
that are obsolete or vulnerable or else not authorized to use the network is of prime
importance for security reasons. This can be done by using a device type identification
system.
The use of cryptographic-based approach [1] for device authentication in wireless
networks may create unnecessary overhead. The conventional security protocols are also
not safe as their flaws are exposed by the attackers time to time [2]. Insider attack and node
forgery are most common in a wireless network. Attackers may modify their identity to get
access to the network. A malicious device can lure victims to connect to it, thereby
exposing the network to outside [3]. Therefore, to enforce security and to avoid such
issues, we need to identify the devices that are part of the network. Device type finger-
printing is a novel approach to extract unique features to generate device-specific signa-
tures and using them to identify the device types. In this paper, we have presented a
technique named WDTF: a system capable of identifying the types of devices that wish be
the part of the network and take countermeasures against unauthorized device-types.
The remainder of the paper is organized as follows, in Sect. 2 description of the related
work in Wireless device fingerprinting has been given. Overview of the proposed tech-
nique is presented in Sect. 3. Performance evaluation is discussed in Sect. 4. Finally, the
concluding remarks and future works are given in Sect. 5.

2 Related Works

In this section, we have discussed some of the existing approaches for fingerprinting in a
wireless network. On the context of the traffic capturing and analysis, a fingerprinting
technique can be either active or passive. A passive fingerprinting technique [4–9] creates a
fingerprint by analyzing the previously captured traffic traces. An active fingerprinting
technique [10, 11] explicitly transmits some crafted packets to the target and then analyses
its response to generate the fingerprint.
Franklin et al. [4] have devised a method for fingerprinting wireless device driver. The
method is based on the bin frequency of arrival time between probe request frames. The
bin size and the actual mean of each bin are taken as the feature for generating the device
signature. Bin width tuning is an essential part as the accuracy of the method varies with
different bin size. The method has been evaluated using data traces collected under dif-
ferent scenarios. The method gives an accuracy of 96, 84 and 77% for the test set 1, 2 and 3
respectively. The method has certain limitations like it falls short to identify the different
version of the same driver and it also fails to work if there exists hardware abstract layer.
As mentioned by the authors, configuring the probe request rate, inducing noise, modifi-
cation of the driver code, MAC spoofing attack, and driver patching can prevent the
method from successfully fingerprinting the wireless device driver whereas, the proposed
method is immune to such attacks. Unlike this approach, the proposed method has a
consistent accuracy rate of 95% and has used more number of devices in the experiment
than this approach.
Desmond et al. [5] have proposed a passive approach for fingerprinting unique devices
over a wireless network. They have used the timing analysis of probe request frames to

123
WDTF: A Technique for Wireless Device Type Fingerprinting 1913

extract the pattern unique to the device. The combination of machine, wireless NIC device
driver, operating system has been taken as a unique tuple to represent a device. Device
clock skew, channel scanning interval and implementation of timing mechanism has been
considered as features that uniquely represents a machine, device driver and the operating
system respectively. A modified maximum variance clustering method has been used for
fingerprinting the devices. Further to verify whether two samples of inter-burst latencies
are from the same distribution or not they have employed a statistical hypothesis testing
named Mann–Whitney U-test. The experiments have been conducted in both controlled
environment and as well as a public hotspot to achieve an accuracy of 70 to 80%. They
have claimed that their technique can be used for identity resolution in WLAN such as
spoof detection, network forensics, and reconnaissance. But the technique has certain
limitations; it requires around one hour of time to collect enough data to generate the
fingerprint for a device. Unlike this approach, the proposed method takes very less time,
and an exchange of a single frame is sufficient enough to detect the device type. Also,
interference in the network, shadowing, packet loss and congestion may reduce the effi-
ciency of the technique whereas, these scenarios have no effect on the proposed method.
Gao et al. [7] developed a technique for Access Point fingerprinting. They have used an
emulator (iperf) to transmit normal traffic through the access point. The resultant traffics
inter-arrival time (IAT) is taken as the feature vector. Multilevel decomposition of these
IAT signal is done by using wavelet transformation (Haar wavelet) to extract a unique
pattern that represents the device signature. The experiment was conducted using six
different APs and 100,000 packets for each AP. With the increase of the traffic used to
fingerprint the device, the accuracy of the technique approaches 100%. Their method has
both offensive and defensive application, the network administrator can use it to detect a
rogue access point, and the attackers may use it to fingerprint the AP to launch firmware
specific attack. The accuracy of the technique may reduce along with the increasing
number of AP used in the experiment. We have experimented our proposed method with
varying the number devices starting from 30 to 300 and did not find any subsequent loss in
accuracy. Additionally, they have conducted the experiments in a testbed with emulated
traffic as opposed to real network traffic used in our proposed approach.
To detect unauthorized access points in wireless network Jana and Kasera [8] have
proposed a method using device clock skew. The Time Synchronization Function (TSF)
timestamps of probe response frames and the beacon frames were taken to retrieve the
clock skew. Two different methods such as linear programming approach and least square
fitting have been used for estimating clock skew of the AP. The separation of clock offsets
is used to differentiate between the authorized AP and the fake AP. For the experiment,
they have used data collected during the ACM Sigcomm 2004 conference and wireless
traffic traces obtained from residential Wi-fi. Although the method is suitable for detecting
fake AP, the periodic clock synchronization among the nodes makes it difficult in iden-
tifying individual nodes in an ad-hoc wireless network. But the proposed method can detect
individual nodes with an accuracy of 95% and also the clock synchronization which is a
common phenomenon in a wireless network has no effect on the proposed method.
Neumann et al. [9] have done performance analysis of five different network parameters
transmission rate, frame size, medium access time, transmission time, and frame inter-
arrival time by individually taking these as the feature for wireless device fingerprinting.
The frequency distribution of these parameters calculated using the histogram and their
weight has been considered as a device signature. The authors have pointed that signal
processing methods such as n-dimensional histograms, correlation functions or frequency
analysis using Fourier or Wavelet transformations may be used to generate a more

123
1914 A. K. Dalai, S. K. Jena

stable device signature. They have performed the similarity and identification test to
evaluate these parameters using data traces from the SIGCOMM-2008 conference as well
as their testbed traffic measurement. They have found that out of these five parameters,
transmission time and frame inter-arrival time perform better in comparison to the others
and under difficult testing conditions inter-arrival time perform better than transmission
time. Therefore, it has been concluded that inter-arrival time is the most promising network
parameter for device fingerprinting. This method may be applicable for detecting fake
devices, localization, and tracking. Forging the signature by mimicking a genuine device,
attacking at the learning stage of the algorithm and denial of service attack on the fin-
gerprinting/ monitoring device are certain cases that can hamper the functioning of this
method. Finally, the authors suggested that the combination of multiple parameters for
device fingerprinting as a future research direction. As opposed to this method we have
used only those features which are immune to signature forgery attack, and such attacks
have no implications on the proposed method.
The active fingerprinting methodology usually injects packets into the network and
analyzes the response of the devices to generate a fingerprint. Bratus et al. [10] have
proposed an active approach for fingerprinting 802.11 wireless devices. The key idea is to
observe the response of the target device to a series of crafted packets. To generate and
inject non-standard and malformed packets, they developed a tool named BAFFLE. To
fingerprint, the wireless device a series of tests are done, and the outcomes of which are
combined in a decision tree structure. Two different processes have been used for scanning
and monitoring, the scanning platform (LORCON injection framework) prepares and sends
a series of stimulus frames, and the monitoring platform (RFMON mode libpcap-based)
sniffs the responses and converts the frame captures to the format suitable for the tests in
the decision tree. The method is applicable for fingerprinting both access points and client
stations as well. The result shows that the method distinguishes among the devices but we
can not sure about the performance as the experiment was conducted with a very limited
number of devices. As opposed to their approach which uses only ten devices our method
uses a set of 300 devices to conduct the experiment with a detection accuracy of 95%.
Corbett et al. [11] also observed unique traffic patterns due to differences in the rate
switching function. They applied spectral analysis and used frequency-domain features to
generate a unique profile for each device. The method can classify the vendor and as well
as the NICs manufactured by the same vendor using their spectral profiles. According to
them, 90% of the device perform rate switching. Therefore, rate switching can be con-
sidered as a viable attribute within the wireless NIC for distinguishing between them. The
experiment was conducted in a hotspot environment by taking 6 NIC cards of 3 different
make. The transmission rate and timestamps of the UDP and TCP traffic sent by the client
to the AP were used to analyze the rate switching of the NICs. Even though the method can
distinguish among the NICs using the spectral profile, the efficacy of the system may
reduce with increased number of devices. Therefore, we have conducted our experiment by
adding 30 devices each time from 30 to 300 and found that increasing number of devices
brings no loss in the accuracy of detection.
Some methods have incorporated both active and passive analysis for device finger-
printing. Radhakrishnan et al. [12] have proposed ‘‘GTID’’ a fingerprinting approach for
identification of the device and its type. To generate the device signature, they have taken
the frequency of inter-arrival time values falling in 300 equally spaced time bins. A multi-
layer feed forward artificial neural network with 50 hidden layers has been trained using
this signature. A total of 37 different devices has been used to evaluate the performance of
the model using both an isolated testbed and a live campus network. The efficiency of the

123
WDTF: A Technique for Wireless Device Type Fingerprinting 1915

model is also tested against different attack scenarios. The method can be applied for
providing authentication, network management, and access control systems. While this
method has only used 37 devices, we have conducted the experiment by taking 300 devices
into account. This method has used both real and isolated testbed, but the proposed method
used only real network traffic and gives better accuracy when implemented in real network
scenarios.
Xu et al. [13] have presented a tutorial on wireless device fingerprinting. In their work,
they have discussed the variety of features that can be used to generate a device fingerprint.
They have classified the features based on the protocol layers they are generated from,
traffic traces (active/passive) from where the features are extracted and also on their
granularity of detection. They have also analyzed some of the existing fingerprinting
algorithms and concluded that unsupervised based approaches are more practical and have
real life implementations. But unsupervised based methods require higher computational
complexity and also can not pin down the attacking device, unlike the proposed white-list
based approach. Further, they have also discussed some of the open issues and possible
research directions in wireless device fingerprinting. While the entire work is an extensive
survey on device fingerprinting, our method presented a novel approach for fingerprinting
wireless device types.
Some recent approaches demonstrate the use of fingerprinting to incorporate security in
Internet-of-Things (IoT). Arseni et al. [14] have done a comparative analysis of the five
different IoT platforms (HomeKit, IoTivity, AllJoyn, Sen.se, and Xively) with respect to
security. They have also pointed out the possible vulnerabilities that are present in these
platforms. From the comparison, it is observed that Xively platform, has the best security
assurance than other platforms. Further, this article can be taken as a guide to select the
best suitable platform as per the need of the users. Miettinen et al. [15] have presented a
technique named IOT SENTINEL; to automatically identify device types in IoT to enforce
security. An IoT network may contain a variety of IP connected devices, and the presence
of a vulnerable device can risk the entire network. To overcome this issue, they have
proposed a technique to secure the network, where a vulnerable device can coexist without
hampering other devices in the network. Their approach is based on device type identi-
fication and then isolation of the vulnerable device by controlling its traffic flow. The
proposed method lacks attack analysis and also the features used for device fingerprinting
are susceptible to forgery, but the same is not true for the proposed method. The
unavailability of software updates is also a bigger issue for the proposed method.
The detailed description of some of the existing fingerprinting methodologies is
depicted in Table 1.

3 Overview of the Model

In this section, we introduce the major components of the proposed model, used to fin-
gerprint the device types. It has five major components: data pre-processing, feature
extraction, signature generation, similarity measure and device verification.

3.1 Data Pre-Processing

The method analyses wireless network traffic sent by the client devices for device type
fingerprinting. The method deals with the probe request frames sent by the devices during

123
1916 A. K. Dalai, S. K. Jena

Table 1 Description of the related fingerprinting techniques


Approach Detection Features Active/passive Technique used Limitation
type

Franklin et al. Device Time delta Passive Data binning. Unsuccessful in


[4] driver Supervised distinguishing
bayesian between different
classifier versions of the same
driver
Corbett et al. Vendor Rate Active Spectral Limited to vendor
[6] type switching analysis. DFT identification
Bratus et al. Device Response to Active Decision list Less no of devices.
[10] Type, crafted learning Active non-standard
Rouge frames malformed frames
AP may induce overhead
Desmond et al. Device Time delta Passive Max variance Needs much time.
[5] clustering. Shadowing,
Mann– interferences and,
Whitney channel fading
U-Test effects, packet losses
and delays
Corbett et al. Vendor Rate Passive PSD. Welch Very les no of devices
[11] type switching method
Gao et al. [7] Access Packet IAT Passive Data binning Isolated testbed traffic.
point wavelet Liimited to AP. Less
transformation no of devices
Jana and Fake AP Clock skew Passive Linear The same method is
Kasera [8] programming. unable to identify
Least-square individual nodes
fit
Neumann et al. Device Transmision Passive Histogram and Unable to defend
[9] time and Cosine D-DoS attack during
IAT Similarity fingerprinting
Radhakrishnan Device Timestamps Both Data binning Have not specified the
et al. [12] and its ANN minimum no of
type frames required
fordevice
identification

network scanning for availability. As network traffics from devices are collected, in pre-
processing phase, it filters the probe request frames and extracts the data-link layer header
from the frames. As probe requests are broadcast in nature, it is easy for the monitoring
device to capture these packets for analysis. The probe request frame contains a total of 53
different fields. The detail of probe request frame format is given in the Fig. 1.
The captured wireless traffic is available in .pcap file format. We have used a packet
capturing and analysis tool, tshark to extract the values of each field present in the probe request
packet. The command used to extract the fields and their corresponding values from .pcap file
and to convert it into .csv is as follows: ½tshark  rfilename:pcap  Tfields  efiledname
Eheader ¼ y  Eseparator ¼; Equote ¼ d  Eoccurrence ¼ f [ newf ilename:csv Fur-
ther, it is found that the traces contain some outliers, for instance, the probe request frames for
some device types are very few. The presence of these instances may reduce the performance

123
WDTF: A Technique for Wireless Device Type Fingerprinting 1917

MAC Header

2 2 6 6 6 2 Variable 4

Frame Sequence
Duration DA SA BSSID Frame Body FCS
Control Control

8 2 2 Variable Variable 7 2

Beacon Capability Supported FH DS Parameter


Timestamp SSID
Interval Info Rates Parameter Set Set

Continued
8 4 Variable 3 Variable 6

CF IBSS FH Hopping FH Pattern


Country Info Power Constant
Parameter Set Parameter Set Parameter Table

Continued
8 4 Variable 3 Variable Variable

Channel Switch ERP Exiended Robust


Quiet IBSS DFS
Announcement Information Supported Rate Secuirty Network

Fig. 1 Probe request frame format of IEEE802.11b

of the proposed method. Therefore these instances are categorized as outliers and are removed
using the extreme value analysis outliers removal algorithm as given below.

Data: probe-request traffic traces


Result: Traffic traces after remoiving the outliers
1 Initialize all instancess (S1 , S2 , ..., Sn )
2 for each instance Si in the traceset do
3 Find U unique devices. count C(U) no of instances.
4 for each each device in U do
5 find M unique instances. count C(M).
6 for each unique instance Sj in M do
7 if C(M ) < C(U )/M then
8 remove Sj
9 end
10 end
11 end
12 end
Algorithm 1: Outlier removal algorithm

After removing the outliers, the datasets are ready for further processing. The classi-
fication of device types can be done by taking the value of the fields as features.

3.2 Feature Extraction

The objective here is to choose a set of unique, tamperproof, non-reproducible features that
can be used for the device type fingerprinting. The conventional Correlation-based Feature

123
1918 A. K. Dalai, S. K. Jena

Selection (CFS) measure evaluates subsets of features by the following hypothesis: ‘‘Good
feature subsets contain features highly correlated with the classification, yet uncorrelated
with each other’’. The following equation gives the merit of a feature subset S consisting of
k features:
krcf
MeritSk ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð1Þ
k þ kðk  1Þrff
where rcf is the average value of all feature-classification correlations, and rff is the
average value of all feature-feature correlations. The CFS criterion is defined as follows:
" #
rcf1 þ rcf2 þ    þ rcfk
CFS ¼ max pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi : ð2Þ
Sk k þ 2ðrf1 f2 þ    þ rfi fj þ    þ rfk f1 Þ

The rcfi and rfi fj variables are referred to as correlations. Using the correlation based
feature selection method gives a set of features. But we can not rely on these feature subset
as some of the features are prone to tampering. The attackers can forge these features to
modify their identity. The integrity and reliability factors of these features have to be
considered before involving them in classification. For instance, the first three octet from
the physical address or MAC address of the wi-fi device known as the Organizationally
Unique Identifier (OUI) is unique and sufficient enough to classify the device types. But as
the MAC address can be changeable, considering it as a feature for classification may give
a drastically poor performance as given in Fig. 2.
Therefore, we have used the correlation and integrity based scoring approach for feature
selection. The flow chart for the feature selection framework using correlation and integrity
is given in Fig. 3.
The multi-factor analysis is applied to nominal feature values and classes. For each
feature, the correlation and integrity (i-value) between each feature value pair of this
feature to the positive and negative classes are calculated, corresponding to correlation and
integrity, respectively. If the correction of a feature-value pair with the positive class is

Fig. 2 ROC curves (GAR vs 1


FAR) for WDTF of normal traffic
traces and traffic traces under Non-Attack
MAC spoofing attack
Genuine Acceptance Rate

0.8 Attack

0.6

0.4

0.2

0
0 0.2 0.4 0.6 0.8 1
False Acceptance Rate

123
WDTF: A Technique for Wireless Device Type Fingerprinting 1919

Fig. 3 Framework for feature


selection using correlation and
integrity

Table 2 The calculation of i value


Sl. # Criteria Example

1 Changes with successive frames Serial number


2 Can be modified by the attacker using tools Source address
3 Changes with time, location and external interference Signal strength
4 Changes with the SSID, the device probing for Frame length
5 Remains same throughout the capture and can not be forged Management capabilities

greater than threshold value 0.5, it indicates this feature value pair is more closely related
to the positive class than to the negative class, or vice versa. For calculating the i-value,
certain assertion been made as given in the Table 2. The i-value is calculated for all the 53
fields present in IEEE 802.11 probe request frame.
There are situations when the correlation value is very close to one, but the i-value is
close to zero. Such scenario could happen when the feature changes with time and or prone
to tampering. After getting the correlation and integrity information of each feature-value
pair with the device-class in the multi-factor calculation stage (represented by the corre-
lation and i-values correspondingly), both the values are combined to calculate the score
metric. Finally, after getting a score for each feature, a ranked list would be generated
according to these scores, and then different stopping criteria can be adapted to generate a
subset of features. Afterward, the signature for each device classes can be generated based
on the selected features.
Using the correlation and integrity based feature selection method we have obtained a
total of 19 number of fields out of 53 fields which can be considered for generating device
signature. The feature subset includes

123
1920 A. K. Dalai, S. K. Jena

½wlan mgt:tag:number; wlan mgt:tag:length; wlan mgt:ssid; wlan mgt:supported rates;


wlan mgt:extended supported rates; wlan mgt:ht:capabilities; wlan mgt:ht:ampduparam;
wlan mgt:ht:mcsset; wlan mgt:htex:capabilities; wlan mgt:txbf ; wlan mgt:asel;
wlan mgt:tag:oui; wlan mgt:tag:vendor:oui:type; wlan mgt:vs:pren:type;
wlan mgt:vs:ht:capabilities; wlan mgt:vs:ht:ampduparam; wlan mgt:vs:ht:mcsset;
wlan mgt:vs:htex:capabilities; wlan mgt:vs:txbf ; wlan mgt:vs:asel

3.3 Signature Generation

The signature generation process uses statistical analysis to reveal the patterns unique to
the device type. We have taken the mean, standard deviation and energy of the feature
subset as the device signature using the following formula.
1X n
Mean l¼ xi ð3Þ
n i¼1
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1X n
Standard Deviation r ¼ ðxi  lÞ2 ð4Þ
n i¼1

1X n
Energy ¼ jxi j2 : ð5Þ
n i¼1

The ½l; r;  is the signature vector that represents a device type. These signature has
been used to generate the profile for each device type. This produces the master signature:
[Deviceid, Signaturevector].

3.4 Similarity Measure

To measure the similarity among the device types, we have to choose among various
distance similarity measure techniques. The selection has to be made on the basis of their
performance evaluation. Therefore, we have used a set of distance similarity measure
techniques to find out the similarity among the device types. The following distance
similarity measures have been used in this approach. Canberra Distance:
X
n
xi  yi
dðx; yÞ ¼ ð6Þ
i¼1
jxi jjyi j

City Block/ Taxicab Geometry/ Manhattan Distance:


X
n
dðx; yÞ ¼ jxi  yi j ð7Þ
i¼1

Correlation Coefficient:
Pn
i¼1 ðxi  x Þðyi  yÞ
rðx; yÞ ¼ qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn P ð8Þ
i¼1 ðxi  x Þ2 ni¼1 ðyi  yÞ2

123
WDTF: A Technique for Wireless Device Type Fingerprinting 1921

Cosine Similarity:
Pn
xi yi
Pn i¼1
cðx; yÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
2
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn 2
ð9Þ
i¼1 xi i¼1 yi

Normalized Euclidean Distance:

1 normððx  xÞ  ðy  yÞÞ2


dðx; yÞ ¼ 
2 normðx  xÞ2 þ normðy  yÞ2
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð10Þ
X n
where normðxÞ ¼ jxi j2
i¼1

Generalized Jacquard Distance:

dj ðx; yÞ ¼1  jðx; yÞ
Pn
Minðxi ; yi Þ ð11Þ
where jðx; yÞ ¼ Pni¼1 :
i¼1 Maxðxi ; yi Þ

The performance metrics are evaluated for each distance similarity measure to find out
the best suitable similarity measure technique. From the results given in Sect. 4.3, it has
been found that normalized Euclidean distance gives better performance than the rest.
Normalized Euclidean distance yields a closeness value between 0 and 1. Where 0 denotes
a perfect match, and 1 denotes and complete mismatch. Each device from the test set
(essentially the unknown signatures) is compared with the master signatures using the
closeness value.

3.5 Verification of the Device

– In enrollment phase for a given set of known devices DK ¼ d1 ; d2 ; d3 ; . . .; dn , n- total


number of device types, For each device typedi where di DK; Extract the feature tuple
Fi .
– Similarly, during verification for all unknown devices DU ¼ d1 ; d2 ; d3 ; . . .; dm , m- total
number of device types, For each device typedj where dj DU; Extract the feature tuple
Fj .
– For each unknown device type dx , which claims to be the di ( dx having same OUI as of
di ) measure the closeness of dx with di .
– Confirm the device type to be di , if it has a matching score higher than the optimal
threshold, otherwise mark dx as an unregistered device type.

4 Performance Evaluation

4.1 Datasets Used

We evaluated this methodology using the traffic traces from the Sapienza/probe-requests
dataset (v.2013-09-10) [16] by CRAWDAD (A Community Resource for Archiving
Wireless Data At Dartmouth) and our experimental set up in the ISRL (Information
Security Research Laboratory) at NIT Rourkela. In the Sapienza/probe-requests dataset, it

123
1922 A. K. Dalai, S. K. Jena

contains the probe request frames collected passively from different wireless devices using
commodity hardware at university campus as well as city-wide, national and international
events in Italy. The release contains anonymized traces in .pcap format.
The Sapienza/probe-requests dataset contains eight trace sets. The data has been col-
lected by using tcpdump and applying the filter to select only the probe-requests. Each
trace-sets contain only probe request for heterogeneous devices. The pcap files from the
data set are analyzed using tshark to retrieve the values for all fields from the probe request
frames.
Experimental setup dataset contains a passive analysis of the probe request frames
captured through the ALFA network (AWUS036NH long range adapter) for nine days on
the real network traffic during peak hours.

4.2 Performance Measures Used

Genuine Acceptance Rate (GAR), False Acceptance Rate (FAR), False Rejection Rate
(FRR), Receiver Operating Characteristics (ROC) curve and Area Under ROC curve
(AUC), Accuracy, Equal Error Rate (ERR) have been used as the performance measures to
evaluate the efficiency of the proposed model.
– Receiver Operating Characteristic (ROC) ROC curve depicts the dependence of GAR
(Genuine Acceptance Rate) with FAR (False Acceptance Rate)for change in the value
of the threshold. The curve is plotted using linear, logarithmic or semi-logarithmic
scales. Genuine acceptance rate is the measure of identifying an authorized device and
is statistically represented as:
TP
GAR ¼ ð12Þ
TP þ FN
False acceptance rate is the measure of identifying the unauthorized device as
authorized and is statistically represented as:
FP
FAR ¼ ð13Þ
FP þ TN
False Rejection Rate (FRR) FRR is the measure of incorrectly rejecting the authorized
devices and is statistically represented as:
FN
FRR ¼ ð14Þ
TP þ FN
Where TP, TN, FP, FN refer to True Positive, True Negative, False Positive and False
Negative respectively.
– Area Under Curve (AUC) AUC is the percentage of coverage under the ROC Curve.
The more the coverage, the more the accuracy of the system. In ideal case, for a system
with 100 % accuracy, GAR = 1 at FRR = 0, causing AUC = 100%.
– Accuracy: With accuracy the overall performance of the system is measured. Accuracy
is defined as:
TP þ TN
Accuracy ¼ ð15Þ
TP þ TN þ FP þ FN

– Equal Error Rate (EER) The EER refers to the point in a ROC curve, where the FAR
equals the FRR. Thus a lower EER value indicates better performance.

123
WDTF: A Technique for Wireless Device Type Fingerprinting 1923

4.3 Results and Discussion

The similarity score between the observed signature and stored signature, is used to
identify the device type. Therefore, the empirical selection of the similarity measures has
to be made. In this work, we have used a set of six different similarity measures. For a
selection of the best suitable similarity measure, we have evaluated the performance of the
model using these distance similarity measures. ROC curves (GAR vs FAR) using several
similarity measures is shown in Fig. 4. Further, the area under the curve and the equal error
rate are calculated for each distance similarity measures and presented in Figs. 5 and 6

Fig. 4 ROC curves (GAR vs


FAR) using several similarity
measurement techniques

0.99

0.98
Area Under the Curve

0.97

0.96

0.95

0.94

0.93

0.92

0.91

0.9
rra k ion e
be loc lat sin ian rad
n yb re Co led ca
Ca Cit Co Eu
c Ja
rm
No

Fig. 5 Area under the curve using several similarity measurement techniques

123
1924 A. K. Dalai, S. K. Jena

0.16

0.14

Equal Error Rate 0.12

0.1

0.08

0.06

0.04

0.02

0
a k on e n ard
err loc ati sin dia cc
nb yb rel Co cle Ja
Ca Cit Co Eu
rm
No

Fig. 6 Equal error rate using several similarity measurement techniques

respectively. From these results, it is found that normalized Euclidean distance would be
the best choice for measuring the closeness among the devices.
Further, we should consider the removal of outliers from the data. An outlier is con-
sidered to be a data point that is far outside the norm for a variable or population. The
presence of outliers can lead to inflated error rates and substantial distortions of parameter
and statistic estimates when using either parametric or nonparametric tests [17]. The
outliers in the data are removed using extreme value analysis outliers removal algorithm,
and the performance analysis is done to visualize the importance of outliers removal.
Figure 7 shows the comparison of the ROC curve for data with and without outliers. From
the result, it is observed that the outlier removal technique significantly improves the
performance of the proposed method.
The existing approaches for wireless device fingerprinting have experimentally evalu-
ated their work on publicly available datasets and isolated testbeds, but the number of
devices used in their experiments are limited, and it is found that the performance of these
method degrades with increasing number of devices. Therefore, at our end, we must
evaluate the proposed model by increasing the number of devices participated in the device
fingerprinting process. To do so, we first evaluated the method with 30 devices and then
subsequently added the number of devices till 300 and calculated the performance. Fig-
ure 8 shows the variations in the area under the curve with respect to increased number of
devices. From the result, it has been observed that increasing the number of devices has no
major impact on the performance of the proposed method.
We have also compared the proposed method with that of the existing approaches for
device type fingerprinting. The proposed method achieves better accuracy rate while using
a large number of devices than its counterparts as given in Table 3.

123
WDTF: A Technique for Wireless Device Type Fingerprinting 1925

Fig. 7 Performance analysis of (a) 1


WDTF for identification of
Wireless device types. a ROC 0.9
curve for Sapienza/probe- With Outlier

Genuine Acceptance Rate


0.8 Without Outlier
requests dataset. EER = 12.32%.
b ROC curve for Experimental 0.7
setup. EER = 11.11%
0.6

0.5

0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
False Acceptance Rate

(b) 1

0.9
With Outlier
0.8 Without Outlier
False Acceptance Rate

0.7

0.6

0.5

0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Genuine Acceptance Rate

5 Conclusion

In this work, we have explored the use of probe request frame field values measured in
various traffic types to identify the wireless device types. We developed a methodology
that benefits from homogeneity in the management capabilities of device types in wireless
local area network. We evaluated this methodology using the traffic traces from the
Sapienza/probe-requests dataset (v.2013-09-10) by CRAWDAD and our experimental
setups. We showed that our method has fair detection rate and uses a larger number of
devices for experiment compared to the existing fingerprinting techniques. We also dis-
cussed and quantified the performance of our work on real-world network traffic captured
at different scenarios. Our exploration result shows that the use of probe request frame
fields to be an efficient and robust method for identifying wireless device types. As a part
of the future work, we want to integrate the work in the wireless intrusion detection system

123
1926 A. K. Dalai, S. K. Jena

Fig. 8 AUC versus number of 1


devices for WDTF of data With Outlier
0.98 Without Outlier
instance with and without outliers
0.96

Area Under the Curve


0.94

0.92

0.9

0.88

0.86

0.84

0.82

0.8
30 60 90 120 150 180 210 240 270 300
Number of Devices

Table 3 Comparison of the proposed method with existing approaches


Approach Detection type Devices used Traffic type Accuracy

Franklin et al. [4] Device driver 17 Real 77–96


Corbett et al. [6] Vendor type 6 Real 90–100
Bratus et al. [10] Device type rouge AP 10 Real Unknown
Corbett et al. [11] Vendor type 06 Real 75–100
Radhakrishnan et al. [12] Device and its type 37 Testbed and Real 83–95
Proposed approach WDTF Device type 300 Real 95

to defend the masquerading attack. Our solution addresses the problem of identifying
wireless device types, in future work we will devise a method that can uniquely identify
individual devices in the wireless local area network.

References
1. Sklavos, N., & Koufopavlou, O. (2003). Mobile communications world: security implementations
aspects-a state of the art. CSJM Journal, Institute of Mathematics and Computer Science, 11(2), 32.
2. Hsieh,W.-C., Chiu, Y.-H., Lo, C.-C. (2005). An interference-based prevention mechanism against wep
attack for 802.11 b network. In Network Control and Engineering for QoS, Security and Mobility, III,
pp. 127–138. Springer.
3. Mavridis, I.P., Androulakis, A.-I.E., Halkias, A.B., Mylonas, Ph. (2011). Real-life paradigms of wireless
network security attacks. In 15th Panhellenic Conference on Informatics (PCI), 2011, pp. 112–116.
IEEE.
4. Franklin, J., McCoy, D., Tabriz, P., Neagoe, V., Randwyk, J.V., Sicker, D. (2006). Passive data link
layer 802.11 wireless device driver fingerprinting. In Usenix Security, vol. 6.
5. Desmond, L.C.C., Yuan, C.C., Pheng, T.C., Lee, R.S. (2008). Identifying unique devices through
wireless fingerprinting. In Proceedings of the first ACM Conference on Wireless Network Security,
pp. 46–55. ACM.

123
WDTF: A Technique for Wireless Device Type Fingerprinting 1927

6. Corbett, C.L., Beyah, R., Copeland J. et al. (2006). Using active scanning to identify wireless nics. In
Information Assurance Workshop, 2006 IEEE, pp. 239–246. IEEE.
7. Gao, K., Corbett, C., Beyah, R. (2010). A passive approach to wireless device fingerprinting. In
International Conference on Dependable Systems and Networks (DSN), 2010 IEEE/IFIP, pp. 383–392.
IEEE.
8. Jana, S., & Kasera, K. (2010). On fast and accurate detection of unauthorized wireless access points
using clock skews. IEEE Transactions on Mobile Computing, 9(3), 449–462.
9. Neumann, C., Heen, O., Onno, S. (2012). An empirical study of passive 802.11 device fingerprinting. In
32nd International Conference on Distributed Computing Systems Workshops (ICDCSW), 2012,
pp. 593–602. IEEE.
10. Bratus, S., Cornelius, C., Kotz, D., Peebles, D. (2008). Active behavioral fingerprinting of wireless
devices. In Proceedings of the first ACM Conference on Wireless Network Security, pp. 56–61. ACM.
11. Cherita, C. L., Beyah, R. A., & Copeland, J. A. (2008). Passive classification of wireless nics during
active scanning. International Journal of Information Security, 7(5), 335–348.
12. Radhakrishnan, S.V., Beyah, R., et al. (2014). Gtid: A technique for physical device and device type
fingerprinting. IEEE Transactions on on Dependable and Secure Computing.
13. Qiang, X., Zheng, R., Saad, W., & Han, Z. (2016). Device fingerprinting in wireless networks: Chal-
lenges and opportunities. IEEE Communications Surveys & Tutorials, 18(1), 94–104.
14. Arseni, S.-C., Halunga, S., Fratu, O., Vulpe, A., Suciu G. (2015). Analysis of the security solutions
implemented in current internet of things platforms. In Grid, Cloud & High Performance Computing in
Science (ROLCG), 2015 Conference, pp. 1–4. IEEE.
15. Miettinen, M., Marchal, S., Hafeez, I., Asokan, N., Sadeghi, A.-R.,Tarkoma, S. (2016). Iot sentinel:
Automated device-type identification for security enforcement in iot. arXiv preprint arXiv:1611.04880
16. Barbera, M. V., Epasto, A., Mei, A., Kosta, S., Perta, V. C, & Stefa, J. (2013). CRAWDAD dataset
sapienza/probe-requests (v. 2013-09-10). http://crawdad.org/sapienza/probe-requests/20130910.
17. Osborne, J. W., & Overbay, A. (2004). The power of outliers (and why researchers should always check
for them). Practical Assessment, Research & Evaluation, 9(6), 1–12.

Asish Kumar Dalai received his B.Tech. in Computer Science and


Engineering from Gandhi Institute of Engineering and Technology,
Odisha, India in 2005 and M.Tech. from the Department of Computer
Science and Engineering, National Institute of Technology Rourkela in
2013, where he is currently working toward his Ph.D. on intrusion
detection in wireless network. His research interests include web
security, intrusion detection system and network secuirty.

123
1928 A. K. Dalai, S. K. Jena

Sanjay Kumar Jena received his M.Tech. in computer science and


engineering from the Indian Institute of Technology Kharagpur in
1982 and Ph.D. from the Indian Institute of Technology Bombay in
1990. He is a fulltime professor in the Department of Computer Sci-
ence and Engineering, National Institute of Technology Rourkela. He
is a senior member of IEEE and ACM and a life member of IE(I),
ISTE and CSI. His research interests include data engineering, infor-
mation security, parallel computing and privacy preserving techniques.

123

Vous aimerez peut-être aussi