Vous êtes sur la page 1sur 6

Computer Security Symposium 2011

19-21 October 2011

How to Setup Online Phishing Experiments:


Lessons from Previous Studies

Yunsang Oh Takashi Obi

Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology


4259-G2-2 Nagatsuta, Midori-ku, Yokohama 226-8502, JAPAN
oh.y.ab@m.titech.ac.jp, obi@ip.titech.ac.jp

Abstract Phishing threatens to topple society’s stability because this erodes trust in its un-
derlying infrastructure. From the insight, researchers are attempting to quantify how people fall
for deceit. However, in-lab studies are challenged with ecological and external validity issues. So
researchers are engaged in deceit-based field studies of users that are conducted without prior
consent. Unfortunately, such studies can expose researchers to ethical risks since field studies
usually mimic real phishing. Here, we present studies about how researchers managed risks in
previous studies, and then propose a recommendable experiment design method for ethical and
valid phishing study.

1 Introduction trolled, in-lab studies of solicited users, this


approach is not as favored by those studying
Phishing is a way of attempting to acquire phishing.
sensitive information such as usernames, pass- In this sense, phishing researchers recently
words and credit card details by masquerading favor deceit-based field studies of users that
as a trustworthy entity in an electronic com- are conducted without their knowledge or
munication. This attack is typically carried prior consent. In general, these field studies
out by e-mail spoofing or instant messaging, can be conducted by mimicking real phish-
and it directs users to enter details at a fake ing attacks for the experiments’ effectiveness.
website whose look and feel are almost identi- However, this makes burdens to experiment
cal to the legitimate one. So a growing num- since individual participants can feel the ex-
ber of phishing researchers share the insight periment is unethical and they did not consent
that this deceit-based attack threatens to top- to the experiment. Even worse, researchers
ple information society’s stability because this can be exposed to legal risks.
erodes trust in its underlying infrastructure. In order to overcome the ethical and legal
Researchers agree that phishing must be setbacks for valid phishing studies, we need
fought, but to do so effectively, they must fully to investigate previous works about how they
understand the threat; that starts by quan- managed the issues. We found some mean-
tifying how and when people fall for deceit. ingful discussions about how to design phish-
A number of studies have been done previ- ing experiments while reducing ethical and le-
ously to gain insights into users’ behavioral re- gal risks. Some researchers already conducted
sponse to phishing. However, these and other field experiments while clarifying how they
recent phishing-related laboratory studies are managed the risks. In this context, this paper
not readily generalizable to a larger popula- will contribute in the following areas. Firstly,
tion and their authors were able to find few, we survey what previous field studies men-
if any, correlation between demographics, per- tion about designing a field experiment (see
sonal characteristics, and behaviors relevant to section 3). Secondly, we present our study
the studies. While researchers in other areas experiences designing a real phishing attack
of computer science are able to rely on con- mimic (see section 4). Lastly, we propose best
- 528 -
practices for experiment setup considering sur- Field Study can be used to evaluate partic-
veyed issues. For this, we will focus more ipants’ susceptibility to phishing with higher
on the situation in Asian countries such as in confidence than “lab studies”, but not to
Japan or South Korea, since all of the previous evaluate the effectiveness of training or anti-
works were conducted and discussed based in phishing tools. Despite of ethical and legal
the United States only (see section 5). issues, this method is most effective to un-
Note that, despite this paper presents the derstand precise users’ behavioral responses
best practices to help researchers to reduce against phishing attempts and to find an
ethical or legal troubles, this information must unidentified new threat model.
not be taken as legal advice.
3 Previous Works Survey
2 Phishing Study Methods
In section 2, we introduced three phishing ex-
Thus far, a variety of tools and techniques are perimental methods. In practice, setting up
advancing the fight against phishing. How- survey and lab study can be conducted with-
ever, phishing detection remains an arms race. out ethical and legal issues. However, in order
All of defending tools and techniques stand to to understand various aspects of users’ sus-
benefit from knowledge of users’ behavioral re- ceptibility to phishing attacks precisely, field
sponse to phishing. For that, researchers have study is highly required in order to defeat
used a variety of methods - “Survey”, “Lab phishing. In this section, we will introduce
Studies”, and “Field Studies” - in user studies how the previous studies managed the real
designed to gain insights into the issues. field phishing experiments. Referring to previ-
ous works will contribute to propose best prac-
Survey has widely been conducted to un- tices for designing field experiments.
derstand users’ mental models and decision
processes when they faced phishing. However, 3.1 Participants’ Responses
surveys tend to underestimate damages; many
Jagatic et al. [1] conducted a field study to
victims are unaware that an attack occurred or
identify a new phishing threat model that
are unwilling to disclose that they fell for it.
attackers exploit publicly available personal
information on social networks. With a
Lab Study is generally conducted by a role
real phishing experiment, this work succeeded
playing in order to test users’ susceptibility to
to identify the threat model named “Social
phishing attacks and evaluate the effectiveness
Phishing”. Specifically, this work was the first
of anti-phishing toolbars and training materi-
to harvest voices from experiment participants
als. Generally, participants play a fictitious
when they understood they participated in
role and use personal information associated
the experiment without their consent. Par-
with that role. Lab studies are very help-
ticipants’ opinions were collected through the
ful in understanding user behavior in a given
research blog.
situation. However, this study method has
Despite of the meaningful achievement, this
tradeoffs and faces validity challenges: most
work failed to clarify all ethical issues. Re-
of these studies are challenged with ecological
searchers of this work conducted a real phish-
(whether the methods, materials, and settings
ing by sending an email to some students in a
are similar to real life) and external (whether
university requesting to visit a URL embedded
the results are generalizable) validity issues.
in the message. The email was delivered from
It overestimate attack awareness because of
the researchers to the recipients with the name
expectancy bias - the mere knowledge of the
of the sender whose name was taken from so-
study’s existence biases its likely outcome.
cial networks. About the experiment, there
were not much complaints, but some partici-

- 529 -
pants complained that it was unethical, inap- 3.4 Regal Risks
propriate, illegal, unprofessional, fraudulent,
To the best of our knowledge, Soghoian’s
self-serving, and/or useless. They also called
work [4] was the first contribution to propose
for the researchers to be fired, prosecuted, ex-
the best practices for designing field phishing
pelled, or reprimanded.
experiments with the considerations of legal
issues. In Table 1, we summarize Soghoian’s
3.2 Attack Timing best practices that may help researchers to
In the above work [1], the attack timing may avoid running afoul of the law.
be one of the reasons about the harsh reac-
tions of some participants. The researchers 4 Our Study Experiences
guessed that the participants’ reaction is cor-
related with the attack timing. The exper- Last year (2010), we studied a threat model
iment was carried out near the end of the about how government’s Web services like e-
semester. This may intensified the stress felt Government can be hijacked by phishing at-
by some students. Ferguson [2] also pointed tackers [5]. Our goal was to identify some char-
out the importance of attack timing. He con- acteristics of Government’s services and to in-
ducted a real phishing experiment targeting vestigate how phishers can effectively exploit
students in a university. The phishing mes- the characteristics. In order to clarify the un-
sage included an embedded link to the phish- existing threat model, we needed to assess the
ing site and contents about troubles in stu- new attack. So we adopted the approach to
dent’s grade reports. Actually this experi- perform experiments that mimic real phishing
ment was conducted three weeks before the attacks, thereby measuring the actual success
end of the semester. Emails regarding exams rates by making sure that our study cannot be
and grades can get students’ attention as the distinguished by the subjects from reality.
semester draws to a close. In practice, it is not simple to resolve all po-
tential legal and ethical issues for mimicking
3.3 Experiments for Organizations the real attack. While it is unlikely that suits
would be brought, the legal risks definitely ex-
Ferguson’s work [2] was conducted to evalu-
isted. Firstly, researchers can be accused of
ate the effectiveness of security education in
sending phishing e-mails by subjects. In South
the campus. In practice, organizations such
Korea, the “Act on Promotion of Information
as universities or companies can set up such
and Communication Network Utilization and
a real phishing experiment to investigate the
information Protection” prohibits a behavior
effectiveness of security education for students
to harvest personal data by deceiving or entice
or employees. In addition, challenge to main-
victims to provide personal data. In case of
tain a robust network against phishing threats
Japan, the law does not clearly prohibit phish-
is an attractive goal to network administra-
ing attack itself, but strictly restricts spam
tors. Therefore, mimicking a real phishing at-
messages which can be exploited for phish-
tack can be an appropriate approach to mea-
ing. Also taking sensitive personal data such
sure the knowledge and reaction of organiza-
as password or medical information is strictly
tions’ members to phishing attacks. This re-
controlled. Copyright law can also be applied
sult will be referred to for creating an internal
to prohibit a website duplication.?Secondly,
security policy or planning further education.
even if no one is harmed, or can claim they
For example, the experiment by Kumaraguru
were adversely affected, the law still permits
et al. [3] was conducted in a Portuguese com-
the third-party to accuse researchers of phish-
pany, because the company was primarily in-
ing experiments. This has variants depending
terested in studying the vulnerability of their
on locations and countries. As we mentioned
employees towards phishing emails.
in Table 1, even in the United States, Cal-

- 530 -
Table 1: Guideline for Phishing Experiment Design (Proposed in [4])

Avoid California
Researchers who are located within the state of California should probably not engage in phishing field
experiment. Researchers should also avoid the experiment targeting people located in the state. The law
permits the Attorney General to bring a civil action against phishing email senders.

Terms of Service and User Accounts


Researchers should avoid the automated scraping of sites that require users first create and login to an
account. Usually, terms of service are required to be agreed by users for the account creation.

Application Programming Interfaces


If researchers require the automated collection of data, use the site’s public Application Programming
Interface (API).

Moderation
When scraping a website, researchers should make sure to limit the number of requests in order to maintain
a upper limit to the bandwidth used.

Discretion
Companies’ reputation must be protected. Their name should be hided by doing obfuscation or vagueness.

Use Caches
It is better to scrap website contents cached by third party than scrap directly.

Commerce
Researchers must avoid any possible personal profit from the phishing experiment. Remove advertising
banners from the research website or blog.

ifornian law permits the Attorney General to third parties. For that, we embedded an
bring a civil action against the sender of phish- invisible random number in each phish-
ing email unlike to the other states. Therefore ing email and our phishing website was
it is not recommended for researchers located opened only when the embedded random
in California to conduct real phishing experi- number was authenticated.
ments.
Reducing the potential ethical and legal • Delete the phishing site immediately
troubles can also be achieved technically. Un- when the experiment is over. Remaining
der the considerations of various aspects, we phishing websites can raise legal troubles
implemented the experiment as follows. whenever it is found by cybercrime de-
fenders.
• Minimize the time duration between
• Avoid harvesting sensitive personal in-
when the experiment subjects become
formation by phishing. In the phishing
aware of the phishing and our contact.
site, we requested only a quite simple
If our contact is delayed, subjects may
personal information, not sensitive infor-
complain to the corresponding institutes
mation such as passwords or credit card
who handles cyber crimes. So we linked
numbers.
PHP scripts to monitor subjects’ behav-
iors like button clicks, and we could ef- These technical implementations cannot
fectively find a subject contact timing in fully guarantee the legally safe experiment,
a timely manner. but can be a good complement to minimize
• Make the phishing website invisible from the risks.
others except experiment subjects. This
help avoid complaints and troubles of
- 531 -
5 Our Best Practices the contents. The email can be linked to the
phishing site by embedding a URL hyperlink
In previous sections, we summarized worth re- to the site.
ferring previous field phishing studies and our
work. In this section, we will introduce the ex- Decide the e-mail sender. The sender’s
perimental flow and ethical considerations in name should be carefully decided to avoid
each stage as the best practices. problems to people or organizations whose
names and email had been spoofed as senders.
Decide phisher’s strategies. In our case, Obfuscation or vagueness should be imple-
we decided a hypothesis what characteristics mented if required.
of the target service impact users’ decision for
perceiving their trust, and investigated how Select participants. Usually, a phishing
users are deceived by the strategies exploiting message targets general users or a specific In-
the characteristics. ternet user. In the former case, a phishing
message can be reused for all participants. On
Implement phishing system. Set up the the other hand, in the latter case, a phishing
infrastructure for a real attack mimicking a message may include participant’s information
real phishing experiment - a Web server for to bind the message to her. For example, pri-
phishing site and a database to record user vate data or identifier is included. For that, re-
behaviors. The infrastructure should be de- searchers need to harvest private information
signed carefully not to remain any private in- about the target users. In this case, illegal ac-
formation of experiment participants. tions such as hacking must not be conducted
to harvest personal information.
Secure the implemented system. Secure In addition, randomness in selecting partici-
the experimental system and database. Some pants is important. Establishing a high degree
participants noticed the experiment can retal- of randomness involved selecting participants
iate against the experimental system by mod- while avoiding the “high-beam”1 effect. Par-
ifying or removing collected data. This is a ticipants with a priori knowledge of the bo-
significant loss to researchers. gus e-mail would either deliberately handle the
embedded link or totally ignore the message.
Design phishing website and email. De- Either way, the data would be skewed.
sign a phishing website and a bogus e-mail in-
cluding faked contents created by the phisher’s Decide experiment timing. Depending
strategies. This can be achieved by two meth- on the timing when participants receive the
ods. Firstly, a fake website (not an existing phishing message, their responses would be
website) can be created. This method is used different. For example, students will react sen-
when researchers need to identify not existing sitively when they receive a phishing message
phishing threats. Secondly, an existing web- about an exam just before the exam.
site can be hijacked. This method can be used
when researchers need to identify vulnerabili- Send the phishing message and monitor
ties of existing Web services. In this case, re- user behaviors. Java or PHP scripts can be
searchers require the cooperation of the com- used to monitor user behaviors such as clicking
pany or organization operating the website. In links or to modify the phishing website dynam-
addition, building the phishing website must ically for various scenarios.
not violate the copyright law. 1
Refers to the situation where drivers on the side of
the highway spot a law enforcement officer at the side of
Decide the e-mail contents. The e-mail the road looking to catch speeders. These drivers flash
message can include some baits (faked mes- their high beam headlights to warn oncoming drivers
that a law enforcement officer is targeting speeders,
sages) that lead participants to authenticate encouraging them to slow down.
- 532 -
Follow-up. Researchers must receive con- phishing experiments and our study experi-
sents from participants and optionally inter- ence. Despite our contribution cannot be a
view them. Interview may be required to ex- legal advice, we aim to start the related dis-
plain about the research and to conduct sur- cussion to fight against phishing threats.
veys. Through the interview, user training can
also be conducted to help them avoid phishing
threats. If some participants are upset with
References
the experiment, an interview is highly required [1] T.N. Jagatic, N.A. Johnson, M. Jakobsson,
to explain the research purpose and apologize. and F. Menczer, “Social phishing,” Com-
mun. ACM, vol.50, no.10, pp.94–100, 2007.
Delete the phishing site. Researchers
should permanently remove all remaining [2] A.J. Ferguson, “Fostering e-mail security
phishing sites when the experiment is over in awareness: The west point carronade,”
order to avoid unexpected damages to stake- EDUCAUSE QUARTERLY, pp.54–57,
holders. This must be definitely conducted 2005.
when the experimental phishing website is a
[3] P. Kumaraguru, S. Sheng, A. Acquisti,
hijack of an existing website.
L.F. Cranor, and J. Hong, “Lessons from
a real world evaluation of anti-phishing
6 Conclusion training,” Proceedings of the The Third
Anti-Phishing Working Group eCrime Re-
Phishing, the automated type of fraud is a rel- searchers Summit, eCrime Researchers
atively recent phenomenon. It becomes a so- Summit ’08, 2008.
cial problem of quite catastrophic dimensions.
This deceit-based attack threatens the web [4] C. Soghoian, “Legal risks for phishing re-
services’ success because this erodes trust in searchersg,” Proceedings of the The Third
its underlying infrastructure. It is in this Anti-Phishing Working Group eCrime Re-
spirit that we must identify possible secu- searchers Summit, eCrime Researchers
rity breaches caused by human factors and Summit ’08, 2008.
then propose defenders’ strategies that miti-
[5] Y. Oh, T. Obi, J.S. Lee, H. Suzuki, and
gate phishing threats.
N. Ohyama, “Empirical analysis of inter-
As a result of these insights, an increasing
net identity misuse: case study of south
number of researchers and practitioners are at-
korean real name system,” Proceedings of
tempting to quantify risks and degrees of vul-
the 6th ACM workshop on Digital identity
nerabilities in order to understand where to
management (DIM ’10), New York, NY,
focus protective measures. Usually the three
USA, pp.27–34, ACM, 2010.
methods are used to quantify the problem of
phishing - “Survey”, “Lab Study” and “Field
Study”.
“Field study” is recently favored by my
many researchers, because we can obtain the
most precise user behavior models. But it has
ethical and legal setbacks to overcome because
it mimics a real phishing. In order to reduce
burdens to researchers, we need to establish a
legal policy and advice to protect researchers.
Unfortunately, no countries still have such a
consideration for phishing researchers.
We presented best practices for the field
phishing study built by referring previous
- 533 -

Vous aimerez peut-être aussi