Académique Documents
Professionnel Documents
Culture Documents
Chapter-1
Introduction
Malware
Malware, short for malicious software, (sometimes referred to as pestware) is a
software designed to harm or secretly access a computer system without the owner's
informed consent. The expression is a general term used by computer professionals to
mean a variety of forms of hostile, intrusive, or annoying software or program code.
Preliminary results from Symantec published in 2008 suggested that "the release
rate of malicious code and other unwanted programs may be exceeding that of legitimate
software applications."[6] According to F-Secure, "As much malware was produced in
2007 as in the previous 20 years altogether." Malware's most common pathway from
criminals to users is through the Internet: primarily by e-mail and the World Wide Web.
The prevalence of malware as a vehicle for organized Internet crime, along with
the general inability of traditional anti-malware protection platforms (products) to protect
against the continuous stream of unique and newly produced malware, has seen the
adoption of a new mindset for businesses operating on the Internet: the acknowledgment
that some sizable percentage of Internet customers will always be infected for some
reason or another, and that they need to continue doing business with infected customers.
The result is a greater emphasis on back-office systems designed to spot fraudulent
activities associated with advanced malware operating on customers' computers.
Malware is not the same as defective software, that is, software that has a
legitimate purpose but contains harmful bugs. Sometimes, malware is disguised as
genuine software, and may come from an official site.
Purposes
Many early infectious programs, including the first Internet Worm and a number of MS-
DOS viruses, were written as experiments or pranks. They were generally intended to be
harmless or merely annoying, rather than to cause serious damage to computer systems.
In some cases, the perpetrator did not realize how much harm his or her creations would
do. Young programmers learning about viruses and their techniques wrote them simply
for practice, or to see how far they could spread. As late as 1999, widespread viruses such
as the Melissa virus and the David virus appear to have been written chiefly as pranks.
The first mobile phone virus, Cabir, appeared in 2004.
Hostile intent related to vandalism can be found in programs designed to cause harm or
data loss. Many DOS viruses, and the Windows ExploreZip worm, were designed to
destroy files on a hard disk, or to corrupt the file system by writing invalid data to them.
Network-borne worms such as the 2001 Code Red worm or the Ramen worm fall into the
same category. Designed to vandalize web pages, worms may seem like the online
equivalent to graffiti tagging, with the author's alias or affinity group appearing
everywhere the worm goes.
Since the rise of widespread broadband Internet access, malicious software has been
designed for a profit, for examples forced advertising. For instance, since 2003, the
majority of widespread viruses and worms have been designed to take control of users'
computers for black-market exploitation. Infected "zombie computers" are used to send
email spam, to host contraband data such as child pornography, or to engage in
distributed denial-of-service attacks as a form of extortion.
Chapter-2
Infectious malware: viruses and worms
The best-known types of malware, viruses and worms, are known for the manner
in which they spread, rather than any other particular behavior. The term computer virus
is used for a program that has infected some executable software and, when run, causes
the virus to spread to other executable. Viruses may also contain a payload that performs
other actions, often malicious. On the other hand, a worm is a program that actively
transmits itself over a network to infect other computers. It too may carry a payload.
These definitions lead to the observation that a virus requires user intervention to
spread, whereas a worm spreads itself automatically. Using this distinction, infections
transmitted by email or Microsoft Word documents, which rely on the recipient opening a
file or email to infect the system, would be classified as viruses rather than worms.
Some writers in the trade and popular press misunderstand this distinction and use
the terms interchangeably.
With the rise of the Microsoft Windows platform in the 1990s, and the flexible
macros of its applications, it became possible to write infectious code in the macro
language of Microsoft Word and similar programs. These macro viruses infect documents
and templates rather than applications (executable), but rely on the fact that macros in a
Word document are a form of executable code.
Today, worms are most commonly written for the Windows OS, although a few
like Mare-D and the Lion worm are also written for Linux and Unix systems. Worms
today work in the same basic way as 1988's Internet Worm: they scan the network and
leverage vulnerable computers to replicate. Because they need no human intervention,
worms can spread with incredible speed. The SQL Slammer infected thousands of
computers in a few minutes.
For a malicious program to accomplish its goals, it must be able to run without
being shut down, or deleted by the user or administrator of the computer system on which
it is running. Concealment can also help get the malware installed in the first place. When
a malicious program is disguised as something innocuous or desirable, users may be
tempted to install it without knowing what it does. This is the technique of the Trojan
horse or Trojan.
In broad terms, a Trojan horse is any program that invites the user to run it,
concealing a harmful or malicious payload. The payload may take effect immediately and
can lead to many undesirable effects, such as deleting the user's files or further installing
malicious or undesirable software. Trojan horses known as droppers are used to start off a
worm outbreak, by injecting the worm into users' local networks.
One of the most common ways that spyware is distributed is as a Trojan horse,
bundled with a piece of desirable software that the user downloads from the Internet.
When the user installs the software, the spyware is installed alongside. Spyware authors
who attempt to act in a legal fashion may include an end-user license agreement that
states the behavior of the spyware in loose terms, which the users are unlikely to read or
understand.
II. Rootkits
Some malicious programs contain routines to defend against removal, not merely
to hide them, but to repel attempts to remove them. An early example of this behavior is
recorded in the Jargon File tale of a pair of programs infesting a Xerox CP-V time sharing
system:
Each ghost-job would detect the fact that the other had been killed, and would
start a new copy of the recently slain program within a few milliseconds. The only way to
kill both ghosts was to kill them simultaneously (very difficult) or to deliberately crash
the system.
Similar techniques are used by some modern malware, wherein the malware starts
a number of processes that monitor and restore one another as needed. In the event a user
running Microsoft Windows is infected with such malware, if they wish to manually stop
it, they could use Task Manager's 'processes' tab to find the main process (the one that
spawned the "resurrector process(es)"), and use the 'end process tree' function, which
would kill not only the main process, but the "resurrector(s)" as well, since they were
started by the main process. Some malware programs use other techniques, such as
naming the infected file similar to a legitimate or trustworthy file (expl0rer.exe VS
explorer.exe).
III. Backdoors
The idea has often been suggested that computer manufacturers preinstall
backdoors on their systems to provide technical support for customers, but this has never
been reliably verified. Crackers typically use backdoors to secure remote access to a
computer, while attempting to remain hidden from casual inspection. To install backdoors
crackers may use Trojan horses, worms, or other methods.
During the 1980s and 1990s, it was usually taken for granted that malicious
programs were created as a form of vandalism or prank. More recently, the greater share
of malware programs has been written with a profit motive (financial or otherwise) in
mind. This can be taken as the malware authors' choice to monetize their control over
infected systems: to turn that control into a source of revenue.
Spyware programs are sometimes installed as Trojan horses of one sort or another.
They differ in that their creators present themselves openly as businesses, for instance by
selling advertising space on the pop-ups created by the malware. Most such programs
present the user with an end-user license agreement that purportedly protects the creator
from prosecution under computer contaminant laws. However, spyware EULAs have not
yet been upheld in court.
Another way that financially motivated malware creators can profit from their
infections is to directly use the infected computers to do work for the creator. The
infected computers are used as proxies to send out spam messages. A computer left in this
state is often known as a zombie computer. The advantage to spammers of using infected
computers is they provide anonymity, protecting the spammer from prosecution.
Spammers have also used infected PCs to target anti-spam organizations with distributed
denial-of-service attacks.
credit card fraud and other theft. Similarly, malware may copy the CD key or password
for online games, allowing the creator to steal accounts or virtual items.
Another way of stealing money from the infected PC owner is to take control of a
dial-up modem and dial an expensive toll call. Dialer (or porn dialer) software dials up a
premium-rate telephone number such as a U.S. "900 number" and leave the line open,
charging the toll to the infected user.
Chapter-3
Data-stealing malware
It is difficult for antivirus software to detect final payload attributes due to the
combination(s) of malware components
The malware uses multiple file encryption levels
Bancos, an info stealer that waits for the user to access banking websites then
spoofs pages of the bank website to steal sensitive information.
Gator, spyware that covertly monitors web-surfing habits, uploads data to a server
for analysis then serves targeted pop-up ads.
LegMir, spyware that steals personal information such as account names and
passwords related to online games.
Qhost, a Trojan that modifies the Hosts file to point to a different DNS server
when banking sites are accessed then opens a spoofed login page to steal login
credentials for those financial institutions.
Albert Gonzalez (not to be confused with the U.S. Attorney General Alberto Gonzalez) is
accused of masterminding a ring to use malware to steal and sell more than 170 million
credit card numbers in 2006 and 2007—the largest computer fraud in history. Among the
firms targeted were BJ's Wholesale Club, TJX, DSW Shoe, OfficeMax, Barnes & Noble,
Boston Market, Sports Authority and Forever 21.
A Trojan horse program stole more than 1.6 million records belonging to several hundred
thousand people from Monster Worldwide Inc’s job search service. The data was used by
cybercriminals to craft phishing emails targeted at Monster.com users to plant additional
malware on users’ PCs.
Customers of Hannaford Bros. Co, a supermarket chain based in Maine, were victims of a
data security breach involving the potential compromise of 4.2 million debit and credit
cards. The company was hit by several class-action law suits.
Chapter-4
There is a group of software (Alexa toolbar, Google toolbar, Eclipse data usage collector,
etc.) that send data to a central server about which pages have been visited or which features of
the software have been used. However differently from "classic" malware these tools document
activities and only send data with the user's approval. The user may opt in to share the data in
exchange to the additional features and services, or (in case of Eclipse) as the form of voluntary
support for the project. Some security tools report such loggers as malware while others do not.
The status of the group is questionable. Some tools like PDF Creator are more on the boundary
than others because opting out has been made more complex than it could be (during the
installation, the user needs to uncheck two check boxes rather than one). However also PDF
Creator is only sometimes mentioned as malware and is still subject of discussions.
Vulnerability to malware
In this context, as throughout, it should be borne in mind that the ―system‖ under
attack may be of various types, e.g. a single computer and operating system, a network or
an application.
Homogeneity: e.g. when all computers in a network run the same OS, upon
exploiting one, one can exploit them all.
Weight of numbers: simply because the vast majority of existing malware is
written to attack Windows systems, then Windows systems, ipso facto, are more
vulnerable to succumbing to malware (regardless of the security strengths or
weaknesses of Windows itself).
Defects: malware leveraging defects in the OS design.
Unconfirmed code: code from a floppy disk, CD-ROM or USB device may be
executed without the user’s agreement.
Over-privileged users: some systems allow all users to modify their internal
structures.
Over-privileged code: some systems allow code executed by a user to access all
rights of that user.
Originally, PCs had to be booted from floppy disks, and until recently it was
common for this to be the default boot device. This meant that a corrupt floppy disk could
subvert the computer during booting, and the same applies to CDs. Although that is now
less common, it is still possible to forget that one has changed the default, and rare that a
BIOS makes one confirm a boot from removable media.
exploits have increased this priority is shifting for the release of Microsoft Windows
Vista. As a result, many existing applications that require excess privilege (over-
privileged code) may have compatibility problems with Vista. However, Vista's User
Account Control feature attempts to remedy applications not designed for under-
privileged users, acting as a crutch to resolve the privileged access problem inherent in
legacy applications.
Malware, running as over-privileged code, can use this privilege to subvert the
system. Almost all currently popular operating systems and also many scripting
applications allow code too many privileges, usually in the sense that when a user
executes code, the system allows that code all rights of that user. This makes users
vulnerable to malware in the form of e-mail attachments, which may or may not be
disguised.
Given this state of affairs, users are warned only to open attachments they trust,
and to be wary of code received from untrusted sources. It is also common for operating
systems to be designed so that device drivers need escalated privileges, while they are
supplied by more and more hardware manufacturers.
As malware attacks become more frequent, attention has begun to shift from
viruses and spyware protection, to malware protection, and programs have been
developed to specifically combat them.
1. They can provide real time protection against the installation of malware software
on a computer. This type of spyware protection works the same way as that of
antivirus protection in that the anti-malware software scans all incoming network
data for malware software and blocks any threats it comes across.
2. Anti-malware software programs can be used solely for detection and removal of
malware software that has already been installed onto a computer. This type of
malware protection is normally much easier to use and more popular.[citation needed]
This type of anti-malware software scans the contents of the Windows registry,
operating system files, and installed programs on a computer and will provide a
list of any threats found, allowing the user to choose which files to delete or keep,
or to compare this list to a list of known malware components, removing files that
match.
the virus writer's public key. In theory the victim must negotiate with the virus writer to
get the IV+SK back in order to decrypt the cipher text (assuming there are no backups).
Analysis of the virus reveals the public key, not the IV and SK needed for decryption, or
the private key needed to recover the IV and SK. This result was the first to show that
computational complexity theory can be used to devise malware that is robust against
reverse-engineering.
Behavioral malware detection has been a particularly lively research area lately.
Most approaches to behavioral detection are based on analysis of system call
dependencies. The executed binary is traced using strace or more precise taint analysis to
compute data-flow dependencies among system calls. The result is a directed graph G =
(V, E) such that nodes are system calls, and edges represent dependencies. For example,
automata. Their approach infers an automaton from dependency graphs, and they show
how such an automaton could be used for detection and classification of malware.
There are basically two broad categories of techniques that are used for analyzing
malware: code analysis and behavior analysis. In most cases, a combination of both these
techniques is used. We will consider code analysis first.
Code analysis is one of the primary techniques used for examining malware. The
best way of understanding the way a program works is, of course, to study the source
code of the program. However, the source code for most malware is not available.
Malicious software is more often distributed in the form of binaries, and binary code can
still be examined using debuggers and disassemblers. However, the use of these tools is
often beyond the ability of all but a small minority because of the specialized knowledge
required and the very steep learning curve needed to acquire it. Given sufficient time, any
binary, however large or complicated, can be reversed completely by using code analysis
techniques.
On the other hand, behavior analysis is more concerned with the behavioral
aspects of the malicious software. Like a beast kept under observation in a zoo, a binary
can be kept in a tightly controlled lab environment and have its behavior scrutinized.
Things like changes it makes to the environment (file system, registry, network, etc.), its
communication with the rest of the network, its communication with remote devices, and
so on are closely observed and information is collected. The collected data is analyzed
and the complete picture is reconstructed from these different bits of information.
The best thing about behavior analysis is that it is within the scope of an average
administrator or even a power user. The learning curve is very small and existing
knowledge can be leveraged to make the learning process faster. This makes it ideal for
teaching newbies the art of malware reverse engineering. These reasons are consistent
with our stated goals, focused on the typical administrator, and therefore this paper is
mostly concerned with behavior analysis.
Though reverse engineering using behavior analysis does not lead to the complete
reversing of a binary, it is sufficient for most users' needs. For instance, it is not sufficient
for an antivirus researcher but for most other users, behavior analysis can fulfill all their
needs.
As stated before, our goal is to provide a set of behavior analysis techniques for
reverse engineering malware. Also, the learning curve should be small so that it is within
the scope of most people.
Using these methods, people should be able to analyze an unknown binary and
determine whether it is malicious or not. Those who require more in-depth knowledge
should be able to reverse engineer the binary, understand and document its workings
completely.
This paper makes a few assumptions for the sake of convenience and clarity.
These are:
IV.Tools
Since the goal of this paper is to propose a generic set of techniques, the tools
mentioned in this paper are just "proposed" tools and are available as references at the
end of this document. Any other tool that has the same or similar functionality can be
used in place of the proposed ones.
Methodology
The framework proposed is broadly divided into six stages. They are:
At least two machines should be used. One machine is for hosting the malicious
binary (victim machine) and the other is for base lining and sniffing the network
traffic (sniffer machine). They should be networked in such a way that each of
them is able to sniff the other's network traffic.
The two networked lab machines should be isolated from the rest of the network.
Fresh copies of Operating Systems should be installed on each of the two
machines. It is preferable to have a WinNT kernel family OS on one machine and
a *nix based OS on the other. Since we are assuming a Win32 binary, the WinNT
machine acts as the "victim host" and the *nix machine is used as the "sniffer
machine".
Tools should be transferred to the relevant machines.
The binary that is to be examined should be transferred to the relevant machine.
Since we are assuming a Win32 binary, it is transferred to the Win32 machine in
this case.
It is highly preferable not to install any other application upon the "victim host"
apart from the tools required for analysis.
This is the most basic setup for a malware analysis lab. Apart from this and
depending on the situation, more modifications can be carried out. For instance, if the
Department of Information Science and Engineering, AMCEC 20
Managing Malware In Higher Learning Institutions 2010-11
malicious binary tries to communicate with a remote server xyz.com, a DNS server has to
be setup in one of the lab machines and a DNS entry for xyz.com has to be created. An
excellent paper that discusses the creation of a malware analysis lab is "An Environment
for Controlled Worm Replication and Analysis".
Base lining the environment is the next major step. "Base lining" means taking a
snapshot of the current environment. This is the most vital stage in our analysis. If base
lining is not done properly, it has a serious effect on the information gathering stage,
which in turn seriously affects our understanding of the binary. If base lining is done
efficiently, the information generated during the next stage becomes very accurate and the
rest of the stages become easy to execute.
A. Network traffic
Sniffing software that is installed on our "sniffer machine" is used for this
purpose. Any sniffing software running in verbose mode is sufficient for our purposes.
However, to make our task easier, it is preferable to use a protocol analyzer like Ethereal.
B. External view
Some of the elements that are to be baselined in the Victim Machine are:
File system: The file system on the victim host has to be baselined. There
are many programs that can create a snapshot of the file system and after a
few changes occur, they can point out the modifications. Some of the
programs we can use are Winalysis and Installrite.
The next element that has to be baselined is the network traffic. Even when there
is no application running on either of the test machines, there will still be some network
traffic. This traffic has to be recorded and the "normal traffic" in our test network has to
be defined. This is because when deviations occur in the "normal traffic" pattern, we can
assume it to be generated by the binary and perform further testing on it. Although we
have created a snapshot of the open ports in the victim machine, it is always better to
create one more snapshot from an external machine. A port scanner running on our
"sniffer machine" can achieve this task for us. It goes without saying that will be the port
scanner of choice for most users.
Now that the preparations are over, we can go ahead with our task. This is the
only stage where we have an actual interaction with the binary. A lot of raw information
about the binary is collected during this stage which is analyzed in the next stage.
Therefore, it is very important to carefully record all the information generated in this
stage. The steps in the information collection stage are:
A. Static analysis
Human-readable strings are extracted from the binary and these strings are
recorded. A program like Binary Text Scan can be used for this purpose. These strings
reveal a lot of information about the function of the binary.
Resources that are embedded in the binary are extracted and recorded. A program
like Resource Hacker can be used for this purpose. The resources that can be discovered
through this process include GUI elements, scripts, HTML, graphics, icons, and more.
B. Dynamic analysis
After taking a snapshot of all the changes the binary performs in the system, the
binary process is terminated. Now, the differences between the new snapshot and the
baseline snapshot are determined. The dynamic analysis step is very similar to the
baselining the environment stage. Therefore, the tools are reused for this stage. Winalysis
and InstallRite can be used for this purpose. Apart from these tools, Filemon and Regmon
from Sysinternals can be used for monitoring the file system and the registry dynamically.
These tools are used for observing the changes to the file system and the registry.
This information is recorded and forms the input for the next stage of our analysis.
The information generated here can be new files, registry entries, open ports, etc.
During the static analysis stage, we collect as much information about the binary
as possible, without executing it. This involves many techniques and tools. Static analysis
reveals the scripts, HTML, GUI, passwords, commands, control channels, and so on.
Simple things like the file name, size, version string (right-click>properties>version in
Win32), are recorded. During this stage, we actually execute the binary and observe its
interaction with the environment. All monitoring tools including the sniffing software are
activated. Different experiments are done to test the response of the running malware
process to our probes. Attempts to communicate with other machines are recorded.
Basically a new snapshot of the environment is created like in the baselining the
environment stage.
Sometimes, the static analysis step has to be repeated once more after doing a
dynamic analysis.
IV.Information analysis
This is the stage where we can finally reverse engineer the binary based on all the
information collected during the previous stages. Each part of the information is analyzed
over and over and the "jigsaw puzzle" is completed. Then the big picture automatically
begins to appear and the reverse engineering process is finished. However, before this is
achieved, we may have to repeat the previous stages (See figure) several times.
The goals of the individual or organization evaluating the binary determine the
type of analysis and because the goals differ, no standard methodology is provided for
this stage. Looking for deviations from the stated security policy of an organization based
on the information can be the determining factor in some cases.
Documenting the results of the malware analysis and reverse engineering exercise
is essential. One of the main advantages is that the knowledge incorporated into the
documentation can be leveraged for later analysis exercises. The documentation needs
differ from individual to individual and organization to organization. The method
preferred by the concerned party can be used here.
Literature review
Based on the literature research done, this study has its roots in two previous
studies Jones et al. 1993 Schmidt and Arnett, 2005. Both of these had a goal of examining
relatively new malware as it emerged on the computing landscape. In this study both
malware and the current antimalware measures will be compared and analyzed.
Thereafter in-depth studies of several security models will be undertaken to investigate
and evaluate organizational network security.
Current solutions of defense for network security are mostly reactive and static
methods, which are used to collect, analyze and extract evidences after attacks. This
approach includes virus detection, frangibility evaluation, and firewalls and more. They
rely upon collecting and analyzing the viruses’ specimens or intrusion signatures with
some traditional techniques, such as statistical analysis, characteristics analysis, neural
network, and data mining. However, these approaches result in a slow reaction time to
new threats. This is largely due to the lack of self-learning and self-adapting abilities as
they can only prevent those known network intrusions, and can do very little or nothing
for the unknown intrusions.
In the real network environment, the incursion threat has raised as well as the
attack class numbers. As a result, HLIs are constantly investing in new information
technology and in business information systems security in particular. These security
expenditures have constantly increased over the last few years. The expenditure is mostly
on widely used security tools such as firewalls, antivirus, Virtual Private Networks
(VPNs), encrypted channels, and more. Although the tools are effective to a certain
extent, there are objective shortcomings related to all existing security tools and
mechanisms. They solve just the technical side of the security problem and fail to address
most of the nontechnical side.
In this study we propose a conceptual framework (Fig.3) which focuses on the non
technical measures, basically dealing with the social layer. This has been compounded by
the trends toward mobility, increasing Web traffic and the rising popularity of e-
commerce, search and social networking applications (e.g. Facebook, MySpace and
twitter) which have all significantly impacted HLI security. Security of networked
systems requires both technical and administrative foundations. Technical foundations
like those based on cryptographic measures and access control models, are well
understood. However the administrative foundation which is based on several non-
technical layers added on top of technical ones has taken a back sit. Clearly for malware
to be effectively managed there must be a marriage between the technical and
nontechnical layers.
Due to the scarcity of Network Security models specifically addressing HLIs with
an emphasis on nontechnical measures, the study will be based on four security models,
which will help in providing some guidance in the development of the proposed network
security framework (Fig. 3). This framework can be applied not only to HLIs but also any
other type of organization.
CONCLUSION
Overall this study examines malware as it affects HLIs and thereafter conducts a
survey and comparative analysis of several security models. From the preliminary study,
we believe that the malware threat poses a significant and increasing problem for HLIs.
Different types of attackers usually attempt different attacks depending on their position,
privileges and knowledge. Although there is a wide range of technological security
measures, there are still very few solutions which largely focus on non technical
measures.
REFERENCES