Vous êtes sur la page 1sur 6

ACCESS

CONTROL

SYSTEMS

AND

METHODOLOGY

Security Event Analysis through Correlation


Anton Chuvakin, Ph.D., GCIA, GCIH he security spending survey by Information Security (http://www. infosecuritymag.com/2003/may/ coverstory.pdf) and recent research by Forrester indicate that deployment rates of many security technologies will soar in the next three years. According to some estimates, security budgets (and thus technology purchases) will double by 2006.

INTRODUCTION TO SECURITY DATA ANALYSIS

Almost every Internet-connected organization now has a firewall included as part of its network infrastructure; most Windows networks have an anti-virus solution. Intrusion detection systems (IDS) are slowly but surely gaining wider acceptance, and intrusion prevention is starting to show more promise despite the obvious hurdles. New types of application security products such as Web application firewalls are starting to be deployed by security-conscious organizations. This buying trend is further enhanced by the growing popularity of socalled appliance security systems, which are very easy to install and manage. Appliances combine software and hardware in one package and usually have much lower installation and maintenance costs, thus facilitating their adoption.

All the above devices, whether aimed at prevention or detection of attacks, usually generate huge volumes of audit data. Firewalls, routers, switched, and other devices recording network connection information are especially guilty of producing vast oceans of data. There are other problems induced by this log deluge, turning its analysis into a pursuit few dare to undertake. Many diverse data formats and representations, some binary,1 obscure, and undocumented, are used for those log files and audit trails. Also, a percentage of events generated by network IDSs and intrusion prevention systems (IPSs) are false alarms and do not map to real threats or map to threats that have no chance of causing loss. To further confuse the issue, different devices might report on the same things happening on the network, but in a different way, with no apparent way of figuring the truth of their relationship. For example, a UNIX log file might contain an FTP connection message. The same will also be recorded by the firewall as connection allowed to TCP port 21. A network IDS might also generate an alert, warning that FTP with no password has occurred. All three messages refer to the same event, and a human analyst will recognize them as such.

ANTON CHUVAKIN, Ph.D., GCIA, GCIH, is a Senior Security Analyst at a major security company. His areas of infosec expertise include intrusion detection, UNIX security, forensics, honeypots, etc. In his spare time he maintains his security portal at http://www.info-secure.org

A C C E S S

C O N T R O L

S Y S T E M S
M A Y / J U N E

A N D

M E T H O D O L O G Y

13

2 0 0 4

Statistical correlation does not employ any preexisting knowledge of the bad activity, but instead relies on the knowledge of normal activities, accumulated over time.

However, programming a system to do that is much more challenging, especially for a broad spectrum of messages. Thus, there is a definite need for a consistent analysis framework to identify various network threats, prioritize them, and learn their impact on the target organization. This must be done as fast as possible (preferably in real-time) for attack identification and also over the long term for threat trending and risk analysis. To understand the meaning of the piling logs, the data in them can be categorized in several ways. It should be noted that before the data can be intelligently categorized, it should be normalized to a common schema. The normalization process involves extracting the parts of the log records serving the common purpose and assigning them to specific fields in the common schema. For example, both firewall and network IDS log records will usually contain the source and destination IP addresses. If you see both firewall and IDS logs referring to the same source and destination at about the same time, they are likely related. Log categorization helps make the similarity between different log records stand out. For example, the generated log data across many security devices, hosts, and applications might be related to: Device performance data Network traffic Known attacks Known network/system problems Anomalous/suspicious network/host activity Access control decisions Software failures Hardware errors System changes Evidence of malicious agents Site-specific AUP2 violations

positives). Moreover, sometimes the threat can only be identified and rated by crossdevice and cross-category analysis of the above events. Many questions arise upon seeing the above data. How do you turn that flood of data into useful and actionable information? How do you find what is really relevant for the organization at the moment and for the near future? How do you tell normal log records, produced in the course of business, from the anomalous and malicious, produced by attackers or misbehaving software? Correlation performed by SIM (Security Information Management) software is believed to be the solution to those challenges. Correlation is defined in the dictionary as establishing or finding relationships between entities. However, a good securityspecific definition is lacking. In security, event correlation can be defined as improving the threat identification and assessment process by looking not only at individual events, but also at their sets, bound by some common parameter (related).
TYPES OF CORRELATION

Each of the above types of events presents unique analysis challenges. For example, some are produced in much higher numbers (network access control, worm events) while others are often not what they seem at first (such as network IDS false

Security-specific correlation can be loosely categorized into rule-based and statistical (or algorithmic). Rule-based correlation needs some preexisting knowledge of the attack (the rule) and is able to define what it actually detected in precise terms (Successful Shopping Cart Web Application Attack). Such attack knowledge is used to relate events and analyze them together in broader context. On the other hand, statistical correlation does not employ any preexisting knowledge of the bad activity (at least not as a primary detection vehicle), but instead relies on the knowledge of normal activities, accumulated over time. Ongoing events are then rated by the built-in algorithm and are additionally compared to the accumulated activity patterns. This distinction is somewhat similar to signature versus anomaly IDS and makes
S E C U R I T Y

14

I N F O R M A T I O N

S Y S T E M S
2 0 0 4

M A Y / J U N E

the SIM solution a kind of meta-IDS, operating on higher-level data (not packets, but log records). Both correlation methods combined can help to sift through the large volume of diverse data and identify highseverity threats.
Rule-Based Correlation

Rule-based correlation uses some preexisting knowledge of an attack (a rule), which is essentially a scenario that an attack must follow to be detected. Such a scenario might be encoded in the form of if this, then that, therefore some action is needed. Rule-based correlation deals with states, conditions, timeouts, and actions. Let us define these important terms. A state is a stationary occurrence that the correlation rule might be in. A state might contain various conditions, such as matching incoming events by the source IP address, protocol, port, event type, producing security device type, username, and other components of the event. It should be noted that although such data components vary with the device, the SIM solution normalizes them using the cross-device event schema without incurring the information loss. Timeout defines how long the rule will be in a certain state. If the correlation engine has to maintain a lot of rules in waiting state in memory, this resource might be exhausted. Thus, rule timeouts play an important role in correlation performance. A transition is an event when one rule state is switched to another one. For a complicated rule, many transitions are possible. Action is what happens when all the rule conditions are met. Various actions can result from rules, such as user notification, alarm escalation, configuration changes, or automatic incident case investigation. The correlation is usually performed by the correlation engine, which is able to track various states and switch from state to state, depending on conditions and incoming events. It does all the above for multiple rules at the same time. The correlation engine gets a real-time event feed from the alarm-generating security devices and
A C C E S S C O N T R O L

applies the relevant correlation rules as needed. The correlation engine also leverages other types of available data (such as vulnerability, open port, or asset business value information) for a higher level of correlation. Correlation rules can be applied to the incoming events as they arrive in real-time or to the historical events stored in the database. In the latter case, the rules are used as a form of data mining or analytics, which allows for uncovering hidden threats such as slow port scans or low-level Trojan or exploitation activity. Such rules can be run periodically for incident identification or in the course of the investigation of suspicious activity for seeking out the prior occurrences of similar (and thus possibly related) activity. Unlike the real-time rules, which become useless if prone to false alarms (just as signature-based IDSs sometimes are), database rules can tolerate a certain level of false alarms for the purpose of drastically reducing false negatives. This is due to the fact that real-time rules usually feed the alarm notification system, while database rule correlation will be launched by the analyst during the security incident investigation. As long as the rule-based analytics will uncover a hidden threat, which is impossible to discover otherwise, an analyst might be able to tolerate a certain level of false alarms not acceptable for the real-time correlation.
Statistical Correlation

Statistical correlation uses special numeric algorithms to calculate threat levels incurred by the security-relevant events on various IT assets. Such correlation looks for deviations from normal event levels and other routine activities. Risk levels can be computed from the incoming events and then tracked in real-time or historically so that deviations are apparent. The algorithmic correlation can leverage the event categorization in order to compute the threat levels specific to various attack types, such as a threat of denial-of-service, a threat of viruses, etc., and track them over time.
A N D M E T H O D O L O G Y

S Y S T E M S
M A Y / J U N E

15

2 0 0 4

Detecting threats using statistical correlation does not require any preexisting knowledge of the attack to be detected.

Detecting threats using statistical correlation does not require any preexisting knowledge of the attack to be detected. Statistical methods can, however, be used to detect threats on predefined activity thresholds. Such thresholds can be configured based on the experiences monitoring the environment. For example, if a normal level of specific reconnaissance activity is exceeded for a prolonged period of time, the alarm might be generated by the system. Correlation can also use various parameters for enterprise assets to skew the statistical algorithm for higher accuracy detection. Some of them are defined by system users (such as the affected asset value to the organization) or are automatically computed from other available event context data (such as vulnerability scanning results or measure of normal user activity on the asset). That allows one to define a broader context for transpiring security events and thus helps one understand how they contribute to the organizations risk posture. If rule-based correlation is more helpful during threat identification, then algorithmic correlation is conducive to impact assessment. In the case of higher threat levels detected by the algorithms, one can assume that there is a higher chance of catastrophic system compromise or failure. Various statistical algorithms can be used to trend such threat levels over long periods of time to gain awareness of the normal network and host activities. The accumulated threat data is then used to compare the current patterns of activity with the baseline. This allows the system to make accurate (and possibly automated) decisions about event flows and their possible impact.
Challenges with Correlation

Both of the above types of correlation have inherent challenges, which can fortunately be mitigated by combining both methods to create coherent correlation coverage, leading to quality threat identification and ranking. First, can we assume that the attacker will follow a scenario that can be caught by the rule-based correlation system? Unlike the

network IDS that needs a specific signature with detailed knowledge of the attack, a correlation system rule might cover the broad range of malicious activities, especially if intelligent security event categorization is utilized. This can be done without going into the specifics of a particular IDS signature. For example, rules can be written to look for certain activities that usually accompany the system compromise, such as backdoor communication or hacker tools download. Doing those things is more difficult for the attacker to avoid if he intends to use the compromised machine for his own purposes. Extensive research using deception networks (also called honeynets) allows one to learn more and more about the attackers patterns of behavior and to encode them as correlation rules, available out of the box. Second, can multiple rules cause the number of false positives to actually increase instead of decrease? Indeed, deploying many rules without any regard for the environment might generate false alarms. However, it is much easier to understand and tune the SIM correlation rules than intricate binary matching patterns. The latter requires an in-depth understanding of the attack network packets, memory corruption issues, and the specifics of the exploitation techniques. On the other hand, tuning the correlation rule involves changing the timeouts and adding or removing conditions. Overall, in the case of correlation rules, one can also define response actions with higher confidence because one can bind the rules to a specific asset or group of assets. Third, rule-based correlation is relatively intensive computationally. However, using highly optimized correlation engines and intelligently applying filters to limit the flow of events allows one to gain maximum advantage of the rule-based correlation. Additionally, many rules can be combined so that the correlation engine does not have to keep many similar events in memory. It also makes sense to apply more specific correlation rules to a large number of assets,
S E C U R I T Y

16

I N F O R M A T I O N

S Y S T E M S
2 0 0 4

M A Y / J U N E

where a false positive flood might endanger the security, and to apply wider and more generic rules to critical assets, where an occasional false alarm is better than missing a single important alert. In this way, all the suspicious activities directed against a small group of critical assets will be detected, and Fourth, statistical correlation might not pick up anomalous activity if it is performed at low enough levels, essentially merging with the normal. Hiding attack patterns under volumes and volumes of similar normal activity might deceive the statistical correlation system. Similarly, a single occurrence of an attack might not impact the statistical profile enough to be noticed. However, careful baselining of the environment and then using statistical methods to track the deviations from such a baseline might allow one to detect some of the lowvolume threats. Also, rule-based correlation efficiency compensates for those rare events and enables their detection, even if algorithmic correlation misses them.
MAXIMIZING THE BENEFITS OF CORRELATION

Use the statistical correlation to learn the threats and then deploy new rules for sitespecific and newly discovered violations Overall, combining rules and algorithms provides the best value for managing an organizations IT security risks.
CORRELATION RULE EXAMPLES Probes Followed by an Attack

Correlation enables system users to take the audit data analysis to the next level. Rulebased and statistical correlation allows the user to: Dramatically decrease the response times for routine attacks and incidents using the centralized and correlated evidence storage Completely automate the response to certain threats that can be detected reliably by correlation rules Identify malicious and suspicious activities on the network even without having any preexisting knowledge of what to look for Increase awareness of the network via baselining and trending and effectively take back your network Fuse data from various information sources to gain a cross-device business risk view of the organization
A C C E S S C O N T R O L

The rule watches for the general attack pattern consisting of a reconnaissance activity, followed by the exploit attempt. Attackers often use activities such as port scanning or application querying to scope the environment and find targets for exploitation and get an initial picture of system vulnerabilities. After performing the initial information gathering, the attacker returns with exploit code or automated attack tools to obtain actual system penetration. The correlation enriches the information reported by the IDS and serves to validate the attack and suppress false alarms. By watching for exploit attempts that follow the reconnaissance activity from the same source IP address against the same destination machine, the SIM solution can increase both the confidence and accuracy of reporting. After the reconnaissance event is detected by the system, the rule activates and waits for the actual exploit to be reported. If it arrives within a specified interval, the correlated event is generated. The notification functionality can then be used to relay the event to security administrators by email, pager, and cell phone or to invoke appropriate actions.
Login Guessing

The rule watches for multiple attempts of failed authentication to network and host services followed by a successful log-in attempt. While some intrusion detection systems are able to alert on failed log-in attempts, the correlation system is able to analyze such activity across all authenticated services, both networked (such as Telnet, SSH, FTP, Windows access, etc.) and local (such as UNIX and Windows console
A N D M E T H O D O L O G Y

S Y S T E M S
M A Y / J U N E

17

2 0 0 4

Intelligent automated guessing tools, available to hackers, allow them to cut the guessing time to a minimum.

log-ins). This rule is designed to track successful completion of such an attack. Triggering of this rule indicates that an attacker managed to log in to one of your servers. It is well-known that system users would often use passwords that are easy to guess from just several tries. Intelligent automated guessing tools, available to hackers, allow them to cut the guessing time to a minimum. The tools use various tricks such as trying to derive a password from a users log-in name, last name, etc. In the case that those simple guessing attempts fail, hackers might resort to brute-forcing the password. This technique uses all possible combinations of characters (such as letters and numbers) to try as a password. After the non-root (nonadministrator) user password is successfully obtained, the attacker will likely attempt to escalate privileges on the machine to achieve higher system privileges. The rule activates after the first failed attempt is detected. The event counter is then incremented until the threshold level is reached. At that point, the rule engine will be expecting a successful log-in message. In case such message is received, the correlated event is sent. It is highly suggested to tune

the count and the interval for the environment. Up to three failed attempts within several minutes is usually associated with users trying to remember the forgotten password, while higher counts within a shorter period of time might be more suspicious and indicate a malicious attempt or a script-based attack.
CONCLUSION

SIM products leveraging advanced correlation techniques and intelligent alert categorization are becoming indispensable as enterprises deploy more and more security point solutions, appliances, and devices. Those solutions alone only address small parts of a companys security requirements and need to be integrated under the umbrella of a Security Information Management solution, which will enable the users to combat modern-day technology threats such as hackers, hybrid worms, and even internal abuse.
Notes
1. Binary = here, not containing human-readable text, but binary data. 2. AUP = Acceptable Use Policy.

18

I N F O R M A T I O N

S Y S T E M S
2 0 0 4

S E C U R I T Y

M A Y / J U N E

Vous aimerez peut-être aussi