CUP: Comprehensive User-Space Protection For C/C++

CUP: Comprehensive User-Space Protection for C/C++
Nathan Burow Derrick McKee Scott A. Carr Mathias Payer

Purdue University Purdue University Purdue University Purdue University
arXiv:1704.05004v1 [cs.CR] 17 Apr 2017
Abstract hensively protects user-space is necessary to find and fix

these bugs.
Memory corruption vulnerabilities in C/C++ applica- To correctly address memory safety in user-space,
tions enable attackers to execute code, change data, and there are four main requirements. Precision addresses
leak information. Current memory sanitizers do not pro- spatial safety by requiring that exact bounds are main-
vide comprehensive coverage of a programs data. In tained for all allocations. Object Awareness prevents
particular, existing tools focus primarily on heap alloca- temporal errors by tracking whether the pointed-to object
tions with limited support for stack allocations and glob- is currently allocated or not. These two requirements are
als. Additionally, existing tools focus on the main exe- sufficient to enforce memory safety. Adding Comprehen-
cutable with limited support for system libraries. Further, sive Coverage expands this protection to all of user space
they suffer from both false positives and false negatives. by requiring that all data on the stack, heap and globals
We present Comprehensive User-Space Protection for be protected. Comprehensive Coverage implies that all
C/C++, CUP, an LLVM sanitizer that provides com- code must be instrumented with the sanitizer, including
plete spatial and probabilistic temporal memory safety system libraries like libc. A sanitizer that meets these
for C/C++ programs on 64-bit architectures (with a pro- three requirements is powerful enough to find all mem-
totype implementation for x86 64). CUP uses a hybrid ory corruption vulnerabilities in user-space programs. To
metadata scheme that supports all program data includ- be useful, such a sanitizer must also be usable in prac-
ing globals, heap, or stack and maintains the ABI. Com- tice. Requiring Exactness no false positives and min-
pared to existing approaches with the NIST Juliet test imal false negatives ensures that bugs reported by a
suite, CUP reduces false negatives by 10x (0.1%) com- sanitizer are real, and that all spatial and most temporal
pared to the state of the art LLVM sanitizers, and pro- violations are found. We discuss these challenges in 3.
duces no false positives. CUP instruments all user-space The research community has come up with many ap-
code, including libc and other system libraries, removing proaches that attempt to address memory safety. Ini-
them from the trusted code base. tial efforts to address spatial safety relied on fat point-
ers that store bounds information inline with point-
1 Introduction ers [26, 14], which unfortunately breaks the Application
Binary Interface (ABI). SoftBound [24] moves metadata
Despite extensive research into memory safety tech- to a disjoint table, maintaining the ABI, but adding over-
niques [38], exploits of memory corruptions remain com- head to lookup the metadata associated with tables. Low-
mon [40, 18, 36]. These attacks rely on the fact that Fat Pointers [9] reduces overhead by partitioning and
C/C++ require the programmer to manually enforce spa- aligning memory allocations, allowing alignment based
tial safety (bounds checks) and temporal safety (lifetime bounds checks. This however rounds object sizes, and
checks). As the continuing stream of memory corruption alters the memory layout of the program. AddressSani-
Common Vulnerabilities and Exposures (CVEs) shows, tizer [35] (ASan) provides probabilistic spatial safety by
these programmer added checks are often inadequate. relying on poison zones between objects, but is vulnera-
Many of these bugs are in network facing code such as ble to long strides that skip these zones. There are two
browsers [28, 29] and servers [30, 31], allowing attackers main approaches to temporal safety. Probabilistic ap-
to illicitly gain arbitrary execution on remote systems. proaches [2, 32] change the memory allocator to reduce
Consequently, a memory safety sanitizer that compre- the frequency with which memory is reallocated. De-
1
terministic schemes [25, 16] either maintain per pointer The first sanitizer to fully protect user-space, includ-
metadata that knows whether an object is still allocated, ing libc
or invalidate all pointers when the object is deallocated.
Comprehensive Coverage is largely an open problem, A new static analysis for determining what stack
with prior work mostly ignoring the stack and neglect- variables require active protection, and present a lo-
ing support for system libraries like libc. Low-Fat Point- cal protection scheme for non-escaping stack vari-
ers [9] is the only work to protect the stack providing ables
only spatial safety. No existing work provides spatial and
temporal safety comprehensively for all user-space data Evaluation of a CUP prototype that, using our hy-
(stack, heap, globals) and code (program code, libc, li- brid metadata model, results in (i) no false positives
braries). Doing so requires a large amount of additional and 0.1% false negatives on the NIST Juliet C/C++
metadata to protect the extra allocations 2.2, which ex- test suite and (ii) reasonably low overhead (in line
isting metadata schemes are unable to handle. Further, with other sanitizers).
the performance of existing tools does not allow them
to scale to handle the additional memory surface of the
stack and libc.
2 Background and Challenges
libc is a particularly critical part of user-space to pro-
Our requirements for Precision and Object Awareness
tect. It is prone to memory errors, notably the mem* and
are designed to enforce spatial and temporal memory
str* family of functions (memcpy, strcpy, . . . ). Mem-
safety, which we define here and then use to introduce
ory errors in libc are not limited to these functions, how-
the notion of a capability ID.
ever as shown by, e.g., GHOST [18] and a stack over-
flow in getaddrinfo [36]. Consequently, the entire libc
needs to be protected, not just certain interfaces. Spatial Vulnerabilities also known as bounds-safety
Exactness shows how well a memory safety solu- violations are over- or under-flows of an object.
tion protects against vulnerabilities in practice. The Over/under-flows occur when a pointer is increment-
U.S. National Institute of Standards and Technology ed/decremented beyond the bounds of the object that it
(NIST) maintains the Juliet test suite. Juliet consists of is currently associated with. Even if the out-of-bounds
thousands of examples of bugs, grouped by class from pointer still points to a valid object, it does not have the
the Common Weakness Enumeration (CWE). Juliet re- capability for the referenced object, and the operation re-
veals that existing, open source memory safety solu- sults in a spatial memory safety violation. However, this
tions [35, 24, 25] have both false positives and a non- violation is only triggered on a dereference of an out-of-
trivial number of false negatives 5.2. bounds pointer. The C standard specifically allows out-
CUP satisfies all four requirements for a powerful, us- of-bounds pointers to exist.
able memory sanitizer. We introduce a new hybrid meta-
data scheme which is capable of storing and using per Temporal Vulnerabilities also known as lifetime-
object metadata for the stack, libc, heaps, and globals. safety violations occur when the object that a pointers
Our metadata is precise and does not require altering the capability refers to is no longer allocated and that pointer
programs memory layout. Additionally, we introduce a is dereferenced. For stack objects, this is because the
new way to check bounds that leverages hardware to in- stack frame of the object is no longer valid (the function
crease our checks performance. Hybrid metadata allows it was created in returned); for heap objects, this happens
us to meet the Precision and Object Awareness require- as a result of a free. These errors do not necessarily cause
ments. CUP presents a novel use of escape analysis to re- segmentation faults (accesses to unmapped memory), be-
duce the amount of stack allocations without loss of pro- cause the memory may have been reallocated to a new
tection. This reduction allows scaling our mechanism to object. Similarly, we cannot simply track what mem-
include all user-space data, satisfying the Comprehensive ory is currently allocated, because the object at a partic-
Coverage requirement. Further, CUP successfully han- ular address can change, which still results in a temporal
dles all system libraries, including libc, the first memory safety violation. Temporal bugs are at the heart of many
sanitizer to do so. Our evaluation on Juliet 5.2 shows recent exploits, e.g., for Google Chrome or Mozilla Fire-
that we have no false positives and 0.1%false negatives, fox as shown in the pwn2own contests [39]
considerably advancing the state of the art for Exactness. Violating either type of memory safety can be formu-
We present the following contributions: lated as a capability violation. In our terminology, an
A new hybrid metadata scheme capable of tracking object is a discrete memory area, created by an alloca-
any runtime information about object allocations, tion regardless of location (stack, heap, data, bss under
and show how it can be applied to memory safety. the Linux ELF format). A capability identifies a specific
2
Memory Type Allocations memory allocations in SPEC CPU2006 (see Table 1).
global 0.0006% This measurement includes allocations made in libc.
heap 0.07% Further, the latest data from van der Veen, et. al. [40, 43]
stack 99.9% show that stack-based vulnerabilities are responsible for
an average of over 15% of memory related CVEs annu-
Table 1: Allocation distribution in SPEC CPU2006. ally since tracking began in November 2002. By com-
parison, heap-based vulnerabilities account for an aver-
object, along with information about its bounds and allo- age of 25% of memory related CVEs over the same time
cation status. Pointers retain a capability ID that identi- period. Given the stacks exploitability and prevalence,
fies the capability of the object that was most recently which stresses memory safety designs, protecting it is a
assigned either directly from the allocation or indi- key design challenge for memory safety solutions.
rectly by aliasing another pointer [12]. Capabilities form
a contract, upon dereference: (i) the pointer must be in 3 Design
bounds, and (ii) the referenced object must still be allo-
cated. Violating the terms of this contract leads to spatial CUP provides precise, complete spatial memory safety
or temporal memory safety errors respectively. and stochastic temporal memory safety by protecting
all program data, including libc (and any other library).
2.1 Vulnerable Objects Safety is enforced, for all program data, by dynami-
cally maintaining information about the size and alloca-
Comprehensive Coverage requires that we protect all tion status of all objects that are vulnerable to memory
memory objects. However, some objects are inherently safety errors. This information is recorded through our
safe, so it is sufficient to protect all vulnerable objects. novel hybrid metadata scheme 3.1. A compiler-based
In particular, an object is vulnerable if accesses to it are instrumentation pass is used to add code that records and
calculated dynamically, and not by fixed offsets. This checks metadata at runtime 3.2. We provide a detailed
happens with pointers, and array accesses (which are argument for why our instrumentation guarantees mem-
just syntactic sugar for pointer arithmetic). Dynamic ad- ory safety in 3.3.
dress calculation does not happen for variables on the A powerful usable memory sanitizer must comply to
stack which are not arrays. Such local variables are ac- the following requirements:
cessed by fixed offsets, calculated at compile time, from
the stack frame, and can only be used maliciously after I Precision. The solution must enforce exact object
an initial memory corruption. In particular, accessing a bounds, ideally without changing memory layout
pointer on the stack does not need to be protected, only (i.e., spatial safety).
the pointers dereference, which dynamically looks up
memory. II Object Aware. The solution must remember the al-
location state of any object accessed through pointer
(i.e., temporal safety).
2.2 Comprehensive Coverage Challenges
III Comprehensive Coverage. The solution must fully
Understanding the scope of the challenge presented by protect a programs user-space memory including
Comprehensive Coverage is critical to understanding the stack, heap, and globals, requiring instrumen-
CUPs design. To illustrate this challenge, we show tation and analysis of all code, including system li-
how intensively programs use different logical regions braries such as libc (i.e., completeness).
of memory. While the operating system presents appli-
cations with a contiguous virtual memory address space, IV Exactness. The solution must have no false posi-
that address space is partitioned into three logical groups tives, and any false negatives must be the result of
for data: global, heap, and stack spaces. Global memory implementation limitations, not design limitations
is allocated at application load-time, and is available for (i.e., usefulness).
the entire lifetime of the application, (i.e. global mem-
ory is never explicitly deallocated). Heap memory is re- These requirements drive the design of CUP. Fully
quested by the application through the new operator or complying with the Precision and Object Awareness re-
a call to malloc, and is deallocated via the delete op- quirements relies on creating metadata for all allocated
erator or a free call. Stack memory space is implicitly objects. While it is possible [1, 9] to do alignment based
allocated with function calls, and is again implicitly deal- spatial checks without metadata, these schemes loose
located with a return call. precision, alter memory layout, and cannot support Ob-
Stack allocations account for almost all (99.9%) of ject Awareness. Object Awareness for temporal checks
3
1 struct pointer fields { capability ID space only needs to support the maximum
int32 id ; number of simultaneously allocated objects. This allows
3 int32 o f fs e t ; CUP to comprehensively cover globals, the heap, and
}
5
stack for all allocations in long running applications.
union e n r i c h e d p t r { Our metadata scheme that draws inspiration from both
7 struct pointer fields capability ; fat-pointers and disjoint metadata (and is thus a hy-
void p t r ;
9 }
brid of the two) for 64-bit architectures. We conceptu-
ally reinterpret the pointer as a structure with two fields,
Listing 1: Enriched Pointer as illustrated by Listing 1. The first field contains the
pointers capability ID. The second field stores the off-
set into the object. This does not change the size of the
requires metadata to either lookup whether the object is pointer, thus maintaining the ABI. Further, when point-
still valid [25] or to find all pointers associated with an ers are assigned, the capability ID automatically transfers
object and mark them invalid upon deallocations [16]. to the assigned pointer without further instrumentation.
Consequently, CUP is a metadata based sanitizer. Hybrid metadata rewrites pointers to include the ca-
CUP provides Comprehensive Coverage, and in par- pability ID of their underlying object and current off-
ticular protects globals, the heap, and stack by instru- set, creating enriched pointers. The size of the offset
menting all code, including libc. Our hybrid metadata field limits the size of supported object allocations. The
scheme scales to handle the required number of alloca- tradeoffs of the field sizes and our implementation deci-
tions 2.2, and our bounds check leverages the x86 64 sions are discussed in 4.1. While we use it for memory
architecture 4.2.2 to perform the required volume of safety, this design allows access to arbitrary metadata,
checks quickly enough to be usable. Additionally, our and could be applied for, e.g., type safety, or any prop-
compiler pass is robust enough to handle libc 4.3, mak- erty that requires runtime information about object allo-
ing CUP the first memory sanitizer to do so. cations.
Exactness is achieved, in part, failing closed, making Capability IDs in our hybrid-metadata scheme are in-
a missing check equivalent to a failed check. We modify dexes into a metadata table. Each entry in this table is a
the initial pointer returned by object allocation 3.2, and tuple of the base and end addresses for the memory ob-
our modification marks it illegal for dereference. This ject, required for spatial safety checks. Each object that
modification propagates through aliasing and all other is currently allocated has an entry in the table, leading
operations naturally. Consequently, we must check all to a memory overhead of 16 bytes per allocated object.
uses of the pointer for the program to execute correctly. Note that we do not require per-pointer metadata due to
Such an approach results in optimal precision at the cost our hybrid scheme. To reduce the number of required
of higher engineering burden (as shown in 4 and 5), IDs to the number of concurrently active objects, we al-
but dramatically reduces false negatives. False posi- low capability IDs to be reused. Allowing ID reuse thus
tives are prevented by maintaining accurate metadata, allows us to protect long running programs, as our limit
and having it propagate automatically. is on concurrent pointers, not total allocations supported.
The security impact of ID reuse is evaluated in 6.
The metadata table provides strong probabilistic Ob-
3.1 Hybrid Metadata Scheme ject Awareness. For a temporal safety violation to go un-
To provide Precision, Object Awareness, and Compre- detected, two conditions must hold. First, the capability
hensive Coverage, we introduce a new hybrid metadata ID must have been reused. Second, the accessed mem-
scheme that lets us embed a capability ID in a pointer ory must be within the bounds of the new object. Cur-
without changing its bit width. This capability ID ties a rent heap grooming techniques [11, 37] already require
pointer to the capability metadata for its underlying ob- a large number of allocations to manipulate heap state.
ject. Precision is provided by the metadata containing Adding the requirement that the same capability ID also
exact bounds for every object, and by not rearranging the be used makes temporal violations harder. 6 contains
memory layout. Object Awareness results from having a other suggestions to further increase the difficulty.
unique metadata entry for each capability ID. We are aware of two memory safety concerns for hy-
Providing Comprehensive Coverage requires assign- brid metadta: (i) arithmetic overflows from the offset to
ing a capability ID to all vulnerable objects in order to the capability ID, and (ii) protecting the metadata table.
associate their pointers with the objects capability meta- The first concern is addressed by operating on the two
data. However, the capability ID space is fundamentally fields of the pointer separately. By treating them like
limited by the width of pointers. To address this limit, separate variables while maintaining them as one en-
we allow capability IDs to be reused. Consequently, our tity in memory we prevent under/over flows from the
4
offset field modifying the capability ID. The second con- plicitly deallocated when their dominating function re-
cern is not relevant for CUP if all memory accesses turns. Deallocations are instrumented to mark associated
are checked, then the metadata table cannot be modified metadata invalid and to reclaim the capability ID.
through a memory violation. Pointer derefences are found by traversing the use-def
chain of identified pointers. Dereferences are analyzed
intra-procedurally, so we include pointers from function
3.2 Compiler Modifications arguments (including variadic arguments) and pointers
Our compiler adds instrumentation to create entries in returned by called functions in the set of allocations for
the metadata table and perform runtime checks. In par- this analysis. We instrument dereferences with a bounds
ticular, we first analyze all allocations (Comprehensive check. Note that the bounds check implicitly checks that
Coverage), and filter them to the ones we must protect the pointers capability ID identifies the correct object.
to provide Precision and Object Awareness. This section See 3.3 for a discussion of the safety guarantees.
also shows how CUP fails closed, and checks all required CUP also inserts instrumentation to handle int to
pointers, both of which contribute to its Exactness. pointer casts. These are commonly inserted by LLVM
The analysis phase identifies when objects are allo- during optimization, and have matching pointer to int
cated or deallocated, and when pointers are dereferenced casts in the same function. In this case, and any others
through an intra-procedural analysis. All pointers that where we can identify a matching pointer to int cast, we
are passed inter-procedurally are instrumented using our restore the original capability ID to the pointer. If we are
metadata scheme, including all heap allocations. unable to find a matching pointer to int cast, we default
We manually annotate libc 4.3 to mark heap alloca- to capability ID zero, which is all of user-space.
tions. For global variables we protect arrays as entries
are referenced indirectly (i.e., with pointer arithmetic). 3.3 Memory Safety Guarantees
Similarly for stack allocations, we only protect arrays, as
well as any address taken variable. We leverage existing We discuss how CUP guarantees spatial memory safety
LLVM analysis to find address taken variables. and probabilistically provides temporal safety. We as-
Our analysis further divides protected stack alloca- sume that all code is instrumented and capability IDs are
tions into (i) escaping and (ii) non-escaping allocations. protected against arithmetic overflow (as proposed).
An allocation does not escape if the following holds: (i) For code that we instrument, we keep a capability ID
it does not have any aliases, (ii) it is not assigned to the (and thus metadata) for every memory object that can be
location referenced by a pointer passed in as a function accessed via a pointer. This subset is sufficient to enforce
argument, (iii) it is not assigned to a global variable, (iv) spatial memory safety. Objects that are not accessed via
it is not passed to a sub-function (our analysis is intra- pointers are guaranteed to be safe by the compiler (if you
procedural excluding inlining), and (v) is not returned are reading an int, it will always emit instructions to
from the function. For those that escape, we use our read the correct 4 bytes from memory).
usual metadata scheme so that the bounds information Pointers can be used to read or write arbitrary mem-
can be looked up in other functions. For those that do ory. Further, the address that they reference is often de-
not escape, we use an alternate instrumentation scheme. termined dynamically. Thus, pointers require dynamic
The optimized instrumentation for non-escaping stack checks at runtime for memory safety guarantees. As de-
variables scheme creates local variables with base and fined in 2, memory objects define capabilities for point-
bounds information. Since these allocations are only ers. These capabilities include the size and validity of the
used within the body of the function, we use local vari- object. We only create capabilities when objects are al-
ables for checks instead of looking up the bounds in the located at runtime. Objects can change size due to, e.g.,
metadata table. This reduces pressure on our capability realloc() calls, in which case we update our metadata
IDs, helping us to achieve Comprehensive Coverage. appropriately by changing base and end to the new val-
All other allocation sites requiring metadata are instru- ues 4.2.2. Thus, we always have correct metadata for
mented to assign the object the next capability ID and to every object that has been created since the start of exe-
create metadata (recording its precise base and end ad- cution. The metadata for objects that have not been cre-
dresses) returning an enriched pointer. We create meta- ated yet is invalid by default.
data at allocation because it is the only time that we are Pointers can receive values in five ways. First, point-
guaranteed to know the size of the object. ers can be directly assigned from the memory allocation,
Identifying deallocations for objects is straightfor- e.g., through a call to malloc(). We have instrumented
ward. Global objects are never deallocated over the life- all allocations to return instrumented pointers. Second,
time of the program. Heap objects are explicitly deallo- they can receive the address of an existing object, via the
cated by, e.g., free() or delete. Stack objects are im- & operator. We treat this as a special case of object allo-
5
cation and instrument it. Third, pointers can be assigned 1 uint 32 n ex t en tr y = 1;
to the value of another pointer. As all existing point-
ers have been instrumented under the first two scenarios, 3 / / T h i s i s done i n l i n e , f u n c t i o n s a r e
illustrative
this case is covered as well. Fourth, pointers can be as-
signed the result of pointer arithmetic. This is handled 5 v o i d o n a l l o c a t i o n ( s i z e t b as e , s i z e t end ) {
naturally, with our separate loads preventing overflows s i z e t o f f s e t = t a b l e [ n e x t e n t r y ] . base ;
7 t a b l e [ n e x t e n t r y ] . base = base ;
into the capability ID. The fifth scenario is a cast from an t a b l e [ n e x t e n t r y ] . end = end ;
int to a pointer. This is exceedingly rare in well written 9 u i n t 3 2 r e t = 0 x80000000 & n e x t e n t r y ;
user-space code. However, the compiler frequently in- ne xt e nt ry = ne xt e nt r y + o f f s e t + 1;
serts these operations in optimized code. As a result, we 11 r e t u r n ( v o i d ) ( r e t << 3 2 ) ;
}
have to allow these operations. We assume that all ints 13 void o n d e a l l o c a t i o n ( i n t id ) {
being cast to pointer were previously a pointers, and thus t a b l e [ id ] . base = n e x t e n t r y id 1;
instrumented. 15 t a b l e [ i d ] . end = 0 ;
So far we have established that all pointers are en- ne xt e nt ry = id ;
17 }
riched with capabilities that accurately reflect the state
of the underlying memory object. Memory safety vio- Listing 2: Free List
lations occur when pointers are dereferenced [38]. We
instrument every dereference to check the pointers capa-
bility and ensure that the dereference is valid. Because
each pointer has a capability and each capability is up- (iv) how to divide the 64 bits in a pointer between the
to-date this ensures full memory safety. capability ID and offset in our enriched pointers 3.1.
A programs is memory safe before any pointer deref- Our metadata table is maintained as a global pointer to
erence happens. We have shown that each type of pointer a mmapd region of memory. Similarly, the next entry in
dereference is protected. Consequently, every pointer that table is a global variable known as next entry.
dereference is valid. Thus, our scheme preserves mem- By mmaping our metadata table, we allow the kernel
ory safety for the entire program. to lazily allocate pages, limiting our effective memory
overhead. Further, our ID reuse scheme reduces frag-
4 Implementation mentation of our metadata since it will always reuse a ca-
pability ID before allocating a new one. This also helps
We implemented CUP on top of LLVM version 4.0.0- improve the locality of our metadata lookups, reducing
rc1. Our compiler pass is 2,500 LoC (lines of code), cache pressure. Alternative reuse schemes with better
the runtime is another 300 LoC for 2,800 LoC total. temporal security are discussed in 6
The line count excludes modifications to our libc, which To implement our capability ID reuse scheme, we up-
required only light annotations 4.3. Our pass runs after date next entry using our free list. The first entry in our
all optimizations, so that our instrumentation does not metadata table is reserved 4.2.2, so next entry is ini-
prevent compiler optimizations. This also reduces the tially one. The free list is encoded in the base fields of
total amount of memory locations that must be protected, each free entry in the table. These are all initialized to
reducing capability ID pressure. zero. When an entry is freed, the base field is set to the
Here we discuss the technical details of how we im- offset to the next available table entry. When we add a
plemented CUP in accordance with our design 3. We metadata entry, next entry is incremented, and the off-
first discuss how our hybrid metadata scheme is imple- set is added. When an object is deallocated, we have
mented. Next we present how we find the sets of alloca- to update the base field for its corresponding capability
tions and dereferences required by our design. With the ID (ID) to maintain the free list correctly. This requires
metadata implementation in mind, we then show how we calculating the offset to the next free entry. C code illus-
instrument allocations and dereferences. With these de- trating these operations is in Listing 2.
tails established, we discuss the modifications required The final implementation decision for our metadata
to libc for it to work with CUP. scheme is how to divide the 64 bits of the pointer be-
tween the capability ID and offset. We use the high
order 32 bits to store our enriched flag and capability
4.1 Metadata Implementation
ID Figure 1. This leaves the low order 32 bits for the
Our metadata scheme consists of four elements: (i) a ta- offset. Limiting the offset to 32 bits does limit individual
ble of information, (ii) a bookkeeping entry for the next object size to 4GB under our current design (with up to
entry to use in that table, (iii) a free list (encoded in the 231 such allocations). However, hardware naturally sup-
table) that enables us to reuse entries in the table, and ports 32-bit manipulations, improving the performance
6
1 s t r u c t Metadata {
Offset s i z e t base ;
z }| { 3 s i z e t end ;
0x |80000001
{z } 00000000 }
Capability ID 5
s t r u c t Metadata t a b l e ;
7
Figure 1: A potential pointer value after enrichment. / / The f o l l o w i n g co d e i s p u r e l y i l l u s t r a t i v e .

9 / / T h i s i s a l l done i n l i n e i n LLVM IR i n
Note that the high order bit is 1 to indicate that it is en- / / our implementation .
riched. 11
s t a t i c i n l i n e s i z e t c h e c k b o u n d s ( s i z e t b as e ,
s i z e t end , s i z e t ch eck ) {
of our implementation. Further, having a 31-bit capabil- 13 s i z e t v a l i d = ( ch eck b a s e ) | ( end (
ity ID space is crucial for protecting the entire applica- ch eck + s i z e ) ) ;
v a l i d = v a l i d & 0 x8000000000000 00 0 ;
tion 2.2. The enriched bit helps make our engineering 15 return valid ;
effort easier - with it we can safely check an unenriched }
pointer dereferences 4.2.2, allowing us to be be conser- 17
vative and over-approximate in the set of pointer deref- s t a t i c i n l i n e v o i d ch eck ( v o i d p t r , u i n t 3 2

size ){
erences that we check 4.2.2. The most common use 19 s i z e t tmp = ( s i z e t ) p t r ;
case for this is pointers returned from the kernel in, e.g., s i z e t mask = p t r >> 6 3 ;
malloc(). Being able to handle non-enriched pointers 21 u i n t 3 2 i d = ( tmp >> 3 2 ) & 0 x 7 f f f f f f f ;
i d = i d & mask ;
allows us to intervene only when the pointer is returned
23 s i z e t base = t a b l e [ id ] . base ;
to the user from libc 4.3, and keep dynamic allocation s i z e t end = t a b l e [ i d ] . end ;
outside the trusted computing base. 25 s i z e t ch eck = b a s e + ( u i n t 3 2 ) p t r ;
With a minimal allocation size of 8 bytes, a 31-bit r e t u r n ( v o i d ) ( c h e c k b o u n d s ( b as e , end ,
ch eck ) & ch eck ) ;
ID allows for at least 16 GB of allocated memory. In 27 }
practice, much more memory can be allocated as ob-
jects are usually larger than 8 bytes. When fully allo- 29 v o i d s e t ( i n t x , i n t v a l ) {
cated, our metadata table uses 2GB * sizeof(struct ( ch eck ( x , 4 ) ) = v a l ;
31 }
Metadata), see Listing 3. Note that CUP only allocates
pages for IDs that are actually used. 33 / / ex am p le o f d e r e f e r e n c i n g an e s c a p i n g and a
local stack array
i n t main ( v o i d ) {
4.2 Compiler Pass 35 i nt escapes [ 5 ] ;
e s c a p e s = o n a l l o c a t i o n ( e s c a p e s , e s c a p e s +5
Our LLVM compiler pass operates in two phases: (i) 4) ;
analysis, and (ii) instrumentation. As per our design, the 37 s e t ( es capes [ 2 ] , 10) ;
analysis phase first determines a set of code points that
39 int local [5];
require us to add code to perform our runtime checks, size t local base = local ;
and the instrumentation phases adds these checks. These 41 s i z e t l o c a l e n d = l o c a l +5 4 ;
checks have been optimized to let the hardware detect ( l o c a l & check bounds ( l o c a l b a s e ,
l o c a l e n d , l o c a l +2 4 ) ) = 1 0 ;
bounds violations rather than doing comparisons in soft- 43
ware. Listing 3 has a running example for stack objects. on deallocation ( escapes ) ;
45 }
4.2.1 Analysis Implementation Listing 3: Instrumentation Example

The first task of our pass is to find the set of object alloca-
tions that we must protect to guarantee spatial safety 3.
Heap-based allocations via malloc are found by our in-
strumented musl libc as detailed in 4.3. Stack-based al- offsets from the frame pointer. Arrays are trivially found
locations are found by examining alloca instructions in by checking the type of alloca instruction as LLVMs
the LLVM Intermediate Representation (IR). These are type system for their IR includes arrays. LLVMs IR has
used to allocate all stack local variables. However, as no notion of the & operator. However, clang (the C/C++
detailed in 2.1 we only target allocations which can be front end) inserts markers llvm.lifetime.start
indirectly accessed via, e.g., pointers. In practice, this which we use to identify stack allocations that have their
means that we need to protect arrays and address-taken address taken.
variables on the stack, all others are accessed via fixed CUP also protects global variables. As shown in 2.1,
7
we only need to protect global arrays (logic is identi- based allocations.
cal to protecting stack arrays). Global arrays present a
challenge for our instrumentation scheme. CUP relies Allocation Instrumentation CUP requires us to
on changing pointer values. Unfortunately in C/C++ it rewrite the pointer for every allocation that we iden-
is illegal to assign to a global array once it has been al- tify as potentially unsafe. Our rewritten or enriched
located. This means that we cannot change the pointers pointer contains the assigned capability ID, has the en-
value. To address this challenge, we create a new global riched bit set, and the lower 32 bits (which encode the
pointer to the first element in the array, and instrument distance from the base pointer) are set to 0. All uses
that pointer. We then replace all uses of the global ar- of the original pointer are then replaced with the new,
ray with our new pointer that can be manipulated as de- instrumented pointer. At allocation time, we use the
scribed above. The new pointer must be initialized at next entry global variable as the capability ID, and then
runtime, once the address of the global array it replace update next entry as described in 4.1. See the escapes
has been done this. To do this, we add a new global con- variable in main() in Listing 3.
structor that initializes our globals. The constructor is Note that this creates a pointer which cannot be deref-
given priority such that it runs before any code that relies erenced on x86 64, which requires that the high order 16
on our globals. bits all be 1 (kernel-space) or 0 (user-space). As a result,
Once the analysis pass has identified the set of allo- any dereference without a check will cause a hardware
cations that need to be protected, it must find all deref- fault. Consequently, for any program that runs correctly
erences of pointers to those objects. For stack and heap we can guarantee that all pointers to protected allocations
variables, it uses an intra-procedural analysis to do this. are checked on dereference. This is in contrast to other
In particular, the analysis pass iterates over the set of pro- schemes [24] that fail open, i.e., silently continue, when
tected stack allocations in the function, pointers passed in a check is missed, sacrificing precision.
as arguments (including variadic arguments), and point- When an object is deallocated, we insert code to up-
ers returned by called functions. For each such pointer, date the free list in our metadata table as per 4.1
it traces through the use-def chain looking for deref- and Listing 2. Further, we mark the end address 0 to
erences. In LLVM IR, this corresponds to load and invalidate the entry.
store instructions for read and write derefences respec-
tively. Once these are found, they are added either to the Dereference Instrumentation To dereference a
list of local checks (if the originating allocation is non- pointer, two things need to be done. First, we need
escaping), or the list of checks using our metadata that to recreate the unenriched pointer. Then, using the
need to be inserted. metadata from the enriched pointers capability ID,
Global variables are handled slightly differently. we need to make sure that the unenriched pointer is in
LLVM maintains their use-def chains at the Module bounds.
level, corresponding to a source file. Therefore we ap- To create the unenriched pointer, we first retrieve the
ply the same analysis used for stack / heap variables, high order bit (which indicates whether the pointer is en-
but it operates over the entire module instead of intra- riched or not). We create a 64-bit mask with the value
procedurally. of this bit. We then extract the capability ID, and AND
it with this mask. If the pointer was not enriched, this
4.2.2 Instrumentation yields an ID of 0 and otherwise preserves the capability
ID. We then lookup the base pointer for the capability
Our compiler is required to add two types of instrumen- ID, and add the offset to it. See check in Listing 3.
tation to the code: (i) allocation, and (ii) dereference. In the case where the pointer was not enriched, we
Allocation instrumentation is responsible for assigning lookup the reserved entry 0 in our metadata table. This
a capability ID to the resulting pointer, creating meta- entry has base and end values that reflect all of user-
data for it, and returning the enriched pointer. A sub- space (0 to 0xffffffffffff). Thus, performing our unen-
category of allocation instrumentation is handling deal- richment on an unenriched pointer has no effect, and our
locations where metadata must be invalidated and the spatial check below simply sandboxes it in user-space.
free list updated. Dereference instrumentation is respon- Our spatial check performs the requisite lower and up-
sible for performing our bounds check, and returning a per bounds check. Note, however, that on the upper
pointer that can be dereferenced. While our runtime li- bound we have to adjust for the number of bytes being
brary provides functions for the functionality described read or written. This adjustment adds significantly to
in this section (for use in manual annotations), all of our our improved precision against other mechanisms 5.2.
compiler added checks are done inline. Listing 2 and To illustrate its importance consider the following. An
Listing 3 contain examples for these operations for stack int pointer is being dereferenced, meaning the last byte
8
used is the pointer base + 4 bytes - 1, while for a char headers before each allocated segment. Consequently,
pointer, the last byte used is the pointer base + 1 byte - 1. the allocators view of memory is different than that of
Equation 1 shows our bounds check formula, the size of the program.
the pointer dereferenced is element size. To compensate for this difference, we manually instru-
mented musls allocator. We ensure that pointers are in-
strumented with the programmer specified length, not the
base ptr + element size 1 < base + length (1) rounded length. Further, we manually unenrich pointers
as necessary to allow musl to read the header blocks that
Hardware Enforcement The check in Equation 1 proceed heap data segments. Without our intervention,
could naively be implemented with comparison instruc- these would appear to be out of bounds.
tions and a jump resulting in additional overhead. We An interesting corner case is realloc(). By design
propose a novel way to leverage hardware to perform it changes the end address. Additionally, it can change
the check for us. We observe that the difference be- the base address if it was forced to move the allocation
tween the adjusted pointer and the base address should to find sufficient room. We manually intervene in both
always be greater than or equal to 0. Similarly, the cases to keep our metadata table up-to-date.
end address minus the adjusted pointer should always be
greater than or equal to 0. Consequently, the high order 4.3.2 System Calls
bit in the differences should always be 0 (x86 64 with
twos complement arithmetic). We OR these two differ- System calls are made through a dedicated API contain-
ences together, mask off the low order 63 bits, and then ing inline assembly in musl. We initially instrumented
OR the result into the unenriched pointer. If it passed this API to ensure that no enriched pointers are passed
the check, this changes nothing. If it failed the check, to the kernel. Unfortunately, this is insufficient as structs
it results in a invalid pointer, causing a segfault when containing pointers are passed to the kernel (FILE structs
dereferenced. Listing 3 shows this optimized check in in particular). Consequently, we required more context
check bounds(). than the narrow system call API provided. This forced
us to find the actual system call sites and add additional
instrumentation to ensure that all pointers passed to the
4.3 LIBC kernel are unenriched.
Libc is the foundation of nearly every C program and
therefore linked with nearly every executable. Unfor- 4.3.3 Assembly Functions
tunately, many of its most popular functions such as
musl implements many of the mem*, str* functions in
the str* and mem* functions are highly prone to mem-
assembly for x86. As a result, clang is unaware of
ory safety errors. They make assumptions about pro-
these functions as they are directly assembled and linked
gram state (e.g., null terminated strings, buffer sizes)
in. We therefore manually instrumented these functions.
and rely on them without checking that they hold. Prior
This required minimal intervention to call the relevant
work [24, 35, 9] assumes that libc is part of the trusted
function in our runtime library, while preserving the reg-
code base (TCB).
ister state and return value of the functions we annotated.
In contrast, CUP removes libc from the TCB by instru-
menting libc with our compiler. The majority of the in-
strumentation is automatic, with few exceptions such as 4.3.4 Memory Safety in Musl
the memory allocator, system calls, and functions imple-
musl itself is not memory safe. As an example,
mented in assembly code. The most mature libc imple-
strlen() reads 4 bytes at a time as an optimization.
mentation that we are aware of that compiles with Clang-
This means that it can overread by as many as 3 bytes.
4.0.0-rc1 is musl [20]. Our instrumented musl libc is part
To prevent this, we modify strlen() to read byte by
of CUP.
byte. Our experiments show that this does not effect per-
formance for strings of length less than O(105 ). Other
4.3.1 Dynamic Memory Allocator str* functions overread as well when looking for the
NULL terminator.
The dynamic memory allocator is responsible for re-
questing memory for the process from the kernel, and
returning it. To do so efficiently, most allocators in- 5 Evaluation
cluding musl modify user requests. In particular, musl
rounds up the number of bytes requested. Further, musl We evaluate CUP along two axes. First, we show that its
maintains metadata inline on the heap in the form of performance is competitive with existing sanitizers. Fur-
9
483.xalancbmk
462.libquantum
4435.gobmk
482.sphinx3
464.h264ref
450.soplex
456.hmmer
444.namd
473.astar
401.bzip2
Geomean
458.sjeng
433.milc
470.lbm
429.mcf
musl-baseline 382 213 386 278 371 369 367 327 166 431 278 183 201 403 297
ASan - 261 - 596 630 264 - 464 369 444 377 261 243 - 369
SoftBound+CETS 2004 - - - 1372 - - - - 1543 - - 409 - 1148
CUP 1131 503 1255 1268 1192 391 1196 955 - 638 905 - 370 1185 839
Table 2: Performance Results. indicates C++ benchmark
CUP 158% run, CUP is faster than SoftBound+CETS. Further, com-

ASan 38% pared to these existing tools, we offer stronger, more
SoftBound+CETS 53% precise coverage 5.2. As we show in 5.2, CUP has
10x the coverage of SoftBound+CETS, for less over-
Table 3: Percent Overhead over Baseline head. Low-Fat Pointers [9] reports 16% to 62% over-
head depending on their optimization level. They achieve
CUP 83% this by using clever alignment tricks to avoid metadata
SoftBound+CETS 134% look ups. This has two drawbacks: (i) they round allo-
cations up to the nearest power of two, losing precision
Table 4: Percent Overhead over ASan
for bounds checks, and (ii) their design cannot support
temporal checks which require metadata.
ther, our performance is markedly better than other tools
that provide both Precision and Object Awareness, given
5.2 Juliet Suite
that CUP provides Comprehensive Coverage. Second,
we demonstrate our Exactness, showing that we have no NIST maintains the Juliet test suite, a large collection
false positives and two orders of magnitude fewer false of C source code that demonstrates common program-
negatives than existing open source tools on the NIST ming practices that lead to memory vulnerabilities, orga-
Juliet suite. Furthermore, the false negatives that we do nized by Common Weakness Enumeration (CWE) num-
register are architecturally dependent, and do not actually bers [15]. Every example comes with two versions: one
represent a failure to fulfill memory safety requirements. that exhibits the bug and one that is patched. We ex-
All experiments were run on Ubuntu 14.04 with a tracted the subset of tests for heap and stack buffers1.
3.40GHz Intel Core i7-6700 CPU and 16GB of memory. We compiled all sources with our pass, as well as with
SoftBound+CETS2 and with AddressSanitizer [35]. Ev-
ery patched test should execute normally. If a memory
5.1 Performance
protection mechanism prematurely kills a patched test,
Performance is an important requirement for any us- we call it a false positive. Conversely, every buggy test
able sanitizer. We evaluate the performance of CUP should be stopped by the memory safety mechanism. All
with musl. For comparison, we also measure the per- three memory protection mechanisms kill the process in
formance overhead of similar sanitizers, using glibc as case of a memory violation. If the process is not killed,
the baseline. AddressSanitizer is the closest open source we report a false negative.
related work that is compatible with LLVM-4.0.0-rc1. Table 5 summarizes the results. Out of 4,038 tests that
SoftBound+CETS is open source, but relies on LLVM- should not fail, we incur no false positives. ASan and
3.4 and can only run a small subset of SPEC CPU2006 SoftBound+CETS show a 0% and 0.3% respectively. We
benchmarks. Its performance results are reported here, produce a 0.1% false negative rate, while ASan produces
but are not directly comparable. Low-Fat Pointers is not an 8% false negative rate, and SoftBound+CETS has a
yet open source. 25% false negative rate.
Table 3 summarizes the performance results for CUP. The false positives that SoftBound+CETS registers
We have 158% overhead vs baseline, compared to 38% comes from variations in how the alloca() function
for AddressSanitizer. Comparing CUP directly with Ad- 1 The tests that match the following regular expression:
dressSanitizer Table 4 shows that we have 83% overhead CWE(121|122)+((?!socket).)*(\.c)$
relative to AddressSanitizer. On benchmarks where both 2 git commit 9a9c09f04e16f2d1ef3a906fd138a7b89df44996
10
False Neg. False Pos. 1 i n t main ( ) {
CUP 4 (0.1%) 0 (0%) i n t d ata = rand ( ) ;
SoftBound+CETS 1032 (25%) 12 (0.3%) 3 char b u f f e r = ( char ) a l l o c (10 , 1) ;
memset ( ( v o i d ) b u f f e r , A , 1 0 ) ;
AddressSanitizer 315 (8%) 0 (0%) 5 b u f f e r [ d a t a ] = B ;
Total Tests 4038 }
Listing 4: Example of code ASan fails to protect

Table 5: Juliet Suite Results
call is handled. alloca() is compiler dependent [17]. 6 Discussion

The failing examples use alloca which is wrapped
around a static function. SoftBound+CETS uses clang We discuss several design aspects of CUP, how we han-
3.4 as the underlying compiler, and CUP uses clang 4.0. dle specific corner cases, potential for optimization, and
Consequently, SoftBound+CETS handles the examples further extensions.
differently, and sees the memory from alloca as invalid,
while CUP does not. Reducing instrumentation. Prior work on reducing
Our false negatives are architecturally dependent. The the amount of required runtime checks has focused on
examples we fail to catch attempt to allocate memory type systems. CCured [26] infers three types of point-
for a structure containing exactly two ints. However, ers: safe, sequential, and wild. Safe pointers are stati-
erroneously, the examples use the size of a pointer to cally proven to stay in bounds. Sequential pointers are
the two int structure when allocating memory. (e.g. only incremented (or decremented) e.g., iterating over
malloc(sizeof(TwoIntStruct*))). On architec- an array in a loop. All remaining pointers are classified
tures which do not define pointers as twice the size of as wild. Leveraging CCured-style type systems to fur-
ints (including 32-bit x86), such a mistake would lead ther optimize memory safety solutions is left as an open
to a memory violation. With the x86 64 architecture, problem.
though, the size of a pointer and the size of the two int
structure are the same. Thus, while semantically incor-
rect, no true memory violation occurs. No false positives, Optimization through LTO. CUP does not depend on
combined with no true false negatives shows that CUP Link Time Optimization (LTO). However, LTO makes
fulfills the Exactness requirement. it possible to inline functions across source files, and
ASan and SoftBound+CETS higher false negative rate generally flattens code. Inlining would increase the ef-
results from their incomplete coverage. In particular, fectiveness of our stack optimization and further reduce
many of the Juliet examples involve calling libc functions the amount of instrumented stack variables, reducing the
to copy strings and buffers (e.g. strcpy and memcpy). number of IDs that a program consumes. Reducing the
Neither ASan nor SoftBound+CETS are able to pro- IDs a program uses, reduces the overhead for ID man-
tect against unsafe libc calls. Our instrumentation of agement and resources used by CUP.
libc 4.3 allows us to properly detect memory violations
in these calls. Further, our adjustment for read / write Arithmetic overflow. Our hybrid metadata scheme
size 4.2.2 allows us to catch additional memory safety stores the capability ID as part of the pointer. Pointer
violations. arithmetic can potentially modify the ID, allowing an
Listing 4 provides source based on a Juliet example 3 adversary to chose the metadata the pointer is checked
that CUP properly handles, and which ASan handles in- against. To prevent this attack vector, the upper and
correctly. The code allocates 10 bytes on the stack, sets lower 32 bits of the pointer are loaded separately. The
the bytes to ASCII A, and then sets a random charac- compiler enforces that any arithmetic operations only
ter to ASCII B. ASan protects the stack by surrounding happen on the lower 32 bits, which contain the pointers
stack objects with poison zones. However, depending on offset. This protects our capability ID from manipulation
the value of data, line 5 could be outside the allocated by an adversary.
objects and outside the poison region a classic buffer
overwrite. CUPs analysis detects that buffer leaves Stronger temporal protection. As discussed in 3.1,
main through the call to memset, and inserts the appro- it is possible (albeit difficult) for an attacker to perform
priate runtime checks. If data is greater than 10, those a temporal attack on software instrumented with CUP.
runtime checks fail. If CUP were deployed as an active defense, the difficulty
of a successful temporal attack could be increased by uti-
3 CWE121 Stack Based Buffer Overflow CWE129 rand 22 bad() lizing a randomized memory allocator like DieHard [2]
11
or DieHarder [32]. Such allocators randomize heap al- scheme as future work. Note that this limitation is shared
locations, making heap grooming [11, 37] much more with other sanitizers.
difficult. Beyond getting the addresses to line up, the
increased number of required allocations makes match-
ing the capability IDs even harder. This renders a suc- 7 Related Work
cessful use-after-free attack highly unlikely, with mini-
mal additional overhead. Additionally, a lock and key Precision is required to enforce spatial memory safety
scheme [25] can be added to our metadata. Alternatively, (bounds checks). There is a class of memory safety
our metadata can be extended to include a dangnull [16]- solutions that only approximately enforce this prop-
style approach that records how many references are still erty [1, 35, 9]. These solutions make use of techniques
pointing to an object and either explicitly invalidating such as poisoned zones detecting spatial violations
those references or waiting until the last reference has within limits, or rounding allocation size which causes
been overwritten before reusing IDs. the executed program to differ from the programmers in-
tent and results in challenges when trying to handle intra-
array and intra-struct checks. By changing the memory
Uninstrumented code. CUP supports linking with layout and not enforcing exact bounds, these solutions
uninstrumented libraries. Enriched pointers are the same are not faithful to the programmers intent. [24] is the
size as regular pointers, maintaining the ABI. Derefer- existing solution which best satisfies this requirement,
encing enriched pointers in uninstrumented code results, though it has other limitations.
by design, in a segmentation fault. A segmentation fault Object-based memory checking [5, 7, 10] keeps track
handler can, on demand, dereference individual point- of metadata on a per-object basis. Since the meta-data is
ers. As the memory allocator is instrumented, memory associated on a per object level, every pointer to the ob-
allocations in uninstrumented code will return enriched ject shares the same metadata. Object layout in memory
pointers. Stack allocations on the other hand will not be is generally left unchanged, which increases ABI com-
protected in uninstrumented code. While this option al- patibility. However, pointer casts and pointers to subfield
lows compatibility, it clearly results in high performance struct members are unhandled [23]. SAFECode [8] is an
overhead. Note that, thanks to support for recompiling example efficient object-based memory checking.
all user-space code, this situation is limited to legacy Recently, Intel has started to add memory safety exten-
code. sions, called Intel MPX, to their processors, starting with
the Skylake architecture [13]. These extensions add ad-
Uninstrumented globals. We currently cannot handle ditional registers to store bounds information at runtime.
the corner case of external global arrays defined in unin- While effective at detecting spatial memory violations,
strumented code. Our pass assumes such arrays have MPX is incapable of detecting temporal violations [33],
been instrumented, and so adds an extern pointer to and typecasts to integers are not protected. In addition,
them 4.2.1. If the global was defined in uninstrumented current implementations of MPX incur a large memory
code, this assumption does not hold. Consequently, our penalty of up to 4 times normal usage [19]. Other ISA
extern pointer does not exist, and the code will not link. extensions include Watchdog [22], WatchdogLite [21],
Note that such a situation is unlikely, as we support full HardBound [6], and Chuang et al. [4]. As a software
user-space instrumentation. only solution, CUP does not require extra hardware or
ISA extensions.
Assembly code. CUP automatically instruments all

code written in high level languages; our analysis pass Object Awareness is required to prevent temporal
runs on LLVM IR. Our analysis does not (and cannot) in- memory safety violations (lifetime errors). This prop-
strument inline assembly and assembly files due to miss- erty requires remembering for every pointer whether the
ing type information. We rely on the programmer to ei- object to which it is assigned is still allocated. Address-
ther instrument the assembly code accordingly or to fall Sanitizer [35] and Low-Fat Pointers [9] make no attempt
back on supporting uninstrumented code as mentioned to do this (and their metadata does not support this prop-
above. erty), while SoftBound+CETS [25] enforces this prop-
erty. AddressSanitizer and Low-Fat Pointers do not
maintain metadata either per object or per pointer. Ob-
Data races. CUP does not protect against inter-thread ject Awareness requires either per objecct or per pointer
races of updates to metadata (e.g., one thread frees an metadata. Consequently, they fundamentally cannot en-
object while the other thread is dereferencing a pointer). force temporal safety because their (current) metadata
We leave the design of a low-overhead metadata locking cannot be object aware.
12
Temporal only detectors include DangNull [16] and the largest source of pointers, remained largely unpro-
Undangle [3] from Microsoft. DangNull automatically tected. Finally, we produce zero false negatives and zero
nullifies all pointers to an object when it is freed. Un- authentic false positives in the NIST Juliet Vulnerability
dangle uses an early detection technique to identify un- example suite, which represents a significant advance-
safe pointers when they are created, instead of being ment over existing memory safety mechanisms.
used. CUP only provides a probabilistic temporal de-
fense, however, DangNull and Undangle lack any spatial References
protection.
[1] A KRITIDIS , P., C OSTA , M., C ASTRO , M., AND H AND , S.
Baggy bounds checking: An efficient and backwards-compatible
Comprehensive Coverage is required to fully protect defense against out-of-bounds errors. In Proceedings of the
18th Conference on USENIX Security Symposium (Berkeley, CA,
the program. As shown in 2.2 stack objects are the USA, 2009), SSYM09, USENIX Association, pp. 5166.
overwhelming majority of allocations, and to this day a [2] B ERGER , E. D., AND Z ORN , B. G. Diehard: Probabilistic mem-
significant portion of memory safety Common Vulnera- ory safety for unsafe languages. SIGPLAN Not. 41, 6 (June 2006),
bilities and Exposures (CVE) are stack related. Our eval- 158168.
uation of SoftBound+CETS 5.2 shows that it has poor [3] C ABALLERO , J., G RIECO , G., M ARRON , M., AND N APPA , A.
coverage missing many stack vulnerabilities. Address- Undangle: early detection of dangling pointers in use-after-free
and double-free vulnerabilities. In Proceedings of the 2012 In-
Sanitizer and Low-Fat Pointers do better. AddressSani- ternational Symposium on Software Testing and Analysis (2012),
tizer protects the stack through the use of poisoned zones, ACM, pp. 133143.
and, as illustrated in 5.2 cannot handle all invalid stack [4] C HUANG , W., N ARAYANASAMY, S., AND C ALDER , B. Accel-
memory accesses. Additionally, neither of them supports erating meta data checks for software correctness and security.
compiling libc leaving the window open for vulnera- Journal of Instruction-Level Parallelism 9 (2007), 126.
bilities such as GHOST [18]. Tripwires [27, 34, 41, 44] [5] C RISWELL , J., L ENHARTH , A., D HURJATI , D., AND A DVE ,
are a way to detect some spatial and temporal memory V. Secure virtual architecture: A safe execution environment for
commodity operating systems. In ACM SIGOPS Operating Sys-
errors [35, 42]. Tripwires place a region of invalid mem- tems Review (2007), vol. 41, ACM, pp. 351366.
ory around objects to avoid small stride overflows and
[6] D EVIETTI , J., B LUNDELL , C., M ARTIN , M. M., AND
underflows. Temporal violations are caught by register- Z DANCEWIC , S. Hardbound: architectural support for spatial
ing memory freed as invalid, until reclaimed. Tripwires, safety of the c programming language. In ACM SIGARCH Com-
however, miss long stride memory errors, and thus can- puter Architecture News (2008), vol. 36, ACM, pp. 103114.
not be said to be completely secure. [7] D HURJATI , D., AND A DVE , V. Backwards-compatible array
The state-of-the-art C/C++ pointer-based memory bounds checking for c with very low overhead. In Proceedings of
the 28th International Conference on Software Engineering (New
safety scheme is SoftBound+CETS [23]. Other pointer- York, NY, USA, 2006), ICSE 06, ACM, pp. 162171.
based schemes include CCured [26] and Cyclone [14]. [8] D HURJATI , D., AND A DVE , V. Backwards-compatible array
CCured uses a fat pointer to store metadata, as well as bounds checking for c with very low overhead. In Proceed-
programmer annotations for indicating safe casts. Un- ings of the 28th international conference on Software engineering
fortunately, fat pointers break the ABI, and programmer (2006), ACM, pp. 162171.
annotations can significantly increase developer time. [9] D UCK , YAP, AND C AVALLARO. Stack bounds protection with
low fat pointers. In NDSS Symposium 2017 (2017), NDSS 2017.
Even with annotations, CCured fails to handle structure
changes. Cyclone also uses a fat pointer scheme, but [10] E IGLER , F. C. Mudflap: Pointer use checking for c/c+. In GCC
Developers Summit (2003), Citeseer, p. 57.
does not guarantee full memory safety. We refer to [23]
[11] E VANS , C. https://googleprojectzero.blogspot.com/2015/06/what-is-go
for a survey of other related work on memory safety and
a discussion on different pointer metadata schemes. [12] H ICKS , M. What is memory safety?, 2014.
[13] Intel software development emulator.
https://software.intel.com/en-us/articles/intel-software-developme
2015.
8 Conclusion
[14] J IM , T., M ORRISETT, J. G., G ROSSMAN , D., H ICKS , M. W.,
C HENEY, J., AND WANG , Y. Cyclone: A safe dialect of c.
We present CUP, a C/C++ memory safety mechanism In Proceedings of the General Track of the Annual Conference
that provides full user-space protection, including libc, on USENIX Annual Technical Conference (Berkeley, CA, USA,
and strong probabilistic temporal protection. It is the 2002), ATEC 02, USENIX Association, pp. 275288.
first such mechanism that satisfies all requirements for [15] National institute of standards and technology juliet c/c++ test
a complete memory safety solution, while incurring only suite. https://samate.nist.gov/SARD/testsuite.php,
2013.
modest performance overhead compared with the state-
[16] L EE , B., S ONG , C., JANG , Y., WANG , T., K IM , T., L U , L.,
of-the-art. CUP is exact, object aware, comprehensive AND L EE , W. Preventing use-after-free with dangling pointers
in its coverage, and precise. We fully protect all user- nullification. In Network and Distributed Systems Symposium
space memory, including the stack, which, despite being (2015).
13
[36] S ERNA
[17] MAN 7. ORG. http://man7.org/linux/man-pages/man3/alloca.3.html . , F. J., AND S TADMEYER , K.
https://security.googleblog.com/2016/02/cve-2015-7547-glibc-getadd
[18] M ITRE. https://cve.mitre.org/cgi-bin/cvename.cgi?name=cve-2015-0235.
[19] Addresssanitizerintelmemoryprotectionextensions. [37] S OTIROV, A. Heap feng shui in javascript. Black Hat Europe
(2007).
https://github.com/google/sanitizers/wiki/AddressSanitizerIntelMemoryProtectionExtensions ,
2016. [38] S ZEKERES , L., PAYER , M., W EI , T., AND S ONG , D. Sok: Eter-
[20] musl libc. http://www.musl-libc.org/, 2016. nal war in memory. In Proceedings of the 2013 IEEE Symposium
on Security and Privacy (Washington, DC, USA, 2013), SP 13,
[21] N AGARAKATTE , S., M ARTIN , M. M., AND Z DANCEWIC , IEEE Computer Society, pp. 4862.
S. Watchdoglite: Hardware-accelerated compiler-based pointer
checking. In Proceedings of Annual IEEE/ACM International [39] T REND M ICRO. http://blog.trendmicro.com/pwn2own-day-1-recap/.
Symposium on Code Generation and Optimization (2014), ACM, [40] VAN DER V EEN , V., DUTT S HARMA , N., C AVALLARO , L., AND
p. 175. B OS , H. Memory errors: The past, the present, and the future.
[22] N AGARAKATTE , S., M ARTIN , M. M. K., AND Z DANCEWIC , S. In Proceedings of the 15th International Conference on Research
Watchdog: Hardware for safe and secure manual memory man- in Attacks, Intrusions, and Defenses (Berlin, Heidelberg, 2012),
agement and full memory safety. In Proceedings of the 39th An- RAID12, Springer-Verlag, pp. 86106.
nual International Symposium on Computer Architecture (Wash- [41] V ENKATARAMANI , G., ROEMER , B., S OLIHIN , Y., AND
ington, DC, USA, 2012), ISCA 12, IEEE Computer Society, P RVULOVIC , M. Memtracker: Efficient and programmable sup-
pp. 189200. port for memory access monitoring and debugging. In High Per-
[23] N AGARAKATTE , S., M ARTIN , M. M. K., AND Z DANCEWIC , S. formance Computer Architecture, 2007. HPCA 2007. IEEE 13th
Everything You Want to Know About Pointer-Based Checking. International Symposium on (Feb 2007), pp. 273284.
In 1st Summit on Advances in Programming Languages (SNAPL [42] V ENKATARAMANI , G., ROEMER , B., S OLIHIN , Y., AND
2015) (Dagstuhl, Germany, 2015), T. Ball, R. Bodik, S. Krish- P RVULOVIC , M. Memtracker: Efficient and programmable sup-
namurthi, B. S. Lerner, and G. Morrisett, Eds., vol. 32 of Leib- port for memory access monitoring and debugging. In Proceed-
niz International Proceedings in Informatics (LIPIcs), Schloss ings of the 2007 IEEE 13th International Symposium on High
DagstuhlLeibniz-Zentrum fuer Informatik, pp. 190208. Performance Computer Architecture (Washington, DC, USA,
[24] N AGARAKATTE , S., Z HAO , J., M ARTIN , M. M., AND 2007), HPCA 07, IEEE Computer Society, pp. 273284.
Z DANCEWIC , S. Softbound: Highly compatible and complete [43] VVDVEEN . https://github.com/vvdveen/memory-errors/.
spatial memory safety for c. In Proceedings of the 30th ACM
[44] Y ONG , S. H., AND H ORWITZ , S. Protecting c programs from
SIGPLAN Conference on Programming Language Design and
attacks via invalid pointer dereferences. In Proceedings of the
Implementation (New York, NY, USA, 2009), PLDI 09, ACM,
9th European Software Engineering Conference Held Jointly with
pp. 245258.
11th ACM SIGSOFT International Symposium on Foundations of
[25] N AGARAKATTE , S., Z HAO , J., M ARTIN , M. M., AND Software Engineering (New York, NY, USA, 2003), ESEC/FSE-
Z DANCEWIC , S. Cets: Compiler enforced temporal safety for 11, ACM, pp. 307316.
c. In Proceedings of the 2010 International Symposium on Mem-
ory Management (New York, NY, USA, 2010), ISMM 10, ACM,
pp. 3140.
[26] N ECULA , G. C., C ONDIT, J., H ARREN , M., M C P EAK , S., AND
W EIMER , W. Ccured: Type-safe retrofitting of legacy software.
ACM Trans. Program. Lang. Syst. 27, 3 (May 2005), 477526.
[27] N ETHERCOTE , N., AND S EWARD , J. Valgrind: a framework for
heavyweight dynamic binary instrumentation. In ACM Sigplan
notices (2007), vol. 42, ACM, pp. 89100.
[28] NIST. https://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2016-5270.
[32] N OVARK , G., AND B ERGER , E. D. Dieharder: Securing the
heap. In Proceedings of the 5th USENIX Conference on Offensive
Technologies (Berkeley, CA, USA, 2011), WOOT11, USENIX
Association, pp. 1212.
[33] O TTERSTAD , C. W. A brief evaluation of intel
R mpx. In Sys-
tems Conference (SysCon), 2015 9th Annual IEEE International
(2015), IEEE, pp. 17.
[34] Q IN , F., L U , S., AND Z HOU , Y. Safemem: exploiting ecc-
memory for detecting memory leaks and memory corruption dur-
ing production runs. In High-Performance Computer Architec-
ture, 2005. HPCA-11. 11th International Symposium on (Feb
2005), pp. 291302.
[35] S EREBRYANY, K., B RUENING , D., P OTAPENKO , A., AND
V YUKOV, D. Addresssanitizer: A fast address sanity checker.
In Presented as part of the 2012 USENIX Annual Technical Con-
ference (USENIX ATC 12) (2012), pp. 309318.
14

CUP: Comprehensive User-Space Protection For C/C++

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

CUP: Comprehensive User-Space Protection For C/C++

Transféré par

Droits d'auteur :

Formats disponibles

CUP: Comprehensive User-Space Protection for C/C++

Nathan Burow Derrick McKee Scott A. Carr Mathias Payer

Abstract hensively protects user-space is necessary to find and fix

Figure 1: A potential pointer value after enrichment. / / The f o l l o w i n g co d e i s p u r e l y i l l u s t r a t i v e .

vative and over-approximate in the set of pointer deref- s t a t i c i n l i n e v o i d ch eck ( v o i d p t r , u i n t 3 2

4.2.1 Analysis Implementation Listing 3: Instrumentation Example

Table 2: Performance Results. indicates C++ benchmark

CUP 158% run, CUP is faster than SoftBound+CETS. Further, com-

Listing 4: Example of code ASan fails to protect

call is handled. alloca() is compiler dependent [17]. 6 Discussion

Assembly code. CUP automatically instruments all

Vous aimerez peut-être aussi