Vous êtes sur la page 1sur 20

Storage duration specifiers

Every object has a storage class, which may be automatic, static, or allocated. Variables declared within
a block by default have automatic storage, as do those explicitly declared with the auto[2] or register
storage class specifiers. The auto and register specifiers may only be used within functions and function
argument declarations; as such, the auto specifier is always redundant. Objects declared outside of all
blocks and those explicitly declared with the static storage class specifier have static storage duration.

Objects with automatic storage are local to the block in which they were declared and are discarded
when the block is exited. Additionally, objects declared with the register storage class may be given
higher priority by the compiler for access to registers; although they may not actually be stored in
registers, objects with this storage class may not be used with the address-of (&) unary operator.
Objects with static storage persist upon exit from the block in which they were declared. In this way,
the same object can be accessed by a function across multiple calls. Objects with allocated storage
duration are created and destroyed explicitly with malloc, free, and related functions.

The extern storage class specifier indicates that the storage for an object has been defined elsewhere.
When used inside a block, it indicates that the storage has been defined by a declaration outside of that
block. When used outside of all blocks, it indicates that the storage has been defined outside of the file.
The extern storage class specifier is redundant when used on a function declaration. It indicates that the
declared function has been defined outside of the file.
[edit] Type qualifiers

Objects can be qualified to indicate special properties of the data they contain. The const type qualifier
indicates that the value of an object should not change once it has been initialized. Attempting to
modify an object qualified with const yields undefined behavior, so some C implementations store
them in read-only segments of memory. The volatile type qualifier indicates that the value of an object
may be changed externally without any action by the program (see volatile variable); it may be
completely ignored by the compiler.
[edit] Pointers

In declarations the asterisk modifier (*) specifies a pointer type. For example, where the specifier int
would refer to the integer type, the specifier int * refers to the type "pointer to integer". Pointer values
associate two pieces of information: a memory address and a data type. The following line of code
declares a pointer-to-integer variable called ptr:

int *ptr;

[edit] Referencing

When a non-static pointer is declared, it has an unspecified value associated with it. The address
associated with such a pointer must be changed by assignment prior to using it. In the following
example, ptr is set so that it points to the data associated with the variable a:

int *ptr;
int a;

ptr = &a;
In order to accomplish this, the "address-of" operator (unary &) is used. It produces the memory
location of the data object that follows.
[edit] Dereferencing

The pointed-to data can be accessed through a pointer value. In the following example, the integer
variable b is set to the value of integer variable a, which is 10:

int *p;
int a, b;

a = 10;
p = &a;
b = *p;

In order to accomplish that task, the dereference operator (unary *) is used. It returns the data to which
its operand—which must be of pointer type—points. Thus, the expression *p denotes the same value as
a.
[edit] Arrays
[edit] Array definition

Arrays are used in C to represent structures of consecutive elements of the same type. The definition of
a (fixed-size) array has the following syntax:

int array[100];

which defines an array named array to hold 100 values of the primitive type int. If declared within a
function, the array dimension may also be a non-constant expression, in which case memory for the
specified number of elements will be allocated. In most contexts in later use, a mention of the variable
array is converted to a pointer to the first item in the array. The sizeof operator is an exception: sizeof
array yields the size of the entire array (that is, 100 times the size of an int). Another exception is the &
(address-of) operator, which yields a pointer to the entire array (e.g. int (*ptr_to_array)[100] =
&array;).
[edit] Accessing elements

The primary facility for accessing the values of the elements of an array is the array subscript operator.
To access the i-indexed element of array, the syntax would be array[i], which refers to the value stored
in that array element.

Array subscript numbering begins at 0. The largest allowed array subscript is therefore equal to the
number of elements in the array minus 1. To illustrate this, consider an array a declared as having 10
elements; the first element would be a[0] and the last element would be a[9]. C provides no facility for
automatic bounds checking for array usage. Though logically the last subscript in an array of 10
elements would be 9, subscripts 10, 11, and so forth could accidentally be specified, with undefined
results.

Due to array↔pointer interchangeability, the addresses of each of the array elements can be expressed
in equivalent pointer arithmetic. The following table illustrates both methods for the existing array:
Array subscripts vs. pointer arithmetic Element index 1 2 3 n
Array subscript array[0] array[1] array[2] array[n-1]
Dereferenced pointer *array *(array + 1) *(array + 2) *(array + n-1)

Similarly, since the expression a[i] is semantically equivalent to *(a+i), which in turn is equivalent to
*(i+a), the expression can also be written as i[a] (although this form is rarely used).
[edit] Dynamic arrays

A constant value is required for the dimension in a declaration of a static array. A desired feature is the
ability to set the length of an array dynamically at run-time instead:

int n = ...;
int a[n];
a[3] = 10;

This behavior can be simulated with the help of the C standard library. The malloc function provides a
simple method for allocating memory. It takes one parameter: the amount of memory to allocate in
bytes. Upon successful allocation, malloc returns a generic (void *) pointer value, pointing to the
beginning of the allocated space. The pointer value returned is converted to an appropriate type
implicitly by assignment. If the allocation could not be completed, malloc returns a null pointer. The
following segment is therefore similar in function to the above desired declaration:

#include <stdlib.h> /* declares malloc */



int *a;
a = malloc(n * sizeof(int));
a[3] = 10;

The result is a "pointer to int" variable (a) that points to the first of n contiguous int objects; due to
array↔pointer equivalence this can be used in place of an actual array name, as shown in the last line.
The advantage in using this dynamic allocation is that the amount of memory that is allocated to it can
be limited to what is actually needed at run time, and this can be changed as needed (using the standard
library function realloc).

When the dynamically-allocated memory is no longer needed, it should be released back to the run-
time system. This is done with a call to the free function. It takes a single parameter: a pointer to
previously allocated memory. This is the value that was returned by a previous call to malloc. It is
considered good practice to then set the pointer variable to NULL so that further attempts to access the
memory to which it points will fail. If this is not done, the variable becomes a dangling pointer, and
such errors in the code (or manipulations by an attacker) might be very hard to detect and lead to
obscure and potentially dangerous malfunction caused by memory corruption.

free(a);
a = NULL;

Standard C-99 also supports variable-length arrays (VLAs) within block scope. Such array variables
are allocated based on the value of an integer value at runtime upon entry to a block, and are
deallocated at the end of the block.

float read_and_process(int sz)


{
float vals[sz]; // VLA, size determined at runtime

for (int i = 0; i < sz; i++)


vals[i] = read_value();
return process(vals, sz);
}

[edit] Multidimensional arrays

In addition, C supports arrays of multiple dimensions, which are stored in row-major order.
Technically, C multidimensional arrays are just one-dimensional arrays whose elements are arrays. The
syntax for declaring multidimensional arrays is as follows:

int array2d[ROWS][COLUMNS];

(where ROWS and COLUMNS are constants); this defines a two-dimensional array. Reading the
subscripts from left to right, array2d is an array of length ROWS, each element of which is an array of
COLUMNS ints.

To access an integer element in this multidimensional array, one would use

array2d[4][3]

Again, reading from left to right, this accesses the 5th row, 4th element in that row (array2d[4] is an
array, which we are then subscripting with the [3] to access the fourth integer).

Higher-dimensional arrays can be declared in a similar manner.

A multidimensional array should not be confused with an array of references to arrays (also known as
Iliffe vectors or sometimes array of arrays). The former is always rectangular (all subarrays must be the
same size), and occupies a contiguous region of memory. The latter is a one-dimensional array of
pointers, each of which may point to the first element of a subarray in a different place in memory, and
the sub-arrays do not have to be the same size. The latter can be created by multiple use of malloc.
[edit] Strings

In C, string literals (constants) are surrounded by double quotes ("), e.g. "Hello world!" and are
compiled to an array of the specified char values with an additional null terminating character (0-
valued) code to mark the end of the string.\
anywhere
to go all
the
people to
follow
anyone
who
followed
him
String literals may not contain embedded newlines; this proscription somewhat simplifies parsing of
the language. To include a newline in a string, the backslash escape \n may be used, as below.

There are several standard library functions for operating with string data (not necessarily constant)
organized as array of char using this null-terminated format; see below.

C's string-literal syntax has been very influential, and has made its way into many other languages,
such as C++, Perl, Python, PHP, Java, Javascript, C#, Ruby. Nowadays, almost all new languages adopt
or build upon C-style string syntax. Languages that lack this syntax tend to precede C.
[edit] Backslash escapes

If you wish to include a double quote inside the string, that can be done by escaping it with a backslash
(\), for example, "This string contains \"double quotes\".". To insert a literal backslash, one must double
it, e.g. "A backslash looks like this: \\".

Backslashes may be used to enter control characters, etc., into a string:


Escape Meaning
\\ Literal backslash
\" Double quote
\' Single quote
\n Newline (line feed)
\r Carriage return
\b Backspace
\t Horizontal tab
\f Form feed
\a Alert (bell)
\v Vertical tab
\? Question mark (used to escape trigraphs)
\nnn Character with octal value nnn
\xhh Character with hexadecimal value hh

The use of other backslash escapes is not defined by the C standard, although compiler vendors often
provide additional escape codes as language extensions.
[edit] String literal concatenation

Adjacent string literals are concatenated at compile time; this allows long strings to be split over
multiple lines, and also allows string literals resulting from C preprocessor defines and macros to be
appended to strings at compile time:

printf(__FILE__ ": %d: Hello "


"world\n", __LINE__);

will expand to

printf("helloworld.c" ": %d: Hello "


"world\n", 10);

which is syntactically equivalent to


printf("helloworld.c: %d: Hello world\n", 10);

[edit] Character constants


Individual character constants are represented by single-quotes, e.g. 'A', and have type int (in C++
char). The difference is that "A" represents a pointer to the first element of a null-terminated array,
whereas 'A' directly represents the code value (65 if ASCII is used). The same backslash-escapes are
supported as for strings, except that (of course) " can validly be used as a character without being
escaped, whereas ' must now be escaped. A character constant cannot be empty (i.e. '' is invalid syntax),
although a string may be (it still has the null terminating character). Multi-character constants (e.g. 'xy')
are valid, although rarely useful — they let one store several characters in an integer (e.g. 4 ASCII
characters can fit in a 32-bit integer, 8 in a 64-bit one). Since the order in which the characters are
packed into one int is not specified, portable use of multi-character constants is difficult.
[edit] Wide character strings

Since type char is usually 1 byte wide, a single char value typically can represent at most 255 distinct
character codes, not nearly enough for all the characters in use worldwide. To provide better support for
international characters, the first C standard (C89) introduced wide characters (encoded in type
wchar_t) and wide character strings, which are written as L"Hello world!"

Wide characters are most commonly either 2 bytes (using a 2-byte encoding such as UTF-16) or 4
bytes (usually UTF-32), but Standard C does not specify the width for wchar_t, leaving the choice to
the implementor. Microsoft Windows generally uses UTF-16, thus the above string would be 26 bytes
long for a Microsoft compiler; the Unix world prefers UTF-32, thus compilers such as GCC would
generate a 52-byte string. A 2-byte wide wchar_t suffers the same limitation as char, in that certain
characters (those outside the BMP) cannot be represented in a single wchar_t; but must be represented
using surrogate pairs.
The original C standard specified only minimal functions for operating with wide character strings; in
1995 the standard was modified to include much more extensive support, comparable to that for char
strings. The relevant functions are mostly named after their char equivalents, with the addition of a "w"
or the replacement of "str" with "wcs"; they are specified in <wchar.h>, with <wctype.h> containing
wide-character classification and mapping functions.
[edit] Variable width strings

A common alternative to wchar_t is to use a variable-width encoding, whereby a logical character may
extend over multiple positions of the string. Variable-width strings may be encoded into literals
verbatim, at the risk of confusing the compiler, or using numerical backslash escapes (e.g. "\xc3\xa9"
for "é" in UTF-8). The UTF-8 encoding was specifically designed (under Plan 9) for compatibility with
the standard library string functions; supporting features of the encoding include a lack of embedded
nulls, no valid interpretations for subsequences, and trivial resynchronisation. Encodings lacking these
features are likely to prove incompatible with the standard library functions; encoding-aware string
functions are often used in such case.
[edit] Library functions

Strings, both constant and variable, may be manipulated without using the standard library. However,
the library contains many useful functions for working with null-terminated strings. It is the
programmer's responsibility to ensure that enough storage has been allocated to hold the resulting
strings.
The most commonly used string functions are:

* strcat(dest, source) - appends the string source to the end of string dest
* strchr(s, c) - finds the first instance of character c in string s and returns a pointer to it or a null
pointer if c is not found
* strcmp(a, b) - compares strings a and b (lexicographical ordering); returns negative if a is less than
b, 0 if equal, positive if greater.
* strcpy(dest, source) - copies the string source onto the string dest
* strlen(st) - return the length of string st
* strncat(dest, source, n) - appends a maximum of n characters from the string source to the end of
string dest and null terminates the string at the end of input or at index n+1 when the max length is
reached
* strncmp(a, b, n) - compares a maximum of n characters from strings a and b (lexical ordering);
returns negative if a is less than b, 0 if equal, positive if greater
* strrchr(s, c) - finds the last instance of character c in string s and returns a pointer to it or a null
pointer if c is not found
Other standard string functions include:

* strcoll(s1, s2) - compare two strings according to a locale-specific collating sequence


* strcspn(s1, s2) - returns the index of the first character in s1 that matches any character in s2
* strerror(errno) - returns a string with an error message corresponding to the code in errno
* strncpy(dest, source, n) - copies n characters from the string source onto the string dest,
substituting null bytes once past the end of source; does not null terminate if max length is reached
* strpbrk(s1, s2) - returns a pointer to the first character in s1 that matches any character in s2 or a
null pointer if not found
* strspn(s1, s2) - returns the index of the first character in s1 that matches no character in s2
* strstr(st, subst) - returns a pointer to the first occurrence of the string subst in st or a null pointer if
no such substring exists
* strtok(s1, s2) - returns a pointer to a token within s1 delimited by the characters in s2
* strxfrm(s1, s2, n) - transforms s2 onto s1, such that s1 used with strcmp gives the same results as
s2 used with strcoll

There is a similar set of functions for handling wide character strings.


[edit] Structures and unions
[edit] Structures

Structures in C are defined as data containers consisting of a sequence of named members of various
types. They are similar to records in other programming languages. The members of a structure are
stored in consecutive locations in memory, although the compiler is allowed to insert padding between
or after members (but not before the first member) for efficiency. The size of a structure is equal to the
sum of the sizes of its members, plus the size of the padding.
[edit] Unions

Unions in C are related to structures and are defined as objects that may hold (at different times)
objects of different types and sizes. They are analogous to variant records in other programming
languages. Unlike structures, the components of a union all refer to the same location in memory. In
this way, a union can be used at various times to hold different types of objects, without the need to
create a separate object for each new type. The size of a union is equal to the size of its largest
component type.
[edit] Declaration

Structures are declared with the struct keyword and unions are declared with the union keyword. The
specifier keyword is followed by an optional identifier name, which is used to identify the form of the
structure or union. The identifier is followed by the declaration of the structure or union's body: a list of
member declarations, contained within curly braces, with each declaration terminated by a semicolon.
Finally, the declaration concludes with an optional list of identifier names, which are declared as
instances of the structure or union.

For example, the following statement declares a structure named s that contains three members; it will
also declare an instance of the structure known as t:

struct s
{
int x;
float y;
char *z;
} t;

And the following statement will declare a similar union named u and an instance of it named n:

union u
{
int x;
float y;
char *z;
} n;

Once a structure or union body has been declared and given a name, it can be considered a new data
type using the specifier struct or union, as appropriate, and the name. For example, the following
statement, given the above structure declaration, declares a new instance of the structure s named r:

struct s r;

It is also common to use the typedef specifier to eliminate the need for the struct or union keyword in
later references to the structure. The first identifier after the body of the structure is taken as the new
name for the structure type. For example, the following statement will declare a new type known as
s_type that will contain some structure:

typedef struct {…} s_type;


Future statements can then use the specifier s_type (instead of the expanded struct … specifier) to refer
to the structure.
[edit] Accessing members

Members are accessed using the name of the instance of a structure or union, a period (.), and the name
of the member. For example, given the declaration of t from above, the member known as y (of type
float) can be accessed using the following syntax:

t.y

Structures are commonly accessed through pointers. Consider the following example that defines a
pointer to t, known as ptr_to_t:

struct s *ptr_to_t = &t;

Member y of t can then be accessed by dereferencing ptr_to_t and using the result as the left operand:

(*ptr_to_t).y

Which is identical to the simpler t.y above as long as ptr_to_t points to t. Because this operation is
common, C provides an abbreviated syntax for accessing a member directly from a pointer. With this
syntax, the name of the instance is replaced with the name of the pointer and the period is replaced with
the character sequence ->. Thus, the following method of accessing y is identical to the previous two:

ptr_to_t->y

Members of unions are accessed in the same way.


[edit] Initialization

A structure can be initialized in its declarations using an initializer list, similar to arrays. If a structure is
not initialized, the values of its members are undefined until assigned. The components of the initializer
list must agree, in type and number, with the components of the structure itself.

The following statement will initialize a new instance of the structure s from above known as pi:

struct s pi = { 3, 3.1415, "Pi" };

Designated initializers allow members to be initialized by name. The following initialization is


equivalent to the previous one.

struct s pi = { .x = 3, .y = 3.1415, .z = "Pi" };

Members may be initialized in any order, and those that are not explicitly mentioned are set to zero.

Any one member of a union may be initialized using designated initializers.

union u value = { .y = 3.1415 };


In C89, a union could only be initialized with a value of the type of its first member. That is, the union
u from above can only be initialized with a value of type int.

union u value = { 3 };

[edit] Assignment

Assigning values to individual members of structures and unions is syntactically identical to assigning
values to any other object. The only difference is that the lvalue of the assignment is the name of the
member, as accessed by the syntax mentioned above.

A structure can also be assigned as a unit to another structure of the same type. Structures (and pointers
to structures) may also be used as function parameter and return types.

For example, the following statement assigns the value of 74 (the ASCII code point for the letter 't') to
the member named x in the structure t, from above:

t.x = 74;

And the same assignment, using ptr_to_t in place of t, would look like:

ptr_to_t->x = 74;

Assignment with members of unions is identical, except that each new assignment changes the current
type of the union, and the previous type and value are lost.
[edit] Other operations

According to the C standard, the only legal operations that can be performed on a structure are copying
it, assigning to it as a unit (or initializing it), taking its address with the address-of (&) unary operator,
and accessing its members. Unions have the same restrictions. One of the operations implicitly
forbidden is comparison: structures and unions cannot be compared using C's standard comparison
facilities (==, >, <, etc.).
[edit] Bit fields

C also provides a special type of structure member known as a bit field, which is an integer with an
explicitly specified number of bits. A bit field is declared as a structure member of type int, signed int,
unsigned int, or _Bool, following the member name by a colon (:) and the number of bits it should
occupy. The total number of bits in a single bit field must not exceed the total number of bits in its
declared type.

As a special exception to the usual C syntax rules, it is implementation-defined whether a bit field
declared as type int, without specifying signed or unsigned, is signed or unsigned. Thus, it is
recommended to explicitly specify signed or unsigned on all structure members for portability.

Empty entries consisting of just a colon followed by a number of bits are also allowed; these indicate
padding.

The members of bit fields do not have addresses, and as such cannot be used with the address-of (&)
unary operator. The sizeof operator may not be applied to bit fields.
The following declaration declares a new structure type known as f and an instance of it known as g.
Comments provide a description of each of the members:

struct f
{
unsigned int flag : 1; /* a bit flag: can either be on (1) or off (0) */
signed int num : 4; /* a signed 4-bit field; range -7...7 or -8...7 */
: 3; /* 3 bits of padding to round out 8 bits */
} g;

[edit] Incomplete types

The body of a struct or union declaration, or a typedef thereof, may be omitted, yielding an incomplete
type. Such a type may not be instantiated (its size is not known), nor may its members be accessed
(they, too, are unknown); however, the derived pointer type may be used (but not dereferenced).

Incomplete types are used to implement recursive structures; the body of the type declaration may be
deferred to later in the translation unit:

typedef struct Bert Bert;


typedef struct Wilma Wilma;

struct Bert
{
Wilma *wilma;
};

struct Wilma
{
Bert *bert;
};

Incomplete types are also used for data hiding; the incomplete type is defined in a header file, and the
body only within the relevant source file.
[edit] Operators
Main article: Operators in C and C++
[edit] Control structures

C is a free-form language.

Vous aimerez peut-être aussi