Académique Documents
Professionnel Documents
Culture Documents
with a backslash (\), for example, "This string contains \"double quotes\".". To insert a
literal backslash, one must double it, e.g. "A backslash looks like this: \\".
\r Carriage return
\b Backspace
\t Horizontal tab
\f Form feed
\a Alert (bell)
\v Vertical tab
\? Question mark (used to escape trigraphs)
\nnn Character with octal value nnn
\xhh Character with hexadecimal value hh
The use of other backslash escapes is not defined by the C standard, although compiler
vendors often provide additional escape codes as language extensions.
[edit] String literal concatenation
Adjacent string literals are concatenated at compile time; this allows long strings to be
split over multiple lines, and also allows string literals resulting from C preprocessor
defines and macros to be appended to strings at compile time:
will expand to
Individual character constants are represented by single-quotes, e.g. 'A', and have type int
(in C++ char). The difference is that "A" represents a pointer to the first element of a
null-terminated array, whereas 'A' directly represents the code value (65 if ASCII is
used). The same backslash-escapes are supported as for strings, except that (of course) "
can validly be used as a character without being escaped, whereas ' must now be escaped.
A character constant cannot be empty (i.e. '' is invalid syntax), although a string may be
(it still has the null terminating character). Multi-character constants (e.g. 'xy') are valid,
although rarely useful they let one store several characters in an integer (e.g. 4 ASCII
characters can fit in a 32-bit integer, 8 in a 64-bit one). Since the order in which the
characters are packed into one int is not specified, portable use of multi-character
constants is difficult.
[edit] Wide character strings
Since type char is usually 1 byte wide, a single char value typically can represent at most
255 distinct character codes, not nearly enough for all the characters in use worldwide.
To provide better support for international characters, the first C standard (C89)
introduced wide characters (encoded in type wchar_t) and wide character strings, which
are written as L"Hello world!"
Wide characters are most commonly either 2 bytes (using a 2-byte encoding such as
UTF-16) or 4 bytes (usually UTF-32), but Standard C does not specify the width for
wchar_t, leaving the choice to the implementor. Microsoft Windows generally uses UTF-
16, thus the above string would be 26 bytes long for a Microsoft compiler; the Unix
world prefers UTF-32, thus compilers such as GCC would generate a 52-byte string. A 2-
byte wide wchar_t suffers the same limitation as char, in that certain characters (those
outside the BMP) cannot be represented in a single wchar_t; but must be represented
using surrogate pairs.
The original C standard specified only minimal functions for operating with wide
character strings; in 1995 the standard was modified to include much more extensive
support, comparable to that for char strings. The relevant functions are mostly named
after their char equivalents, with the addition of a "w" or the replacement of "str" with
"wcs"; they are specified in <wchar.h>, with <wctype.h> containing wide-character
classification and mapping functions.
[edit] Variable width strings
Strings, both constant and variable, may be manipulated without using the standard
library. However, the library contains many useful functions for working with null-
terminated strings. It is the programmer's responsibility to ensure that enough storage has
been allocated to hold the resulting strings.
* strcat(dest, source) - appends the string source to the end of string dest
* strchr(s, c) - finds the first instance of character c in string s and returns a pointer to it
or a null pointer if c is not found
* strcmp(a, b) - compares strings a and b (lexicographical ordering); returns negative if
a is less than b, 0 if equal, positive if greater.
* strcpy(dest, source) - copies the string source onto the string dest
* strlen(st) - return the length of string st
* strncat(dest, source, n) - appends a maximum of n characters from the string source
to the end of string dest and null terminates the string at the end of input or at index n+1
when the max length is reached
* strncmp(a, b, n) - compares a maximum of n characters from strings a and b (lexical
ordering); returns negative if a is less than b, 0 if equal, positive if greater
* strrchr(s, c) - finds the last instance of character c in string s and returns a pointer to
it or a null pointer if c is not found
Unions in C are related to structures and are defined as objects that may hold (at different
times) objects of different types and sizes. They are analogous to variant records in other
programming languages. Unlike structures, the components of a union all refer to the
same location in memory. In this way, a union can be used at various times to hold
different types of objects, without the need to create a separate object for each new type.
The size of a union is equal to the size of its largest component type.
[edit] Declaration
Structures are declared with the struct keyword and unions are declared with the union
keyword. The specifier keyword is followed by an optional identifier name, which is used
to identify the form of the structure or union. The identifier is followed by the declaration
of the structure or union's body: a list of member declarations, contained within curly
braces, with each declaration terminated by a semicolon. Finally, the declaration
concludes with an optional list of identifier names, which are declared as instances of the
structure or union.
For example, the following statement declares a structure named s that contains three
members; it will also declare an instance of the structure known as t:
struct s
{
int x;
float y;
char *z;
} t;
And the following statement will declare a similar union named u and an instance of it
named n:
union u
{
int x;
float y;
char *z;
} n;
Once a structure or union body has been declared and given a name, it can be considered
a new data type using the specifier struct or union, as appropriate, and the name. For
example, the following statement, given the above structure declaration, declares a new
instance of the structure s named r:
struct s r;
It is also common to use the typedef specifier to eliminate the need for the struct or union
keyword in later references to the structure. The first identifier after the body of the
structure is taken as the new name for the structure type. For example, the following
statement will declare a new type known as s_type that will contain some structure:
typedef struct {} s_type;
Future statements can then use the specifier s_type (instead of the expanded struct
specifier) to refer to the structure.
[edit] Accessing members