Vous êtes sur la page 1sur 9

Data structures

In computer science, a data structure is a particular way of storing and organizing data in
a computer so that it can be used efficiently when required. It can also be defined as a
data structure is a specialized format for organizing and storing data. General data
structure types include the array, the file, the record, the table, the tree, and so on. Any
data structure is designed to organize data to suit a specific purpose so that it can be
accessed and worked with in appropriate ways. In computer programming, a data
structure may be selected or designed to store data for the purpose of working on it with
various algorithms. Different kinds of data structures are suited to different kinds of
applications, and some are highly specialized to specific tasks. For example, B-trees are
particularly well-suited for implementation of databases, while compiler implementations
usually use hash tables to look up identifiers. Data structures are used in almost every
program or software system. Specific data structures are essential ingredients of many
efficient algorithms, and make possible the management of huge amounts of data, such as
large databases and internet indexing services. Some formal design methods and
programming languages emphasize data structures, rather than algorithms, as the key
organizing factor in software design.

Basic principles

Data structures are generally based on the ability of a computer to fetch and store data at
any place in its memory, specified by an address — a bit string that can be itself
stored in memory and manipulated by the program. Thus the record and array data
structures are based on computing the addresses of data items with arithmetic
operations; while the linked data structures are based on storing addresses of data
items within the structure itself. Many data structures use both principles,
sometimes combined in non-trivial ways . The implementation of a data structure
usually requires writing a set of procedures that create and manipulate instances of
that structure. The efficiency of a data structure cannot be analyzed separately from
those operations.

Classification of data structure

1. Primitive and Non- primitive : primitive data structures are basic data structure
and are directly operated upon machine instructions. Example Integer, character.
Non-primitive data structures are derived data structure from the primitive data
structures. Example Structure, union, array.
2. Homogeneous and heterogeneous: In homogeneous data structures all the
elements will be of same type. Example array. In heterogeneous data structure the
elements are of different types. Example structure.
3. Static and Dynamic data structures: In some data structures memory is allocated
at the time of compilation such data structures are known as static data structures .
If the allocation of memory is at run-time then such data structures are known as
Dynamic data structures. Functions such as malloc, calloc, etc.. are used for run-
time memory allocation.
4. Linear and Non-linear data structures: Linear data structure maintain a linear
relationship between it's elements. Example array. Non-linear data structures does
not maintain any linear relationship between the elements. Example tree.

In computer science, a tree is a widely-used data structure that


emulates a hierarchical tree structure with a set of linked nodes. Mathematically,
it is not a tree, but an arborescence: an acyclic connected graph where each node
has zero or more children nodes and at most one parent node. Furthermore, the
children of each node have a specific order. A node is a structure which may
contain a value, a condition, or represent a separate data structure (which could be
a tree of its own). Each node in a tree has zero or more child nodes, which are
below it in the tree (by convention, trees grow down, not up as they do in nature).
A node that has a child is called the child's parent node. A node has at most one
parent. Nodes that do not have any children are called leaf nodes. They are also
referred to as terminal nodes.

A subtree of a tree T is a tree consisting of a node in T and all of its


descendants in T. The subtree corresponding to the root node is the entire tree; the
subtree corresponding to any other node is called a proper subtree.

Tree representations

There are many different ways to represent trees; common


representations represent the nodes as records allocated on the heap with pointers
to their children, their parents, or both, or as items in an array, with relationships
between them determined by their positions in the array e.g., binary heap.

Trees and graphs

The tree data structure can be generalized to represent directed


graphs by allowing cycles to form in the parent-child relationship. Instead of
parents and children, we speak of the sources and targets of the edges of the
directed graph. However this is not a common implementation strategy.

Relationship with trees in graph theory

In graph theory, a tree is a connected acyclic graph; unless stated


otherwise, trees and graphs are undirected. There is no one-to-one correspondence
between such trees and trees as data structure. We can take an arbitrary undirected
tree, arbitrarily pick one of its vertices as the root, make all its edges directed by
making them point away from the root node - producing an arborescence - and
assign an order to all the nodes. The result corresponds to a tree data structure.
Picking a different root or different ordering produces a different one.

Pointers:-

In computer science, a pointer is a programming language data type whose value


refers directly to (or "points to") another value stored elsewhere in the computer
memory using its address. For high-level programming languages, pointers
effectively take the place of general purpose registers in low level languages such
as assembly language or machine code—but, in contrast, occupies part of the
available memory. A pointer references a location in memory, and obtaining the
value at the location a pointer refers to is known as dereferencing the pointer. A
pointer is a simple, less abstracted implementation of the more abstracted
reference data type. Several languages support some type of pointer, although
some are more restricted than others. Pointers to data significantly improve
performance for repetitive operations such as traversing strings, lookup tables,
control tables and tree structures. In particular, it is often much cheaper in time
and space to copy and dereference pointers than it is to copy and access the data
to which the pointers point.

In Object-oriented programming, pointers to functions are used for binding


methods, often using what are called virtual method tables. While "pointer" has
been used to refer to references in general, it more properly applies to data
structures whose interface explicitly allows the pointer to be manipulated
(arithmetically via pointer arithmetic) as a memory address, as opposed to a
magic cookie or capability where this is not possible. Because pointers allow both
protected and unprotected access to memory addresses, there are risks associated
with using them particularly in the latter case.

Types of Pointers

1. C pointers

The basic syntax to define a pointer is

int *money;

This declares money as a pointer to an integer. Since the contents of memory are
not guaranteed to be of any specific value in C, care must be taken to ensure that
the address that money points to is valid. This is why it is sometimes suggested to
initialize the pointer to NULL (however initialising pointers unnecessarily can
mask compiler analyses and hide bugs).
int *money = NULL;

If a NULL pointer is dereferenced then a runtime error will occur and execution
will stop, usually with a segmentation fault. Once a pointer has been declared, the
next logical step is for it to point at something:

int a = 5;
int *money = NULL;

money = &a;

This assigns the value of money to be the address of a. For example, if a is stored
at memory location of 0x8130 then the value of money will be 0x8130 after the
assignment. To dereference the pointer, an asterisk is used again:

*money = 8;

This means take the contents of money (which is 0x8130), "locate" that address in
memory and set its value to 8. If a is later accessed again, its new value will be 8.

This example may be more clear if memory is examined directly. Assume that a is
located at address 0x8130 in memory and money at 0x8134; also assume this is a
32-bit machine such that an int is 32-bits wide. The following is what would be in
memory after the following code snippet is executed

int a = 5;
int *money = NULL;

Address Contents

0x8130 0x00000005
0x8134 0x00000000

(The NULL pointer shown here is 0x00000000.) By assigning the address of a to


money

money = &a;

yields the following memory values

Address Contents

0x8130 0x00000005

0x8134 0x00008130

Then by dereferencing money by coding

*money = 8;

the computer will take the contents of money (which is 0x8130),


'locate' that address, and assign 8 to that location yielding the following memory.

Address Contents

0x8130 0x00000008

0x8134 0x00008130

Clearly, accessing a will yield the value of 8 because the previous


instruction modified the contents of a by way of the pointer money.
2. Null pointer

Since a null-valued pointer does not refer to a meaningful object, an attempt to


dereference a null pointer usually causes a run-time error. If this error is left
unhandled, the program terminates immediately. In the case of C on a general
computer, execution halts with a segmentation fault because the literal address of
NULL is never allocated to a running program (with a C program in an embedded
system, various things may occur). In Java, access to a null reference triggers a
Null Pointer Exception, which can be caught by error handling code, but the
preferred practice is to ensure that such exceptions never occur. In safe languages
a possibly-null pointer can be replaced with a tagged union which enforces
explicit handling of the exceptional case; in fact, a possibly-null pointer can be
seen as a tagged pointer with a computed tag. Other languages, such as Objective-
C, allow messages to be sent to a nil address (the value of a pointer that does not
point to a valid object) without causing the program to be interrupted; the
message will simply be ignored, and the return value (if any) is nil or 0,
depending on the type.

In C and C++ programming, two null pointers are guaranteed to compare equal;
ANSI C guarantees that any null pointer will be equal to 0 in a comparison with
an integer type; furthermore the macro NULL is defined as a null pointer
constant, that is value 0 (either as an integer type or converted to a pointer to
void), so a null pointer will compare equal to NULL.

A null pointer should not be confused with an uninitialized pointer: a null pointer
is guaranteed to compare unequal to any valid pointer, whereas depending on the
language and implementation an uninitialized pointer might have either an
indeterminate (random or meaningless) value or might be initialised to an initial
constant (possibly but not necessarily NULL).
In most C programming environments malloc returns a null pointer if it is unable
to allocate the memory region requested, which notifies the caller that there is
insufficient memory available. However, some implementations of malloc allow
malloc (0) with the return of a null pointer and instead indicate failure by both
returning a null pointer and setting err no to an appropriate value.

Computer systems based on a tagged architecture are able to distinguish in


hardware between a NULL dereference and a legitimate attempt to access a word
or structure at address zero.

3. Wild pointers

Wild pointers are pointers that have not been initialized (that is, a wild pointer
does not have any address assigned to it) and may make a program crash or
behave oddly. In the Pascal or C programming languages, pointers that are not
specifically initialized may point to unpredictable addresses in memory.

The following example code shows a wild pointer:

int func(void)
{
char *p1 = malloc(sizeof(char)); /* (undefined) value of some place on the
heap */
char *p2; /* wild (uninitialized) pointer */
*p1 = 'a'; /* This is OK, assuming malloc() has not returned NULL. */
*p2 = 'b'; /* This invokes undefined behavior */
}

Here, p2 may point to anywhere in memory, so performing the


assignment *p2 = 'b' will corrupt an unknown area of memory that may contain
sensitive data

The void pointer, or void*


The void pointer or Void* is supported in ANSI C and C++ as a
generic pointer type. A pointer to void can store an address to any data type, and,
in C, is implicitly converted to any other pointer type on assignment, but it must
be explicitly cast if dereferenced inline.

int x = 4;
void * q = &x;
int* p = q; /* void* implicity converted to int*: valid C, but not C++ */
int i = *p;
int j = *(int*)q; /* when dereferencing inline, there is no implicit conversion */

C++ does not allow the implicit conversion of void* to other


pointer types, not even in assignments. This was a design decision to avoid
careless and even unintended casts, though most compilers only output warnings,
not errors, when encountering other ill casts.

int x = 4;
void* q = &x;
// int* p = q; This fails in C++: there is no implicit conversion from void*
int* a = (int*)q; // C-style cast
int* b = static_cast<int*>(q); // C++ cast

In C++, there is no void& (reference to void) to complement void*


(pointer to void), because references behave like aliases to the variables they point
to, and there can never be a variable whose type is void.

Vous aimerez peut-être aussi