Vous êtes sur la page 1sur 9

Module II

1. Data Types
A data type is a class of data objects with a set of operations for creating and manipulating them.
Examples of elementary data types: integer, real, character, Boolean, enumeration, pointer.
1.1 Specification of a data type
Attributes
Values
Operations
1.1.1 Attributes
Distinguish data objects of a given type. It is invariant during the lifetime of the object
Approaches:
stored in a descriptor and used during the program execution
used only to determine the storage representation, not used explicitly during execution
1.1.2 Values
The data type determines the values that a data object of that type may have
Specification: Usually an ordered set, i.e. it has a least and a greatest value
1.1.3 Operations
Operations define the possible manipulations of data objects of that type.
Primitive - specified as part of the language definition
Programmer-defined (as subprograms, or class methods)
An operation is defined by:
Domain - set of possible input arguments
Range - set of possible results
Action - how the result is produced
Operation signature:
Specifies the domain and the range
the number, order and data types of the arguments in the domain,
the number, order and data type of the resulting range



Mathematical notation for the specification:
op name: arg type x arg type x x arg type result type
1.2 Implementation of a data type
Storage representation
Implementation of operations
1.2.1 Storage Representation
Storage representation is Influenced by the hardware.
Described in terms of: Size of the memory blocks required and Layout of attributes and data values
within the block
1.2.2 Implementation of Operations
Hardware operation: direct implementation. E.g. integer addition
Subprogram/function, e.g. square root operation
In-line code. Instead of using a subprogram, the code is copied into the program at the
point where the subprogram would have been invoked.
2. Declarations
Information about the name and type of data objects needed during program execution.
Declarations can be implicit or explicit.
Explicit programmer defined
Implicit system defined
2.1 Declarations of operations
They are prototypes of the functions or subroutines that are programmer-defined.
Examples:
declaration: float Sub(int, float)
signature: Sub: int x float --> float
2.2 Purpose of declaration
Choice of storage representation
Storage management
Polymorphic operations
Static type checking



2.3 Type Checking versus Type Conversion
Type checking: checking that each operation executed by a program receives the proper number
of arguments of the proper data types. Type checking can be static or dynamic. Static type
checking is done at compilation. Dynamic type checking is done at run-time. Strong typing
means all type errors can be statically checked
Type inference: implicit data types, used if the interpretation is unambiguous.
2.3.1 Type Conversion and Coercion
Coercion: Implicit type conversion, performed by the system.
Explicit conversion : routines to change from one data type to another.
Example of explicit conversion
Pascal: the function round - converts a real type into integer
C - cast, e.g. (int)X for float X converts the value of X to type integer
There are two opposite approaches in coercion:
a. No coercions, any type mismatch is considered an error : Pascal, Ada
b. Coercions are the rule. Only if no conversion is possible, error is reported.
Coercion: advantages and disadvantages
Advantages: free the programmer from some low level concerns, as adding real numbers and
integers.
Disadvantages: may hide serious programming errors
3. Assignments and Initialization
3.1 Assignments
Assignment - the basic operation for changing the binding of a value to a data object.
The assignment operation can be defined using the concepts L-value and R-value
L-value: Location for an object.
R-value: Contents of that location.
Value, by itself, generally means R-value






Example
A = A + B ;
Pick up contents of location A: R-value of A
Add contents of location B: R-value of B
Store result into address A: L-value of A
3.2 Initialization
Uninitialized data object - a data object has been created, but no value is assigned,
i.e. only allocation of a block storage has been performed. Initialization can be Implicit and explicit
initialization.

4. Structured data types
A data structure is a data object that contains other data objects as its elements or
components.
4.1 Specifications
i. Number of components
Fixed size - Arrays
Variable size stacks, lists. Pointer is used to link components.
ii. Type of each component
Homogeneous all components are the same type
Heterogeneous components are of different types
iii. Selection mechanism to identify components index, pointer
Two-step process:
Referencing the structure selection of a particular component
iv. Maximum number of components


v. Organization of the components: simple linear sequence
1. simple linear sequence
2. multidimensional structures:
a. separate types (Fortran)
b. vector of vectors (C++)
Operations on data structures
vi. Component selection operations
Sequential
Random
vii. Insertion/deletion of components
viii. Whole-data structure operations
Creation/destruction of data structures
4.2 Implementation of data structure types
Storage representation
Includes:
a. storage for the components
b. optional descriptor - to contain some or all of the attributes
Sequential representation: the data structure is stored in a single
contiguous block of storage, that includes both descriptor and components.
Used for fixed-size structures, homogeneous structures (arrays, character
strings)
Linked representation: the data structure is stored in several
noncontiguous blocks of storage, linked together through pointers. Used
for variable-size structured (trees, lists)


Stacks, queues, lists can be represented in either way. Linked
representation is more flexible and ensures true variable size, however it
has to be software simulated.
Implementation of operations on data structures
Component selection in sequential representation: Base address plus offset
calculation. Add component size to current location to move to next component.
Component selection in linked representation: Move from address location to address
location following the chain of pointers.
Storage management
Access paths to a structured data object - to endure access to the object for its processing.
Created using a name or a pointer.
Two central problems:
Garbage the data object is bound but access path is destroyed.
Memory cannot be unbound.
Dangling references the data object is destroyed, but the access path still exists.
5 Declarations and type checking for data structures
What is to be checked:
Existence of a selected component
Type of a selected component
5.1 Vectors and arrays
A vector - one dimensional array


A matrix - two dimensional array
Multidimensional arrays
A slice - a substructure in an array that is also an array, e.g. a column in a matrix.
Implementation of array operations:
a. Access - can be implemented efficiently if the length of the components of the array is
known at compilation time. The address of each selected element can be computed using
an arithmetic expression.
b. Whole array operations, e.g. copying an array - may require much memory.
Associative arrays
Instead of using an integer index, elements are selected by a key value, that is a part of the
element. Usually the elements are sorted by the key and binary search is performed to find an
element in the array.
5.2 Records
A record is a data structure composed of a fixed number of components of different types.
The components may be heterogeneous, and they are named with symbolic names.
Specification of attributes of a record:
Number of components
Data type of each component
Selector used to name each component.
Implementation:
Storage: single sequential block of memory where the components are stored
sequentially.


Selection: provided the type of each component is known, the location can be computed
at translation time.
Note on efficiency of storage representation:
For some data types storage must begin on specific memory boundaries (required by the
hardware organization). For example, integers must be allocated at word boundaries (e.g.
addresses that are multiples of 4). When the structure of a record is designed, this fact has to be
taken into consideration. Otherwise the actual memory needed might be more than the sum of the
length of each component in the record. Here is an example:

struct employee
{ char Division;
int IdNumber; };

The first variable occupies one byte only. The next three bytes will remain unused and then the
second variable will be allocated to a word boundary.
Careless design may result in doubling the memory requirements.
5.3 Other structured data objects
Records and arrays with structured components: a record may have a component that is an
array, an array may be built out of components that are records.
Lists and sets: lists are usually considered to represent an ordered sequence of elements,
sets - to represent unordered collection of elements.





Executable data objects
In most languages, programs and data objects are separate structures (Ada, C, C++).
Other languages however do not distinguish between programs and data - e.g. PROLOG. Data
structures are considered to be a special type of program statements and all are treated in the
same way.