Vous êtes sur la page 1sur 56

From Modern Database Management

11th Ed Hoffer, Ramesh, Topi


Conceptual, Logical, and Physical Data Model
Conceptual Data Model
Logical Data Model (LDM) Physical Data Model (PDM)
(CDM)
Includes tables, columns,
Includes entities (tables),
Includes high-level data keys, data types, validation
attributes (columns/fields)
constructs rules, database triggers, and
and relationships (keys)
access constraints
Uses more defined and less
Non-technical names, so generic specific names for
that executives and tables and columns, such as
managers at all levels can Uses business names for abbreviated column names,
understand the data basis entities & attributes limited by the database
of Architectural management system
Description (DBMS) and any company
defined standards
Is independent of
Represent data from the technology platform; Requires a knowledge of
viewpoint of the describes the data in the specific DBMS that will
organization, independent terms of data mgnt be used to implement the
of any technology technology which will be database
used
Physical Database Design
- Definition : translates logical description of
data into technical specifications for storing
and retrieving data

- Goal : create a design for storing data that


will provide data processing efficiency and
ensure database integrity, security, and
recoverability

- Output : technical specifications for use in


the implementation phase in information
systems construction
Key Decisions in Physical DB Design
1. Data type for each attribute (storage
format) - must maximize data integrity;
minimize storage space

2. Grouping of attributes - grouping of


attributes in a logical data model is not
always the optimal grouping in a physical
design
Key Decisions in Physical DB Design:
3. Choice of file organization arranging
similarly structured records in secondary
memory so that records can be stored,
retrieved, and updated rapidly

4. Selection of indexes and overall database


architecture for efficiency of data retrieval

5. Proper handling of queries by the DBMS so


that file organization and indexes will be
optimized
Key Decisions in Physical DB Design
1. Data type for each attribute (storage
format)

2. Grouping of attributes

3. Choice of File Organization

4. Selection of Indexes

5. Proper Handling of Queries


Key Decisions in Physical DB Design:

1. Data type for each attribute (storage


format) must maximize data integrity;
minimize storage space

Choose the appropriate data type


Key Decisions in Physical DB Design:

1. Choosing Data Types

Represent all possible values


Alphanumeric as needed e.g. Drivers
license

Improve data integrity


Default value
Range control
Null-value control
Referential integrity
Choosing Data Types

Support all data manipulations


Numeric for numerical calculations
(width should be enough)
Date for date computations
Character for parsing text

Minimize storage space (width should be


just enough)
Key Decisions in Physical DB Design
1. Data type for each attribute (storage
format)

2. Grouping of attributes

3. Choice of File Organization

4. Selection of Indexes

5. Proper Handling of Queries


Key Decisions in Physical DB Design:

2. Grouping of attributes - grouping of


attributes in a logical data model is not
always the optimal grouping in a physical
design

Denormalization
Normalization Definitions
Normalization involves decomposing relations
to produce smaller, well-structured relations

A formal process for deciding which attributes


should be grouped together in a relation so that
all anomalies are removed

More specifically, if a relation is normalized


(well-formed), rows can be inserted, deleted, or
modified without creating anomalies
Normalization Goals

Minimize data redundancy thereby


conserving space and avoiding anomalies

Make it easier to maintain data


Disadvantages of Denormalization

- Redundant copies of same data are often not


updated in synchronized way

- Extra programming is required to ensure that


all copies of exactly the same business data
are updated together

- More storage space for raw data and for DB


overhead (e.g. indexes)
Denormalization

- Mechanism used to improve efficient


processing of data

- Quick access to stored data

- Motivation for denormalization :


normalized tables often creates many
tables and joining tables slows DB
processing
Opportunities for Denormalization

1.Two entities with one-to-one relationship -


Even if one of the entities is an optional
participant
2.A many-to-many relationship (associative
entity) with non-key attributes
3.Reference Data - exists on the one side of a
one-to-many relationship and this entity
participates in no other database
relationship
Opportunities for Denormalization

1.Two entities with one-to-one relationship -


Even if one of the entities is an optional
participant

student(studentid, campusaddress)

application(applicationid , applicationdate,
qualification, studentid)

student(studentid, campusaddress,
applicationdate, qualification)
Opportunities for Denormalization

2. A many-to-many relationship (associative


entity) with non-key attributes

vendor(vendorid, address, contact name)

pricequote(price)

item(itemid, description)
Opportunities for Denormalization

2. A many-to-many relationship (associative


entity) with non-key attributes

vendor(vendorid, address, contact name)

pricequote(vendorid,itemid, price)

item(itemid, description)
Opportunities for Denormalization

2. A many-to-many relationship (associative


entity) with non-key attributes

vendor(vendorid, address, contact name)

itemquote(vendorid,itemid, price,
description)
Opportunities for Denormalization

3. Reference Data - exists on the one side of a one-to-many


relationship and this entity participates in no other database relationship

ITEM
itemid itemdesc instrid
advantageous
A1 laptop 1 when there are
A2 tablet 1 few instances of
A3 computer table 2 the entity on the
A4 cabinet 2 many side for each
STORAGE entity on the one
side
instrid wherestore constainertype
1 ortigas depot van
2 pasig depot truck
Opportunities for Denormalization

3. Reference Data - exists on the one side of a one-to-many


relationship and this entity participates in no other database relationship

ITEM
itemid itemdesc wherestore constainertype
A1 laptop ortigas depot van
A2 tablet ortigas depot van
computer
A3 table pasig depot truck
A4 cabinet pasig depot truck
Other Forms of Denormalization

- Partitioning creation of more tables


(horizontal, vertical , or record partitioning)

- Data replication
Other Forms of Denormalization

Horizontal Partitioning

- Places different rows into separate tables


based on a common value

Three Forms

Range
Hash
List
Other Forms of Denormalization

Three (3) forms of Horizontal Partitioning

Range - Each partition is defined by a


range of values (lower and upper key
value limits)

Example - Partition by range of dates


Other Forms of Denormalization

Three (3) forms of Horizontal Partitioning

Hash Data are evenly spread across


partitions independent of any
partition key value

Example - for a table of 1M records,


to be divided in 5 partitions, each
partition will compose of 250k records
Other Forms of Denormalization

Three (3) forms of Horizontal Partitioning

List Partitions are defined based on


predefined list of values

Example Table partitioned based


on region. All records with of regions I,
II, III will be grouped, while IV, V, VII will
be grouped, etc..
Other Forms of Denormalization

Data Replication same data are stored in


multiple places for improved data access
speed.
Other Forms of Denormalization

Advantages of Partitioning
1.Efficiency
2.Local optimization
3.Security
4.Recovery and uptime
5.Load Balancing

Disadvantages of Partitioning
1.Inconsistent access speed
2.Complexity
3.Extra space and update time
Other Forms of Denormalization

Vertical Partitioning - distributes the


columns of a logical relation into separate
tables

Record Partitioning combination of


horizontal and vertical partitioning and is
common for a database whose files are
distributed across multiple computers.
Key Decisions in Physical DB Design
1. Data type for each attribute (storage
format)

2. Grouping of attributes

3. Choice of File Organization

4. Selection of Indexes

5. Proper Handling of Queries


Key Decisions in Physical DB Design:

3. Choice of file organization arranging


similarly structured records in secondary
memory so that records can be stored,
retrieved, and updated rapidly
Key Decisions in Physical DB Design:

3. File Organizations

Sequential

Indexed

Hashed
Key Decisions in Physical DB Design:

3. File Organizations

Sequential Data are stored in


sequence

Not used in database except for back-up purposes


File Organizations - Sequential

Data Access Scan file from the


beginning until record is found

Data Insertion, Update - requires


rewriting a file

Data Deletion Records are marked


for deletion; requires reorganizing
Key Decisions in Physical DB Design:

3. File Organizations

Indexed - An Index is created that is


used as location of records

Extensively used with relational DBMS


File Organizations - Indexed

Data Access Random retrieval is


moderately fast; search records by index key
Data Insertion- Easy, requires maintenance
of indexes
Data Update- Easy, requires maintenance of
indexes
Data Deletion - Easy, requires maintenance
of indexes
Key Decisions in Physical DB Design:

3. File Organizations

Hashed - Uses a hashing algorithm to


create a record address
File Organizations - Hashed

Data Access Random retrieval is very


fast

Data Insertion- Very easy

Data Update- Very easy

Data Deletion - Very easy


File Organizations - In terms of storage
space

Sequential - no wasted space


Index no wasted space but requires
extra space for index
Hash Extra space may be needed to
allow for addition and deletion of records
after the initial set of records is loaded
Key Decisions in Physical DB Design
1. Data type for each attribute (storage
format)

2. Grouping of attributes

3. Choice of File Organization

4. Selection of Indexes

5. Proper Handling of Queries


Key Decisions in Physical DB Design:
4. Selection of indexes and overall database
architecture for efficiency of data retrieval

Selecting an index is one of the most


important decisions in DB design
Key Decisions in Physical DB Design:

When to Use Indexes

o Indexes should be used generously for


data retrieval purposes e.g. data
warehouse applications

o Indexes should be used judiciously for


databases with heavy processing
requirements
Key Decisions in Physical DB Design:

Indexes makes the data retrieval and


transactions faster if they are properly
implemented.

Over-indexing may degrade performance,


specially when inserting or updating
records.
Key Decisions in Physical DB Design:
Types of Indexes:

Clustered

Non-Clustered

https://technet.microsoft.com/en-us/library/ms190457(v=sql.110).aspx
Key Decisions in Physical DB Design:

Clustered indexes sort and store the data


rows in the table based on their key values.

Only one clustered index per table as data


rows can be sorted in one order.

https://technet.microsoft.com/en-us/library/ms190457(v=sql.110).aspx
Non-clustered index contains the the
non-clustered index key values and each
key value entry has a pointer to the data
row that contains the key value.

The data in the index is stored in order


based on the key value while the data rows
of the underlying table are not sorted.

You can create multiple non-clustered


indexes on a table.
Key Decisions in Physical DB Design:
Index Use Tips
1. Indexes are most useful in large
tables

2. Specify a unique index for the


primary key of each table

3. Indexes are most useful for columns


that frequently appear in WHERE
clauses
Key Decisions in Physical DB Design:
Index Use Tips
4. Use an index for attributes
referenced in ORDER BY

5. Use an index if there is significant


variety in the values of a group

6. Consider creating surrogate keys for


index fields with long values
Key Decisions in Physical DB Design:
Index Use Tips
7. Check your DBMS for the limit, if any,
on the number of indexes allowable
per table.

8. Be careful of indexing attributes that


have null values as these cannot be
referenced in the index
Key Decisions in Physical DB Design
1. Data type for each attribute (storage
format)

2. Grouping of attributes

3. Choice of File Organization

4. Selection of Indexes

5. Proper Handling of Queries


Key Decisions in Physical DB Design:

5. Proper handling of queries by the DBMS so


that file organization and indexes will be
optimized
Key Decisions in Physical DB Design:
DBMS

Parallel Query Processing

Use of multiple processors in DB servers

Query Optimization

Optimizer determines access indexes,


join operations, etc.
Key Decisions in Physical DB Design:
Guidelines for Better Query Design

Write simple queries


Retrieve only the data you need
Use compatible data types for fields and
literals in queries
Break complex queries into multiple simple
parts
Dont nest one query inside another query
Create temporary tables for groups of
queries
Key Decisions in Physical DB Design:
Guidelines for Better Query Design

Understand how indexes are used in query


processing
In general queries with equality criteria
are more efficient
Dont have the DBMS sort without an
index
If possible, avoid using self-joins
Thank you!

Vous aimerez peut-être aussi