Vous êtes sur la page 1sur 40

DATABASE OBJECT

By
Team Teaching
Agenda

■ Schema
■ Database Objects
■ Alias
■ Table
■ Data Organizations
■ Range-clustered Table (RCT)
■ Multidimensional Clustering (MDC) Tables
Schemas

■ A schema is a collection of named objects;


– it provides a way to group those objects logically.
– A schema is also a name qualifier;
– it provides a way to use the same natural name for several
objects, and to prevent ambiguous references to those
objects.
– For example, the schema names 'INTERNAL' and
'EXTERNAL' make it easy to distinguish two different SALES
tables (INTERNAL.SALES, EXTERNAL.SALES).
■ Schemas also enable multiple applications to store data in a
single database without encountering namespace collisions.
Schemas, Cont’d…

■ A schema can contain tables, views, nicknames, triggers,


functions, packages, and other objects. A schema is itself a
database object.
■ It is explicitly created using the CREATE SCHEMA statement,
with the current user or a specified authorization ID recorded
as the schema owner.
■ It can also be implicitly created when another object is
created, if the user has IMPLICIT_SCHEMA authority.
Schemas, Cont’d…

■ There are some restrictions and recommendations that you must be aware of
when naming schemas.
– User-defined types (UDTs) cannot have schema names longer than the
schema length listed in SQL and XML limits.
– The following schema names are reserved words and must not be used:
SYSCAT, SYSFUN, SYSIBM, SYSSTAT, SYSPROC.
– To avoid potential problems upgrading databases in the future, do not use
schema names that begin with SYS. The database manager will not allow
you to create modules, procedures, triggers, user-defined types or user-
defined functions using a schema name beginning with SYS.
– It is recommended that you not use SESSION as a schema name.
Declared temporary tables must be qualified by SESSION. It is therefore
possible to have an application declare a temporary table with a name
identical to that of a persistent table, in which case the application logic
can become overly complicated. Avoid the use of the schema SESSION,
except when dealing with declared temporary tables.
Database objects

■ Physical database design consists of defining database objects and


their relationships.
■ You can create the following database objects in a DBMS:
– Aliases
– Tables
– Constraints
– Indexes
– Triggers
– Sequences
– Views
Aliases

■ An alias is an alternative name for an object such as a


module, table or another alias.
– It can be used to reference an object wherever that
object can be referenced directly.
■ An alias cannot be used in all contexts;
■ for example, it cannot be used in the check condition of a
check constraint. An alias cannot reference a declared
temporary table but it can reference a created temporary
table.
Aliases, Cont’d…

■ To create an alias using the command line, enter:


– CREATE ALIAS alias_name FOR table_name
■ The following SQL statement creates an alias WORKERS for the EMPLOYEE
table:
– CREATE ALIAS WORKERS FOR EMPLOYEE
■ The alias is replaced at statement compilation time by the table or view name.
■ If the alias or alias chain cannot be resolved to a table or view name, an error
results.
■ For example, if WORKERS is an alias for EMPLOYEE, then at compilation time:
– SELECT * FROM WORKERS
■ Becomes in effect
– SELECT * FROM EMPLOYEE
Tables

■ Tables are logical structures maintained by the database


manager.
■ Tables are made up of columns and rows.
■ At the intersection of every column and row is a specific data
item called a value.
■ A column is a set of values of the same type or one of its
subtypes.
■ A row is a sequence of values arranged so that the nth value
is a value of the nth column of the table.
Types of tables

■ Base tables
– These types of tables hold persistent data. There are different
kinds of base tables, including
■ Regular tables
– Regular tables with indexes are the "general purpose" table
choice.
■ Multidimensional clustering (MDC) tables
■ Insert time clustering (ITC) tables
■ Range-clustered tables (RCT)
■ Partitioned tables
■ Temporal tables
■ Temporary tables
■ Materialized query tables
Data types
Table partitioning and data
organization schemes

■ Table partitioning is a data organization scheme in which


table data is divided across multiple data partitions according
to values in one or more partitioning columns of the table.
■ Data from a given table is partitioned into multiple storage
objects, which can be in different table spaces.
Data organization schemes

■ With the introduction of table partitioning, a DB2® database offers a


three-level data organization scheme.
■ There are three clauses of the CREATE TABLE statement that include an
algorithm to indicate how the data is to be organized.
■ The following three clauses demonstrate the levels of data organization
that can be used together in any combination:
– DISTRIBUTE BY to spread data evenly across database partitions
(to enable intraquery parallelism and to balance the load across
each database partition) (database partitioning)
– PARTITION BY to group rows with similar values of a single
dimension in the same data partition (table partitioning)
– ORGANIZE BY to group rows with similar values on multiple
dimensions in the same table extent (multidimensional clustering)
or to group rows according to the time of the insert operation
(insert time clustering table).
Data organization schemes, Cont’d…

Demonstrating the table partitioning


organization scheme where a table
representing monthly sales data is
partitioned into multiple data
partitions. The table also spans two
table spaces (ts1 and ts2).
Data organization schemes, Cont’d…

Demonstrating the complementary


organization schemes of database
partitioning and table partitioning. A
table representing monthly sales
data is partitioned into multiple data
partitions, spanning two table
spaces (ts1 and ts2) that are
distributed across multiple database
partitions (dbpart1, dbpart2,
dbpart3) of a database partition
group (dbgroup1).
Data organization schemes, Cont’d…

A representation of the database


partitioning, table partitioning and
multidimensional organization
schemes where data from table
SALES is not only distributed across
multiple database partitions,
partitioned across table spaces ts1
and ts2, but also groups rows with
similar values on both the date and
region dimensions.
A range-clustered table (RCT)

■ A range-clustered table (RCT) has a table layout scheme in which each


record in the table has a predetermined record ID (RID).
– The RID is an internal identifier that is used to locate a record in
the table.
■ An algorithm is used to associate a record key value with the location of
a specific table row.
■ This approach provides exceptionally fast access to specific table rows.
■ The algorithm does not use hashing, because hashing does not preserve
key-value order.
■ Preserving this order eliminates the need to reorganize the table data
over time.
A range-clustered table (RCT), Cont’d…

■ Each record key value in the table must be:


– Unique
– Not null
– An integer (SMALLINT, INTEGER, or BIGINT)
– Monotonically increasing
– Within a predetermined set of ranges based on each
column in the key. (If necessary, use the ALLOW
OVERFLOW option on the CREATE TABLE statement to
allow rows with key values that are outside of the defined
range of values.)
A range-clustered table (RCT), Cont’d…

■ In addition to direct access to specific table rows, there are


other advantages to using range-clustered tables.
– Less maintenance is required. A secondary structure,
such as a B+ tree index, which would need to be
updated after every insert, update, or delete operation,
does not exist.
– Less logging is required for RCTs, when compared to
similarly-sized regular tables with B+ tree indexes.
– Less buffer pool memory is required. There is no
additional memory required to store a secondary
structure, such as a B+ tree index
Guidelines for using
range-clustered tables

■ Because the process of creating a range-clustered table pre-


allocates the required disk space, that space must be available.
■ When defining the range of key values, the minimum value is
optional; if it is not specified, the default is 1. A negative
minimum value must be specified explicitly. For example:
– ORGANIZE BY KEY SEQUENCE (f1 STARTING FROM -100
ENDING AT -10)
■ You cannot create a regular index on the same key values that
are used to define the range-clustered table.
■ ALTER TABLE statement options that affect the physical structure
of the table are not allowed.
Scenarios: Range-clustered tables

■ Range-clustered tables can have single-column or multiple-


column keys, and can allow or disallow rows with key values
that are outside of the defined range of values.
■ This section contains scenarios that illustrate how such tables
can be created.
– Scenario 1: Creating a range-clustered table (overflow
allowed)
– Scenario 2: Creating a range-clustered table (overflow
not allowed)
Scenario 1:
Creating a range-clustered table
(overflow allowed)

■ The following example shows a range-clustered table that can be


used to retrieve information about a specific student.
■ Each student record contains the following information:
– School ID
– Program ID
– Student number
– Student ID
– Student first name
– Student last name
– Student grade point average (GPA)
Scenario 1:
Creating a range-clustered table
(overflow allowed)
CREATE TABLE students (
school_id INT NOT NULL,
program_id INT NOT NULL,
student_num INT NOT NULL,
student_id INT NOT NULL,
first_name CHAR(30),
last_name CHAR(30),
gpa FLOAT
)
ORGANIZE BY KEY SEQUENCE
(student_id STARTING FROM 1 ENDING AT 1000000)
ALLOW OVERFLOW;
Scenario 2:
Creating a range-clustered table
(overflow not allowed)

■ In the following example, a school board administers 200


schools, each having 20 classrooms with a capacity of 35
students per classroom.
■ This school board can accommodate a maximum of 140,000
students.
Multidimensional clustering
(MDC) tables
CREATE TABLE students (
school_id INT NOT NULL,
class_id INT NOT NULL,
student_num INT NOT NULL,
student_id INT NOT NULL,
first_name CHAR(30),
last_name CHAR(30),
gpa FLOAT
)
ORGANIZE BY KEY SEQUENCE
(school_id STARTING FROM 1 ENDING AT 200,
class_id STARTING FROM 1 ENDING AT 20,
student_num STARTING FROM 1 ENDING AT 35)
DISALLOW OVERFLOW;
Comparison of regular and MDC tables

• Regular tables
• have indexes that are
record-based.
• Any clustering of the
indexes is restricted to a
single dimension
• A regular table with a
clustering index. has two
record-based indexes defined
on it:
• A clustering index on
"Region“
MDC Block Index

■ MDC introduces indexes that are block-based. "Block indexes"


point to blocks or groups of records instead of to individual
records.
■ By physically organizing data in an MDC table into blocks
according to clustering values, and then accessing these
blocks using block indexes, MDC is able not only to address
all of the drawbacks of clustering indexes, but to provide
significant additional performance benefits
MDC Block Index
• The MDC table shown is
physically organized such that
records having the same
"Region" and "Year" values are
grouped together into separate
blocks, or extents.
• An extent is a set of contiguous
pages on disk, so these groups
of records are clustered on
physically contiguous data pages
How row indexes differ from
block indexes
CREATE MDC

CREATE TABLE "SATRIO"."EMPMDC" (


"EMPNO" INTEGER,
"DEPT" INTEGER,
"DIV" INTEGER
)
ORGANIZE BY ROW USING DIMENSIONS (
("DEPT"),
("DIV"))
DATA CAPTURE NONE
COMPRESS NO;
#Sesi2

■ Constraints
■ Indexes
■ Triggers
■ Sequences
CONSTRAINT
■ DB2 provides three types of constraints:
– Unique constraints, which are used to
ensure that values in a column are
unique.
– Referential integrity constraints, which
are used to define relationships
between tables and ensure that these
relationships remain valid.
– Table check constraints, which are used
verify that column data does not violate
rules defined for the column.
INDEXES

■ create an index:
– To ensure uniqueness of values in a
column or columns.
– To improve performance of queries
against the table. The DB2 optimizer
will use indexes to improve
performance when performing
queries, or to present results of a
query in the order of the index.
TRIGGERS

■ There are two types of triggers:


1. BEFORE triggers
– They are executed before any SQL
operation.
2. AFTER triggers
– They are executed after any SQL
operation.
SEQUENCE

■ Sequences are
similar to
identity columns
in that they both
generate unique
values.

CREATE SEQUENCE
ORDER_SEQ START
WITH 500
INCREMENT BY 1
MAXVALUE 1000
CYCLE CACHE 24
VIEW
■ In SQL, a VIEW is a virtual relation based on the result-
set of a SELECT statement.

■ A view contains rows and columns, just like a real table.


The fields in a view are fields from one or more real
tables in the database. In some cases, we can modify a
view and present the data as if the data were coming
from a single table.

■ Syntax:
CREATE VIEW view_name AS
SELECT column_name(s)
FROM table_name
WHERE condition
DATABASE OBJECT#2

■ View
■ Partitioned Table
■ History Table + Time Travelling Query
■ Temporary Table
■ Materialized Query Table (MQT)

Vous aimerez peut-être aussi