0 Votes +0 Votes -

37 vues98 pagesSQL Exam

Nov 14, 2014

© © All Rights Reserved

PDF, TXT ou lisez en ligne sur Scribd

SQL Exam

© All Rights Reserved

37 vues

SQL Exam

© All Rights Reserved

- The Woman Who Smashed Codes: A True Story of Love, Spies, and the Unlikely Heroine who Outwitted America's Enemies
- NIV, Holy Bible, eBook
- NIV, Holy Bible, eBook, Red Letter Edition
- Steve Jobs
- Cryptonomicon
- Hidden Figures Young Readers' Edition
- Make Your Mind Up: My Guide to Finding Your Own Style, Life, and Motavation!
- Console Wars: Sega, Nintendo, and the Battle that Defined a Generation
- The Golden Notebook: A Novel
- Alibaba: The House That Jack Ma Built
- Life After Google: The Fall of Big Data and the Rise of the Blockchain Economy
- Hit Refresh: The Quest to Rediscover Microsoft's Soul and Imagine a Better Future for Everyone
- Hit Refresh: The Quest to Rediscover Microsoft's Soul and Imagine a Better Future for Everyone
- The Innovators: How a Group of Hackers, Geniuses, and Geeks Created the Digital Revolution
- Autonomous: A Novel
- Algorithms to Live By: The Computer Science of Human Decisions
- Digital Gold: Bitcoin and the Inside Story of the Misfits and Millionaires Trying to Reinvent Money

Vous êtes sur la page 1sur 98

Language

5 SQL The Relational Language

5.1 Introduction . . . . . . . . . . . . . . . . . . . .

5.2 Tabular Variables in SQL . . . . . . . . . . . .

5.2.1 Creation of Tables . . . . . . . . . . . .

5.3 Referential Integrity in SQL . . . . . . . . . . .

5.4 Basic Data Types . . . . . . . . . . . . . . . . .

5.4.1 String Domains . . . . . . . . . . . . . .

5.4.2 Numeric Domains . . . . . . . . . . . .

5.4.3 Special Domains . . . . . . . . . . . . .

5.4.4 Basic Domains Supported by ORACLE

5.5 SELECT Phrases . . . . . . . . . . . . . . . . .

5.6 The WHERE Option . . . . . . . . . . . . . . .

5.7 Union, Intersection, and Difference in SQL . . .

5.8 Table Product in SQL . . . . . . . . . . . . . .

5.9 Join in SQL . . . . . . . . . . . . . . . . . . . .

5.10 Sets and subqueries . . . . . . . . . . . . . . . .

5.11 Parametrized subqueries . . . . . . . . . . . . .

5.12 Subqueries and division . . . . . . . . . . . . .

5.13 Relational Completeness of SQL . . . . . . . .

5.14 Scalar Functions of SQL . . . . . . . . . . . . .

5.14.1 Numerical Functions . . . . . . . . . . .

5.14.2 String Functions . . . . . . . . . . . . .

5.14.3 Date functions . . . . . . . . . . . . . .

5.15 Aggregate Functions in SQL . . . . . . . . . . .

5.16 Sorting Results . . . . . . . . . . . . . . . . . .

5.17 The Group-by Option . . . . . . . . . . . . . .

5.17.1 The decode and case Functions . . . . .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

63

64

65

66

70

72

72

72

73

73

75

77

82

84

86

88

91

93

95

96

96

97

100

102

105

107

111

66

5.17.2 The rollup and cube Extensions of group

5.18 Analytical Capabilities of SQL Plus . . . . . . .

5.18.1 Ranking Functions . . . . . . . . . . . .

5.18.2 Top-n Queries . . . . . . . . . . . . . . .

5.18.3 Windowing functions in SQL Plus . . . .

5.19 Statistics in SQL . . . . . . . . . . . . . . . . .

5.19.1 Variance and Correlation . . . . . . . .

5.19.2 Linear Regression . . . . . . . . . . . . .

5.20 Graphs and SQL in SQL Plus . . . . . . . . . .

5.21 Updates . . . . . . . . . . . . . . . . . . . . . .

5.22 Access Rights . . . . . . . . . . . . . . . . . . .

5.23 Views in SQL . . . . . . . . . . . . . . . . . . .

5.24 Accessing metadata in SQLPlus . . . . . . . . .

5.25 Exercises . . . . . . . . . . . . . . . . . . . . .

5.26 Bibliographical Comments . . . . . . . . . . . .

5.1

by

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

. .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

114

124

125

129

131

132

132

136

138

144

146

147

151

152

155

Introduction

SQL is an acronym for Structured Query Language and is the name of the most

important tool for defining and manipulating relational databases. The development of SQL began in the mid-1970s at the IBM San Jose Research Laboratory.

The success of an experimental IBM database system (known as System R) that

incorporated SQL compelled a number of software manufacturers to join IBM

in developing relational database systems that incorporated SQL. In 1982, the

American National Standards Institute (ANSI) initiated the development of a

standard for a query language for relational database systems, it opted for SQL

as its prototype. The resulting ANSI standard, issued in 1986, was adopted as

an International Standard by the International Organization for Standardization

(ISO) in 1987.

In the late 1980s, embedded SQL was standardized by ANSI, and work on

expanding SQL continues. A much extended version of the original standard,

known as SQL92, was adopted by ISO/IEC at the end of 1992. To reflect current trends in the database field towards object-relational technology, a new

standard ISO/IEC 9075-1, known as SQL99, was published in July 1999. As

we shall see, SQL99 is a superset of SQL92. New features incorporated by this

standard include object-relational extensions (user-defined data types, reference

types, collections, large object support, table hierarchies), active database features (triggers), stored procedures and functions, on-line analytic processing

extensions, etc. More recently, in 2003, a new standard was issued. This new

edition of the standard includes a new chapter that deals with the interaction

between SQL and XML (which we discuss in Chapter 10), correction to SQL99,

and several new features.

Our presentation concentrates initially on common SQL features, applicable

to a wide range of SQL implementations.

67

SQL need not specify how a problem is to be solved nor how data should be

accessed by the computing system; instead, an SQL query states what the query

is, i.e., what data are sought.

This leaves the user free to focus on the logic of the query. Because the

DBMS makes use of its internal knowledge, in most cases, the DBMS generates

retrieval procedures that are faster than equivalent retrieval procedures built

directly by the user.

The SQL language consists of three components: the data definition language (DDL), the data manipulation language (DML), and the data control

language (DCL). The first component allows the user to define the structure of

the tables of the database. The second contains retrieval and update directives.

The last component allows the database administrator to define the access rights

to the database for various categories of users.

SQL syntax is format-free: tabs, carriage returns, and spaces can be included

anywhere a space occurs in the definition of an SQL construct. Also, case is

insignificant in table names, reserved words and keywords. However, case is

significant in character string literals.

5.2

is a relation, that is, it is a set of tuples. To conform to the reality of databases

we need to define the content of a table as a sequence of tuples. Thus, a table

may contain several copies of the same tuple. If a table is allowed to contain

duplicates, then even if we know all components of a tuple, we may be unable

to identify the corresponding row in the table uniquely. As a consequence, not

every table has a key.

In this section we present a topic that we refer to informally as table creation. In reality, we create an object similar to a variable in a programming

language that we call a tabular variable. The values of a tabular variables are

tables and these values change in time. Tabular variables are created using the

construction create table.

Example 5.2.1 To create a tabular variable called PATRONS having the heading

name addr city zip telno date of birth

we write:

create table PATRONS (name varchar(35) not null,

addr varchar(50),

city varchar(25),

zip char(9),

telno char(12),

date_of_birth date);

68

table whose contents is the empty set of tuples:

name

addr

city

PATRONS

zip

telno

date of birth

After inserting a first row, the next value of the tabular variable PATRONS

is the table:

name

Ann Richards

addr

56 Green Ln

PATRONS

city

zip

Natick

02170

telno

508-561-0987

date of birth

02/15/78

A second insertion yields a new table as the value for the tabular variable:

name

Ann Richards

Ron Scott

addr

56 Green Ln

50 Cider Hill

PATRONS

city

zip

Natick

01170

Framingham

01160

telno

508-561-0987

608-663-0211

date of birth

02/15/78

11/4/80

If the first patron moves to a new address, the first row is modified and the

tabular variable assumes a third value:

name

Ann Richards

Ron Scott

addr

77 Lake St.

50 Cider Hill

PATRONS

city

zip

Milton

02186

Framingham

02160

telno

617-364-0606

608-663-0211

date of birth

02/15/78

11/4/80

The values that the tabular variable PATRONS may assume are the actual

tables that have the name and the heading specified at the creation of the

tabular variable. In addition, we can specify several types of constraints that

any value of the tabular variable must satisfy.

Before it is possible to create tabular variables and form queries, it is necessary to create an empty database in which to work. In practice, this is generally

done at the level of the operating system, usually with a command that is provided by the vendor of the DBMS.

To start, we assume that we have created an empty database. In this section

we begin to discuss a part of the data definition component of SQL, namely, the

creation of tabular variables, or informally, the creation of database tables.

5.2.1

Table Creation

The SQL directive for adding tables to a database is create table.

At a minimum, as we saw in Example 5.2.1, creating a tabular variable

in SQL requires that we specify its name and its attributes along with their

domains. The syntax for this is:

create table table name

[(attr def {,attr def })],

where the attribute definition attr def has the syntax:

attribute name domain

69

A slightly more general form (that ignores certain details related to the

physical design of databases), the directive that creates a tabular variable is

create table and has the form: following syntax:

create table [schema.]table name

[(hattr def | table constraint | table ref clause i

{,hattr def | table constraint | table ref clause i})],

where the attribute definition attr def has the syntax:

attribute name domain [default expr] [column ref clause]{column constraint }

As a result of the execution of this directive, an initial amount of space

is reserved in secondary memory to accommodate future values of the tabular

variable, and the metadata are modified to reflect the addition of the new tabular

variable. Specialized SQL constructions, discussed later (insert, delete, and

update) can be used to modify the value of this variable.

Creation of tabular variables permits placing restrictions, called constraints

on the contents of any value that the tabular variable may assume. The constraints that follow have a global character (which means that they apply to

the contents of a table in its entirety) and apply to any value that the tabular

variable may assume.

Definition 5.2.2 A primary key constraint has the form

[constraint constraint name] primary key(list of attributes)

when the primary key consists of the attributes of the list.

Alternate keys of tables can be specified using unique constraints. The syntax

of this type of constraints is:

[constraint constraint name] unique(list of attributes)

This indicates that no two rows of a table that is a value of the tabular variable

may have the same values for the attributes specified in the list.

A constraint of the form cC that involves conditions C that are a Boolean

combination of conditions involving only components of tuples and constants is

denoted by:

[constraint constraint name] check(C)

When a constraint involves more than one attribute it is considered a table

constraint ; otherwise, it is a column constraint. Referential integrity can be

imposed by using the column constraint references in the definition of an

attribute. To prevent certain components of tuples from assuming a null value

we can impose the column constraint not null.

Example 5.2.3 To create the tabular variable INSTRUCTORS of the college

database we use the following create table directive:

create table INSTRUCTORS(empno varchar(11) not null,

name varchar(35),

rank varchar(25),

roomno integer,

telno varchar(4), primary key(empno));

70

11. In addition, we have the column constraint not null, which means that

null cannot be used as a value of the attribute empno. The domains of the

other attributes have similar, obvious definitions that are discussed below. Note

that in the definition of INSTRUCTORS we impose a table constraint, namely

primary key(empno).

Similarly, the tabular variables STUDENTS and COURSES are created by:

create table STUDENTS(stno varchar2(10) not null,

name varchar2(35) not null,

addr varchar2(35),

city varchar2(20),

state varchar2(2),

zip varchar2(10), primary key(stno));

create table COURSES(cno varchar2(5) not null,

cname varchar2(30),

cr smallint, primary key(cno));

A script that creates all tabular variables of the college database is contained

in Appendix A.

Example 5.2.4 To express that the primary key of the table GRADES consists

of the attributes stno cno sem year we can say that this table satisfies the primary

key constraint:

constraint pkg primary key (stno, cno, sem, year)

Example 5.2.5 For the table EMPHIST, introduced in Example 3.3.5 we could

introduce the tuple conditions:

constraint pos_sal check(salary > 0)

and

constraint suf_sal check(position != Programmer or salary > 65000),

respectively. They express that the salary must be a positive number and that

somebody who is a programmer must be paid more than 65000 dollars, respectively.

Thus, the creation of the table EMPHIST can be achieved by:

create table EMPHIST(empno integer not null references PERSINFO(empno),

position varchar2(30),

dept varchar2(20),

appt_date date,

term_date date,

salary float,

check(position != Programmer or salary > 65000),

constraint pos_sal check(salary > 0));

A script that creates the tables PERSINFO, EMPHIST, and REPORTING is contained in Appendix C.

71

Example 5.2.6 In the directives enclosed below we state that stno is both a

foreign key for ADVISING and, also, its primary key. In addition, empno is a

foreign key for this table (being the primary key for the table INSTRUCTORS).

create table ADVISING(stno varchar2(10) not null

references STUDENTS(stno),

empno varchar2(11)

references INSTRUCTORS(empno),

primary key(stno));

create table GRADES(stno varchar2(10)

not null references STUDENTS(stno),

empno varchar2(11)

not null references INSTRUCTORS(empno),

cno varchar2(5)

not null references COURSES(cno),

sem varchar2(6) not null,

year smallint not null,

grade integer,

primary key(stno,cno,sem,year),

check (grade <= 100));

The definition of the tabular variable GRADES specifies referential integrity constraints for each of the attributes stno, empno,cno. In addition, this designates

the set of attributes stno,cno,sem,year as the primary key of GRADES and, also,

imposes the constraint grade < 100.

To remove the tabular variable T we use the construct

drop table T

Rows can be inserted in a table individually, as we show below, or as they

are produced by a select phrase (as we shall see later). To insert a row in a

table T whose heading is A1 An we write in SQL a directive of the form:

insert into T (A1 , . . . , An )

values (a1 , . . . , an );

For example, to insert the row

(1011,Edwards P. David,10 Red Rd.,Newton,MA,02159)

into the table STUDENTS we write:

insert into STUDENTS(stno,name,addr,city,state,zip)

values (1011,Edwards P. David,10 Red Rd.,Newton,MA,02159);

It is possible to insert tuples in the database starting from text files by using

a special utility or ORACLE known as the SQL*Loader. Details are provided

in Appendix D.

To delete a row specified by a certain condition we can use the construct

delete. For example, to remove the row of the table STUDENTS that corresponds to the student having student number 1011 we write:

delete from STUDENTS

where stno = 1011;

72

If you wish to examine the headings of the tables you created you can issue,

for example, the SQL Plus directive

describe INSTRUCTOR;

Name

Null?

-------------------------- -------EMPNO

NOT NULL

NAME

RANK

ROOMNO

TELNO

Type

-----------VARCHAR2(11)

VARCHAR2(35)

VARCHAR2(25)

NUMBER(38)

VARCHAR2(4)

The directive alter table is used for modifying the structure of an existing

table. Columns may be added or dropped, the names of the columns or their

data types can be modified, etc. A simplified syntax of this directive is:

alter table table name modification specification

In turn, the modification specification depends on the particular change we need

to impose on the table. Examples of such modification specifications include

add column name column type,

drop column name,

modify column name column type,

rename column name to new column name,

as well as many other choices.

Example 5.2.7 To add a new year column to the table ADVISING we use the

directive:

alter table advising add year varchar2(4);

The entries of the new column year will have initially null values.

Column types can be modified using the modify option. For instance, to

increase the maximum length of the values of stno to 12 characters we write:

alter table advising modify stno varchar(12);

rename the column stno to studentno:

alter table advising rename column stno to studentno;

alter table advising drop column year;

5.3

We saw that referential integrity can be imposed in SQL using the column

constraint references. An alternative method is to impose the table constraint

foreign key. Its syntax is:

73

references table name ((attr def {,attr def })

[on cascade delete]

The foreign key construction contains the option on cascade delete. The

role of this option is to define the behavior of the tables when deletions occur

in the table where the primary key occurs. Namely, when a row is removed

from the table containing the primary key and the clause on cascade delete is

specified, then all rows from the table that contains the corresponding foreign

key that match the removed row are also removed.

Example 5.3.1 Suppose that the tabular variable CITIES is created by:

create table CITIES (city varchar(40),

state char(2),

primary key (city,state));

A second tabular variable, STORES, records the stores that a retailer has in

the covered territory, and is created by

create table STORES (storeno integer not null,

address varchar(40) not null,

city varchar(40),

state char(2),

tel char(12),

primary key storeno,

foreign key(city,state) references CITIES(city,state)

on delete cascade);

insert

insert

insert

insert

insert

into

into

into

into

into

CITIES(city,

CITIES(city,

CITIES(city,

CITIES(city,

CITIES(city,

state)

state)

state)

state)

state)

values(Boston,MA);

values(Spingfield,MA);

values(Providence,RI);

values(Hartford,CT);

values(Bayonne,NJ);

values(1,125 Harvard St.,Boston,MA,617-287-0991);

insert into STORES(storeno, addr, city, state, tel)

values(2,50 Storrow Drive,Boston,MA,617-566-7629);

insert into STORES(storeno, addr, city, state, tel)

values(3,85 Manton Av.,Providence,RI,401-453-1234);

insert into STORES(storeno, addr, city, state, tel)

values(4,40 West Street,Hartford,CT,860-232-4484);

insert into STORES(storeno, addr, city, state, tel)

values(5,5 Finley Av.,Bayonne,NJ,908-221-0094);

insert into STORES(storeno, addr, city, state, tel)

74

insert into STORES(storeno, addr, city, state, tel)

values(7,30 Stilson Rd.,Providence,RI,401-861-5249);

CITY

ST

--------------Boston

MA

Spingfield

MA

Providence

RI

Hartford

CT

Bayonne

NJ

and

STORENO ADDR

CITY

ST TEL

-----------------------------------------------------1

125 Harvard St.

Boston

MA 617-287-0991

2

50 Storrow Drive Boston

MA 617-566-7629

3

85 Manton Av.

Providence

RI 401-453-1234

4

40 West Street

Hartford

CT 860-232-4484

5

5 Finley Av.

Bayonne

NJ 908-221-0094

6

10 Linton Plaza

Hartford

CT 860-660-2220

7

30 Stilson Rd.

Providence

RI 401-861-5249

Since the referential integrity was imposed between the tabular variables

CITIES and STORES we need to insert the tuples of CITIES before we can insert

the tuples of STORES. Otherwise, the cities mentioned in the values of STORES

can not reference a city in a value of CITIES and the insertion in STORES will

be rejected.

The presence of on delete cascade means that if a row is removed from

a table CITIES that the rows corresponding to that city are also removed. For

example, if the company closes its business in Hartford and we execute

delete from CITIES where

city = Hartford and state = CT;

deleted automatically.

Removal of the tabular variables is also constrained by the referential integrity. It would be impossible to remove the tabular city CITIES before we

remove the table STORES because STORES references CITIES. Thus, the correct order of removal is

drop table STORES;

drop table CITIES;

If the clause on cascade delete is absent, then the deletion of a row from

CITIES is impossible unless we delete first the rows of STORES that correspond

to the city that is removed from CITIES.

5.4

75

SQL makes use of a collection of domains that, in general, varies from one

implementation to another. Not all domains of the standard exist in every

implementation, and not all domains of implementations exist in the standard.

Basic domains supported by virtually all implementations of SQL can be

classified as string domains, numerical domains, and special domains.

5.4.1

String Domains

characters. In this category, we have char(n), which represents the set of strings

of characters (from a given basic set of characters) that have fixed length n. Similarly, varchar(n) represents the set of variable-length strings whose maximal

length is n for n > 0.

5.4.2

Numeric Domains

The SQL standard prescribes two kinds of numeric domains: exact numeric data

types: numeric, decimal, integer and smallint, and approximate numeric

data types: float, double precision, and real. Their respective syntax is:

numeric [(p[, s])]

decimal [(p[, s])]

integer

smallint

float [(p)]

double precision

real

Here, p stands for precision and s stands for scale (both of which are nonnegative integers). The precision parameter refers to the total number of digits,

while the scale indicates the number of digits to the right of the decimal point.

The difference between numeric and decimal is that in the latter case, p is

understood to be the maximum number of digits, while in the former case, p is

the exact total number of digits.

The domains smallint and integer have a number of digits dependent on

the implementation; however, the precision of integer is required to be equal

to or larger than the precision of smallint.

The float domain includes approximate representations of real numbers having precision at least p. Also, real and double precision have implementationdependent precision, where the precision of double precision is never smaller

than the one of real.

5.4.3

Special Domains

Specific DBMSs have their own domains. For instance, ORACLE has the long

domain that contains strings of characters of variable length that may be as

76

To allow us to begin working with actual examples as quickly as possible, we

introduce some basic domains for ORACLE. Other databases are quite similar,

and the reader can obtain the relevant details by consulting product-specific

manuals.

5.4.4

In ORACLE, char[(n)] represents variable strings of characters of length

n, where 1 n 32767; the default value of n is 1. The domain character is the same as char. The characters and their order are determined

by the system during the installation of the DBMS.

The domain varchar(n) requires n to be specified and also represents

variable-length strings of characters. It is the intention of ORACLE to

separate char(n) from varchar(n) in future releases: char(n) will represent fixed-length strings while varchar(n) will represent variable-length

strings.

The varchar2 data type stores variable-length character strings and is

currently synonymous with the varchar data type. However, in a future

version of Oracle, varchar might store variable-length character strings

compared with different comparison semantics. Currently there are two

types of comparison semantics for strings in Oracle: blank-padded comparison semantics and non-padded comparison semantics.

When blank-padded comparison semantics is used, if the two values have

different lengths, Oracle first adds blanks to the end of the shorter one

so their lengths are equal. Oracle then compares the values character

by character up to the first character that differs. The value with the

greater character in the first differing position is considered greater. If

two values have no differing characters, then they are considered equal.

This rule means that two values are equal if they differ only in the number

of trailing blanks. Oracle uses blank-padded comparison semantics only

when both values in the comparison are either expressions of data type

char, text literals, or values returned by the user-defined function.

In the case of non-padded comparison semantics two values are compared

character by character up to the first character that differs. The value with

the greater character in that position is considered greater. If two values

of different length are identical up to the end of the shorter one, the longer

value is considered greater. If two values of equal length have no differing

characters, then the values are considered equal. Oracle uses non-padded

comparison semantics when one or both values in the comparison have the

data type varchar or varchar2.

In either of the two comparison semantics we have ab > aa and

ab > a . However, in the blank-padded comparison semantics we

have a = a, while in the non-padded semantics we have a > a.

The domain date represents dates in the format dd-mmm-yy.

77

The domain long (also denoted by long varchar) represents variablelength strings of characters with no more than 65,535 characters. At most

one attribute may have this domain in any table.

The number domain in ORACLE can be used in several forms as specified

by the following syntax:

number [(p[, s])],

where p is the precision and s is the scale.

The maximum precision of number is 38. The scale can vary between

84 and 127. If the scale is negative, the number is rounded to the

specified number of places to the left of the decimal point.

The following cases may occur when we insert a value in a column whose

domain is number:

Data

Domain

Stored as

1,234,567.89 number

1234567.89

1,234,567.89 number(9)

1234567

1,234,567.89 number(9,2)

1234567.89

1,234,567.89 number(9,1)

1234567.9

1,234,567.8

number(6)

error: exceeds precision

1,234,567.89 number(10,1) 1234567.9

1,234,567.89 number(7,-2) 1234500

1,234,567.89 number(7,2)

error: exceeds precision

If s > p, then s specifies the maximum number of valid digits after the

decimal point. For instance, number(4,5) requires at least one digit after

the decimal point and rounds the digits after the fifth decimal digit. The

number 0.012358 is stored as 0.01236.

Numbers may also be entered in exponential form, that is, including

an exponent preceded by E. For example, 1234567 can be represented as

1.234567E+6, that is, as 1.234567 106 .

Floating point domains are supported as float, float(*), and float(b),

where b is the binary precision, that is, the number of significant binary

digits. The domains float and float(*) are equivalent, and they consists

of floating point numbers that can be represented by 126 binary digits (or,

equivalently, by about 36 decimal digits).

To provide compatibility with other systems, ORACLE supports such

domains as decimal, integer, smallint, real, and double precision.

However, their internal representation is defined by the format of the

number domain.

5.5

SELECT Phrases

Queries must be written based on the names and headings of the tabular variables and not on the tables that represent their values at any given moment.

This is similar to writing programs. A program should work for all legal inputs

and not just the ones on which it was tested. In both cases, it is important to

78

focus on the abstract structure and not on specific examples. The way we write

SQL constructs must be directed only by the logic of the query and not by the

content of a particular database instance. Just because the query generated the

right answer for a particular instance of the database does not mean that it is

correct.

The main retrieval construction is the select phrase. Consider a query that

we solved previously using relational algebra. Recall that in Example 4.1.25 we

found the names of all instructors who have taught any student who lives in

Brookline. The solution involved using product, selection, and projection:

T1 := (STUDENTS GRADES INSTRUCTORS)

T2 := T1 where STUDENTS.stno = GRADES.stno and

GRADES.empno = INSTRUCTORS.empno and

STUDENTS.city = Brookline

ANS := T2 [INSTRUCTORS.name].

In SQL the same problem can be resolved using a single select phrase as in:

select INSTRUCTORS.name from STUDENTS, GRADES, INSTRUCTORS

where STUDENTS.stno = GRADES.stno and

GRADES.empno = INSTRUCTORS.empno and

STUDENTS.city = Brookline;

We can conceptualize the execution of this typical select using the operations of relational algebra as follows:

1. The execution begins by performing the product of the tables listed after

the reserved word from. In our case, this involves computing the product

STUDENTS GRADES INSTRUCTORS

2. The selection specified after the reserved word where is executed next,

if the where part is present (we shall see that this may or may not be

present in a select.) In our case, this amounts to retaining that part of

the table product that satisfies the condition:

STUDENTS.stno = GRADES.stno and GRADES.empno = INSTRUCTORS.empno

and STUDENTS.city = Brookline

listed between select and from, that is, in our case, on the attribute

INSTRUCTORS.name.

We use a string constant (also known as a literal ) in the above select, namely

Brookline. String constant must begin and end with a single quote.

SQL is not case-sensitive. This means that you may or may not use capital

letters in any place in an SQL construction (except for string comparisons)

without any effect on the value returned by the query.

As we mentioned above, the where part of a select (also known as the where

clause) is optional. This allows us to compute table projections in SQL as we

show next.

79

the room numbers of their offices by projecting the table INSTRUCTORS on

name roomno.

In SQL this can be done by writing

select name, roomno from INSTRUCTORS;

The select construct used above requires the table name for the table involved in the retrieval and the list of attributes that we need to extract.

In general, if we need to compute the projection of a table T on a set of

attributes A1 . . . An of the heading of T , we use the construct:

select A1 , . . . , An from T ;

Example 5.5.2 To find out the states where the students originate we project

the table STUDENTS on the attribute state. This is done by

select state from STUDENTS;

ST

-MA

MA

MA

MA

NH

MA

MA

MA

RI

The value MA is repeated 7 times because there are seven students who live

in Massachusetts.

Duplicate values can be eliminated from a query by using the option distinct

as in

select distinct state from STUDENTS;

ST

-MA

NH

RI

80

5.6

The where clause allows us to extract tuples that satisfy certain conditions; in

other words, using the where clause we can perform selections.

Example 5.6.1 To find students who live in Boston we write:

select stno, name, addr, city, state, zip

from STUDENTS

where city = Boston;

STNO

NAME

ADDR

CITY

ST ZIP

--------------------------------------------------------------2890

McLane Sandy

30 Cass Rd.

Boston

MA 02122

4022

Prior Lorraine 8 Beacon St.

Boston

MA 02125

5544

Rawlings Jerry 15 Pleasant Dr.

Boston

MA 02115

If we want to extract all columns of a table instance, we can use the wildcard character, *, instead of listing all columns. Thus, we can write the equivalent select:

select * from STUDENTS

where city = Boston;

Starting from simple conditions (which we called atomic conditions in Chapter 4) we can write queries involving more complicated conditions built by using

and, or, and not.

Example 5.6.2 In Example 4.1.14 we retrieved the students who live in Boston

or Brookline. In SQL this can be done by:

select * from STUDENTS

where city = Boston or city = Brookline;

STNO

NAME

ADDR

CITY

ST ZIP

--------------------------------------------------------------2661

Mixon Leatha

100 School St.

Brookline MA 02146

2890

McLane Sandy

30 Cass Rd.

Boston

MA 02122

3566

Pierce Richard 70 Park St.

Brookline MA 02146

4022

Prior Lorraine 8 Beacon St.

Boston

MA 02125

5544

Rawlings Jerry 15 Pleasant Dr.

Boston

MA 02115

Example 5.6.3 To retrieve the grade records obtained in cs110 during the

Spring of 2000 we can write in SQL:

select * from GRADES

where cno = cs110 and sem = SPRING

and year = 2003;

STNO

---------1011

4022

EMPNO

----------023

023

CNO

----cs110

cs110

81

SEM

YEAR

GRADE

------ ---------- ---------SPRING

2000

75

SPRING

2000

60

Example 5.6.4 In the select phrase:

select stno, empno from GRADES

where cno = cs110;

on the attributes stno, empno that are listed after the word select. The result

is:

STNO

---------1011

2661

3566

5544

1011

4022

EMPNO

----019

019

019

019

023

023

Certain patterns can be specified using the symbol % to replace 0 or more characters, and the underscore to replace exactly one character. As mentioned earlier, SQL is generally not case-sensitive; however, comparisons involving strings

are case-sensitive. Thus, Jerry and JERRY are distinct strings, and Jerry

JERRY. The comparison is realized using the operator like.

Example 5.6.5 If we need to find the names and the addresses of students

whose name includes Jerry, we can use the following select construct:

select name, addr from STUDENTS

where name like %Jerry%;

NAME

--------------Rawlings Jerry

Lewis Jerry

ADDR

--------------15 Pleasant Dr.

1 Main Rd.

Example 5.6.6 Suppose the computer science course numbers were carefully

assigned so that all fundamental programming courses have a 1 as their second

digit. Then the following select construct lists all fundamental programming

courses.

82

where cno like cs_1%;

CNO

----cs110

cs210

cs310

cs410

CNAME

------------------------Introduction to Computing

Computer Programming

Data Structures

Software Engineering

CR

-4

4

3

3

CAP

--120

100

60

40

Using the reserved word between, we can ensure that certain values are

limited to prescribed intervals (including the endpoints of these intervals).

Example 5.6.7 To find the students who obtained some grade between 65 and

85 in 2002, we apply the following query:

select distinct stno from GRADES

where year = 2003 and

grade between 65 and 85;

STNO

---1011

2661

5571

select distinct stno from GRADES

where year = 2003 and

grade >= 65 and

grade <= 85

Example 5.6.8 A select construct, similar to the one used in Example 5.6.7,

can be used to retrieve the students who have some grade that does not satisfy

the previous condition, that is, the students who have some grade not between

65 and 85:

select distinct stno from GRADES where year = 2003

and grade not between 65 and 85;

STNO

---1011

2415

3442

83

3566

4022

5571

by using a condition of the form:

A in (v1 , . . . , vn )

This condition is satisfied by those tuples t such that t[A] has one of the values

v1 , . . . , vn .

Example 5.6.9 Let us find the names of students who live in Boston or Brookline, a query that we already discussed in Example 5.6.2. Using the previous

condition we write:

select name from STUDENTS

where city in (Boston,Brookline);

NAME

-------------Mixon Leatha

McLane Sandy

Pierce Richard

Prior Lorraine

Rawlings Jerry

On the other hand, we can test of the negation of a condition using not. To

list the names of students who live outside those two cities, we write:

select name from STUDENTS

where not(city in (Boston,Brookline));

select name from STUDENTS

where city not in (Boston,Brookline);

improve the presentation of the results.

Example 5.6.10 To insert the string Student name: in front of a student

name we write:

select Student name: , name from STUDENTS;

STUDENTNAME: NAME

--------------- ----------------Student name: Edwards P. David

Student name: Grogan A. Mary

Student name: Mixon Leatha

84

Student

Student

Student

Student

Student

Student

name:

name:

name:

name:

name:

name:

McLane Sandy

Novak Roland

Pierce Richard

Prior Lorraine

Rawlings Jerry

Lewis Jerry

operator ||.

Example 5.6.11 In the next select phrase we concatenate the string Student

with a students name, then with the string lives in and the students state:

select Student || name || lives in || state

from STUDENTS;

STUDENT||NAME||LIVESIN||STATE

------------------------------------Student Edwards P. David lives in MA

Student Grogan A. Mary lives in MA

Student Mixon Leatha lives in MA

Student McLane Sandy lives in MA

Student Novak Roland lives in NH

Student Pierce Richard lives in MA

Student Prior Lorraine lives in MA

Student Rawlings Jerry lives in MA

Student Lewis Jerry lives in RI

Example 5.6.12 The query shown in Example 5.6.11 can be executed in Microsoft SQL server by

select Student + name + lives in + state

from STUDENTS;

5.7

may occur only between tables that have identical headings. To execute these

operations in SQL, we need to use compound select phrases. Compound selects

are constructed from simple select phrases using the reserved words union,

intersect, and minus. As we shall see, SQL treats union, intersection and difference as operations between sets of tuples, and therefore, it removes duplicate

values from the results of the queries.

85

Example 5.7.1 To determine the student numbers of students who took cs210

we write:

select stno from GRADES

where cno = cs210;

STNO

---1011

2661

3566

5571

4022

select stno from GRADES

where cno = cs240;

STNO

---3566

5571

2415

5544

1011

4022

To find the students who took both cs210 and cs240 we use the intersect

to link the two previous select phrases into a compound select:

select stno from grades where cno = cs210

intersect

select stno from grades where cno = cs240;

This gives:

STNO

---1011

3566

4022

5571

The union of the two sets is computed by the following compound select:

select stno from grades where cno = cs210

union

select stno from grades where cno = cs240;

86

STNO

---1011

2415

2661

3566

4022

5544

5571

If we wish to retain all values in the result, then we need to use union all

to link the select phrases as in:

select stno from grades where cno = cs210

union all

select stno from grades where cno = cs240;

The result contain now all values retrieved by the individual selects:

STNO

---1011

2661

3566

5571

4022

3566

5571

2415

5544

1011

4022

the students who took cs210 but did not take cs240 we write:

select stno from grades where cno = cs210

minus

select stno from grades where cno = cs240;

STNO

---2661

The reverse difference allows us to find students who took cs240 but did not

take cs210:

select stno from grades where cno = cs240

minus

select stno from grades where cno = cs210;

Now we obtain:

87

STNO

---2415

5544

5.8

A select phrase that lists several distinct table names after the reserved word

from computes the product of these tables.

Example 5.8.1 To examine all possible pairs of students/instructors we could

write the following select:

select STUDENTS.name, INSTRUCTORS.name

from STUDENTS, INSTRUCTORS;

Since our database is in a state that contains 9 students and five instructors,

this will result in 45 rows retrieved:

NAME

NAME

--------------------------------Edwards P. David

Evans Robert

Grogan A. Mary

Evans Robert

Mixon Leatha

Evans Robert

.

.

.

Pierce Richard

Will Samuel

Prior Lorraine

Will Samuel

Rawlings Jerry

Will Samuel

Lewis Jerry

Will Samuel

Observe that the tables are not linked by any where condition; as expected

in the definition of the product, all combinations of rows are considered. After computing the product, a projection eliminates all attributes except STUDENTS.name and INSTRUCTORS.name.

Also, note that we use qualified attributes as required by the definition of

table product (see Definition 4.1.7).

The result produced by the query shown in Example 5.8.1 does not differentiate between the attributes STUDENTS.name and INSTRUCTORS.name and

this may confuse the user. Therefore, it is preferable to rename the columns of

the result using the option as:

select STUDENTS.name as stname, INSTRUCTORS.name as instname

from STUDENTS, INSTRUCTORS;

88

STNAME

INSTNAME

--------------------------------Edwards P. David

Evans Robert

Grogan A. Mary

Evans Robert

Mixon Leatha

Evans Robert

.

.

.

Pierce Richard

Will Samuel

Prior Lorraine

Will Samuel

Rawlings Jerry

Will Samuel

Lewis Jerry

Will Samuel

SQL allows for computations of products of several copies of the same table

through the creation of aliases; the solution proceeds using the logic discussed

in Example 4.1.18. To create an alias S of a table named T we write the name

of the alias after the name of the table in the list of table, making sure that at

least one space (and no comma) exists between the name of the table and its

alias. For example, in the select phrase of Example 5.8.2 we create the alias I

by writing

INSTRUCTORS I

Example 5.8.2 Let us solve the query shown in Example 4.1.18: finding all

pairs of instructors names for instructors who share the same office. This can

be done by writing:

select I.name as firstname, INSTRUCTORS.name as secname

from INSTRUCTORS I, INSTRUCTORS

where I.roomno = INSTRUCTORS.roomno and

I.empno < INSTRUCTORS.empno;

FIRSTNAME

SECNAME

-----------------------------Exxon George

Will Samuel

product between this alias and INSTRUCTORS and retain those pairs that share

the the same room and consist of distinct individuals.

Example 5.8.3 Suppose that we need to find all triples of student names for

students who live in the same city and state. Now we need to operate with three

distinct copies of the table STUDENTS. This is accomplished by:

select S1.name as name1, S2.name as name2,

S3.name as name3

from STUDENTS S1, STUDENTS S2,

STUDENTS S3

where S1.state = S2.state and

S2.state = S3.state and

S1.city

S2.city

S1.stno

S2.stno

89

=

=

<

<

S2.city and

S3.city and

S2.stno and

S3.stno

NAME1

NAME2

NAME3

---------------------------------------------------McLane Sandy

Prior Lorraine

Rawlings Jerry

5.9

Join in SQL

Earlier version of SQL (at the level of SQL 1) dealt with the join operation

indirectly, using operations like product, selection and projection, which are

already available in SQL. The blueprint of this treatment of the join operation

was outlined in Section 4.2.

Example 5.9.1 The SQL solution to the query considered in Example 4.2.2 in

which we seek to find the names of instructors who have taught any four-credit

course is solved in SQL by writing:

select distinct INSTRUCTORS.name

from COURSES, GRADES, INSTRUCTORS

where COURSES.cr = 4

and COURSES.cno = GRADES.cno

and GRADES.empno = INSTRUCTORS.empno;

SQL. The first step that consists of computing the product

T1 = COURSES GRADES INSTRUCTORS

corresponds to the list of tables that follows the word from. Then, the selection

specified by

T2 = (T1 where COURSES.cr = 4 and

COURSES.cno = GRADES.cno and

GRADES.empno = INSTRUCTORS.empno)

is executed using the condition of the where clause.

Finally, the projection

T3 (name) = T2 [INSTRUCTORS.name]

corresponds to the list that follows select. In this case, this list consists of one

attribute, INSTRUCTORS.name.

We give one more example that shows a typical query that uses a join.

90

Example 5.9.2 To list all pairs of student names and course names such that

the student takes the course, the relational algebra solution would require that

we join the tables STUDENTS, GRADES, and COURSES. In SQL we write:

select distinct STUDENTS.name, COURSES.cname

from STUDENTS, GRADES, COURSES

where STUDENTS.stno = GRADES.stno and

GRADES.cno = COURSES.cno

NAME

CNAME

-------------------------------------------------Edwards P. David

Computer Architecture

Edwards P. David

Computer Programming

Edwards P. David

Introduction to Computing

Grogan A. Mary

Computer Architecture

.

.

.

Prior Lorraine

Data Structures

Prior Lorraine

Introduction to Computing

Rawlings Jerry

Computer Architecture

Rawlings Jerry

Introduction to Computing

SQL dialects that conform to the SQL-2 standard (e.g., SQLPlus of Oracle

9i and 10g, and Microsoft SQL Server) allow the use of the constructions inner join and on. For example, the query discussed in Example 5.9.1 has the

alternate solution:

select distinct INSTRUCTORS.name

from INSTRUCTORS, COURSES INNER JOIN GRADES

on COURSES.cno = GRADES.cno

where INSTRUCTORS.empno = GRADES.empno

and COURSES.cr = 4;

This query should be viewed as computing the natural join of COURSES and

GRADES based on the equality of the attributes they share (as specified by the

on clause. Then, the join INSTRUCTORS with the result of the previous join is

computed using the simulation by product and selection method.

In SQL Plus queries involving natural joins among tables who attributes

identically named can be further simplified by applying the using clause, which

lists the attributes involved in the joining.

Example 5.9.3 To retrieve the names of instructors who taught cs110 we can

execute in SQL Plus the query:

select distinct INSTRUCTORS.name

from INSTRUCTORS inner join GRADES

using(empno);

The inner join can be used for joins that involve more than two tables.

91

makes use of the inner join operation is:

select distinct INSTRUCTORS.name

from

INSTRUCTORS inner join GRADES

using(empno)

inner join COURSES

using(cno)

where COURSES.cr = 4

using the claues on or implicitely, employing the clause using.

Example 5.9.5 To find the pairs of names of students and instructors such that

the student takes a course with the instructor who is also his or her advisor, we

can write either:

select distinct STUDENTS.name as sname, INSTRUCTORS.name as iname

from GRADES inner join ADVISING

on GRADES.stno = ADVISING.stno and

GRADES.empno = ADVISING.empno

inner join STUDENTS

on ADVISING.stno = STUDENTS.stno

inner join INSTRUCTORS

on ADVISING.empno = INSTRUCTORS.empno

or, equivalently,

select distinct STUDENTS.name as sname, INSTRUCTORS.name as iname

from GRADES inner join ADVISING

using(stno,empno)

inner join STUDENTS

using(stno)

inner join INSTRUCTORS

using(empno)

cross join operation.

Example 5.9.6 The query that we wrote in Example 5.8.1 that generates all

possible pairs of students/instructors can be also written as:

select STUDENTS.name, INSTRUCTORS.name

from STUDENTS cross join INSTRUCTORS;

which is equivalent to

select STUDENTS.name, INSTRUCTORS.name

from STUDENTS, INSTRUCTORS;

92

We saw that when joining two tables not all tuples are joinable; tuples that

belong to one table and are not joinable with any tuple of the other table leave no

trace in the join, a situation that is often inconvenient. As we saw in Section 4.3,

the outer join operation and its variants, the left outer join and the right outer

join can rectify this situation.

Let us assume that the tabular variables STUDENTS and INSTRUCTORS

contain the tuples shown in Figure 5.1.

The tabular variable ADVISING has the same content as the one shown in

Figure 3.1.

Example 5.9.7 Oracles own syntax for left outer join is to designate the component that may be null by (+), as in

select students.name, ADVISING.empno from STUDENTS, ADVISING

where STUDENTS.stno = ADVISING.stno(+)

This is equivalent to using the operator left outer join as specified by SQL2:

select STUDENTS.name, ADVISING.empno

from STUDENTS left outer join ADVISING

on STUDENTS.stno = ADVISING.stno

\end{PGMdiplsy}

Either phrase will return:

\begin{PGMdisplay}

name

empno

----------------------------------------Edwards P. David

019

Grogan A. Mary

019

Mixon Leatha

023

McLane Sandy

023

Novak Roland

056

Pierce Richard

126

Prior Lorraine

234

Rawlings Jerry

023

Lewis Jerry

234

Davis Richard

Chu Martin

The computation of the right outer join is similar. We can use either Oracles

syntax as in

select ADVISING.stno, INSTRUCTORS.name from ADVISING, INSTRUCTORS

where ADVISING.empno(+) = INSTRUCTORS.empno;

select ADVISING.stno, INSTRUCTORS.name

from ADVISING right outer join INSTRUCTORS

on ADVISING.empno = INSTRUCTORS.empno;

93

STUDENTS

addr

10 Red Rd.

8 Walnut St.

100 School St.

30 Cass Rd.

42 Beacon St.

70 Park St.

8 Beacon St.

15 Pleasant Dr.

1 Main Rd

45 Algonquin Rd.

90 Rye Dr.

stno

1011

2415

2661

2890

3442

3566

4022

5544

5571

6410

7209

name

Edwards P. David

Grogan A. Mary

Mixon Leatha

McLane Sandy

Novak Roland

Pierce Richard

Prior Lorraine

Rawlings Jerry

Lewis Jerry

Davis Richard

Chu Martin

empno

019

023

056

126

234

323

INSTRUCTORS

name

rank

Evans Robert

Professor

Exxon George

Professor

Sawyer Kathy

Assoc. Prof.

Davis William

Assoc. Prof.

Will Samuel

Assist.Prof.

Campbell Kenneth

Professor

city

Newton

Malden

Brookline

Boston

Nashua

Brookline

Boston

Boston

Providence

Natick

Ayer

roomno

82

90

91

72

90

102

state

MA

MA

MA

MA

NH

MA

MA

MA

RI

MA

MA

zip

02159

02148

02146

02122

03060

02146

02125

02115

02904

01760

01290

telno

7122

9101

5110

5411

7024

7077

94

stno

name

--------------------------1011

Evans Robert

2415

Evans Robert

2661

Exxon George

2890

Exxon George

5544

Exxon George

3442

Sawyer Kathy

3566

Davis William

4022

Will Samuel

5571

Will Samuel

Campbell Kenneth

Finally, the outer join itself can be computed using the operator outer join:

select STUDENTS.name, INSTRUCTORS.name

from students full outer join advising

using(stno)

full outer join instructors

using(empno);

sname

iname

----------------------------------------------------Grogan A. Mary

Evans Robert

Edwards P. David

Evans Robert

Rawlings Jerry

Exxon George

McLane Sandy

Exxon George

Mixon Leatha

Exxon George

Novak Roland

Sawyer Kathy

Pierce Richard

Davis William

Lewis Jerry

Will Samuel

Prior Lorraine

Will Samuel

Chu Martin

Davis Richard

Campbell Kenneth

5.10

Subqueries are select phrases that return sets rather than tables. Their main

use is in conditions that involve sets. As we shall see, they are useful in implementing difference and division

in SQL. Syntactically, a subquery is written by placing a select phrase

between a pair of parentheses. For example,

(select empno from INSTRUCTORS where rank = Professor);

95

the student numbers of students who take a course with a full professor, we

need to select those GRADES tuples whose empno belongs to this set. This can

be accomplished by writing:

select distinct stno from GRADES where

empno in (select empno from INSTRUCTORS

where rank = Professor);

STNO

---1011

2415

2661

3566

4022

5544

5571

We refer to the first select as the calling select, or the main select or the outer

select; the select of the subquery is the inner select.

As we saw in the introductory example, membership can be tested using in.

Here is another example.

Example 5.10.1 Let us find the names of students who took cs310. We determine the student numbers of those students using a subquery. Then, in the

main select, we retrieve those students whose student number is in this set.

This can be accomplished using the query:

select name from STUDENTS where

stno in (select stno from GRADES

where cno = cs310);

NAME

-------------Mixon Leatha

Prior Lorraine

subquery using a condition of the form

(x1 , . . . , xn ) in (select A1 , . . . , An from )

This type of test is included by SQL99, but it is not implemented in many SQL

dialects. However, it is in ORACLE and DB2.

Example 5.10.2 To find the pairs of names of students and instructors such

that the student took some course with the instructor but no four-credit course.

This is computed by the following query:

96

INSTRUCTORS.name as iname

from STUDENTS, INSTRUCTORS where

(STUDENTS.stno, INSTRUCTORS.empno) in

(select stno, empno from grades

minus

select stno, empno from grades

where cno in (select cno

from courses

where cr=4));

SNAME

INAME

------------------ ------------Edwards P. David

Sawyer Kathy

Grogan A. Mary

Evans Robert

Mixon Leatha

Will Samuel

Novak Roland

Will Samuel

Prior Lorraine

Sawyer Kathy

Prior Lorraine

Will Samuel

Rawlings Jerry

Sawyer Kathy

Lewis Jerry

Will Samuel

If oper is one of the operators =, !=, <, >, <= or >=, then we can use

conditions of the form

v oper any (select ...)

or

v oper all (select ...)

in comparisons that involve some elements of the set computed by the subquery

(select ) or all elements of the same set, respectively. Here != stands for

inequality.

Example 5.10.3 To find the names of the courses taken by the student whose

student number is 1011, we can use the following query:

select cname from COURSES where

cno = any (select cno from

The construct = any is synonymous with in, and the same query could be

written as:

select cname from COURSES

where cno in (select cno from GRADES where stno= 1011);

Also, instead of = any we could use = some, and so, we have a third way or

writing the same query:

select cname from COURSES where

cno = some (select cno from GRADES where stno= 1011);

97

CNAME

------------------------Introduction to Computing

Computer Programming

Computer Architecture

Example 5.10.4 Let us find the students who obtained the highest grade in

cs110. Although there are methods that we explain later that yield much simpler

solutions for this type of query, for the moment we want to illustrate the oper all

condition. We operate on two copies of GRADES. The copy used in the inner

select is intended for computing the grades obtained in cs110:

select stno from GRADES where cno = cs110

and grade >= all(select grade from GRADES

where cno = cs110);

STNO

---5544

Example 5.10.5 Let us find the students who obtained a grade higher than any

grade given by a certain instructor, say Prof. Will. Using the all... subquery

we can write:

select stno from GRADES

where grade >= all(select grade from GRADES

where empno in (select empno from INSTRUCTORS

where name like Will%));

If we alter this query and replace the instructor with Prof. Davis, who teaches

no courses, then the set computed by the query

select stno from GRADES

where grade >= all(select grade from GRADES

where empno in (select empno from INSTRUCTORS

where name like Davis%));

is empty. Therefore, every grade satisfies the inequality, and we obtain all

student numers for students who took any course!

5.11

Parametrized subqueries

calling select. A typical situation is described in the following example.

98

Example 5.11.1 Suppose that we need to retrieve the course numbers of courses

taken by the student whose student number is STUDENTS.stno. Ignore (for the

moment) the origin of this piece of data. Then, the retrieval is done by the

select construct:

select cno from GRADES

where stno = STUDENTS.stno;

Next, we transform this select into a subquery. The student number STUDENTS.stno is provided by the outer select of the following construct:

select name from STUDENTS where cs310 in

(select cno from GRADES

where stno = STUDENTS.stno);

Observe that this provides an alternate solution to the query discussed in Example 5.10.1. Namely, we use a subquery to compute the courses taken by each

student. Then, we test if cs310 is one of these courses. We use the qualified attribute STUDENTS.stno inside the subquery to differentiate between this input

parameter and the attribute stno of the table GRADES.

Sets of tuples produced by subqueries can be tested for emptiness using the

exists condition. Namely, the condition

exists (select from )

is true if the set returned by the subquery is not empty; similarly,

not exists (select from )

is true if the set returned by the subquery is empty.

Example 5.11.2 Let us give yet another solution to the query we solved in

Example 5.10.1. This time, to find the names of students who took cs310 we

determine the student numbers of those students for whom their set of grades

in cs310 is not empty. This can be done as follows:

select name from STUDENTS where

exists (select * from GRADES where

stno = STUDENTS.stno and

cno = cs310);

NAME

-------------Mixon Leatha

Prior Lorraine

Example 5.11.3 To find instructors who never taught cs110, we search for

instructors for whom there is no GRADES record involving cs310 and these

instructors. This can be done by

99

not exists(select * from GRADES where

empno = INSTRUCTORS.empno and

cno = cs110);

NAME

------------Sawyer Kathy

Davis William

Will Samuel

If both the main query and the subquery deal with the same table and the

subquery requires input parameters from the outer query, then we use an alias

of the table in the outer query.

Example 5.11.4 Let us find the student numbers of students whose advisor

is advising at least one other student. The information is contained in the

ADVISING table, and the following select construct uses both ADVISING (in

the subquery) and its alias A in the main query:

select distinct stno from ADVISING A

where exists (select * from ADVISING where

empno = A.empno and stno != A.stno);

STNO

---1011

2415

2661

2890

4022

5544

5571

Subqueries can be used in the list that follows from in exactly the same

manner that tables are used. This is shown in the next example:

Example 5.11.5 To find the pairs of names of students and instructors such

that the student took some course with the instructor we could write:

select STUDENTS.name as sname, INSTRUCTORS.name as iname

from STUDENTS, INSTRUCTORS,

(select stno, empno from GRADES) PN

where STUDENTS.stno = PN.stno and

INSTRUCTORS.empno = PN.empno;

100

The difference of the tables T and S can be computed by looking for each

tuple of T for which there is no matching tuple in S. This can be done by:

select * from T where

not exists (select * from S where

A1 = T.A1 and and An = T.An )

Example 5.11.6 Courses offered by the continuing education program but not

by the regular program can be found by writing:

select * from CED_COURSES where

not exists (select * from COURSES where

cno = CED_COURSES.cno)

which takes advantage of the fact that cno is a key for both COURSES and

CED COURSES.

5.12

SQL does not have a division operation. However, as we saw in Examples 4.1.27

and 4.2.3, we can perform division using product, projection, and difference. Of

course, we could apply the prescription offered by relational algebra. This type

of solution is discussed in the next example.

Example 5.12.1 The solution envisioned here is

select cno from grades

minus

select GI.cno from (select grades.cno,

instructors.empno

from grades, instructors

where rank=Professor) GI

where (GI.cno,GI.empno) not in (select cno,empno from grades)

select grades.cno, instructors.empno

from grades, instructors

where rank=Professor

computes all pairs of courses and instructor numbers using the product of the

tables GRADES and INSTRUCTORS. Then, the query

select GI.cno from (select grades.cno,

instructors.empno

from grades, instructors

where rank=Professor) GI

where (GI.cno,GI.empno) not in (select cno, empno from grades)

extracts the courses that are part of the pairs of the previous table that do not

appear in the GRADES table, that is, the courses for which there exists a full

professor who did not teach these courses. These are the courses that we need

to exclude from the answer. Thus, the query presented at the beginning of this

example yields the solution of the problem:

101

CNO

----cs110

that do not have all the facilities of SQL Plus. Therefore, we need to examine

an alternate way of solving this problem that is almost universally usable. To

understand the technique used we examine the solution of the query formulated

in the next example.

Example 5.12.2 Again, suppose that we need to determine the courses taught

by every full professor. Let us formulate the same query in a way that is

easier to translate in SQL. Namely, we find the courses for which there are no

full professors who have not taught these courses. The reader should realize

immediately that this is simply a new formulation of the same problem. We

show the solution in steps, moving gradually from plain English to SQL:

Phase I:

select cno from GRADES G where

not exists (instructors who are full professors and

have not taught the course G.cno)

Phase II:

select cno from GRADES G where

not exists (select * from INSTRUCTORS

where rank = Professor and

these instructors have not taught

the course G.cno)

Phase III:

select cno from GRADES G where

not exists (select * from INSTRUCTORS

where rank = Professor and

not exists (select * from GRADES

where empno = INSTRUCTORS.empno

and cno = G.cno));

In Phase I we determine in SQL the course numbers for which no full professor exists who has not taught these courses.

In Phase II we concentrate on preventing the existence of full professors who

are not teaching these courses. Note that Phase II still contains an untranslated

part.

Finally, in Phase III, we translate the part who have not taught these

courses using not exists for the second time.

Example 5.12.3 Another query that requires division in relational algebra is:

Find names of instructors who have taught every 100-level course, that is,

102

every course whose first digit of the course number is 1. The formulation that

is better suited to SQL implementation is: Find names of instructors for whom

there is no 100 level course that they have not taught. This is solved by the

following select construct:

select name from INSTRUCTORS where

not exists (select * from COURSES

where cno like cs1__ and

not exists (select * from GRADES where

empno = INSTRUCTORS.empno

and cno = COURSES.cno));

The answer that results from our usual database instance is:

NAME

-----------Evans Robert

Exxon George

5.13

Between Chapter 4 and the current chapter, we have shown that SQL is capable

of performing all operations of relational algebra. This fact is known as the

relational completeness of SQL. As we shall see in subsequent chapters, the

capabilities of SQL go well beyond the standard definition of relational algebra.

5.14

by discussing built-in functions in SQL that may act on individual values (scalar

function), functions that act on sets of values (aggregate functions), and, also,

analytic functions that can be used for various statistical computations. Then,

we continue with the group by option of select, and we discuss several on-line

analytic processing functions of SQL.

Scalar functions are built-in functions of SQL that work on individual values.

They are highly dependent on the particular implementation of SQL, and we

limit our discussions to functions implemented by ORACLEs SQL Plus. There

are several types of scalar functions, depending on the types of their arguments.

5.14.1

Numerical Functions

Among the numerical functions, abs, sin, cos, power, sqrt, etc. have quite obvious

definitions. For example, sqrt computes the square root of its argument, while

power(x, y) computes xy .

103

table POINTS whose rows represent labelled points in the plane:

create table POINTS(ptid varchar2(10), x integer, y integer,

primary key(ptid));

insert

insert

insert

insert

insert

insert

insert

insert

insert

insert

insert

insert

into

into

into

into

into

into

into

into

into

into

into

into

points(ptid, x, y) values (b,0,1);

points(ptid, x, y) values (c,0,2);

points(ptid, x, y) values (d,1,0);

points(ptid, x, y) values (e,1,1);

points(ptid, x, y) values (f,1,2);

points(ptid, x, y) values (g,2,0);

points(ptid, x, y) values (h,2,1);

points(ptid, x, y) values (i,2,2);

points(ptid, x, y) values (j,3,0);

points(ptid, x, y) values (k,3,1);

points(ptid, x, y) values (l,3,2);

select p.ptid,

sqrt(power(a.x - p.x,2)+power(a.y - p.y,2))

as dist

from points a, points p

where a.ptid = a

This returns:

PTID

---------a

b

c

d

e

f

g

h

i

j

k

l

DIST

---------0

1

2

1

1.41421356

2.23606798

2

2.23606798

2.82842712

3

3.16227766

3.60555128

p (xa , ya ) and a point

p with coordinates (xp , yp ), we use the formula d(a, p) = (xa xp )2 + (ya yp )2 .

The formula appears in the target list of the select and is written with the numerical functions sqrt and power.

In Oracle we can perform computations unrelated to any table by using a

fictious tabular variable that is named DUAL.

104

write:

select sin(30*3.14159265359/180) as sin30,

sin(45*3.14159265359/180) as sin45,

sin(60*3.14159265359/180) as sin60

from dual;

We need to convert the angles to radians before sin is applied. This will return:

SIN30

SIN45

SIN60

---------- ---------- ---------.5 .707106781 .866025404

Microsoft SQL server has a simpler way of performing this type of computations in that it does not require the fictitious table.

Example 5.14.3 In SQL server we can simply write:

select sin(30*3.14159265359/180) as sin30,

sin(45*3.14159265359/180) as sin45,

sin(60*3.14159265359/180) as sin60;

5.14.2

String Functions

String functions can be used to transform strings, extract parts of strings, transform strings, etc.

The functions upper and lower, convert strings to upper and lower characters, respectively.

Example 5.14.4 To print names of students in capital characters and course

titles in small letters we can write:

select distinct upper(STUDENTS.name) as STNAME,

lower(COURSES.cname) as course

from STUDENTS, GRADES, COURSES

where STUDENTS.stno = GRADES.stno and

GRADES.cno = COURSES.cno;

STNAME

COURSE

----------------------------------------------EDWARDS P. DAVID

computer architecture

EDWARDS P. DAVID

computer programming

EDWARDS P. DAVID

introduction to computing

GROGAN A. MARY

computer architecture

.

.

.

PRIOR LORRAINE

PRIOR LORRAINE

RAWLINGS JERRY

RAWLINGS JERRY

105

data structures

introduction to computing

computer architecture

introduction to computing

These functions are particularly useful for performing string comparisons when

ignoring case. Thus,

STE\% like upper(stephany)

is true.

Example 5.14.5 The string function replace substitutes every occurrence of

its second argument in the value(s) specified by its first argument, by its third

argument. In the select written below the string Computer is replaced by the

string Comp.:

select replace(cname,Computer,Comp.) from COURSES;

REPLACE(CNAME,COMPUTER,COMP.)

---------------------------------Introduction to Computing

Comp. Programming

Comp. Architecture

Data Structures

Higher Level Languages

Software Engineering

Graphics

Example 5.14.6 The function concat computes the concatenation of two strings

that form its arguments. Its effect is identical to the concatenation operator ||

that we discussed in Example 5.6.11. The phrase below prints the state and zip

code of each students as a single string:

select name, addr, concat(state,zip) as state_zip from STUDENTS;

This returns:

NAME

ADDR

STATE_ZIP

---------------------------------------------Edwards P. David

10 Red Rd.

MA02159

Grogan A. Mary

Walnut St.

MA02148

Mixon Leatha

100 School St.

MA02146

McLane Sandy

30 Cass Rd.

MA02122

Novak Roland

42 Beacon St.

NH03060

Pierce Richard

70 Park St.

MA02146

Prior Lorraine

8 Beacon St.

MA02125

Rawlings Jerry

15 Pleasant Dr. MA02115

Lewis Jerry

1 Main Rd

RI02904

106

substr. To call this function we need to use the following syntax:

substr(string, integer [,integer ])

A typical call such as substr(s, n, m) will return a the substring of length m

of the string s that starts with the nth characater of s. If m is omitted, as

in substr(s, n), then the function returns all charaters of s starting from the

nth character to the end of s. If n is negative, then the characters are counted

backwards from the end of s.

The select phrase

select substr(Oracle,2,3) from dual;

will return:

SUB

--rac

select substr(Oracle,2) from dual

yields:

SUBST

----racle

which is the string that begins with the second character of Oracle and ends

with the last character of this string.

Since the second argument of the function call in

select substr(Oracle,-4,3) from dual

is negative, the starting position of the substring is the 4th character counted

from the end (that is, the character a) and thus, the query returns:

SUB

--acl

The functions lpad and rpad can be used to enhance presentation of results

of queries. The syntax of lpad is:

lpad(s, integer [string])

The effect is to padd s to the left with spaces to bring the total length of the

string to the length specified by the second argument of the function. If the

third argument is present, then this string is repeated to the left to fill up the

padded string.

The function rpad has a similar syntax; however, the padding is done at the

right of s.

Example 5.14.8 To print a list of all employees and their salaries (using the

tabular variables EMPHIST and PERSINFO we can use the query:

107

persinfo, emphist

where persinfo.empno = emphist.empno

NAME

----------------------------------Natalia Martins

Laura Schwartz

John Soriano

Kendall MacRae

Rachel Anderson

Richard Laughlin

Danielle Craig

Abby Walsh

Bailey Burns

5.14.3

ANN_SAL

------$150000

$120000

$120000

$100000

$$70000

$$70000

$$90000

$$75000

$$70000

Date functions

SQL Plus contains a class of functions that apply to the DATE type: extract,

months between, etc.

Example 5.14.9 The function extract computes a part of a date value. Its

first argument gives the desired date part; the second argument is the date

value. For instance, to obtain the year part of the appt date attribute of the

table EMPHIST we write:

select empno, extract(year from appt_date) as start_y

from emphist;

This returns:

EMPNO

START_Y

---------- ---------1000

1999

1005

1999

1010

2000

1015

1999

1020

1999

1025

2000

1030

2000

1035

2000

1040

2000

select empno, extract(month from appt_date)

as start_m

from emphist

108

EMPNO

START_M

---------- ---------1000

10

1005

10

1010

1

1015

10

1020

11

1025

3

1030

1

1035

2

1040

3

we can use the function month between. This will compute the number of

months between the current date (designated by the system-provided constant

SYSDATE) and the date of hire:

select empno, months_between(SYSDATE,appt_date)

as month_served

from emphist

EMPNO MONTH_SERVED

---------- -----------1000

35.8877397

1005

35.532901

1010

32.8877397

1015

35.1135461

1020

34.8877397

1025

30.5974171

1030

32.5974171

1035

31.2748365

1040

30.8877397

Example 5.14.11 Suppose that a bonus is to be paid to the employees. The

bonus is computed by paying 10% of the current weekly salary (salary/52)

(determined by a null value of the termination date), multiplied by the number

of months employed. This is computed by

select empno, 0.1 * months_between(SYSDATE,appt_date) * salary/52 as bonus

from emphist

where term_date is null;

109

EMPNO

BONUS

-----------------1000

10430.7253

1005

8262.69438

1010

7652.27254

1015

6804.93348

1020

4733.05642

1025

4155.51299

1030

5688.95627

1035

4550.04006

1040

4194.59488

5.15

Aggregate functions are those functions that operate on sets of values. Typical

examples include: sum, avg, max, min, and count.

The first four functions operate on columns of tables and ignore null values.

The count returns the number of elements of the set that is its argument.

Example 5.15.1 The following select construct determines the largest grade

obtained by the student whose student number is 1011. The function max is

applied to the set of grades of the student whose number is 1011 and returns

the largest value in this set:

select max(grade) as highgr from GRADES

where stno = 1011;

HIGHGR

-----90

For instance, sum(A) returns the sum of all values of the selected nonnull

A-components of the tuples. Similarly, avg(A) returns the average value of the

same sequence. The expressions max(A) and min(A) yield the largest and the

smallest values in the set of A-components of the tuples selected by a query,

respectively.

The functions sum and avg apply to attributes whose domains are numerical

(such as integer or float); max and min apply to every kind of attribute.

If we wish to discard duplicate values from the sequences of values before

applying these functions, we need to use the word distinct. For instance,

sum(distinct A) considers only the distinct nonnull values that occur in the

sequence of components.

Example 5.15.2 We mentioned that the built-in functions max and min apply

to string domains as well as to numerical domains. We use this feature of these

functions to determine the first and the last student in alphabetical order:

110

from STUDENTS;

FIRST

LAST

---------------- -------------Edwards P. David Rawlings Jerry

Next, we show a select construct where the same functions are applied to

a numerical domain:

select min(grade) as lowgr,

max(grade) as highgr from GRADES

where stno = 1011;

LOWGR

HIGHGR

----------------40

90

The query

select avg(distinct grade) as avggr from GRADES

where stno = 1011

AVGGR

----73.75

select avg(distinct grade) as avggr from GRADES

where stno = 1011

then the average grade is lower, indicating a preponderance of the higher grades

for this student:

AVGGR

----68.33

example.

Example 5.15.3 To retrieve the students who obtained a grade higher than

the average grade in cs110 we write:

111

and grade > all(select avg(grade) from grades

where cno=cs110);

STNO

---2661

3566

5544

count(A) can be used to determine the number of non-null entries under

the attribute A;

count(distinct A) computes the number of distinct non-null values that

occur under A;

count(*) determines how many rows exist in a table.

Note that count(distinct *) cannot be used in SQL.

Example 5.15.4 Here are several examples of the use of the count function.

To find how many students took cs110 in the fall semester of 2002, we write:

select count(cno) from GRADES

where cno = cs110 and

sem = Fall and

year = 2003;

Since no records exist for any grades given during that semester in cs110, we

obtain the answer:

COUNT(CNO)

---------0

Observe that this table has a system-supplied column name COUNT(cno). This

happens because we did not provide a name using as.

Let us determine how many students have ever registered for any course. We

have to retrieve this result from GRADES, and we must use distinct to avoid

counting the same student several times (if the student took several courses):

select count(distinct stno) as nost

from GRADES;

NOST

---8

112

Finally, let us determine the names of instructors who are teaching more

than one subject. For every instructor, we determine in a subquery the number

of courses taught. Then, we retain those instructors who taught more than one

course:

select name from INSTRUCTORS where

1 < any (select count(distinct cno) from GRADES

where empno = INSTRUCTORS.empno);

NAME

-----------Evans Robert

Will Samuel

5.16

Sorting Results

Data obtained from a select construct may be sorted on one or several columns

using the order by clause. This clause also gives the user the possibility of

opting for an ascending or descending sorting order on each of the columns. By

default, the ascending order is chosen.

Example 5.16.1 Suppose that we need to sort the GRADES tuples on the

student number. For each student, we sort the grades in descending order. This

can be done with the query:

select * from GRADES

order by stno, grade desc;

STNO

---------1011

1011

1011

1011

2415

2661

2661

2661

3442

3566

3566

3566

4022

4022

EMPNO

----------019

056

023

019

019

234

019

019

234

019

019

019

056

234

CNO

----cs210

cs240

cs110

cs110

cs240

cs310

cs110

cs210

cs410

cs240

cs110

cs210

cs240

cs310

SEM

YEAR

GRADE

------ ---------- ---------FALL

2003

90

SPRING

2004

90

SPRING

2003

75

FALL

2002

40

SPRING

2003

100

SPRING

2004

100

FALL

2002

80

FALL

2003

70

SPRING

2003

60

SPRING

2003

100

FALL

2002

95

FALL

2003

90

SPRING

2004

80

SPRING

2004

75

4022

4022

5544

5544

5571

5571

5571

019

023

019

056

019

234

019

113

cs210

cs110

cs110

cs240

cs210

cs410

cs240

SPRING

SPRING

FALL

SPRING

SPRING

SPRING

SPRING

2004

2003

2002

2004

2004

2003

2003

70

60

100

70

85

80

50

Instead of using the name of the columns one could use their ordinal position

in the select phrase.

Example 5.16.2 An equivalent form of the query from Example 5.16.1 is

select stno, empno, cno, sem, year, grade

from GRADES

order by 1, 6 desc;

Example 5.16.3 To sort the grades based on the second digit of the course

number, and, then on the first digit of the course number (which are the fourth

and the third characters of course numbers) we write:

select * from grades

order by substr(cno,4,1), substr(cno,3,1)

STNO

---------1011

2661

3566

5544

1011

4022

1011

3566

4022

5571

2661

2661

4022

3442

5571

3566

4022

5571

5544

1011

2415

EMPNO

----------019

019

019

019

023

023

019

019

019

019

019

234

234

234

234

019

056

019

056

056

019

CNO

----cs110

cs110

cs110

cs110

cs110

cs110

cs210

cs210

cs210

cs210

cs210

cs310

cs310

cs410

cs410

cs240

cs240

cs240

cs240

cs240

cs240

SEM

YEAR

GRADE

------ ---------- ---------FALL

2002

40

FALL

2002

80

FALL

2002

95

FALL

2002

100

SPRING

2003

75

SPRING

2003

60

FALL

2003

90

FALL

2003

90

SPRING

2004

70

SPRING

2004

85

FALL

2003

70

SPRING

2004

100

SPRING

2004

75

SPRING

2003

60

SPRING

2003

80

SPRING

2003

100

SPRING

2004

80

SPRING

2003

50

SPRING

2004

70

SPRING

2004

90

SPRING

2003

100

114

stno

1011

2661

3566

5544

1011

4022

1011

3566

4022

5571

2661

3566

5571

1011

4022

5544

2415

2661

4022

3442

5571

empno

019

019

019

019

023

023

019

019

019

019

019

019

019

056

056

056

019

234

234

234

234

cno

cs110

cs110

cs110

cs110

cs110

cs110

cs210

cs210

cs210

cs210

cs210

cs240

cs240

cs240

cs240

cs240

cs240

cs310

cs310

cs410

cs410

sem

FALL

FALL

FALL

FALL

SPRING

SPRING

FALL

FALL

SPRING

SPRING

FALL

SPRING

SPRING

SPRING

SPRING

SPRING

SPRING

SPRING

SPRING

SPRING

SPRING

year

2002

2002

2002

2002

2003

2003

2003

2003

2004

2004

2003

2003

2003

2004

2004

2004

2003

2004

2004

2003

2003

grade

40

80

95

100

75

60

90

90

70

85

70

100

50

90

80

70

100

100

75

60

80

5.17

The group by clause serves to group together tuples of tables based on the

common value of an attribute or of a group of attributes. Suppose, for instance,

that we wish to partition the table GRADES into groups based on the course

number. This can be done by using a construct like

select ... from GRADES group by cno

Conceptually, we operate on the table shown in Figure 5.2. The reader should

imagine that the table has been divided into five groups, each corresponding

to one course. In the previous select, we left open the target list following

select. Once a table has been partitioned into groups (using group by), the

select construct that we use must return one or more atomic pieces of data for

every group. The term atomic, in this context, refers to simple pieces of data

(numbers, strings, etc.). By contrast, a set of values is not an atomic piece of

data. For instance, the number of students enrolled in each course can be listed

by:

select cno, count(stno) as totenr from GRADES

group by cno

CNO

TOTENR

----- ----------

stno

1011

2661

3566

5544

1011

4022

1011

3566

2661

5571

4022

3566

5571

2415

5544

1011

4022

2661

4022

3442

5571

empno

019

019

019

019

023

023

019

019

019

019

019

019

019

019

056

056

056

234

234

234

234

115

cno

cs110

cs110

cs110

cs110

cs110

cs110

cs210

cs210

cs210

cs210

cs210

cs240

cs240

cs240

cs240

cs240

cs240

cs310

cs310

cs410

cs410

sem

FALL

FALL

FALL

FALL

SPRING

SPRING

FALL

FALL

FALL

SPRING

SPRING

SPRING

SPRING

SPRING

SPRING

SPRING

SPRING

SPRING

SPRING

SPRING

SPRING

year

2002

2002

2002

2002

2003

2003

2003

2003

2003

2004

2004

2003

2003

2003

2004

2004

2004

2004

2004

2003

2003

grade

40

80

95

100

75

60

90

90

70

85

70

100

50

100

70

90

80

100

75

60

80

cs110

cs210

cs240

cs310

cs410

6

5

6

2

2

select cno, stno from GRADES

group by cno

because more than one student is enrolled in a course, and therefore the entries

of the result under the attribute stno would be sets of values rather than simple

values. SQL enforces the atomicity of the data generated by a select with

group by by demanding that any component of the target list of such a select

must be either one of the grouping attributes or a built-in function.

Example 5.17.1 Grouping can be done on more than one attribute. Suppose

that now we are interested not in the total enrollment but, rather, in the enrollment numbers for each offering of the courses, that is, in the numbers during

every semester of every year. This can be done using the select construction:

select cno, sem, year, count(stno) as enrol

from GRADES

group by cno, year, sem

order by cno, sem, year;

Then, the query generates the answer:

116

CNO

----cs110

cs110

cs210

cs210

cs240

cs240

cs310

cs410

SEM

YEAR

ENROL

------ ---------- ---------FALL

2002

4

SPRING

2003

2

FALL

2003

3

SPRING

2004

2

SPRING

2003

3

SPRING

2004

3

SPRING

2004

2

SPRING

2003

2

Example 5.17.2 The next select construct determines the average grade and

the number of courses taken by every student and sorts the results in ascending

order on the student number:

select stno, avg(grade) as average,

count(cno) as ncourses

from GRADES

group by stno

order by stno;

STNO

AVERAGE

NCOURSES

---------- ---------- ---------1011

73.75

4

2415

100

1

2661

83.33

3

3442

60

1

3566

95

3

4022

71.25

4

5544

85

2

5571

71.66

3

Grouping can be applied in combination with selection. In such cases, selection is applied first and the resulting rows are grouped.

Example 5.17.3 The select construct that follows determines the average

grade in cs110 during successive offerings of this course:

select sem, year, avg(grade) from GRADES

where cno = cs110

group by sem, year

order by year, sem

SEM

YEAR AVG(GRADE)

------ ---------- ---------FALL

2002

78.75

SPRING

2003

67.5

117

the clause having. The condition that follows having must be formulated to

include only data that have an atomic character for every group.

Example 5.17.4 Let us determine the average grade obtained in courses that

are taken by more than two students. After grouping the tuples of GRADES on

cno, we retain the groups that include more than two students by applying the

clause having count(grade) > 2:

select cno, avg(grade) from GRADES

group by cno

having count(grade) > 2

order by cno;

CNO

AVG(GRADE)

----- ---------cs110

75

cs210

81

cs240

81.66

the tabular variable T , whose heading is A1 Am B1 Bn by the tabular

variable S whose heading is B1 Bn we compute the number k of distinct

rows in S. Then, we seek to retrieve those m-tuples (a1 , . . . , am ) that occur in

T and are associated in that tabular variable with at least k distinct tuples.

Example 5.17.5 Recall that in Example 5.12.3 we solved the query Find

names of instructors who have taught every 100-level course, that is, every

course whose first digit of the course number is 1 by implementing division in

SQL.

Here we determine the number 100-level courses and, then we seek the employee numbers that are associated with all these courses in the GRADES table:

select name

from INSTRUCTORS,

(select empno from GRADES

where cno like cs1__

group by empno

having count(distinct cno) =

all(select count(distinct cno)

from COURSES

where cno like cs1__)) E

where INSTRUCTORS.empno = E.empno;

As expected, this will return the same result as the query discussed in Example 5.12.3.

118

5.17.1

The function decode is typically used with four arguments and has the syntax:

decode(value,search value,result,default value)

The value returned by this function is:

(

r if x = s

decode(x, s, r, d) =

d otherwise.

Example 5.17.6 A course is defined as introductory if its first digit is one.

Using the decode function we can print a list of students and the courses they

took followed by an indication of their status using the query:

select stno,cno,

decode(substr(cno,3,1),1,Introductory course,Advanced course)

from grades;

Note that the first digit of the course number is the third character of the cno

value; this digit is extracted by the function substr previously discussed. The

query yields the following result:

STNO

---------1011

1011

1011

1011

2415

2661

2661

2661

3442

3566

3566

3566

4022

4022

4022

4022

5544

5544

5571

5571

5571

CNO

----cs110

cs110

cs210

cs240

cs240

cs110

cs210

cs310

cs410

cs110

cs210

cs240

cs110

cs210

cs240

cs310

cs110

cs240

cs210

cs240

cs410

DECODE(SUBSTR(CNO,3

------------------Introductory course

Introductory course

Advanced course

Advanced course

Advanced course

Introductory course

Advanced course

Advanced course

Advanced course

Introductory course

Advanced course

Advanced course

Introductory course

Advanced course

Advanced course

Advanced course

Introductory course

Advanced course

Advanced course

Advanced course

Advanced course

decode(value,search value,result, [search value,result,] default value)

This variant of decode is defined by:

(

ri if x = si for 1 i n

decode(x, s1 , r1 , . . . , sn , rn , d) =

d otherwise.

119

Example 5.17.7 The following variant of the previous query will print First

year course, Second year course, etc., depending on the first digit of the course

number:

select stno,cno,

decode(substr(cno,3,1),1,First year course,

2,Second year course,

3,Third year course,

4,Fourth year course,

Special course)

from grades;

STNO

---------1011

1011

1011

1011

2415

2661

2661

2661

3442

3566

3566

3566

4022

4022

4022

4022

5544

5544

5571

5571

5571

CNO

----cs110

cs110

cs210

cs240

cs240

cs110

cs210

cs310

cs410

cs110

cs210

cs240

cs110

cs210

cs240

cs310

cs110

cs240

cs210

cs240

cs410

DECODE(SUBSTR(CNO,

-----------------First year course

First year course

Second year course

Second year course

Second year course

First year course

Second year course

Third year course

Fourth year course

First year course

Second year course

Second year course

First year course

Second year course

Second year course

Third year course

First year course

Second year course

Second year course

Second year course

Fourth year course

can be used in two formats; either as:

case value

when search value result

[when search valueresult]

else default value

end

or as:

case when condition result

[when conditionresult]

else default value

120

end

In the first case the function returns the result that corresponds to the search

value that matches the first argument; in the second case, case returns the

result that corresponds to the first condition that is satisfied.

Example 5.17.8 Using case we can give an alternate solution to the query

solved in Example 5.17.7:

select stno,cno,

case substr(cno,3,1)

when 1 then First year course

when 2 then Second year course

when 3 then Third year course

when 4 then Fourth year course

else Special course

end

from grades;

Example 5.17.9 Suppose that the minimal passing grade is 60 for the first

and second year courses and 70 for the third and fourth year courses. We wish

to print a report that prints Passed or Failed depending on the grade and

level of the course. This can be done with the following query:

select stno,cno, grade,

case when (substr(cno,3,1) in (1,2) and grade >= 60) or

(substr(cno,3,1) in (3,4) and grade >= 70)

then Passed

else Failed

end

from grades

STNO

---------1011

2661

3566

5544

1011

4022

3566

5571

2415

3442

5571

1011

2661

3566

5571

4022

CNO

GRADE CASEWH

----- ---------- -----cs110

40 Failed

cs110

80 Passed

cs110

95 Passed

cs110

100 Passed

cs110

75 Passed

cs110

60 Passed

cs240

100 Passed

cs240

50 Failed

cs240

100 Passed

cs410

60 Failed

cs410

80 Passed

cs210

90 Passed

cs210

70 Passed

cs210

90 Passed

cs210

85 Passed

cs210

70 Passed

5544

1011

4022

2661

4022

cs240

cs240

cs240

cs310

cs310

5.17.2

70

90

80

100

75

121

Passed

Passed

Passed

Passed

Passed

For analyzing complex data, we often wish to partition data into blocks and then

calculate subtotals for these blocks. For example, we may wish to analyze sales

data by geographical region, so we want to calculate values for New England, the

Midwest, the South, etc. Such analyses are faciliatated by ORACLEs rollup

extension of group by.

Example 5.17.10 Suppose that we need to print a report summarizing the

number of grades given in every course by every instructor. We wish to print

subtotals for every course and then a general total for all courses. This can be

done in SQL using three subqueries (each containing a group by clause) as

follows:

select cno,empno,count(grade)

from grades

group by cno,empno

union

select cno,,count(grade)

from grades

group by cno

union

select ,,count(grade)

from grades;

CNO

----cs110

cs110

cs110

cs210

cs210

cs240

cs240

cs240

cs310

cs310

cs410

cs410

EMPNO

COUNT(GRADE)

----------- -----------019

4

023

2

6

019

5

5

019

3

056

3

6

234

2

2

234

2

2

21

122

It is clear that the execution of this query entails three scans of the table

GRADES followed by the computation of the unions. The result is sorted because

of the use of the union operation.

In SQL Plus we can replace the cumbersome query used in Example 5.17.10

by:

select cno,empno,count(grade)

from grades

group by rollup(cno,empno);

which produces exactly the same result. Note that after the number of grades

for the first two groups are reported in the first two detail rows a blank is printed

for the empno of the third row; this is the rollup way of indicating that this

row contains the subtotal number of grades for the course cs110. A new detail

row follows for cs210 and, since this course is taught only by the employee 019,

the next row contains a subtotal for this course, etc. Finally, the last row, with

blank for the first two columns is the total number of grades for all courses.

We conclude that the rollup extension of group by generates subtotals in

increasing order of aggregation until all expressions in the group by clause are

rolled up.

Example 5.17.11 The next example uses three grouping attributes cno, empno,

stno:

select cno,empno,stno,count(grade)

from grades

group by rollup(cno,empno,stno)

CNO

----cs110

cs110

cs110

cs110

cs110

cs110

cs110

cs110

cs110

cs210

cs210

cs210

cs210

cs210

cs210

cs210

cs240

cs240

cs240

EMPNO

----------019

019

019

019

019

023

023

023

019

019

019

019

019

019

019

019

019

STNO

COUNT(GRADE)

---------- -----------1011

1

2661

1

3566

1

5544

1

4

1011

1

4022

1

2

6

1011

1

2661

1

3566

1

4022

1

5571

1

5

5

2415

1

3566

1

5571

1

cs240

cs240

cs240

cs240

cs240

cs240

cs310

cs310

cs310

cs310

cs410

cs410

cs410

cs410

019

056

056

056

056

1011

4022

5544

234

234

234

2661

4022

234

234

234

3442

5571

123

3

1

1

1

3

6

1

1

2

2

1

1

2

2

21

The order in which attributes are rolled up influences the result of the query

as the next example shows:

Example 5.17.12 Suppose that we invert the grouping attributes cno and

empno as in

select empno,cno, count(grade)

from grades

group by rollup(empno,cno);

EMPNO

----------019

019

019

019

023

023

056

056

234

234

234

CNO

COUNT(GRADE)

----- -----------cs110

4

cs210

5

cs240

3

12

cs110

2

2

cs240

3

3

cs310

2

cs410

2

4

21

Note that this time the subtotals are computed for every employee, and then,

for all employees.

Partial rollups, that is, rollups that involve only a subset of the grouping

attributes, are always possible as shown in the next example.

Example 5.17.13 Suppose that we need to count the number of times a student takes a course and the number of course offerings a student took. This can

be achieved by:

124

group by stno,rollup(cno);

STNO

---------1011

1011

1011

1011

2415

2415

2661

2661

2661

2661

3442

3442

3566

3566

3566

3566

4022

4022

4022

4022

4022

5544

5544

5544

5571

5571

5571

5571

CNO

COUNT(GRADE)

----- -----------cs110

2

cs210

1

cs240

1

4

cs240

1

1

cs110

1

cs210

1

cs310

1

3

cs410

1

1

cs110

1

cs210

1

cs240

1

3

cs110

1

cs210

1

cs240

1

cs310

1

4

cs110

1

cs240

1

2

cs210

1

cs240

1

cs410

1

3

This shows that the student whose number is 1011 took four course offerings

and repeated cs110. Note that for a partial rollup no general total is produced.

The rollup extension is especially useful when there exists a natural order

on the attributes of a table, as is in the next example.

Example 5.17.14 Suppose that we have the table SALES that contains records

of sales in a chain of department stores that is present in several regions of the

country: the North East (NE), South East (SE), and Midwest (MW).

REGION

---------NE

NE

ST

-NY

NY

CITY

--------------New York City

New York City

STORENO

SALESVOL

---------- ---------55

1000

67

800

NE

NE

NE

SE

SE

SE

SE

SE

MW

MW

MW

MW

MW

NY

MA

MA

FL

FL

GA

GA

GA

OH

KS

KS

KS

KS

Syracuse

Worcester

Boston

Miami

Miami

Atlanta

Atlanta

Augusta

Athens

Topeka

Lawrence

Lawrence

Wichita

125

90

41

83

62

74

60

52

95

48

33

72

09

38

600

1000

750

450

900

500

1100

300

590

860

300

700

900

region, st, city, and storeno. To study the total sales in each state we can use

the following rollup query:

select region, state, sum(salesvol)

from sales

group by rollup(region,state);

REGION

---------MW

MW

MW

NE

NE

NE

SE

SE

SE

ST SUM(SALESVOL)

-- ------------KS

2760

OH

590

3350

MA

1750

NY

2400

4150

FL

1350

GA

1900

3250

10750

sales in each city. This is accomplished by:

select region, state, city, sum(salesvol)

from sales

group by rollup(region, state, city)

REGION

---------MW

MW

MW

ST

-KS

KS

KS

CITY

SUM(SALESVOL)

--------------- ------------Lawrence

1000

Topeka

860

Wichita

900

126

MW

MW

MW

MW

NE

NE

NE

NE

NE

NE

NE

SE

SE

SE

SE

SE

SE

KS

OH Athens

OH

MA

MA

MA

NY

NY

NY

Boston

Worcester

New York City

Syracuse

FL Miami

FL

GA Atlanta

GA Augusta

GA

2760

590

590

3350

750

1000

1750

1800

600

2400

4150

1350

1350

1600

300

1900

3250

Another useful extension of group by is cube. The rollup extension summarizes at increasing levels of aggregation from left to right; in contrast, cube

summarizes at all possible levels of aggregation.

Example 5.17.15 A full aggregation can be achieved by using cube as in:

select cno,empno,count(grade)

from grades

group by cube(cno,empno);

CNO

----cs110

cs110

cs110

cs210

cs210

cs240

cs240

cs240

cs310

cs310

cs410

cs410

EMPNO

COUNT(GRADE)

----------- -----------019

4

023

2

6

019

5

5

019

3

056

3

6

234

2

2

234

2

2

019

12

023

2

056

3

234

4

21

127

result. For example, the query:

select empno,cno,count(grade)

from grades

group by cube(empno,cno);

will result in

EMPNO

----------019

019

019

019

023

023

056

056

234

234

234

CNO

COUNT(GRADE)

----- -----------cs110

4

cs210

5

cs240

3

12

cs110

2

2

cs240

3

3

cs310

2

cs410

2

4

cs110

6

cs210

5

cs240

6

cs310

2

cs410

2

21

The totals computed by either of these cubes are shown in Figure 5.4.

Partial cube aggregations include group by clauses of the form

group by A1 , . . . , Ak , cube (B1 , . . . , B )

and compute total values of an aggregate function for all groups that can be

obtained for values of A1 , . . . , Ak and all combinations of values of B1 , . . . , Bk .

Example 5.17.16 The partial cube aggregation:

select cno,empno,stno,count(grade) from grades

group by cno,cube(empno,stno)

CNO

----cs110

cs110

cs110

cs110

cs110

cs110

EMPNO

----------019

019

019

019

019

023

STNO

COUNT(GRADE)

---------- -----------1011

1

2661

1

3566

1

5544

1

4

1011

1

128

Total for

cno

course

cs410

cs310

cs240

cs210

cs110

21

5

4

019

023

056

12

234 empno

4

Total for

employee

cs110

cs110

cs110

cs110

cs110

cs110

cs110

cs110

cs210

cs210

cs210

cs210

cs210

cs210

cs210

cs210

cs210

cs210

cs210

cs210

cs240

cs240

cs240

cs240

cs240

cs240

cs240

cs240

cs240

cs240

cs240

cs240

cs240

cs240

cs240

cs310

cs310

cs310

cs310

cs310

cs310

cs410

cs410

cs410

cs410

cs410

cs410

023

023

4022

1011

2661

3566

4022

5544

019

019

019

019

019

019

1011

2661

3566

4022

5571

1011

2661

3566

4022

5571

019

019

019

019

056

056

056

056

2415

3566

5571

1011

4022

5544

1011

2415

3566

4022

5544

5571

234

234

234

2661

4022

2661

4022

234

234

234

53 rows selected.

3442

5571

3442

5571

129

1

2

2

1

1

1

1

6

1

1

1

1

1

5

1

1

1

1

1

5

1

1

1

3

1

1

1

3

1

1

1

1

1

1

6

1

1

2

1

1

2

1

1

2

1

1

2

130

that serve to summarize other rows and, therefore, contain null components.

Namely, grouping(A) returns 1 for those A-compnents of rows that contain null

values and 0, otherwise.

Example 5.17.17 Consider again the cube query discussed in Example 5.17.15.

The summarization query suppemented by the use of the function grouping:

select cno, empno, count(grade) as nogr,

grouping(cno) as c, grouping(empno) as e

from grades

group by cube(cno,empno)

CNO

----cs110

cs110

cs110

cs210

cs210

cs240

cs240

cs240

cs310

cs310

cs410

cs410

EMPNO

NOGR

C

E

----------- ---------- ---------- ---------019

4

0

0

023

2

0

0

6

0

1

019

5

0

0

5

0

1

019

3

0

0

056

3

0

0

6

0

1

234

2

0

0

2

0

1

234

2

0

0

2

0

1

019

12

1

0

023

2

1

0

056

3

1

0

234

4

1

0

21

1

1

17 rows selected.

In turn, we can use the grouping values and the having clause to retain only

certain summary rows as in

select cno, empno, count(grade) as nogr,

grouping(cno) as c, grouping(empno) as e

from grades

group by cube(cno,empno)

having grouping(cno) = 1 or grouping(empno) = 1

CNO

EMPNO

NOGR

C

E

----- ----------- ---------- ---------- ---------cs110

6

0

1

cs210

cs240

cs310

cs410

019

023

056

234

5.18

5

6

2

2

12

2

3

4

21

0

0

0

0

1

1

1

1

1

131

1

1

1

1

0

0

0

0

1

to produce quite refined reports. These features reduce the need to use external

reporting tools and simplify statistical data analysis.

Analytical functions compute a value for each row of a query. These values

are, in turn, based on a set of rows that is computed for each row and may be

considered to appear in a sliding window. This set of rows is known as a window

and it is specified by the analytical clause, which is the parenthesizd expression

that follows the reserved word over.

An example of the use of an analytic function (which we discuss in detail

in Example 5.18.2) is the following query that computes a list of students, the

number of courses they took, and their grade point average.

select STUDENTS.name, GA.noc as no_of_c, GA.gpa as gpa,

rank() over (partition by GA.noc

order by GA.gpa desc) as rank

from (select stno,

count(distinct cno) as noc,

avg(grade) as gpa

from GRADES

group by stno) GA, STUDENTS

where STUDENTS.stno = GA.stno

The function rank that we use in this query computes for each row a numerical rank starting from the content of the window.

The analytical clause used in the previous example indicates that the rows

retrieved by the query are partitioned based on the value of the number of

credits (noc) and, then in each group the rows are ordered according to the

values of the gpa attribute.

In general, the computation of the analytical clause is done after the computation of the from, where, group by, and having clauses.

Analytic functions are classified as shown in the table below:

132

FUNCTION CLASS

Ranking Functions

Windowing Functions

Reporting Functions

Lag/Lead Functions

Statistical Functions

USAGE

Calculating ranks, percentiles and n-tiles

Cumulative and moving averages

Calculating shares

Finding a value in a row located a

specified number of rows from the current row

Linear regression and other statistics

1. computation of products, selections, grouping, and having clauses;

2. application of analytic functions to the resulting sets of rows;

3. processing of the final order by clauses.

The results of the first phase can be partitioned. Partitions are created after

the groups defined by the group by clauses and, thus, may use any aggregate

functions such as sum, count, etc.

For each row in a partition, a sliding window of data may be defined. The

window determines a sequence of rows that is used to perform calculations for

the current row. Window sizes can be specified as numbers of rows or can be

determined by intervals in a domain. Either end of a window or both ends can

move, depending on the definition of the window.

Each computation involving an analytic function is based on a current row .

This row serves as reference for the ends of the window.

5.18.1

Ranking Functions

SQL Plus contains the ranking functions rank() and dense_rank() that can

be use to rank tuples in an order determined by certain attributes or expressions. Both functions generate ranks in either ascending or descending order,

but dense_rank() does not leave gaps in rank numbers when a tie occurs. The

default order is, as usual, ascending order.

Example 5.18.1 To rank the grade records based on the grade obtained in

any course we may write:

select stno, grade,

rank() over (order by grade)

from grades;

STNO

GRADE RANK

---------- ---------- ---1011

40

1

5571

50

2

4022

60

3

3442

60

3

2661

70

5

5544

70

5

4022

70

5

1011

4022

2661

5571

4022

5571

1011

1011

3566

3566

5544

2415

2661

3566

75

75

80

80

80

85

90

90

90

95

100

100

100

100

133

8

8

10

10

10

13

14

14

14

17

18

18

18

18

where the highest ranking is attributed to the grade record that involves the

lowest grade. To reverse the ranking we write:

select stno, grade,

rank() over (order by grade desc)

from grades;

which yields:

STNO

GRADE RANK

---------- ---------- ---5544

100

1

3566

100

1

2415

100

1

2661

100

1

3566

95

5

1011

90

6

1011

90

6

3566

90

6

5571

85

9

2661

80

10

5571

80

10

4022

80

10

1011

75

13

4022

75

13

2661

70

15

5544

70

15

4022

70

15

4022

60

18

3442

60

18

5571

50

20

1011

40

21

Note that the first four grade records are tied for the first place; therefore,

the record that follows the tied records has rank 5. With the dense_rank() all

134

four tied records will have rank 1 and the record that follows will have rank 2.

This can be achieved by writing:

select stno, grade,

dense_rank() over (order by grade desc) as den_rank

from grades;

STNO

GRADE DEN_RANK

---------- ---------- -------5544

100

1

3566

100

1

2415

100

1

2661

100

1

3566

95

2

1011

90

3

1011

90

3

3566

90

3

5571

85

4

2661

80

5

5571

80

5

4022

80

5

1011

75

6

4022

75

6

2661

70

7

5544

70

7

4022

70

7

4022

60

8

3442

60

8

5571

50

9

1011

40

10

Example 5.18.2 To rank the students in order of the number of courses they

have taken we could write:

select STUDENTS.name, GA.noc as no_of_courses,

dense_rank() over (order by GA.noc desc) as den_rank

from (select stno, count(distinct cno) as noc

from grades

group by stno) GA, STUDENTS

where STUDENTS.stno = GA.stno;

NAME

NO_OF_COURSES

DEN_RANK

----------------------------------- -------Prior Lorraine

4

1

Edwards P. David

3

2

Mixon Leatha

Pierce Richard

Lewis Jerry

Rawlings Jerry

Grogan A. Mary

Novak Roland

3

3

3

2

1

1

135

2

2

2

3

4

4

If we wish to rank the students based on the number of courses and, then,

at an equal number of courses, to rank them in the order of the grade point

average, we could write the following query:

select STUDENTS.name, GA.noc as no_of_c, GA.gpa as gpa,

rank() over (partition by GA.noc

order by GA.gpa desc) as rank

from (select stno,

count(distinct cno) as noc,

avg(grade) as gpa

from GRADES

group by stno) GA, STUDENTS

where STUDENTS.stno = GA.stno

The partition by option establishes groups of equal GA.noc value, and then it

ranks the record in each such group using the order by clause. The result of

this query is:

NAME

NO_OF_C

GPA

RANK

------------------------ ---------- ---------Grogan A. Mary

1

100

1

Novak Roland

1

60

2

Rawlings Jerry

2

85

1

Pierce Richard

3

95

1

Mixon Leatha

3

83.33

2

Edwards P. David

3

73.75

3

Lewis Jerry

3

71.66

4

Prior Lorraine

4

71.25

1

8 rows selected.

In general, the expression in the partition by clause divides the set of rows

that results from the query in groups and the rank() function operates within

these groups; in other words, rank() is reset when the defining expression of the

group changes. The order by clause attached to the rank specifies the ranking

criterion and the order of the rows in each group.

5.18.2

Top-n Queries

Top-n queries ask for the n largest or smallest values of a column. Such queries

are solved in ORACLE using the pseudo-attribute ROWNUM which assigns a value

136

starting with 1 to each of the rows returned by a subquery. Thus, a top-n query

in SQL Plus requires the following elements:

1. a subquery containing the order by clause that ensures that the rows

retrieved by the subquery are placed in the proper order;

2. the main query that includes the ROWNUM pseudo-attribute and may include

a where clause to specify the number of returned rows.

Example 5.18.3 To retrieve the top three students in the order of their grade

point averages we write:

select ROWNUM as rank, name, avgg from

(select STUDENTS.stno, STUDENTS.name, avg(grade) as avgg

from STUDENTS, GRADES

where STUDENTS.stno = GRADES.stno

group by STUDENTS.stno, STUDENTS.name

order by avg(grade) desc)

where ROWNUM <= 3

RANK

-----1

2

3

NAME

AVGG

--------------- ------Grogan A. Mary

100

Pierce Richard

95

Rawlings Jerry

85

in the subquery. This can be achieved by either replacing desc with asc, or by

omitting desc altogether (since the default is asc). Thus, the phrase:

select ROWNUM as rank, name, avgg from

(select STUDENTS.stno, STUDENTS.name, avg(grade) as avgg

from STUDENTS, GRADES

where STUDENTS.stno = GRADES.stno

group by STUDENTS.stno, STUDENTS.name

order by avg(grade))

where ROWNUM <= 3;

will yield:

RANK

---1

2

3

NAME

--------------Novak Roland

Prior Lorraine

Lewis Jerry

AVGG

----60

71.25

71.67

Example 5.18.4 Ties between rows may eliminate rows that we would expect

to see in results of our queries. The next query

137

count(distinct cno) as noc

from STUDENTS, GRADES

where STUDENTS.stno = GRADES.stno

group by STUDENTS.stno, STUDENTS.name

order by count(distinct cno) desc;

STNO

---------4022

1011

2661

3566

5571

5544

2415

3442

NAME

NOC

--------------------Prior Lorraine

4

Edwards P. David

3

Mixon Leatha

3

Pierce Richard

3

Lewis Jerry

3

Rawlings Jerry

2

Grogan A. Mary

1

Novak Roland

1

To retrieve the first four students among the ones who took the largest

number of courses we write:

select ROWNUM as rank, name, noc

from (select STUDENTS.stno, STUDENTS.name,

count(distinct cno) as noc

from STUDENTS, GRADES

where STUDENTS.stno = GRADES.stno

group by STUDENTS.stno, STUDENTS.name

order by count(distinct cno) desc)

where ROWNUM <= 4

RANK NAME

NOC

---------- --------------1 Prior Lorraine

4

2 Edwards P. David

3

3 Mixon Leatha

3

4 Pierce Richard

3

A more complicated example involves using two subquery rankings.

Example 5.18.5 Suppose that we need to find, as above, the top three students; in addition, we need to find for each of these students their ranking from

the point of view of the number of courses they took. This can be done using

the query:

select ROWNUM as gr_rank, name, c_rank from

(select name, avgg, ROWNUM as c_rank from

(select name, avg(grade) as avgg, count(distinct cno) as nc

from STUDENTS S, GRADES G

where S.stno = G.stno

138

group by S.stno,S.name

order by nc desc)

order by avgg desc)

where ROWNUM <= 3

GR_RANK

------1

2

3

5.18.3

NAME

C_RANK

-------------- -----Grogan A. Mary

7

Pierce Richard

4

Rawlings Jerry

6

Windowing functions are used in SQL Plus to compute cumulative, moving, and

other aggregate functions applied to a set of tuples called a window. The size

and shape of the window is always defined relative to a row in a block; this

reference row is called the current row.

Aggregate functions that can be used include sum, avg, min, max, statistical functions (discussed in Section 5.19), as well as two special functions,

first value and last value that return the first and last values in a window.

Example 5.18.6 To compute the evolution of the grade average for each student as he or she advances towards graduation, we can write a query that returns

the cumulative average for each student for the sequence of semesters when the

student is active:

select stno, year, sem,

avg(grade) over (partition by stno

order by year, sem desc

rows unbounded preceding) as ag

from grades

order by stno, year, sem desc;

STNO

YEAR SEM

AG

---------- ---------- ------ ---------1011

2002 FALL

40

1011

2003 SPRING

57.50

1011

2003 FALL

68.33

1011

2004 SPRING

73.75

2415

2003 SPRING

100

2661

2002 FALL

80

2661

2003 FALL

75

2661

2004 SPRING

83.33

3442

3566

3566

3566

4022

4022

4022

4022

5544

5544

5571

5571

5571

2003

2002

2003

2003

2003

2004

2004

2004

2002

2004

2003

2003

2004

139

SPRING

FALL

SPRING

FALL

SPRING

SPRING

SPRING

SPRING

FALL

SPRING

SPRING

SPRING

SPRING

60

95

97.5

95

60

65

70

71.25

100

85

50

65

71.67

The words unbound preceding mean that the window over which we compute

the grade average extends to all rows that involve the same student and precede

the current row.

The syntax of the windowing functions is:

aggregate function (value expression | *)

over ([partition byvalue expression{,value expression}]

order by value expression [collate clause]

[asc | desc] [nulls first | nulls last]

{,value expression [collate clause]

[asc | desc] [nulls first | nulls last}

[rows | range]

[[unbounded preceding | value expression preceding] |

between [unbounded preceding | value expression preceding]

andhcurrent row | value expression following

5.19

Statistics in SQL

discuss in this section. These function are incorporated in the newest SQL

standard, SQL2003.

5.19.1

Population and sample variance can be computed using the functions var pop

and var samp, respectively. Both functions take an attribute as argument and

apply to the remaining non-null values. If the sequence of values of an attribute

A is (x1 , . . . , xn ), then the population variance is:

var pop(A) =

Pn

i=1 (xi

x

)2

Pn

i=1

x2i (

n2

Pn

i=1

xi )

140

var samp(A) =

Pn

Pn

x

)2

n

=

n1

i=1 (xi

Pn

i=1

P

2

x2i ( ni=1 xi )

,

n(n 1)

i

. As it is shown in statistics, the sample variance is an

where x

= i=1

n

unbiased estimator of the theoretical variance.

Example 5.19.1 To determine the population variance for the grade population of each student we group the records of GRADES on the student number

and then compute the population variance for each group. This is done by the

following select phrase:

select stno, var_pop(grade)

from GRADES

group by stno;

which returns:

STNO

VAR_POP(GRADE)

---------- -------------1011

417.18

2415

0

2661

155.55

3442

0

3566

16.66

4022

54.68

5544

225

5571

238.88

select stno, var_samp(grade)

from GRADES

group by stno;

which yields:

STNO

VAR_SAMP(GRADE)

---------- --------------1011

556.25

2415

2661

233.33

3442

3566

25

4022

72.91

5544

450

5571

358.33

8 rows selected.

To compute the population variance grade over the entire GRADES table we

write:

141

select var_pop(grade)

from GRADES;

which gives:

VAR_POP(GRADE)

-------------275.283447

A similar select

select var_samp(grade)

from GRADES;

VAR_SAMP(GRADE)

--------------289.047619

If the set of values of the sample contains one value, then the function

var samp returns a null value. This is the case in the query:

select var_samp(grade)

from GRADES

where stno= 1011 and cno = cs110

and year = 1999;

which yields:

VAR_SAMP(GRADE)

---------------

population contains a single value; otherwise, variance returns the sample

variance. For instance, the query

select variance(grade)

from GRADES

where stno =1011 and cno = cs110

and year =1999;

returns:

VARIANCE(GRADE)

--------------0

The population standard deviation and the sample standard deviation that are

the square roots of the population and the sample variance, respectively, can be

computed using the functions stddev pop and stddev samp, respectively.

Example 5.19.2 To compute the population standard deviation of the set of

values of the grade for each student we write:

142

from GRADES

group by stno;

STNO

STDDEV_POP(GRADE)

---------- ----------------1011

20.42

2415

0

2661

12.47

3442

0

3566

4.08

4022

7.39

5544

15

5571

15.45

8 rows selected.

select stno, stddev_samp(grade)

from GRADES

group by stno;

which generates:

STNO

STDDEV_SAMP(GRADE)

---------- -----------------1011

23.58

2415

2661

15.27

3442

3566

5

4022

8.53

5544

21.21

5571

18.92

8 rows selected.

The population and the sample covariances between the values that appear

under the attributes T.A and S.B are computed using the functions covar pop

and covar samp, respectively, as in the following select phrases:

select covar_pop(T.A,S.B) from T,S where T.C = S.D;

select covar_samp(T.A,S.B) from T,S where T.C = S.D;

Example 5.19.3 The table sstudy contains whose creation was described in

Appendix B records the number of hours slept during three successive nights

by a group of students. To determine the population covariance between the

average number of hours slept and the grade point average of the students we

write:

143

from (select stno, avg(grade) as avggrade

from GRADES

group by stno) g,

(select stno, avg(no_hours) as avghours

from SSTUDY

group by stno) s

where g.stno = s.stno;

COVAR_POP(G.AVGGRADE,S.AVGHOURS)

-------------------------------11.2673611

select covar_samp(g.avggrade, s.avghours)

from (select stno, avg(grade) as avggrade

from GRADES

group by stno) g,

(select stno, avg(no_hours) as avghours

from SSTUDY

group by stno) s

where g.stno = s.stno;

COVAR_SAMP(G.AVGGRADE,S.AVGHOURS)

--------------------------------12.8769841

Example 5.19.4 The correlation coefficient between the grade point average

and the average number of hours slept is computed by:

select corr(g.avggrade, s.avghours)

from (select stno, avg(grade) as avggrade

from GRADES

group by stno) g,

(select stno, avg(no_hours) as avghours

from SSTUDY

group by stno) s

where g.stno = s.stno;

CORR(G.AVGGRADE,S.AVGHOURS)

--------------------------.961293724

144

5.19.2

Linear Regression

link that exists between input and output data of an experiment starting from

a sequence of inputs and the corresponding observations of the outputs. If we

attempt to find this link as a linear function, then we apply linear regression.

Suppose that the input data is x1 , , xn and the corresponding output

sequence is y1 , . . . , yn and we seek to determine the linear function f (x) = ax+b

such that values yi are as close as possible to axi + b for 1 i n. This is

achieved by minimizing the total square error given by:

E=

n

X

i=1

It is possible toP

show that

minimum of E is achieved when:

Pthe P

n xi yi xi yi

a =

P

P 2

(n x2 xi )

P P i2 P P

yi xi xi xi yi

b =

.

P

P 2

(n x2i xi )

Thus, we obtain the regression line y = ax + b, where a is the slope and b is

the intercept. These numbers are computed by the functions regr slope and

regr intercept, respectively. Both take as arguments the averages of the xsequence and the y-sequence. The quality of the regression line obtained can be

evaluated using the goodness of fit regr r2 which takes the same arguments as

the functions mentioned above.

Example 5.19.5 To compute the regression parameters for the sequences of

average grades and the sequence of average hours of nightly sleep for all students

we write:

select regr_count(g.avggrade, s.avghours) as rc,

regr_avgx(g.avggrade, s.avghours) as avgx,

regr_avgy(g.avggrade, s.avghours) as avgy,

regr_slope(g.avggrade, s.avghours) as slope,

regr_intercept(g.avggrade, s.avghours) as interc,

regr_r2(g.avggrade, s.avghours) as gof

from (select stno, avg(grade) as avggrade

from GRADES

group by stno) g,

(select stno, avg(no_hours) as avghours

from SSTUDY

group by stno) s

where g.stno = s.stno;

RC

AVGX

AVGY

SLOPE

INTERC

GOF

---------- ---------- ---------- ---------- ---------- ---------8 7.08333333

80 12.7755906 -10.493766 .924085625

145

3

0 e

3

e

?

4 e

?

?

1 e 5 e

6

se

2

we

3 6

7

5.20

Graphs represent binary relations on sets, in the sense of the following definition.

A graph is defined as a pair of sets G = (V, E), where V is the set of vertices of

G and E V V is the set of edges of G. Clearly, E is a binary relation on V .

If (u, v) E, we say that u is origin of the edge (u, v) and v is destination

of the same edge. A graph can be drawn by representing the vertices by points

and edges by arrows. Namely, if (u, v) is an edge, we draw in arrow that begins

at u and ends at v.

Example 5.20.1 Consider the graph G = (V, E), where V = {0, 1, 2, 3, 4, 5, 6}

and E = {(0, 1), (0, 3), (1, 2), (2, 5), (2, 6), (3, 4), (3, 6), (4, 5), (5, 6)}. This graph

is drawn in Figure 5.5.

Graphs can be represented by tables that have the heading origin destination. Each edge (u, v) corresponds to a pair in the table. Clearly, for any graph

the corresponding table contains the same information as the graph.

Example 5.20.2 The graph of Example 5.20.1 is represented by the table:

origin

0

0

1

2

2

3

3

4

5

GRAPH

destination

1

3

2

5

6

4

6

5

6

A path in the graph G = (V, E) joins v0 to vn is a sequence of vertices (v0 , v1 , . . . , vn )

such that (vi , vi+1 ) is an edge in G for 0 i n 1. We refer to v0 as the

origin of the path and to vn as the destination of the path. The number n is the

146

length of the path. A path that begins and ends in the same vertex is a cycle

or a loop. If a graph has no cycles, then we say that the graph is acyclic. Note

that the graph defined in Example 5.20.1 is acyclic.

We write (u, v) E + if there exists a path of length at least 1 that has u as

its origin and v as its destination. The relation E + is transitive closure of the

relation E.

Example 5.20.3 The transitive closure of the relation E defined by the graph

of Example 5.20.1 consists of the following pairs:

(0, 1), (0, 2), (0, 3), (0, 4), (0, 5), (0, 6),

(1, 2), (1, 5), (1, 6), (2, 5), (2, 6), (3, 4),

(3, 5), (3, 6), (4, 5), (4, 6)

as can be easily seen by inspecting Figure 5.5.

Of course, the transitive closure E + of a relation E V V is itself a relation

on the set V and, therefore, it can also be represented as a table. Namely, the

tabular representation of E + is:

GRAPHPLUS

origin

destination

0

1

0

2

0

3

0

4

0

5

0

6

1

2

1

5

1

6

2

5

2

6

3

4

3

5

3

6

4

5

4

6

through the operations of the relational algebra. (see [Maier, 1983], for example). However, SQL Plus allows us to compute the table GRAPHPLUS, when the

underlying graph G is acyclic.

This is accomplished using the clause connect by of select. This clause

establishes links between tuples of a table and can be used to retrieve the vertices

of a graph that can be accessed by paths that start from a certain vertex. Its

syntax is defined by:

[start with condition ] connect by condition

For example, the chaining condition of the nodes in a path of a graph is

described by connect by origin = prior destination. Thus, to obtain the

set of vertices that are accessible from the vertex 4 in the graph shown in

Figure 5.5 we write:

147

start with origin = 4

connect by origin = prior destination;

DESTINATION

----------5

6

The connect by clause cannot be applied to graphs that contain cycles. If

this is the case, ORACLE detects the existence of loop and returns an error

message.

Example 5.20.4 Suppose that we add an edge to the graph shown in Figure 5.5 that creates a loop, for example, the edge (5, 0), which creates the loop

(0, 3, 4, 5, 0). This can be done by

insert into tree(origin, destination)

values(5,0);

select distinct destination from graph

start with origin = 0

connect by origin = prior destination;

ORA-01436: CONNECT BY loop in user data

In Chapter 8 we discuss an algorithm that can be used for compute the transitive

closure for arbitrary graphs (with or without loops).

If a data set has a hierarchical structure, then it can be described by a rooted

tree, that is, by a special acyclic graph G = (V, E) that has a distinguished

vertex v0 called root such that for every other vertex v of the graph there is

a unique path that joins v0 to v. It is not difficult to show that for any two

distinct vertices u, v of a rooted tree there exists at most one path that joins u

to v. If such a path exists then we say that v is a descendant of u.

Example 5.20.5 The option connect by of SQL Plus can be used to find the

descendants of a vertex in a rooted tree. Consider, for example the rooted tree

shown in Figure 5.6. The table that represents this tree is created by the SQL

script included in Appendix F and has the form:

148

0

f

1

f

4

f

?

f

s

3 f

Uf

f/7 ?

f wf

5 6

8

f Uf

11

12

f

9

Uf

10

origin

0

0

0

1

1

2

2

2

3

3

7

7

TREE

destination

1

2

3

4

5

6

7

8

9

10

11

12

To retrieve the all descendants of a vertex (in this case, of vertex 2) we write:

select distinct destination as DESCENDANTS from tree

start with origin = 2

connect by origin = prior destination;

This returns:

DESCENDANTS

----------6

7

8

11

12

On the other hand, to retrieve the ancestors of a vertex, that is all vertices

that occur between the root of the tree and a vertex we write:

select distinct origin as ANCESTORS from tree

start with destination = 12

connect by destination = prior origin;

149

ANCESTORS

---------0

2

7

The reserved word prior can be used on either side of the equality sign. For

example, the last query of Example 5.20.5 can be written as:

select distinct origin ANCESTORS from tree

start with destination = 12

connect by prior origin = destination;

The pseudo-attribute LEVEL can be used to indicate the length of the path

that begins at the starting vertex of the query and ends with the vertex currently

retrieved.

Example 5.20.6 The following query adds the pseudo-attribute LEVEL to the

query of Example 5.20.5 that retrieves the descendants of the vertex 2:

select distinct level, destination as DESCENDANTS from tree

start with origin = 2

connect by origin = prior destination

LEVEL DESCENDANTS

---------- ----------1

6

1

7

1

8

2

11

2

12

Observe that the immediate descendants are at level 1 and the next level of

descendants at level 2.

If we retrieve the ancestor of a node as in

select distinct level, origin as ANCESTORS from tree

start with destination = 12

connect by destination = prior origin;

the values of LEVEL reflects the distance (in number of edges) between the vertex

and its various ancestors:

LEVEL ANCESTORS

---------- ---------1

7

2

2

3

0

150

Example 5.20.7 Combining the string function lpad and the pseudo-attribute

LEVEL allows us to display the entire tree using indentations. The query:

select level,lpad(*,2 * level -1)||destination as vertex from tree

start with origin = 0

connect by prior destination = origin;

LEVEL

---------1

2

2

1

2

2

3

3

2

1

2

2

VERTEX

--------*1

*4

*5

*2

*6

*7

*11

*12

*8

*3

*9

*10

An alternate way for obtaining a description of a tree that shows the paths

that can be used to reach vertices can be obtained using two pseudo-attributes

CONNECT BY ISLEAF and SYS CONNECT BY PATH. CONNECT BY LEAF returns 1 if

the current vertex, (in our case, the destination) of the edge is a leaf and 0,

otherwise. For every edge of the path that joins the starting vertex to the current

node the pseudo-attribute SYS CONNECT BY PATH computes a string specified by

its first argument; entries between successive edges are separated by the string

specified by its second argument.

Example 5.20.8 The query:

select level,destination,

CONNECT_BY_ISLEAF "IsLeaf?",

SYS_CONNECT_BY_PATH((||origin||,||destination||),+) "Path"

from tree

start with origin = 0

connect by prior destination = origin

order by level

will return:

LEVEL

1

1

1

2

2

2

DE

1

2

3

4

7

6

IsLeaf?

0

0

0

1

0

1

Path

+(0,1)

+(0,2)

+(0,3)

+(0,1)+(1,4)

+(0,2)+(2,7)

+(0,2)+(2,6)

2

2

2

2

3

3

5.21

10

9

8

5

11

12

151

1

1

1

1

1

1

+(0,3)+(3,10)

+(0,3)+(3,9)

+(0,2)+(2,8)

+(0,1)+(1,5)

+(0,2)+(2,7)+(7,11)

+(0,2)+(2,7)+(7,12)

Updates in SQL

There are three constructs in SQL that allow us to update the tables of a

relational database: update, insert, and delete.

The update construct modifies components of tuples. It applies to all tuples

of the specified table unless limited by a where clause.

Example 5.21.1 Recall the table EMPHIST introduced in Example 3.3.5. A

script to create and populate the tables discussed in that example is contained

in the script ced.sql that is available in Appendix C.

To give all current employees a 10% raise, we apply the following update

phrase:

update EMPHIST

set salary = 1.1* salary

where term_date is null;

update table name [corr name]

set column = hexpression|nulli {,column = hexpression|nulli}

[wherecondition]

The insert construct adds new rows to a table. It inserts a single rows

(whose components must be specified by the user) or a set of rows that originate

from a retrieval involving other tables.

The syntax of a single-tuple insert is:

insert into table name[(column{, column}]

hvalues(expr {, expr})|subselecti

The values of expressions listed in the list of values must belong to the domains

of the attributes specified in the list of columns in order for the insertion to take

effect.

Example 5.21.2 To insert two rows containing registration records for student 2890 for the fall semester of 2004 into GRADES, we execute two insert

statements:

insert into GRADES

values (2890,023,cs110,Fall,2004,null);

insert into GRADES

values (2890,056,cs240,Fall,94,null);

152

is:

insert into table name [(column{, column}]

select phrase

Here the select phrase must return tuples of values consistent with the

domains of the attributes specified by the list of columns [(column{, column}].

Example 5.21.3 Suppose that we intend to have a separate table indicating

the assignments of instructors. After creating such a table (called ASSIGN and

equipped with the attributes empno, cno, sem, and year) by writing:

create table ASSIGN(empno varchar2(11) not null,

cno varchar2(5) not null,

sem varchar2(6) not null,

year smallint);

we can load this table using data from the existing table GRADES using the

construct:

insert into ASSIGN(empno, cno, sem, year)

select distinct empno, cno, sem, year

from GRADES;

empno

019

019

019

019

023

056

234

234

cno

cs110

cs210

cs210

cs240

cs110

cs240

cs310

cs410

sem

Fall

Fall

Spring

Spring

Spring

Spring

Spring

Spring

year

2001

2002

2003

2002

2002

2003

2003

2002

If the components of the tuple to be inserted into a table violate the declaration of the table (e.g., a null value for a not null attribute, or a character

string for a numerical attribute), the DBMS should reject the insertion.

Likewise, the delete construct deletes rows of tables.

Example 5.21.4 To delete the rows of the table ASSIGN that correspond to

course taught by the instructor whose employee number is 234, we write:

delete from ASSIGN where empno = 234;

The directive:

delete from GRADES where grade is null;

The where clause of delete is optional; if this clause is not used, then all

rows are deleted. The tabular variable still exists.

153

Example 5.21.5 The following delete eliminates all rows of the table ASSIGN:

delete from ASSIGN;

delete from table name [wherecondition]

5.22

Access Rights

The grant operation assigns access rights to users. To delegate access rights to

other users, a user must own these rights. The set of access rights includes

select, insert, update, and delete and refers to the right of executing each

of these operations on a table. Further, update can be restricted to specific

columns.

All these access rights are granted to the creator of a table automatically.

The creator, in turn, may grant access rights to other users or to all users

(designated in SQL as public). The SQL standard envisions a mechanism that

can limit the excessive proliferation of access rights. Namely, a user may receive

the select right with or without the right to grant this right to others by his

own action.

Example 5.22.1 Suppose that the user alex owns the table COURSES and

intends to grant this right to the user whose name is peter. The user alex can

accomplish this by

grant select on COURSES to peter

Now, peter has the right to query the table COURSES but he may not propagate

this right to the user ellie. In order for this to happen, alex would have to

use the directive:

grant select on COURSES to peter

with grant option

Example 5.22.2 If peter owns the table STUDENTS, then he may delegate

the right to query the table and the right to update the columns addr, city and

zip to ellie using the directive:

grant select, update(addr, city, zip) on

STUDENTS to ellie

154

grant {priv {,priv } | all [privileges]}

on [table] tablename{, tablename}

to husername{, username}|publici

[with grant option]

hselect|insert|delete|update[(attribute{, attribute})]i

Privileges can be revoked using the revoke construct, which is a feature

of standard SQL. For instance, if peter wishes to revoke ellies privileges to

update the table STUDENTS, he may write:

revoke update(addr,city,zip) on

STUDENTS from ellie

revoke {priv {,priv }|all [privileges]}

on [table] tablename{, tablename}

from husername{, username}|publici

5.23

Views in SQL

Views are virtual tabular variables. This means that in SQL a view is referenced

for retrieval purposes in exactly the same way a tabular variable is referenced.

The only difference is that a view does not have a physical existence. It exists

only as a definition in the database catalog. We refer to real tabular variables

(that is, the tabular variables that have a physical existence in the database) as

base tabular variables.

Views are supported in both SQLPlus and in Transact SQL but not in the

current version (4.1) of MySQL.

To illustrate the notion of view, let us consider the following example.

Example 5.23.1 Suppose that we write:

create view STC as

select STUDENTS.name, GRADES.cno

from STUDENTS, GRADES

where STUDENTS.stno = GRADES.stno;

The select construct contained by this create view retrieves all pairs of

student names and course numbers such that the student whose name is s has

registered for the course c.

When this directive is executed by SQL, no data retrieval takes place. The

database system simply stores this definition in its catalog. The definition of the

view STC becomes a persistent object, that is, an object that exists after our

interaction with the DBMS has ceased. From a conceptual point of view, the

user treats STC exactly like any other tabular variable. Suppose, for instance

that we wish to retrieve the names of students who took cs110. In this case it

is sufficient to write the query:

155

In reality, SQL combines this select phrase with the query just shown and

executes the modified query:

select

where STUDENTS.stno = GRADES.stno

and GRADES.cno =cs110;

The previous example shows that views in SQL play a role similar to the role

played by macros in programming languages.

Views are important for data security. A user who needs to have access only

to list of names of students and the courses they are taking needs to be aware

only of the existence of STC. If the user is authorized to use only select constructs, then the user can ignore whether STC is a table or a view. Confidential

data (such as grades obtained in specific courses) can be completely protected

in this manner. Also, the queries that this limited-access user may write are

simpler and easier to understand. No space is wasted with the view STC, and

the view remains current always, reflecting the contents of the tabular variables

STUDENTS and GRADES.

SQL treats views exactly as it treats the tabular variables as far as retrieval

is concerned. We can also delegate the select privilege to a view in exactly

the same way as we did for a tabular variable. For instance, if the user george

created the view STC, then he can give the select right to vanda by writing:

grant select on STC to vanda;

Example 5.23.2 The view SNA that contains the student number and the

names of students can be created by:

create view SNA as

select stno, name from STUDENTS

The purpose of this view is to insure privacy to students. Any user who has

access only to this view can retrieve the student number and name of a student,

but not the address of the student.

There is a fundamental difference between the views introduced in Examples 5.23.1 and 5.23.2, and this refers to the ways in which these two views

behave with respect to updates.

Suppose that the user wishes to insert the pair (7799, Jane Jones) in the

view SNA. The user may ignore entirely the fact that SNA is not a base tabular

variable. On the other hand, the effect on the base tabular variable of this

insertion is unequivocally determined: the system inserts in the tabular variable

STUDENTS the tuple (7799, Jane Jones, null, null, null). On the other hand,

we cannot insert a tuple in a meaningful way in the view STC introduced in

Example 5.23.1. Indeed if we attempt to insert a pair (s, c) in STC, then we have

to define the effect of this insertion on the base tabular variable. This is clearly

156

impossible: we do not know what the student number is, what the identification

of the instructor is, etc. SQL forbids users to update views based on more than

one table (as STC is). Even if such updates would have an unambiguous effect

on the base tabular variable, this rule rejects any such update. Only some views

based on exactly one tabular variable can be updated. It is the responsibility

of the database administrator to grant to the user the right to update a view

only if that view can be updated.

If a view can be updated, then its behavior is somewhat different from the

base tabular variable on which the view is built. An update made to a view

may cause one or several tuples to vanish from the view, whenever we retrieve

the tuples of the view.

Example 5.23.3 Consider the view uppergr defined by:

create view UPPERGR as

select * from GRADES where grade > 75;

If we wish to examine the tuples that satisfy the definition of the view we use

the construction:

select * from UPPERGR;

STNO

---------2661

3566

5544

3566

2415

5571

1011

3566

5571

1011

4022

2661

EMPNO

----------019

019

019

019

019

234

019

019

019

056

056

234

CNO

----cs110

cs110

cs110

cs240

cs240

cs410

cs210

cs210

cs210

cs240

cs240

cs310

SEM

YEAR

GRADE

------ ---------- ---------FALL

1999

80

FALL

1999

95

FALL

1999

100

SPRING

2000

100

SPRING

2000

100

SPRING

2000

80

FALL

2000

90

FALL

2000

90

SPRING

2001

85

SPRING

2001

90

SPRING

2001

80

SPRING

2001

100

update UPPERGR

set grade = 70

where stno = 2661 and empno = 019 and cno = cs110

and sem = FALL and year = 1999;

makes the first row disappear, since it no longer satisfies the definition of the

view. Indeed, if we use again the same query on UPPERGR, we obtain:

STNO

---------3566

5544

EMPNO

----------019

019

CNO

----cs110

cs110

SEM

YEAR

GRADE

------ ---------- ---------FALL

1999

95

FALL

1999

100

3566

2415

5571

1011

3566

5571

1011

4022

2661

019

019

234

019

019

019

056

056

234

157

cs240

cs240

cs410

cs210

cs210

cs210

cs240

cs240

cs310

SPRING

SPRING

SPRING

FALL

FALL

SPRING

SPRING

SPRING

SPRING

2000

2000

2000

2000

2000

2001

2001

2001

2001

100

100

80

90

90

85

90

80

100

update UPPERGR

set grade = 80

where stno = 2661 and empno = 019 and cno = cs110

and sem = FALL and year = 1999;

The standard syntax of create view allows us to use the clause with check

option. When this clause is used, every insertion and update done through the

view is verified to make sure that a tuple inserted through the view actually

appears in the view and an update of a row in the view does not cause the row

to vanish from the view.

The syntax of create view is:

create view view as

subselect

[with check option]

A view V can be dropped from a database by using the construct

drop view V;

If we drop a tabular variable from the database, then all views based on that

table are automatically dropped; if we drop a view, then all other views that

use the view that we drop are also dropped.

Views are useful instruments in implementing generalizations. Suppose, that

we began the construction of the college database from the existing tabular

variables UNDERGRADUATES and GRADUATES that modelled sets of entities

having the same name, where

heading (UNDERGRADUATES ) = stno name addr city state zip major

heading (GRADUATES ) = stno name addr city state zip qualdate

Then, the tabular variable STUDENTS could have been obtained as a view

built from the previous two base tabular variables by

create view STUDENTS as

select stno name addr city state zip

from UNDERGRADUATES

union

158

TABLE NAME

STUDENTS

INSTRUCTORS

COURSES

GRADES

ADVISING

user catalog

TABLE TYPE

TABLE

TABLE

TABLE

TABLE

TABLE

from GRADUATES

5.24

The catalog of ORACLE is a very large tabular variable that can be accessed

through several views defined on this table.

In ORACLE a list of the table owned by the current user is contained by

the view user catalog, also accessible through its synonym cat. A content of this

view is shown in Figure 5.24.

Information that describes space allocation and statistical properties can be

found in the view named USER TABLES, also named TABS. A description of

the attributes of tabular variables and of their domains can be found in the view

USER TAB COLUMNS also accessible as COLS. For example, the query:

select table_name,column_name,data_type from COLS;

TABLE_NAME

ADVISING

ADVISING

COURSES

COURSES

COURSES

GRADES

GRADES

GRADES

GRADES

GRADES

GRADES

INSTRUCTORS

INSTRUCTORS

INSTRUCTORS

INSTRUCTORS

INSTRUCTORS

STUDENTS

STUDENTS

COLUMN_NAME

STNO

EMPNO

CNO

CNAME

CR

STNO

EMPNO

CNO

SEM

YEAR

GRADE

EMPNO

NAME

RANK

ROOMNO

TELNO

STNO

NAME

DATA_TYPE

CHAR

CHAR

CHAR

CHAR

NUMBER

CHAR

CHAR

CHAR

CHAR

NUMBER

NUMBER

CHAR

CHAR

CHAR

NUMBER

CHAR

CHAR

CHAR

5.25 Exercises

STUDENTS

STUDENTS

STUDENTS

STUDENTS

ADDR

CITY

STATE

ZIP

159

CHAR

CHAR

CHAR

CHAR

A more complete list of objects that belong to the current user can be found

in the view USER OBJECTS which lists all objects created by the user, including

those mentioned in USER CATALOG, as well as other useful information (such

as the date of creation, the last time when the object was affected by a data

definition statement, the status of the object, etc.)

The definition of views can be accessed by the USER VIEWS catalog view.

Example 5.24.1 The meta-view (view about views) USER VIEWS has the

structure described below:

Name

Null?

------------------------------- -------VIEW_NAME

NOT NULL

TEXT_LENGTH

TEXT

TYPE_TEXT_LENGTH

TYPE_TEXT

OID_TEXT_LENGTH

OID_TEXT

VIEW_TYPE_OWNER

VIEW_TYPE

SUPERVIEW_NAME

Type

-------------VARCHAR2(30)

NUMBER

LONG

NUMBER

VARCHAR2(4000)

NUMBER

VARCHAR2(4000)

VARCHAR2(30)

VARCHAR2(30)

VARCHAR2(30)

The last six attributes are important for object views discussed in Chapter 7.

To extract the definition of the view UPPERGR defined above we write:

select text from user_views where view_name=UPPERGR;

TEXT

-----------------------------------------------------------select "STNO","EMPNO","CNO","SEM","YEAR","GRADE" from GRADES

where grade > 75

5.25

Exercises

(a) Find all students who live in Malden or Newton.

(b) Find all students whose name starts with F;

(c) Find all students whose name contains the letter f;

2. A select phrase equivalent to the union-computing select

select stno from grades where cno = cs210

union

select stno from grades where cno = cs240;

160

is

select stno from grades

where cno = cs210 or cno = cs210;

(b) Can you transform the intersection-computing select:

select stno from grades where cno = cs210

intersect

select stno from grades where cno = cs240;

Solve in SQL the following queries that refer to the college database:

3. Find cities where students live for all students who dot not live in Boston,

Massachusetts.

4. Find all pairs of student names and course names for grades obtained

during Fall of 2001.

5. Find the names of students who took some four-credit courses.

6. Find the names of students who took a course with an instructor who is

also their advisor.

7. Find the names of students who took cs210 or had Prof. Smith as their

advisor.

8. Find all pairs of names of students who live in the same city.

9. Find all triples of instructors names for instructors who taught the same

course.

10. Find instructors who taught students who are advised by another instructor who shares the same room.

11. Find course numbers of courses taken by students who live in Boston and

which are taught by an associate professor.

12. Find the names of instructors who teach courses attended by students who

took a course with an instructor who is an assistant professor.

13. Find the telephone numbers of instructors who teach a course taken by

any student who lives in Boston.

14. Find all pairs of names of students and instructors such that the student

never took a course with the instructor.

15. Find the names of students who took no four-credit courses.

16. Find the names of students who took only four-credit courses.

17. Find the names of students who took every four-credit course.

18. Find the names of all students for whom no other student lives in the same

city.

19. Find names of students who took every course taken by Richard Pierce.

20. Find the names of instructors who teach no courses.

21. Find course numbers of courses that have never been taught.

22. Find courses that are taught by every assistant professor.

23. Find the names of students whose advisor did not teach them any course.

5.25 Exercises

161

24. Find the names of students who have failed all their courses (failing is

defined as a grade less than 60).

25. Find the names of students who do not have an advisor.

26. Find the names of instructors who taught every semester when a student

from Rhode Island was enrolled.

27. Find course names of courses taken by every student advised by Prof.

Evans.

28. Find names of students who took every course taught by an instructor

who is advising at least two students.

29. Find names of instructors who teach every student they advise.

30. Find names of students who are taking every course taught by their advisor.

31. Find course numbers of courses taken by every student who lives in Rhode

Island.

32. Find the student numbers of students who took at least two courses.

33. Find the course names of courses in which at least three students were

enrolled.

34. Find the names of instructors who advise at least two students.

35. List all students by name, along with their grade averages.

36. Find student numbers of students for whom the difference between the

highest and the lowest grade is less than 20.

37. Print a report that contains for each course (cno), the number of students

who took the course, the highest, the lowest, and the average grade in the

course.

38. Find the average grade of students who took cs110 at any time. Then,

find students whose grades in cs110 were above the average.

39. Identify those queries that require division among the queries 3 to 34 and

solve those queries using the group by option of SQL.

40. Create views on the college database as specified:

(a) A view that contains the names of the instructors, the courses (cnos)

that they teach, and the average grade in these courses.

(b) A view that shows the names and offices of the instructors.

(c) A view that contains the courses (cnos) , the number of students who

took the courses, the average grade in these courses, and the highest

grade.

(d) A view that contains the names of instructors and the names of the

students that they advise.

(e) A view that shows the data about the students in Massachusetts.

41. Print the contents of the views created in Exercise 40.

42. Determine which of the views created in Exercise 40 can be updated.

43. Using the views created in Exercise 40(a) and 40(c) create a view that

lists the instructors and the total number of students they teach.

44. Solve the following queries:

(a) list names of instructors and the number of courses they taught;

(b) list instructors in the order of the number of courses they taught;

162

(c) list the top three instructors in the order of the number of courses

they taught.

45. Let GRAPH be the table introduced in Example 5.20.3. The degree of a

vertex is the number of edges incident to that vertex.

(a) write an SQL query that yields a list of vertices of a graph arranged

in the decreasing order of their degrees;

(b) list the top 5 vertices of a graph in increasing order of their degrees.

46. For each instructors list the sequence of the numbers of courses that the

instructor taught during each of the semesters that he or she was active.

47. List the top three instructors in the order of the number of students that

they advise.

5.26

Bibliographical Comments

defined in [International Organization for Standardization, 1992]. Extensive

presentations of SQL3 can be found in [Melton and Simon, 1993; Melton and

Simon, 2002] and [Fortier, 1999]. Also, useful reference are [Line and Kline,

2000] and [J. Kauffman, 2001].

- ORACLE ACADEMY PLSQL SEMESTER 1 FINALTransféré parJake Dewey
- SQL ModifiedTransféré parRaJu SinGh
- Difference Beween Rdbms n DbmsTransféré parMiaka Yuuki
- Relational Theory Lab 1Transféré parrobgw
- Database Management SystemD2Transféré parMalawig Kayvin
- SQL serverTransféré parakash raj81
- Relatiional ModelTransféré parElif Köseler
- OracleTransféré parMeghraj Marothia
- SQLTransféré parNidheesh KM
- Lec 1-IntroTransféré parAshik Ahmed Nahid
- Interview DocTransféré parAditya Narayahn Biswal
- QuestionnairTransféré parShravan Kumar
- Database System ConceptTransféré parDisha Sharma
- 2Transféré parManimala Selvam
- DB Tut5Transféré parwirdina
- 19180_bia_unit-2_part-2Transféré parKritika Jain
- base de datos.txtTransféré parAnonymous 2XeYPh
- SQL-FAQ-ANSTransféré pardineshahlawat
- SQL DetailsTransféré parvishmath
- Topic 7-SQLTransféré parWan Nor Farhanah
- SQL_FAQsTransféré parYathindra sheshappa
- HR SchemaTransféré parLuiz Carlos Pires
- Introduction to the Oracle ..Transféré parAnirbanBanerjee
- A New Timestamp Value Based on Adding an Interval to an Existing Timestamp Value SQL SyntaxTransféré parwotaradeska
- N047_DD (Hotel Management System)Transféré parpratyoosh
- Table NameTransféré parShamnadh A k
- Shine2K18Transféré parArun Kumar
- DDL DML InventoryTransféré parAdrian Crisandy
- db doc labTransféré parkalki shiv
- Production Table DesignTransféré paranand_gsoft3603

- ME423.pdfTransféré parmaherkamel
- Course 649Transféré parjambu99
- Forming_2 Forging 1Transféré parZulfikar Mamen
- HW3aTransféré parjambu99
- ME338-10Transféré parmaherkamel
- HW1_2Transféré parmaherkamel
- hw1Transféré parmaherkamel
- Mechanics2Transféré parmaherkamel
- chatterTransféré parmaherkamel
- Quality lecture notes (1)Transféré parmaherkamel
- QM in Library.pdfTransféré pargaurav sharma
- Quality Assurance and Quality ControlTransféré partraslie
- 8 QualityTransféré parmaherkamel
- Quality 1Transféré parpankajdharmadhikari
- Total Quality Management Question, mcqs of BesterfieldTransféré parJuanita Copeland
- 15 Acceptance SamplingTransféré parRahul Verma
- Assignment-5_noc18_ma07_15Transféré parmaherkamel
- Quality controlTransféré parmaherkamel
- 111105041.pdfTransféré parmaherkamel
- noc19_ma08_Assignment13Transféré parmaherkamel
- Assignment-2_noc18_ma07_5Transféré parmaherkamel
- Assignment-1_noc18_ma07_7Transféré parmaherkamel
- 'p00Transféré parAsheesh Kumar
- Mecschsyll 17 SchemeTransféré parpachiedu
- Machine Tools LabTransféré parmaherkamel
- Tatistical Quality ControlTransféré parmaherkamel
- Contents of UNIT III-part ATransféré parmaherkamel
- lec. 36 OptimizationTransféré parmaherkamel
- lec37Transféré parmaherkamel

- Airline Reservation System Project ReportTransféré parnasilalap
- IT TG Normalisation to Third Normal Form 9626Transféré parahsan
- ORDBMSTransféré parPuneet Verma
- -OODBMS - Concepts (1)Transféré parsmartbilal5338
- A scalable language for data-driven gamesTransféré parBarry Gian James
- CafarelaTransféré parWawat Smart
- Are NoSQL Data Stores Useful for Bioinformatics ResearchersTransféré parEditor IJRITCC
- Using Electronic SignaturesTransféré parchris75726
- Database99Transféré parRajat Sharma
- Chapter 3.pptTransféré parKj Banal
- Ds TutorialTransféré parhtbcsi
- Lecture 3 - PDF - FullTransféré parAlbert Wang
- Neward_ch05.Qxd.ps 7/27/04 1:45 PM Page 225Transféré pargowri_95
- Fox Pro CommandTransféré parKamal Markam
- RdbmsTransféré parkvsureshmysore
- Poll Survey System.docxTransféré parFreeProjectz.com
- Neo4j Essentials - Sample ChapterTransféré parPackt Publishing
- lecture04.pptTransféré parpoo4all
- DBMS BookTransféré paratayushtyagi0
- mcaTransféré parMohit Hulk
- Advanced Data WareHouse Design Errata.pdfTransféré parDanne Ramirez
- thesisTransféré parAyatRashad
- DBMS(Database Management System)Transféré parPardeep Vats
- CaeTransféré parlabeedmi
- Lab Normalization1Transféré parDarwin Vargas
- The Entity-Relationship.docxTransféré parMauricio Rojas Valdivia
- 1105.1364v1Transféré parBharathy Gnanam
- Accounting Plus One Chap 12 15 HsehereTransféré parChandresh
- cbimTransféré parAathithyan Balasubramaniam
- DBMS 2 Course OutlineTransféré parR-jay Lamadrid

## Bien plus que des documents.

Découvrez tout ce que Scribd a à offrir, dont les livres et les livres audio des principaux éditeurs.

Annulez à tout moment.