Vous êtes sur la page 1sur 44

Advanced SQL Programming

Mark Holm Centerfield Technology

Goals
Introduce some useful advanced SQL programming techniques Show you how to let the database do more work to reduce programming effort Go over some basic techniques and tips to improve performance

Notes
V4R3 and higher syntax used in examples Examples show only a small subset of what can be done!

Agenda
Joining files - techniques, dos and donts Query within a query - Subqueries Stacking data - Unions Simplifying data with Views Referential Integrity and constraints Performance, performance, performance

Joining files
Joins are used to relate data from different tables Data can be retrieved with one open file rather than many Concept is identical to join logical files without an associated permanent object (except if the join is done with an SQL view)

Join types
Inner Join
Used to find related data

Left Outer (or simply Outer) Join


Used to find related data and orphaned rows

Exception Join
Used to only find orphaned rows

Cross Join
Join all rows to all rows
6

Sample tables
FirstName John Cindy Sally
Department table

LastName Doe Smith Anderson

Dept 397 450 250

Dept 397 550 250

Area Development Marketing Sales

Employee table

Inner Join
Method #1 - Using the WHERE Clause SELECT LastName, Division FROM Employee, Department WHERE Employee.Dept = Department.Dept Method #2 - Using the JOIN Clause SELECT LastName, Division FROM Employee INNER JOIN Department ON Employee.Dept = Department.Dept
NOTE: This method is useful if you need to influence the order of the tables are joined in for performance reasons. Only works on releases prior to V4R4.

Results
Return list of employees that are in a valid department. Employee Smith is not returned because she is not in a department listed in the Department table

Result table

LastName Doe Anderson

Area Development Sales

Left Outer Join

Must use Join Syntax SELECT LastName, Area FROM Employee LEFT OUTER JOIN Department ON Employee.Dept = Department.Dept

10

Results
Return list of employees even if they are not in a valid department Employee Smith has a NULL Area because it could not be associated with a valid Dept

Result table

LastName Doe Smith Anderson

Area Development Sales


11

Exception Join

Must use Join Syntax SELECT LastName, Area FROM Employee EXCEPTION JOIN Department ON Employee.Dept = Department.Dept

12

Results
Return list of employees only if they are NOT in a valid department Employee Smith is only one without a valid department

Result table

LastName Smith

Area -

13

WARNING!
The order tables are listed in the FROM clause is important For OUTER and EXCEPTION joins, the database must join the tables in that order. The result may be horrible performancemore on this topic later

14

Observations
Joins provide one way to bury application logic in the database Each join type has a purpose and can be used to not only get the data you want but identify incomplete information With some exceptions, if joined properly performance should be at least as good as an application
15

Subqueries
Subqueries are a powerful way to select only the data you need without separate statements. Example: List employees making a higher than average salary

16

Subquery Example
SELECT FNAME, LNAME FROM EMPLOYEE WHERE SALARY > (SELECT AVG(SALARY) FROM EMPLOYEE)

SELECT FNAME, LNAME FROM EMPLOYEE WHERE SALARY > (SELECT AVG(SALARY) FROM EMPLOYEE WHERE LNAME = JONES)

17

Subqueries - types
Correlated
Inner select refers to part of the outer (parent) select (multiple evaluations)

Non-Correlated
Inner select does not relate to outer query (one evaluation)

18

Subquery Tips 1
Subquery optimization (2nd statement will be faster)
SELECT name FROM employee WHERE salary > ALL (SELECT salary FROM salscale) SELECT name FROM employee WHERE salary > (SELECT max(salary) FROM salscale)

19

Subquery Tips 2
Subquery optimization (2nd statement will be faster)
SELECT name FROM employee WHERE salary IN (SELECT salary FROM salscale) SELECT name FROM employee WHERE EXISTS (SELECT salary FROM salscale WHERE employee.salid = salscale.salid)

20

UNIONs
Unions provide a way to append multiple row sets files in one statement Example: Process all of the orders from January and February
SELECT * FROM JanOrders WHERE SKU = 199976 UNION SELECT * FROM FebOrders WHERE SKU = 199976

21

Unions
Each SELECT statement that is UNIONed together must have the same number of result columns and have compatible types Two forms of syntax
UNION ALL -- allow duplicate records UNION -- return only distinct rows

22

Views
Views provide a convenient way to permanently put SQL logic Create once and use many times Also make the database more understandable to users Can put simple business rules into views to ensure consistency
23

Views
Example: Make it easy for the human resources department to run a report that shows new employees.
CREATE VIEW HR/NEWBIES (EMPLOYEE_NAME, DEPARTMENT, HIRE_DATE) AS SELECT concat(concat(strip(last_name),','),strip(first_name)), department, hire_date FROM WHERE HR/EMPLOYEE (year(current date)-year(hire_date)) < 2

24

Performance
SQL performance is harder to predict and tune than native I/O. SQL provides a powerful way to manipulate data but you have little control over HOW it does it. Query optimizer takes responsibility for doing it right.
25

Performance - diagnosis
Getting information about how the optimizer processed a query is crucial Can be done via one or all of the following:
STRDBG: debug messages in job log STRDBMON: optimizer info put in file QAQQINI: can be used to force messages CHGQRYA: messages put out when time limit set to 0
26

Performance tips
Create indexes
Over columns that significantly limit data in WHERE clause Over columns that join tables together Over columns used in ORDER BY and GROUP BY clauses

27

Performance tips
Create Encoded Vector Indexes (EVIs)
Most useful in heavy query environments with a lot of data (e.g. large data warehouses) Helps queries that process between 20-60% of a tables data Create over columns with a modest number of distinct values and those with data skew EVIs bridge the gap between traditional indexes and table scans
28

Performance tips
Encourage optimizer to use indexes
Use keyed columns in WHERE clause if possible Use ANDed conditions as much as possible OPTIMIZE FOR n ROWS Dont do things that eliminate index use
Data conversion (binary-key = 1.5) LIKE clause w/leading wildcard (NAME LIKE %JOE)
29

Performance tips
Keep statements simple
Complex statements are much more difficult to optimize Provide more opportunity for the optimizer to choose a sub-optimal plan of attack

30

Performance tips
Enable DB2 to use parallelism
Query processed by many tasks (CPU parallelism) or by getting data from many disks at once (I/O parallelism) CPU parallelism requires IBMs SMP feature and a machine with multiple processors Enabled via the QQRYDEGREE system value, CHGQRYA, or the QAQQINI file
31

Other useful features


CASE clause - conditional calculations ALIAS - access to multi-member files Primary/Foreign keys - referential integrity Constraints

32

CASE
Conditional calculations with CASE
SELECT Warehouse, Description, CASE RegionCode WHEN 'E' THEN 'East Region' WHEN 'S' THEN 'South Region' WHEN 'M' THEN 'Midwest Region' WHEN 'W' THEN 'West Region' END FROM Locations
33

CASE
Avoiding calculation errors (e.g. division by 0)
SELECT Warehouse, Description, CASE NumInStock WHEN 0 THEN NULL ELSE CaseUnits/NumInStock END FROM Inventory

34

ALIAS names
The CREATE ALIAS statement creates an alias on a table, view, or member of a database file.
CREATE ALIAS alias-name FOR table member

Example: Create an alias over the second member of a multi-member physical file
CREATE ALIAS February FOR MonthSales February

35

Referential Integrity
Keeps two or more files in synch with each other Ensures that children rows have parents Can also be used to automatically delete children when parents are deleted

36

Referential Integrity Rules


A row inserted into a child table must have a parent row (typically in another table). Parent rules
A parent row can not be deleted if there are dependent children (Restrict rule) OR All children are also deleted (Cascade rule) OR All childrens foreign keys are changed (Set Null and Set Default rules)
37

Primary key must be unique

Primary Key

Parent table Child table

Foreign Key

38

Referential Integrity syntax


ALTER TABLE Hr/Employee ADD CONSTRAINT EmpPK PRIMARY KEY (EmployeeId) ALTER TABLE Hr/Department ADD CONSTRAINT EmpFK FOREIGN KEY (EmployeeId) REFERENCES Hr/Employee (EmployeeId) ON DELETE CASCADE ON UPDATE RESTRICT
39

Check Constraints
Rules which limit the allowable values in one or more columns:
CREATE TABLE Employee (FirstName CHAR(20), LastName CHAR(30), Salary CHECK (Salary>0 AND Salary<200000))
40

Check Constraints
Effectively does data checking at the database level. Data checking done with display files or application logic can now be done at the database level. Ensures that it is always done and closes back doors like DFU, ODBC, 3-rd party utilities.
41

Other resources
Database Design and Programming for DB2/400 - book by Paul Conte SQL for Smarties - book by Joe Celko SQL Tutorial - www.as400network.com AS/400 DB2 web site at http://www.as400.ibm.com/db2/db2main.htm Publications at http://publib.boulder.ibm.com/pubs/html/as400/ Our web site at http://www.centerfieldtechnology.com

42

Summary
SQL is a powerful way to access and process data Used effectively, it can reduce the time it takes to build applications Once tuned, it can perform very close (and sometimes better) than HLLs alone

43

Good Luck and Happy SQLing

Vous aimerez peut-être aussi