Académique Documents
Professionnel Documents
Culture Documents
Course Agenda
Introduction
System Architecture
Installation and Database
Clusters
Configuration
Creating and Managing
Databases
User Tools - CUI and GUI
Security
SQL Primer
Backup, Recovery and PITR
Routine Maintenance Tasks
Data Dictionary
Module 1
Introduction
Objectives
In this module we will cover:
PostgreSQL
Features of PostgreSQL
General Database Limits
Common Database Object Names
PostgreSQL Community
Community mailing lists
www.Postgresql.org/community/lists
Support Forums
http://forums.enterprisedb.com/forums/list.page
Commercial SLAs
www.enterprisedb.com/products-services-training/subscriptions
PostgreSQL Lineage
Reliable:
ACID Compliant
Supports Transactions
Supports Save points
Uses Write Ahead Logging (WAL)
Scalable:
Uses Multi-version Concurrency Control
Supports Table Partitioning
Supports Tablespaces
Copyright EnterpriseDB Corporation, 2015. All rights reserved.
Streaming Replication
Replication Slots
Supports Hot-Backup, pg_basebackup
Point-in-Time Recovery
Advanced:
Supports Triggers and Functions
Supports Custom Procedural Languages PL/pgSQL, PL/Perl, PL/TCL,
PL/PHP, PL/Java
Upgrade using pg_upgrade
Unlogged Tables
Materialized Views
Copyright EnterpriseDB Corporation, 2015. All rights reserved.
10
Rollback or Undo
Dynamic allocation of disk space is used for the storage and processing of records within
tables.
The files that represent the table grow as the table grows.
It also grows with transactions that are performed against it.
The data is stored within the file that represents the table.
When deletes and updates are performed on a table, the file that represents the object
will contain the previous data
This space gets reused, but need to force recovery of used space. (Using Vaccum)
11
Value
Unlimited
32 TB
1.6 TB
1 GB
Unlimited
Unlimited
12
Industry Term
PostgreSQL Term
Table or Index
Relation
Row
Tuple
Column
Attribute
Data Block
Page
13
Summary
In this module we covered:
PostgreSQL
Features of PostgreSQL
General Database Limits
Common Database Object Names
14
Module 2
System Architecture
15
Module Objectives
Architectural Summary
Statement Processing
Utility Processes
Connection Request-Response
Write Buffering
Page Layout
16
Architectural Summary
PostgreSQL uses processes, not threads
Postmaster process acts as supervisor
17
Shared Buffers
BGWRITER
STATS
COLLECTOR
CHECKPOINTER
ARCHIVER
AUTO -VACUUM
LOG WRITER
WAL Writer
18
Process Array
WAL Buffers
Data
Data
Files
Files
Error Log
Files
18
WAL
WAL
Segments
Segments
Archived
Archived
WAL
WAL
Utility Processes
Background writer
Writes updated data blocks to disk
WAL writer
Flushes write-ahead log to disk
Checkpointer process
Automatically performs a checkpoint based on config parameters
Autovacuum launcher
Starts Autovacuum workers as needed
Autovacuum workers
Recover free space for reuse
19
Stats collector
Collects usage statistics by relation and block
Archiver
Archives write-ahead log files
20
Postmaster as Listener
Postmaster is the master process
called postgres
Client request a
connection
Postmaster
Shared Memory
21
Copyright EnterpriseDB Corporation, 2015. All rights reserved.
21
postgres
postgres
Stable Database
22
postgres
postgres
CHECKPOINT
Stable Database
23
postgres
postgres
postgres
BGWRITER
Stable Database
24
postgres
postgres
postgres
postgres
WAL
Buffer
Transaction
Log
Stable Database
25
postgres
ARCHIVE
COMMAND
26
postgres
WAL
Buffer
Transaction
Log
Stable Database
After Commit
Committed updates written from shared memory to WAL log file (on disk).
After Checkpoint
Modified data pages are written from shared memory to the data files.
27
Statement Processing
Check Syntax
Call Traffic Cop
Identify Query Type
Command Processor if
needed
Break Query in Tokens
Optimize
Execute
Parse
28
Data
Global
Base
pg_tblsc
pg_xlog
Cluster wide
database
objects
Contains
Databases
Symbolic link to
tablespaces
Write ahead
logs
pg_log
Error logs
29
Status
Directories
Configuration
Files
postgresql.conf,
pg_clog, pg_multiexact,
pg_hba.conf,
pg_snapshots, pg_stat,
pg_ident.conf
pg_subtrans,pg_notify,
pg_serial, pg_replslot,
pg_logical, pg_dynshmem
Postmaster
Info Files
include
lib
doc
/opt/PostgresPlus/9.4AS
share
scripts
stackbuilderplus
pg_env.sh,
uninstall-* files
30
Copyright EnterpriseDB Corporation, 2015. All rights reserved.
30
31
32
Copyright EnterpriseDB Corporation, 2015. All rights reserved.
32
14297
14307
Database OID
Base
14300
14405
14498
14312
Data
pg_tblsc
16650
14300
/storage
Tablespace OID
14301
/pg_tab
14307
16651
Copyright EnterpriseDB Corporation, 2015. All rights reserved.
33
16700
16701
Lab Exercise
1. Write a command to display a list of all PostgreSQL related processes on
your operating system
2. Write a SQL statement to find the data file name and location for
customers table
34
Module Summary
Architectural Summary
Shared Memory
Inter-processes Communication
Statement Processing
Utility Processes
Disk Read Buffering Disk
Write Buffering
Background Writer Cleaning Scan
35
Module 3
Installation
36
Module Objectives
OS User and Permissions
Installation Options
Installation of PostgreSQL
StackBuilder
Setting Environmental Variables
37
All PostgreSQL processes and data files must be owned by a user in the
OS.
OS user is un-related to database user accounts.
For security reasons, the OS user must not be root or an administrative account.
SELinux Permissions
SELinux must be set to permissive mode on systems with SELinux.
38
Supported Platforms
PostgreSQL is available for Windows, Linux and Solaris systems
Windows:
Windows Server 2008 R1 (32 and 64 bit)
Windows Server 2008 R2 (64 bit)
Windows 2012 (64-bit)
Linux platforms:
39
Supported Locales
PostgreSQL has been tested and certified for the following locales:
40
41
License Agreement
User Authentication
Installation Directory
Components
Data and WAL Directory
Configuration Mode
Database superuser password and Locale
42
StackBuilder
Simplifies the process of downloading and installing modules
StackBuilder requires Internet access
Run stackbuilder using the StackBuilder menu option from the Postgresql
9.4 menu
43
44
Module Summary
OS User and Permissions
Installation Options
Installation of PostgreSQL
StackBuilder
Setting Environmental Variables
45
Module 4
Database Clusters
2015 EnterpriseDB Corporation. All rights reserved.
46
Module Objectives
Database Clusters
Creating a Database Cluster
47
Database Clusters
A Database Cluster is a collection of databases managed by a single
server instance.
48
49
Example - initdb
50
51
52
53
Syntax
pg_ctl stop
54
Syntax:
pg_controldata [DATADIR]
55
Module Summary
Database Clusters
Creating a Database Cluster
56
Lab Exercise - 1
1. A new website is to be developed for an online music store.
Create a new cluster edbdata with ownership of postgres user.
Start your edbdata cluster
Stop your edbdata cluster with fast mode.
Reload your cluster with pg_ctl utility and using pg_reload_conf()
function.
57
Module 5
Configuration
58
Objectives
In this module we will cover:
59
60
61
62
Connection Settings
The following parameters in the postgresql.conf file control the connection
settings:
listen_addresses (string)
Specifies the TCP/IP address(es) on which the server is to listen for connections
port (integer)
The TCP port the server listens on; 5432 by default
max_connections (integer)
Determines the maximum number of concurrent connections to the database server
superuser_reserved_connections (integer)
Determines the number of connection slots that are reserved for connections by
Advanced Server superusers
63
Copyright EnterpriseDB Corporation, 2015. All rights reserved.
63
Memory Settings
The following parameters in postgresql.conf control the memory settings:
shared_buffers (integer)
Sets the amount of memory the database server uses for shared memory buffers
temp_buffers (integer)
Sets the maximum number of temporary buffers used by each database session
autovacuum_work_mem
Specifies the maximum amount of memory to be used by each autovacuum worker process
64
65
66
67
68
Autovacuum Settings
autovacuum (default on) - Controls whether the autovacuum launcher
runs, and starts worker processes to vacuum and analyze tables.
69
Lab Exercise
1. Open psql and write a statement to change work_mem to 10MB. This
change must persist across server restarts
2. Open psql and write a statement to change work_mem to 20MB for the
current session
3. Open psql and write a statement to change work_mem to 1 MB for the
postgres user
4. Write a query to list all parameters requiring a server restart
5. Take a backup of the postgresql.conf file
70
Lab Exercise
6. Open the configuration file for your Postgres database cluster and make
the following changes
7. Restart the server and verify the changes made in previous step
71
Summary
In this module we covered:
72
Module 6
PSQL
73
Objectives
In this module we will cover:
Client Interface psql
Connecting to a Database
psql Meta Commands
Executing SQL Commands
Advance Features
74
PSQL
psql is the name of PostgreSQL PSQL's executable.
It enables you to type in queries interactively, issue them to PostgreSQL,
and see the query results.
psql [option...] [dbname [username]]
75
Connecting to a Database
In order to connect to a database you need to know the name of your
target database, the host name and port number of the server and what
user name you want to connect as.
Command line options, namely -d, -h, -p, and -U respectively.
Environment variables PGDATABASE, PGHOST,
PGPORT and PGUSER to appropriate values.
76
Conventions
psql has it's own set of commands, all of which start with a backslash (\).
These are in no way related to SQL commands, which operate in the
server. psql commands only affect psql.
Some commands accept a pattern. This pattern is a modified regex. Key
points:
* and ? are wildcards
Double-quotes are used to specify an exact name, ignoring all special characters and
preserving case
77
On Startup...
psql will execute commands from $HOME/.psqlrc, unless option -X is
specified.
78
Entering Commands
psql uses the command line editing capabilities that are available in the
native OS. Generally, this means
Up and Down arrows cycle through command history
on UNIX, there is tab completion for various things, such as SQL commands and to a more
limited degree, table and field names
disabled with -n
79
Controlling Output
There are numerous ways to control output.
The most important are:
-o FILENAME or \o FILENAME will send query output
(excluding STDERR) to FILENAME (which may be a pipe)
\g FILENAME executes the query buffer,
sending output to FILENAME (may be a pipe)
-q runs quietly.
80
81
Information Commands
\d(i, s, t, v, S)[+] [pattern]
List information about indexes, sequences, tables, views or system objects
Any combination of letters may be used in any order, e.g.: \dvs
+ displays comments
\d[+] [pattern]
For each relation describe/display the relation structure details
+ displays any comments associated with the columns of the table,
and if the table has an OID column
Without a pattern, \d[+] is equivalent to \dtvs[+]
82
server
If + is appended to the command name, database descriptions are also displayed
\dn+ [pattern]
Lists schemas (namespaces)
+ adds permissions and description to output
\df[+] [pattern]
Lists functions
+ adds owner, language, source code and description to output
83
\q or ^d
Quits the edb-psql program
\cd [ directory ]
Change current working directory
Tip: To print your current working directory, use
\! pwd
\! [ command ]
Executes the specified command
If no command is specified, escapes to a separate Unix shell (CMD.EXE in
Windows)
84
Help
\?
Shows help information about edb-psql commands
\h [command]
Shows information about SQL commands
If a command isn't specified, lists all SQL commands
psql --help
Lists command line options for edb-psql
85
Lab Exercise
1. Open an psql terminal and connect to the edb database
2. Write a command to display the list of:
Databases
Users
Tablespaces
Tables
4. Execute the above SQL script using psql and store the results in a file
86
Summary
In this module we covered:
Client Interface edb-psql
Connecting to a Database
psql Meta Commands
Executing SQL Commands
87
Module 7
88
Objectives
In this module we will cover:
Object Hierarchy
Creating Databases
Creating Schemas
Schema Search Path
Creating Database Objects
89
Object Hierarchy
Database
Cluster
Users/Groups
(Roles)
Database
Tablespaces
Catalogs
Schema
Extensions
View
Sequence
Functions
Table
90
Event
Triggers
Database
A running PostgreSQL server can manage multiple databases.
A database is a named collection of SQL objects. It is a collection of
schemas and the schemas contain the tables, functions, etc.
Databases are created with CREATE DATABASE command.
Databases are destroyed with DROP DATABASE command .
To determine the set of existing databases:
SQL: SELECT datname FROM pg_database;
PSQL META COMMAND: \l (backslash lowercase L)
91
Creating Databases
There is a program that you can execute from the shell to create new
databases, createdb.
createdb dbname
92
93
Accessing a Database
The psql tool allows you to interactively enter, edit, and execute SQL
commands.
94
Users
Database users are completely separate from operating system user.
Database users are global across a database cluster.
This pre-defined superuser have all the privileges with GRANT OPTION.
95
96
97
Privileges
Cluster level
Granted to a user during CREATE or later using ALTER USER
These privileges are granted by superuser
Object Level
Granted to user using GRANT command
These privileges allow a user to perform particular action on an database object, such
as table, view, sequence etc.
Can be granted by Owner, superuser or someone who have been given permission to
grant priviledge (WITH GRANT OPTION)
98
GRANT Statement
GRANT can be used for granting object level privileges to database users,
groups or roles.
Syntax:
Type \h GRANT in psql terminal to view the entire syntax and available privileges that
can be granted on different objects.
99
100
REVOKE Statement
REVOKE can be used for revoking object level privileges to database users,
groups or roles.
101
102
What is a Schema
SCHEMA
Tables
Views
Sequences
Functions
Owns
USER
Domains
103
Benefits of Schemas
A database can contains one or more named schemas.
By Default, all database contain public schema.
There are several reasons why one might want to use schemas:
To allow many users to use one database without interfering with each other.
To organize database objects into logical groups to make them more manageable.
Third-party applications can be put into separate schemas so they cannot collide with
the names of other objects.
104
Creating Schemas
105
106
107
108
Database Objects
Database Schemas can contain different types of objects including:
Tables
Sequences
Views
Synonyms
Domains
Packages
Functions
Procedures
109
Object Ownership
Database
Cluster
Users/Groups
(Roles)
Database
Tablespaces
Catalogs
Schema
Extensions
Sequence
Functions
Table
View
110
Event
Triggers
Summary
In this module we covered:
Object Hierarchy
Creating Databases
Creating Schemas
Schema Search Path
111
Lab Exercise
1. Write a SQL query to view name and size of all the databases in your
data cluster
112
Lab Exercise
6. Open psql and connect to mgrdb using mgr1 user and find the result of
the following query
select * from mgrtab;
The above statement should run successfully with 0 rows returned
7. In the lab exercise of a previous module you have added user Irena. Set
the proper search_path for user Irena so that the edbstore schema
objects can be accessed without use of fully qualified table names
8. Connect to the edbstore database using the Irena user and verify if you
can access orders table without a fully qualified table name
113
Module 8
Security
114
Objectives
In this module we will cover:
Authentication and Authorization
Levels of security
Implementing Host Based Access Control
115
116
Levels of Security
Server and
Application
Database
User/Password
Connect Privilege
Schema
Permissions
Table Level
Privileges
Grant/Revoke
Object
Check Client IP
pg_hba.conf
117
118
pg_hba.conf Example
# TYPE
DATABASE
USER
ADDRESS
METHOD
host
all
all
127.0.0.1/32
md5
all
all
::1/128
md5
replication
postgres
127.0.0.1/32
md5
#host
replication
postgres
::1/128
md5
119
Typically other database users should not have any OS level privilege.
Always revoke CONNECT privilege on a database from public to block
normal user access to the database.
Use GRANT to grant CONNECT privilege to the users who are allowed to
connect to a database.
120
Authentication Problems
FATAL:
FATAL:
FATAL:
FATAL:
121
Object Ownership
Cluster
Users/Groups
Databases
Tablespaces
Schemas
Tables
Views
Sequences
Synonyms
122
Copyright EnterpriseDB Corporation, 2015. All rights reserved.
122
Domains
Functions
Lab Exercise
1. User Irena is trying to connect to the edbstore database. She emailed
you the error message she is getting while trying to connect:
FATAL: no pg_hba.conf entry for host "192.168.10.23", user
irena", database edbstore
123
Lab Exercise
2. You have decided to log all the connections on your data cluster.
Configure your postgresql.conf settings so that all the successful as well
as unsuccessful connections are logged in the error log file
3. Connect to edbstore database
4. Verify if the connections are logged or not
124
Summary
In this module we covered:
Authentication & Authorization
Levels of security
Implementing Host Based Access Control
125
Module 9
SQL Primer
126
Module Objectives
Data Types
Introduction to various SQL
functionalities:
Structured Query Language (SQL)
Quoting
Tables and Constraints
SQL Functions
Concatenation
Manipulating Data using
Nested Queries
INSERT,UPDATE and DELETE
Joins
Creating Other Database Objects Materialized Views
Sequences, Views and Domains
Updatable Views
Indexes
127
Data Types
Common Data Types:
Numeric Types
Character Types
NUMERIC
CHAR
Date/Time Types
TIMESTAMP
Other Types
Postgres Plus
Advanced
Server
BYTEA
CLOB
BOOL
DATE
INTEGER
MONEY
VARCHAR2
VARCHAR
TIME
XML
JSON
SERIAL
TEXT
BLOB
INTERVAL
128
JSONB
NUMBER
XMLTYPE
Data Definition
Language
Data Manipulation
Language
Data Control
Language
Transaction Control
Language
GRANT
REVOKE
CREATE
ALTER
DROP
TRUNCATE
INSERT
UPDATE
DELETE
SELECT
129
COMMIT
ROLLBACK
SAVEPOINT
SET
TRANSACTION
Tables
A table is a named collection of rows
Each table row has same set of columns
130
Constraints
Constraints are used to enforce data integrity.
131
Types of Constraints
PostgreSQL supports different types of constraints:
NOT NULL
CHECK
UNIQUE
PRIMARY KEY
FOREIGN KEY
132
133
134
ALTER TABLE
ALTER TABLE is used to modify the structure of a table
Can be used to:
Add a new column
Modify an existing column
Add a constraint
Drop a column
Change schema
Rename a column, constraint or table
Enable and disable triggers and rules on a table
Change ownership
Enable or disable logging for a table
Copyright EnterpriseDB Corporation, 2015. All rights reserved.
135
DROP TABLE
The DROP TABLE statement is used to drop a table
All data and structure is deleted
136
Inserting Data
The INSERT statement is used to insert one or multiple rows in a table
Syntax:
INSERT INTO table_name [ ( column_name [, ...] ) ]
{ DEFAULT VALUES | VALUES ( { expression | DEFAULT } [, ...] ) [,
...] | query }
137
Example - INSERT
138
Update Data
The UPDATE statement is used to modify zero or more rows in a table
The number of rows modified depends on the WHERE condition
Syntax:
UPDATE [ ONLY ] table_name
SET column_name = { expression | DEFAULT }
[ WHERE condition]
139
Example - UPDATE
140
Deleting Data
The DELETE statement is used to delete one or more rows of a table
Syntax:
DELETE FROM [ ONLY ] table_name
[ WHERE condition ]
141
Example - DELETE
142
Views
A View is a Virtual Table and can be used to hide complex queries
Can also be used to represent a selected view of data
143
Example - VIEW
View Example with updatable and non-updateable columns:
144
Materialized Views
Materialized Views
145
146
Sequences
A sequence is used to automatically generate integer values that follow a
pattern.
147
Domains
A domain is a data type with optional constraints
Domains can be used to create a data type which allows a selected list of
values
Table: emp
Column: cityname
Data Type: city
Domain: city
Table: shop
Column: shoplocation
Data Type: city
Table: clients
Column: res_city
Data Type: city
148
Example : Domain
149
Quoting
Single quotes and dollar quotes are used to specify non-numeric values
Example:'hello world'
'2011-07-04 13:36:24'
'{1,4,5}'
$$A string "with" various 'quotes' in.$$
$foo$A string with $$ quotes in $foo$
Double quotes are used for names of database objects which either clash with
keywords, contain mixed case letters, or contain characters other than a-z, 0-9
or underscore
Example:select * from "select"
create table "HelloWorld" ...
select * from "Hi everyone and everything"
150
String Functions
Format Functions
Date and Time Functions
Aggregate Functions
Example
SELECT lower(name)
FROM departments;
SELECT *
FROM departments
WHERE lower(name) = 'development';
151
Format Functions
Function
Return Type
Description
Example
to_char(timestamp, text)
text
to_char(current_timestamp, 'HH12:MI:SS')
to_char(interval, text)
text
to_char(int, text)
text
to_char(125, '999')
text
to_char(125.8::real, '999D9')
to_char(numeric, text)
text
to_char(-125.8, '999D99S')
to_date(text, text)
date
to_number(text, text)
numeric
to_number('12,454.8-', '99G999D9S')
to_timestamp(text, text)
to_timestamp(double precision)
to_timestamp(1284352323)
152
Concatenating Strings
Use two vertical bar symbols ( || ) to concatenate strings together
Example
SELECT 'Department ' || department_id || ' is: ' || name
FROM departments;
153
154
Inner Joins
Inner joins are the most common
Only rows that have corresponding rows in the joined table are returned
Example:
SELECT ename, dname
FROM emp, dept
WHERE emp.deptno = dept.deptno;
155
156
Outer Joins
Returns all rows even if there is no corresponding row in the joined table.
157
Indexes
Indexes are a common way to enhance performance
PostgreSQL supports several index types:
B-tree (default)
Hash only used when the WHERE clause contains a simple comparison using the
= operator (discouraged because they are not crash safe)
Index on Expressions use when quick retrieval speed is needed on an often used
expression. Inserts and updates will be slower.
Partial Index Indexes only rows that satisfy the WHERE clause (the WHERE clause
need not include the indexed column). A query must include the same WHERE clause
to use the partial index.
158
Example Index
Syntax:
CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ name ] ON table_name [
USING method ] ( { column_name | ( expression ) } [ COLLATE
collation ] [ opclass ] [ ASC | DESC ] [ NULLS { FIRST | LAST }
] [, ...] )[ WITH ( storage_parameter = value [, ... ] ) ]
[ TABLESPACE tablespace_name ]
[ WHERE predicate ]
Example:
159
Module Summary
Data Types
Introduction to various SQL
functionalities:
Structured Query Language (SQL)
Quoting
Tables and Constraints
SQL Functions
Concatenation
Manipulating Data using
Nested Queries
INSERT,UPDATE and DELETE
Joins
Creating Other Database Objects Materialized Views
Sequences, Views and Domains
Updatable Views
Indexes
160
Lab Exercise - 1
Test your knowledge:
1. Initiate a psql session
2. psql commands access the database
True/False
True/False
True/False
5. There are coding errors in the following statement. Can you identify them?
SELECT empno, ename, sal * 12 ANNUAL SALARY FROM emp;
161
Lab Exercise - 2
1. Write a statement for the following:
- The HR department needs a report of all employees. Write a query to display the
name, department number, and department name for all employees.
- Create a report to display employee names and employee numbers along with their
managers name and managers number. Label the columns Employee, Emp#,
Manager, and Mgr#, respectively.
- Create a report for the HR department that displays employee names, department
numbers, and all the employees who work in the same department as a given
employee. Give each column an appropriate label.
162
Lab Exercise - 3
1. Write a query that displays the employee number and name of all employees
who work in a department with any employee whose name contains an E.
(Use a subquery.)
2. Update and delete data in the emp table.
- Change the name of employee 7566 to Drexler.
- Change the salary to $1,000 for all employees who have a salary less than
$900.
- Verify your changes to the table.
- Delete MILLER from the emp table.
163
Lab Exercise - 4
1. Create the EMP2 table based on the structure of the EMP table. Include only the empno,
ename, sal, and deptno columns. Name the columns in your new table ID,
FIRST_NAME, SALARY , and DEPTID, respectively.
2. The staff in the HR department wants to hide some of the data in the EMP table. They
want a view called EMPVU based on the employee numbers, employee names, and
department numbers from the EMP table. They want the heading for the employee name
to be EMPLOYEE.
3. Confirm that the view works. Display the contents of the EMPVU view.
4. Using your EMPVU view, write a query for the SALES department to display all employee
names and department numbers.
164
Lab Exercise - 5
1. You need a sequence that can be used with the primary key column of the
dept table. The sequence should start at 60 and have a maximum value of
200. Have your sequence increment by 10. Name the sequence
dept_id_seq.
2. To test your sequence, write a script to insert two rows in the dept table.
3. Create an index on the deptno column of the dept table.
4. Create and test a partial index.
165
Module 9
Tablespaces
166
Objectives
In this module we will cover
167
Data Files:
Can belong to only one tablespace
Are used to store database objects
Cannot be shared by multiple tables (one or more per table)
168
Advantages of Tablespaces
Control the disk layout for a Database Cluster.
Heavily used indexes can be placed on fast media using tablespaces.
Fast
Media
Slow
Media
169
170
Creating Tablespaces
Each user-defined tablespace has a symbolic link inside the
PGDATA/pg_tblspc directory.
This symbolic link is named after the tablespace's OID.
Tablespace directory contains a subdirectory named after the PPAS
version used to built the database cluster.
Each database have a separate subdiretory.
Tablespaces can be created using CREATE TABLESPACE command.
Tablespace directory must be created and appropriate permissions must
be set prior to running CREATE TABLESPACE command.
CREATE TABLESPACE tablespace_name [ OWNER user_name ] LOCATION
'directory'
171
172
Altering Tablespaces
ALTER TABLESPACE is used to change the definition of a tablespace.
ALTER TABLESPACE can be used to rename a tablespace, change
ownership, move objects to other tablespaces and set custom values for a
configuration parameter.
Only owner or superuser can alter a tablespace.
173
Dropping Tablespace
DROP TABLESPACE removes a tablespace from the system
Only owner or superuser can drop a tablespace
174
Lab Exercise
1. Write a statement or meta command in psql to view the list of all
tablespaces
2. You need to create a new tablespace labtab for the labapp application.
Start with creation of new directory /home/postgres/labtab and then
create the required tablespace
3. Create a table labtest in the edb database with tablespace labtab
4. Verify the location of the newly created table labtest
5. Move the labtest table to the pg_default tablespace
6. Remove labtab tablespace and the associated directory
175
Summary
In this module we covered:
176
Module 10
Routine Maintenance Tasks
177
Module Objectives
Database Maintenance
Vacuum Freeze
Maintenance Tools
Optimizer Statistics
Vacuumdb
Data Fragmentation
Autovacuuming
Routine Vacuuming
Vacuuming Commands
Routine Reindexing
Preventing Transaction ID
Wraparound Failures
CLUSTER
178
Database Maintenance
Data files become fragmented as data is modified and
deleted
179
Maintenance Tools
PostgreSQL maintenance thresholds can be configured in postgresql.conf
Manual scripts can be written watch stat tables like
pg_stat_user_tables
Maintenance commands:
ANALYZE
VACUUM
CLUSTER
180
Optimizer Statistics
Optimizer statistics play a vital role in query planning
Not updated in real time
Collect information for relations including size, row counts, average row
size and row sampling
Stored permanently in catalog tables
The maintenance command ANALYZE updates the statistics
181
182
183
Routine Vacuuming
Obsoleted rows can be removed or reused using vacuuming
Helps in shrinking data file size when required
184
Vacuuming Commands
When executed, the VACUUM command:
- Can recover or reuse disk space occupied by obsolete rows
- Updates data statistics
- Updates the visibility map, which speeds up index-only scans
- Protects against loss of very old data due to transaction ID wraparound
185
186
VACUUM Syntax
187
Example - Vacuuming
188
189
A cluster that runs for a long time (more than 4 billion transactions) would
suffer transaction ID wraparound
This causes a catastrophic data loss
To avoid this, every table in the database must be vacuumed at least once
for every two billion transactions
190
Vacuumdb Utility
The VACUUM command has a command-line executable wrapper called
vacuumdb
191
Autovacuuming
Highly recommended feature of Postgres
It automates the execution of VACUUM, FREEZE and ANALYZE commands
192
Autovacuuming Parameters
193
Routine Reindexing
Indexes are used for faster data access
UPDATE and DELETE on a table modify underlying index entries
Indexes are stored on data pages and become fragmented over time
REINDEX rebuilds an index using the data stored in the index's table
Time required depends on:
Number of indexes
Size of indexes
Load on server when running command
194
When to Reindex
There are several reasons to use REINDEX:
An index has become "bloated", meaning it contains many empty or nearlyempty pages
You have altered a storage parameter (such as fillfactor) for an index
An index built with the CONCURRENTLY option failed, leaving an "invalid"
index
Syntax:
REINDEX { INDEX | TABLE | DATABASE | SYSTEM } name [
FORCE ]
195
CLUSTER
Sort data physically according to the specified index
One time operations and changes are not clustered
An ACCESS EXCLUSIVE lock is acquired
When some data is accessed frequently and can be grouped using an
index, CLUSTER is helpful
CLUSTER lowers disk accesses and speeds up the query when accessing
a range of indexed values
Setting maintenance_work_mem to a large values is recommended
before running CLUSTER
Run ANALYZE afterwards
196
Example - CLUSTER
197
Module Summary
Database Maintenance
Maintenance Tools
Optimizer Statistics
Data Fragmentation
Routine Vacuuming
Vacuuming Commands
Preventing Transaction ID
Wraparound Failures
Vacuum Freeze
The Visibility Map
Vacuumdb
Autovacuuming
198
Lab Exercise - 1
1. While monitoring table statistics on the edbstore database, you found that
some tables are not automatically maintained by autovacuum. You decided
to perform manual maintenance on these tables. Write a SQL script to
perform the following maintenance: Write a SQL script to perform the
following maintenance Reclaim obsolete row space from the customers
table.
199
Lab Exercise 2
1. The composite index named ix_orderlines_orderid on (orderid,
orderlineid) columns of the orderlines table is performing very
slow. Write a statement to reindex this index for better performance.
200
Module 11
Backup and Recovery
201
Objectives
In this module we will cover:
Backup Types
SQL Dump
Backup using PGADMIN
Cluster Dump
Offline Copy Backup
Continuous Archiving
Point-In Time Recovery
202
Backup
As with any database, Postgres database should be backed up regularly.
There are three fundamentally different approaches to backing up
Postgres data:
- SQL dump
- File system level backup
- Continuous Archiving
203
204
205
Click Backup
206
PGADMIN-III: Backup
Specify the backup
filename
207
Standard Unix tools can be used to work around this potential problem.
- You can use your favorite compression program, for example gzip:
- pg_dump dbname | gzip > filename.gz
- Also the split command allows you to split the output into smaller files:
- pg_dump dbname | split -b 1m - filename
208
Syntax:
pg_restore [options] [filename.backup]
209
-C: Create the database named in the dump file & restore directly into it.
-a: Restore the data only, not the data definitions (schema).
-s: Restore the data definitions (schema) only, not the data.
-n <schema>: Restore only objects from specified schema.
-t <table>: Restore only specified table.
210
PGADMIN-III: Restore
Right click on any database.
Click Restore.
211
PGADMIN-III: Restore
212
213
214
215
The database server must be shut down in order to get a usable backup.
File system backups only work for complete backup and restoration of an
entire database cluster.
File system snapshots work for live servers.
216
Requirements:
- wal_level must be set to archive
- archive_mode must be set to on
- archive_command must be set in postgresql.conf which archives WAL
logs and supports PITR
217
archive_mode=on
Unix:
archive_command= cp i %p /mnt/server/archivedir/%f </dev/null
Windows:
archive_command= 'copy "%p" c:\\mnt\\server\\archivedir\\"%f"'
%p is the absolute path of WAL otherwise you can define the path
%f is a unique file name which will be created on above path.
218
219
Point-in-Time Recovery
Point-in-time recovery (PITR) is the ability to restore a database cluster up
to the present or to a specified point of time in the past.
Uses a full database cluster backup and the write-ahead logs found in the
/pg_xlog subdirectory.
Must be configured before it is needed (write-ahead log archiving must be
enabled).
220
221
Point-in-Time Recovery
Settings in the recovery.conf file:
restore_command(string)
Unix:
recovery_target_time(timestamp)
222
Summary
In this module we covered:
Backup Types
SQL Dump
Backup using PGADMIN
Cluster Dump
PGADMIN Backup Globals
Offline Copy Backup
Continuous Archiving
Point-In Time Recovery
223
Module 12
Data Dictionary
224
Module Objectives
The System Catalog Schema
System Information Tables
225
226
Example:
- pg_tables list of tables
- pg_constraints list of constraints
- pg_indexes list of indexes
- pg_trigger list of triggers
227
228
229
230
231
Module Summary
The System Catalog Schema
System Information Tables
232
Lab Exercise - 1
1. You are working with different schemas in a database. After a while you
need to determine all the schemas in your search path. Write a query to
find the list of schemas currently in your search path.
233
Lab Exercise - 2
1. You need to determine the names and definitions of all of the views in
your schema. Create a report that retrieves view information - the view
name and definition text.
234
Lab Exercise - 3
1. Create of report of all the users who are currently connected. The report
must display total session time of all connected users.
2. You found that a user has connected to the server for a very long time
and have decided to gracefully kill its connection. Write a statement to
perform this task.
235
Lab Exercise - 4
1. Write a query to display the name and size of all the databases in your
cluster. Size must be displayed using a meaningful unit.
236
Module 13
Performance Tuning
237
Objectives
In this module we will cover:
Hardware Configuration
OS Configuration
SQL Tuning
Query Plan
Configuration (postgresql.conf)
238
Hardware Configuration
239
Hardware Configuration
240
OS Configuration
241
SQL Tuning
242
SQL Tuning
Statement Processing
Common Query Performance Issues
243
Statement Processing
244
245
246
247
248
PGBadger Log Manager Wizard can configure log min duration statements
Review log messages in PGBadger
249
250
251
Execution Plan
An execution plan shows the detailed steps necessary to execute a SQL
statement
Planner is responsible for generating the execution plan
The Optimizer determines the most efficient execution plan
Optimization is cost-based, cost is estimated resource usage for a plan
252
Optimizer Statistics
The PostgreSQL Optimizer and Planner use table statistics for generating
query plans
Choice of query plans are only as good as table stats
Table statistics
- Stored in catalog tables like pg_class, pg_stats etc.
- Stores row sampling information such as number of rows, distribution frequency, etc.
253
254
255
Output:
Hash Join (cost=25.30..71.16 rows=1510 width=102) (actual
time=0.002..0.002 rows=0 loops=1)
Hash Cond: (office.cityid = city.cityid)
-> Seq Scan on office (cost=0.00..25.10 rows=1510 width=24) (actual
time=0.001..0.001 rows=0 loops=1)
-> Hash (cost=16.80..16.80 rows=680 width=90) (never executed)
-> Seq Scan on city (cost=0.00..16.80 rows=680 width=90)
(never executed)
Planning time: 0.456 ms
Execution time: 0.067 ms
Copyright EnterpriseDB Corporation, 2015. All rights reserved.
256
257
258
259
260
261
262
Avoid expressions as the optimizer may ignore the indexes on such columns
Use equijoins wherever possible to improve SQL efficiency
Try changing the access path and join orders with hints
Avoid full table scans if using an index is more efficient
The join order can have a significant effect on performance
263
264
Adding an index for one SQL can slow down other queries
Verify index usage using the EXPLAIN command
265
Indexes
PostgreSQL provides several index types:
B-tree- equality and range queries on data that can be sorted (Default index type)
Hash- only simple equality comparisons.
266
Multicolumn Indexes
An index can be defined on more than one column of a table
Currently, only the B-tree, GiST, and GIN index types support multicolumn
indexes
Example:
CREATE INDEX test_idx1 ON test (id_1, id_2);
This index will be used when you write:
SELECT * FROM test WHERE id_1=1880 AND id_2= 4500;
267
268
Unique Indexes
Indexes can also be used to enforce uniqueness of a column's value or the
uniqueness of the combined values of more than one column
CREATE UNIQUE INDEX name ON table (column [, ...]);
Currently, only B-tree indexes can be declared unique.
269
Functional Indexes
An index can be created on a computed value from the table columns
For example:
CREATE INDEX test1_lower_col1_idx ON test1 (lower(col1));
270
Partial Indexes
A partial index is an index built over a subset of a table
The index contains entries only for those table rows that satisfy
the predicate
Example:
CREATE INDEX idx_test2 on test(id_1) where province=AB;
271
272
273
274
275
Summary
In this module we will cover:
Hardware Configuration
OS Configuration
SQL Tuning
Query Plan
Configuration (postgresql.conf)
276
13.PGPool
277
PGPOOL-II
pgpool-II is a middleware that works between PostgreSQL servers and a
PostgreSQL database client.
278
Connection Pooling
pgpool-II saves connections to the PostgreSQL servers, and reuse them
whenever a new connection with the same properties (i.e. username,
database, protocol version) comes in. It reduces connection overhead, and
improves system's overall throughput.
279
Replication
pgpool-II can manage multiple PostgreSQL servers. Using the replication
function enables creating a realtime backup on 2 or more physical disks,
so that the service can continue without stopping servers in case of a disk
failure.
280
Load Balance
If a database is replicated, executing a SELECT query on any server will
return the same result. pgpool-II takes an advantage of the replication
feature to reduce the load on each PostgreSQL server by distributing
SELECT queries among multiple servers, improving system's overall
throughput. At best, performance improves proportionally to the number of
PostgreSQL servers. Load balance works best in a situation where there
are a lot of users executing many queries at the same time.
281
282
What is a Transaction?
The essential point of a transaction is that it bundles multiple steps into a
single ACID event characterized by:
283
284
15.Table Partitioning
285
Objective
Partitioning
Partition Methods
Partition Setup
Partitioning Example
Partition Table Explain Plan
286
What is Partitioning?
PostgreSQL supports basic table partitioning
Splitting one large table into smaller pieces
287
Partitioning
Partitioning refers to splitting what is logically one large table into smaller
physical pieces.
288
Partitioning
Bulk deletes may be accomplished by simply removing one of the
partitions
289
Partitioning Methods
Range Partitioning
Range partitions are defined via key column(s) with no overlap or gaps
List Partitioning
Each key value is explicitly listed for the partitioning scheme Partitioning Methods
290
Manageability
Allows data to be added and removed easier
Maintenance is easier (vacuum, reindex, cluster), can focus on active data
Scalability
Manage larger amounts of data easier
Remove any hardware constraints
291
292
Partition 1:
create table child1 (check (id between 1 and 100)) inherits (master_table);
Partition 2:
create table child2 (check (id between 101 and 200)) inherits (master_table);
293
else
RAISE NOTICE 'INVALID ID RANGE';
end if;
return null;
end;
$$ language plpgsql;
294
Creating Trigger
postgres=# create trigger trg_ins_cit BEFORE INSERT on master_table
FOR EACH ROW EXECUTE PROCEDURE trg_cit_ins();
CREATE TRIGGER
postgres=#
295
Triggers
Created with the CREATE FUNCTION command
Several special variables are created automatically in the top-level block.
NEW
Data type RECORD; variable holding the new database row for INSERT/UPDATE
operations in row-level triggers. This variable is NULL in statement-level triggers. "
OLD
Data type RECORD; variable holding the old database row for UPDATE/DELETE
operations in row-level triggers. This variable is NULL in statement-level triggers. "
TG_NAME
Data type name; variable that contains the name of the trigger actually fired. "
296
Triggers (Contd.)
TG_WHEN
Data type text; a string of either BEFORE or AFTER depending on the trigger's definition. "
TG_LEVEL
Data type text; a string of either ROW or STATEMENT depending on the trigger's definition. "
TG_OP
Data type text; a string of INSERT, UPDATE, or DELETE telling for which operation the trigger
was fired. "
TG_RELNAME
Data type name; the name of the table that caused the trigger invocation. "
TG_NARGS
Data type integer; the number of arguments given to the trigger procedure in the CREATE
TRIGGER statement. "
297
Triggers (Contd.)
TRIGGERS TYPES : in order to create trigger we need to create function
first and return triggers
ROW LEVEL :- will get fire on each row affected on the database, for eg if
you are executing an update query on 1000 rows the trigger will get fired
for 1000 times.
STATEMENT LEVEL :- here it only fires only one time compared to above.
298
16.PG_UPGRADE
299
Pg_upgrade
Typically pg_dump is required for major version upgrades (e.g. 8.3.X to
8.4.X, or 8.4.X to 9.0.X)
pg_upgrade is used to migrate PostgreSQL data files to a later
PostgreSQL major version without the data dump/reload
pg_upgrade does its best to make sure the old and new clusters are
binary-compatible, e.g. by checking for compatible compile time settings,
including 32/64-bit binaries.
It is important that any external modules are also binary compatible,
though this cannot be checked by pg_upgrade.
pg_upgrade supports upgrades from 8.3.X and later to the current major
release of PostgreSQL.
300
Pg_upgrade
You can use pg_upgrade utility for migrating old cluster data
directories to new version.
Syntax:
pg_upgrade [OPTIONS]...
301
Pg_upgrade
Options:
-b, --old-bindir old cluster executable directory
-B, --new-bindir new cluster executable directory
-c, --check check clusters only, don't change any data
-d, --old-datadir old cluster data directory
-D, --new-datadir new cluster data directory
-l, --logfile log session activity to file
-p, --old-port old cluster port number (default 5432)
-P, --new-port new cluster port number (default 5432)
-u, --user clusters superuser (default "postgres")
-v, --verbose enable verbose output
302
Pg_upgrade
Before running pg_upgrade you must:
create a new database cluster (using the new version of initdb)
shutdown the postmaster servicing the old cluster
shutdown the postmaster servicing the new cluster
When you run pg_upgrade, you must provide the following information:
For example:
pg_upgrade -d oldCluster/data -D newCluster/data -b oldCluster/bin -B newCluster/bin
303
17.Loading/Unloading
304
Copy
postgres=# COPY emp (empno,ename,job,sal,comm,hiredate) TO
'/tmp/emp.csv' CSV HEADER;
COPY
postgres=# \! cat /tmp/emp.csv
empno,ename,job,sal,comm,hiredate
7369,SMITH,CLERK,800.00,,17-DEC-80 00:00:00
7499,ALLEN,SALESMAN,1600.00,300.00,20-FEB-81 00:00:00
7521,WARD,SALESMAN,1250.00,500.00,22-FEB-81 00:00:00
7566,JONES,MANAGER,2975.00,,02-APR-81 00:00:00
305
Copy
Postgres =# CREATE TEMP TABLE empcsv (LIKE emp);
CREATE TABLE
-------+--------+-----------+-----+--------------------+---------+---------+-------
306
Copy
Options:
-b, --old-bindir old cluster executable directory
-B, --new-bindir new cluster executable directory
-c, --check check clusters only, don't change any data
-d, --old-datadir old cluster data directory
-D, --new-datadir new cluster data directory
-l, --logfile log session activity to file
-p, --old-port old cluster port number (default 5432)
-P, --new-port new cluster port number (default 5432)
-u, --user clusters superuser (default "postgres")
-v, --verbose enable verbose output
307
Thank You
shankarnag@taashee.com
308