Académique Documents
Professionnel Documents
Culture Documents
About Me
PostgreSQL ~ 6.5 CTO @myYearbook.com Scaled initial infrastructure Not as involved day-to-day database
operational and development
Twitter: @Crad
Scaling?
Concurrency
Requests per Second
6am
8am 10am 12pm 2pm 4pm 6pm 8pm 10pm 12am 2am Hourly breakdown
4am
6am
Increasing Size-On-Disk
Size in GB
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec
Size on Disk
Relations
Indexes
Constraints
Available Memory Disk Speed IO Bus Speed
Keep it in memory.
Client Connections
Apache+PHP
One connection per backend for each pg_connect
Python
One connection per connection*
ODBC
One connection to Postgres per ODBC connection
Lock Contention?
Each backend for a connected client has to check for locks
Master Process
Stats Collector
Autovacuum
Wall Writer
Wall Writer
Connection Backend
Client Connection
Master Process
Stats Collector
Autovacuum
Wall Writer
Wall Writer
Connection Backend
Client Connection
Connection Backend
Client Connection
Master Process
Stats Collector
Autovacuum
Wall Writer
Wall Writer
Connection Backend
Client Connection
Connection Backend
Client Connection
...
Connection Backend Client Connection
250 Apache Backends x 1 Connection per Backend x 250 Servers = 62,500 Connections
Solvable Problems!
The Trailblazers
Solving Concurrency
pgBouncer
Session Pooling
Transactional Pooling
Statement Pooling
Connection Pooling
Clients Clients Clients
Hundreds
Hundreds
Hundreds
pgBouncer
Tens
Postgres Server #1
Tens
Postgres Server #2
Tens
Postgres Server #3
Hundreds
Hundreds
Hundreds
Local pgBouncer
Local pgBouncer
Local pgBouncer
Tens
Tens
Tens
pgBouncer
Tens
Postgres Server #1
Tens
Postgres Server #2
Tens
Postgres Server #3
Easy to run
Usage: pgbouncer [OPTION]... config.ini -d, --daemon Run in background (as a daemon) -R, --restart Do a online restart -q, --quiet Run quietly -v, --verbose Increase verbosity -u, --user=<username> Assume identity of <username> -V, --version Show version -h, --help Show this help screen and exit
userlist.txt
username password foo bar
pgbouncer.ini
Specifying Connections
[databases] ; foodb over unix socket foodb = ; redirect bardb to bazdb on localhost bardb = host=localhost dbname=bazdb ; access to dest database will go with single user forcedb = host=127.0.0.1 port=300 user=baz password=foo client_encoding=UNICODE datestyle=ISO connect_query='SELECT 1'
Authentication
; any, trust, plain, crypt, md5 auth_type = trust #auth_file = 8.0/main/global/pg_auth auth_file = etc/userlist.txt admin_users = user2, someadmin, otheradmin stats_users = stats, root
Stats Users?
SHOW HELP|CONFIG|DATABASES|POOLS|CLIENTS|SERVERS|VERSION SHOW FDS|SOCKETS|ACTIVE_SOCKETS|LISTS|MEM
pgbouncer=# SHOW CLIENTS; type | user | database | state | addr | port | local_addr | local_port | connect_time ------+-------+-----------+--------+-----------+-------+------------+------------+--------------------C | stats | pgbouncer | active | 127.0.0.1 | 47229 | 127.0.0.1 | 6000 | 2011-09-13 17:55:46 * Truncated columns for display purposes
Pooling Behavior
pool_mode = statement server_check_query = select 1 server_check_delay = 10 max_client_conn = 1000 default_pool_size = 20 server_connect_timeout = 15 server_lifetime = 1200 server_idle_timeout = 60
Skytools
Scale-Out Reads
Clients Clients Clients Clients pgBouncer
Load Balancer
Canonical Database
PGQ
The Ticker
ticker.ini
[pgqadm] job_name = pgopen_ticker db = dbname=pgopen # how often to run maintenance [seconds] maint_delay = 600 # how often to check for activity [seconds] loop_delay = 0.1 logfile = ~/Source/pgopen_skytools/%(job_name)s.log pidfile = ~/Source/pgopen_skytools/%(job_name)s.pid
Londiste
replication.ini
[londiste] job_name = pgopen_to_destination provider_db = dbname=pgopen subscriber_db = dbname=destination # it will be used as sql ident so no dots/spaces pgq_queue_name = pgopen logfile = ~/Source/pgopen_skytools/%(job_name)s.log pidfile = ~/Source/pgopen_skytools/%(job_name)s.pid
Install Londiste
londiste.py replication.ini provider install londiste.py replication.ini subscriber install
DDL?
Great Success!
PL/Proxy
plProxy Server
A-F Server
G-L Server
M-R Server
S-Z Server
Sharded Request
CREATE FUNCTION get_user_email(username text) RETURNS SETOF text AS $$ CLUSTER usercluster; RUN ON hashtext(username); $$ LANGUAGE plproxy;
Sharding Setup
Need 3 Functions:
plproxy.get_cluster_partitions(cluster_name text) plproxy.get_cluster_version(cluster_name text) plproxy.get_cluster_cong(in cluster_name text, out key text, out val text)
get_cluster_partitions
CREATE OR REPLACE FUNCTION plproxy.get_cluster_partitions(cluster_name text) RETURNS SETOF text AS $$ BEGIN IF cluster_name = 'usercluster' THEN RETURN NEXT 'dbname=part00 host=127.0.0.1'; RETURN NEXT 'dbname=part01 host=127.0.0.1'; RETURN; END IF; RAISE EXCEPTION 'Unknown cluster'; END; $$ LANGUAGE plpgsql;
get_cluster_version
CREATE OR REPLACE FUNCTION plproxy.get_cluster_version(cluster_name text) RETURNS int4 AS $$ BEGIN IF cluster_name = 'usercluster' THEN RETURN 1; END IF; RAISE EXCEPTION 'Unknown cluster'; END; $$ LANGUAGE plpgsql;
get_cluster_cong
CREATE OR REPLACE FUNCTION plproxy.get_cluster_config( in cluster_name text, out key text, out val text) RETURNS SETOF record AS $$ BEGIN -- lets use same config for all clusters key := 'connection_lifetime'; val := 30*60; -- 30m RETURN NEXT; RETURN; END; $$ LANGUAGE plpgsql;
get_cluster_cong values
connection_lifetime query_timeout disable_binary keepalive_idle keepalive_interval keepalive_count
SQL/MED
plproxyrc
plpgsql based api for table based
management of PL/Proxy
BSD Licensed
https://github.com/myYearbook/plproxyrc
Server-to-Server
Postgres Server #1 Postgres Server #2 Postgres Server #3
pgBouncer
Clients
Local pgBouncer
Clients
Local pgBouncer
pgBouncer
pgBouncer
plProxy Server
plProxy Server
Load Balancer
Questions?