Académique Documents
Professionnel Documents
Culture Documents
Michael Tate
Galen Charlton
Scope of workshop
What to monitor.
When to monitor.
What is
Availability Monitoring?
Stand-alone server
(Virtual or Physical)
Existing server within the cluster
Utility server
Load balancer
Logging server
What to monitor?
Is brick in rotation?
{bricks|utility|sip}:~$ for i in \
`ps wax | grep -v grep | grep -i listener | cut -d"[" -f2 | cut -d"]" -f1`; \
do echo "($i) \
Listener count: `ps wax | grep -v grep | grep $i | grep -ic listener` \
Drone count: `ps wax | grep -v grep | grep $i | grep -ic drone`"\
; done
OpenSRF drones...cont
(opensrf.settings)
(open-ils.acq)
(open-ils.booking)
(open-ils.cat)
(open-ils.supercat)
(open-ils.search)
(open-ils.circ)
(open-ils.actor)
(open-ils.storage)
(open-ils.penalty)
(open-ils.collections)
(open-ils.ingest)
(open-ils.reporter)
(open-ils.permacrud)
(open-ils.trigger)
(open-ils.fielder)
(open-ils.vandelay)
(opensrf.math)
(opensrf.dbmath)
(open-ils.auth)
(open-ils.cstore)
(open-ils.reporter-store)
(open-ils.pcrud)
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
Listener
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
1
1
1
1
1
1
1
1
1
1
1
1
2
1
1
1
1
1
1
1
1
1
1
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
Drone
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
count:
5
1
1
1
1
1
1
2
4
1
1
5
2
5
5
5
1
1
1
1
1
1
1
Is postgres running?
Is pgpool running?
db1$
/usr/lib64/nagios/plugins/check_tcp -p 5432
TCP OK - 0.000 second response time on port 5432|time=0.000093s;;;0.000000;10.000000
Is slony running?
{all-servers}:~$ /usr/lib/nagios/plugins/check_load -w 3 -c 4
OK - load average: 0.00, 0.00, 0.00|load1=0.000;3.000;4.000;0;
load5=0.000;3.000;4.000;0; load15=0.000;3.000;4.000;0;
When to monitor?
check_period
notification_period
known exclusions
db snapshots
log housekeeping
intensive reports
etc.
Q&A
Michael Tate
Galen Charlton
Thank You!
mtate@esilibrary.com
Michael Tate
Galen Charlton