Académique Documents
Professionnel Documents
Culture Documents
assigned the task of implementing a monitoring system for the new Data Centre (DC),
recently completed and used by the University for its central administrative computing.
Already implemented was an OSS package called Cacti; it had been selected by
management due to their familiarity with it, having used it extensively to monitor Java
Virtual Machine (JVM) for the Learning Management System called Blackboard Learn.
In addition to JVM monitoring, Cacti had been leveraged to monitor host network
interfaces, memory usage, CPU usage, and Apache Web Server statistics. The goal was
devices, physical and virtual within and connected to this new facility.
Questions arose regarding Cactis limitations, which had been experienced and
motivated a review of available monitoring packages. These questions included: does our
current monitoring package meet all our needs? Can the package be scaled up as the DC
and its virtual infrastructure and systems grow? Should we change to another OSS
solution like Zabbix or Nagios? Each person had expressed his or her own preference and
123
This model revealed to management the strengths and weaknesses of Cacti in
addition to Zabbix and Nagios. These two OSS Monitoring packages are commonly used
in the community.
The target users of this system are the staff responsible for this facility, thus we
have implemented with the assistance of the Hardware Infrastructure Group (HIG) in
consultation with the Manager of the Data Centre. In this case we have included real
employee salary, hardware, and services costs in Canadian dollars (sign: $; code: CAD),
Starting with the Manager, Data Centers, John Calvin, and in consultation of other
members of his team, we have conducted several interviews to understand why the
customer seeks to use OSS instead of CSS. Additionally, we needed to establish the
Services division (I+TS) of The University and outlines the importance of this system.
We then show how the model was applied to three OSS packages, and finally how these
results justify retaining the already implemented Cacti solution. In addition, the outcome
of the evaluation suggested the use of additional plugins and performance tuning to
124
6.1 Introduction
research university in Toronto, Canada. At the time of writing it had 65,612 full time
undergraduates, of which 56,380 are domestic and 9,232 are international students, and
15,287 graduate students, of which 13,210 are domestic and 2,077 are international. The
Universitys annual operating budget for the current fiscal year is $1.8B CAD (University
of Toronto, 2012).
According to the I+TS official web site61, the Office of the Chief Information
Officer (CIO) is responsible for planning and provision [sic] of central IT services at the
University of Toronto (University of Toronto, 2010). Eight key areas integrate I+TS as
Within the portfolio of the CIO, the Enterprise Infrastructure Solutions group is
responsible for designing and implementing networking, server, storage and other
enterprise-level solutions that are secure, efficient, reliable and cost-effective (University
of Toronto, 2012).
the I+TS Roadmap (University of Toronto, 2011), in order to provide reliable and secure
61
Information + Technology services - http://jmll.me/tbc2
125
network, server, and storage services while decreasing the use of space and energy. In
May of 2011 the Data Centre redesign started with an approved budget of $5.1M CAD
and ROI in two years. DC redesign project was led by Patrick Hopewell, Director of EIS,
Tom Molnar, Manager HIG, and John Calvin, Manager, Data Centers, working with
Ehvert Mission Critical to build a state of the art, sustainable and efficient Data Centre.
devices like switches, routers, fiber channel switches, servers, and storage heads, but it
needs to include very specific devices such as Uninterruptible Power Supplies (UPS),
Power Distribution Units (PDUs), Airflow, temperature and humidity sensors among
other instrumentation devices. For instance, according to the Calvin (2012), the accepted
safe rate of change of temperature for most server equipment is less than 10C/hr. A loss
of airflow in a sealed vented cabinet operating at about 10kW would cause a 20C rise of
The I+TS ecosystem is categorized as hybrid, due to the mixture of OSS and
CSS-based information technology services. However there is a push toward the adoption
of OSS to cap and reduce licensing costs. One scenario often mention by Calvin (2013) is
the monitoring of large-scale networks having many thousands of devices, which in the
case of some CSS licensing models would require licensing on a per-device basis. On the
other hand, OSS offers no added software purchase or licensing costs as the number of
126
devices increases. Other disadvantages of using CSS were enumerated, such as the extra
costs for integrating software via pay-for-use APIs, software lifecycles imposed by the
According to Gartner (2011), Cacti, Nagios and Zabbix are among the most
popular OSS web-based monitoring systems in the current market. Thus, these three
candidates were evaluated using The Integral OSS Evaluation Model, defined in this
document. It is important to reiterate that Cacti has already been implemented, and thus
the outcome of this analysis is to be used to support its continued use or justify the
The primary user community for the monitoring software is a group of highly
skilled technical staff responsible for the design and implementation of datacenter
networking, servers, storage and other enterprise level solutions. The secondary user
responsible for the development and implementation of all computer applications offered
127
We asked the question during the interview conducted with the manager of the
Data Centre What are the requirements for the IT Infrastructure Monitoring System?
The answer of this question and others are documented in Appendix G. As a result,
almost 50 elements were listed, and for a better understanding as the model recommends,
they were classified into three main categories: functional, non-functional and
ensure that the chosen software will fulfill the needs expressed by the user. In this case
the evaluator must have a fair amount of technological knowledge to recognize for
128
instance, how a monitoring system will handle 64 bit values, and of course if the
technological requirement is not well defined, working closely with the user to gain a
Tables 43, 44, and 45 show the identification cards for the evaluated packages;
they provide a quick reference of the relevant characteristics of each package. For
instance, Cacti is written in PHP, Nagios in C and Zabbix in three different languages
129
Areas in which Zabbix seems to be very strong are official documentation,
community support, issue tracker systems. On the other hand, Cacti has a user
based off [sic] CentOS that sets up and configures a customized Cacti install (Conner,
2012).
Another important fact is that both Nagios and Zabbix have their own business
62
CactiEZ - http://jmll.me/tbc19
130
Category Sub-category Description
Name Zabbix
Version 2.0.4
License GNU General Public License version 2
General
Type Monitoring System
Site http://www.zabbix.org/
Language C (server, proxy, agent), PHP (frontend), Java (Java gateway)
Hardware Network Access, 100MB Disk Space, 256M RAM, Pentium IV or equivalent
Requirements
Software Apache Web Server, MySQL, PostgreSQL, SQLite, Oracle or IBM DB2
Official https://www.zabbix.com/wiki/doku.php; http://blog.zabbix.com/
Non-Official https://s3.amazonaws.com/analyticarts/zabbix/Zabbix2-0Manual.pdf
Documentation
Relevant N/A
Books Zabbix 1.8 Network Monitoring By: Rihards Olups
https://www.zabbix.com/forum/; https://support.zabbix.com/secure/Dashboard.jspa;
Official https://lists.sourceforge.net/lists/listinfo/zabbix-announce;
Support & https://lists.sourceforge.net/lists/listinfo/zabbix-users
Community Non-Official N/A
Issue tracker site https://support.zabbix.com/browse/ZBX
Relevant N/A
http://sourceforge.net/projects/zabbix/files/ZABBIX%20Latest%20Stable/2.0.4/zabbix-
Source 2.0.4.tar.gz/download
Distribution
Binaries http://www.zabbix.com/download.php
Platforms Cross Platform
Modularity Plugins
Architecture
Plugins N/A
Training http://www.zabbix.com/business_solutions.php
+ Services Support http://www.zabbix.com/business_solutions.php
Consulting http://www.zabbix.com/business_solutions.php
Using the results from the Definition phase, the identification cards for the three
candidates, and the criterion previously defined in the model, will be scored by first
131
6.4.1 Functionality
Recall that requirements shown in Table 42 have been classified in three main
sections. Each requirement has been assigned a weight relative to its importance to the
customer. These requirements are listed in Appendix I. Summarizing, Cacti met 14 out of
meets more requirements for an IT Infrastructure Monitoring System than Nagios and
Cacti.
Functionality
Cacti Nagios Zabbix
Functional 0.8 1.0 1.0
Non- 0.7 0.7 0.9
functional
Technological 1.3 1.3 1.3
Total 0.9 1.0 1.1
6.4.2 License
The user did not specify a type of OSS license agreement as being a requirement.
However, the user did specify a preference for OSS vs. OSS. Therefore, Table 47 scores
132
License
Cacti Nagios Zabbix
License GPL GPL GPL
Required N/A
Total Score 1 1 1
6.4.3 Community
bug tracking system called bug.cacti.net that is updated frequently. According to the
Cacti Forums, the official community site, it has 228,715 posts, 43,831 topics and a total
cacti-user for users in general and cacti-devel for developers (Cacti.net, 2012). The
mailing list cacti-user is the one with the most activity. Further analysis of data gathered
from the List Archive site provided by SourceForge.net from 2009 to 2012 (illustrated in
Figure 27) shown that 2009 has been the most active year to date with 633 messages; in
2012 the standard deviation of the years data was 16.5, which shows how the activity
133
Figure 26 Cacti-user mailing list activity from 2009 to 2012. Data gathered from SourceForge.net (2013).
scripts, data, graph, hosts templates and data queries, both official (supported) and user-
distributions with additional features for an extra fee, Nagios Core, the foundation, is free
and is lead by Nagios.org, a community site funded by NE. Nagios.org estimates its
worldwide community in more than 1 million users, including individuals and companies
(Nagios.org, 2013). Exchange is a sub-site of Nagios.org where all types of projects such
as plugins (2500+), add-ons (500+) and utilities (16) among others can be found
of 40,009 posts and 18,312 members. This forum includes private areas for customer
134
Figure 27 Nagios-user mailing list activity from 2009 to 2012. Data gathered from SourceForge.net (2013).
and nagios-users-ru. Similar to the results we saw for Cacti, the Nagios user mailing list
had more activity than others with a total of 77,205 messages from 2001 to 2012
(SourceForge.net, 2013).
Further analysis on data gathered from the List Archive site provided by
SourceForge.net for the period 2009 to 2012 (illustrated in Figure 28) showed that 2009
has been the most active year with 7,017 messages; in 2012 the standard deviation of the
years data was 40.9 as the monthly change rate. Finally, there is a World Conference
135
The Zabbix Community official site is at Zabbix.org, which names itself as the
107,561 posts and 12,185 members (Zabbix SIA, 2013; Zabbix SIA, 2013) and in the
Blog (blog.zabbix.com) site, users find a fair amount of information about Zabbix
versions, solved issues, useful hints and recommendations from Zabbix experts and
mailing lists for announcements, users, developers and translators who translate the
product and documentation into various languages. Among these four, the Zabbix-users
list is the most used, with 1266 messages from 565 subscribers (SourceForge.net, 2013)
Presented in the Figure 29 is an analysis of the Zabbix mailing list data from
SourceForge.net for the period of 2009 to 2012 (illustrated in Figure 29) showing that
2009 was the most active year, with 195 messages; in 2012 the standard deviation of the
years data is 8.16 representing the change in the number of messages monthly.
Figure 28 Zabbix-user mailing list activity from 2009 to 2012. Data gathered from SourceForge.net (2013).
136
We know that Zabbix has a modular architecture, however we could not identify
an official site hosting plugins or modules. On the other hand, there are posts in the
official forums, but nothing specific with regard to obtaining plugins. A simple Google
search demonstrates how these plugins are hosted on various source-code hosting sites
The Nagios community is certainly well structured and NE has done a great job
involving users in events, blogs, etc. Therefore, the Nagios community is categorized as
Commercial Organization.
Cacti, on the other hand, relies completely on its community, and although no
Organization.
Community
OSS Package Type Score
Cacti ORG 1
Nagios COR 4
Zabbix ORG 1
137
6.4.4 Seniority
Cactis initial version, 0.5 was released on September 23, 2001; this
approximately 11 years ago (SourceForge.net, 2001). The latest version, and the one
being evaluated in this case, is 0.8.8a released on April 29, 2012. This is the 40th release
in its lifetime.
NetSaint (Nagios version 0.0.1) was made publicly available on March 14, 1999
and was renamed Nagios 34 releases later in 2002. After 14 years of development (2013)
it is now at version 3.4.4 (Nagios Core), which is the version evaluated in this case.
and was not released until March 23rd 2004 as Zabbix 1.0. To date, Zabbix has had a total
life; the latest version (2.0.4) was released on December 8th, 2012 and is the version
However, in this case we have chosen the latest stable release, meaning this sub-criterion
can be avoided ignoring. On the other hand, lifespan has been taken into account and, in
this case Nagios with 14 years ranks first; Cacti is in second place with a ten years and
138
Seniority
Package Lifespan Score Final Sore
Cacti 11 3 3
Nagios 14 3 3
Zabbix 9 2 2
Cacti and Nagios are the senior OSS offerings in the IT infrastructure monitoring
6.4.5 Support
forum divided into four main sub-forums: General, with 80,184 posts and 18,184 topics;
Lunix/Unix Specific, with 45,263 posts and 93,337 topics; Windows Specific, with
21,514 posts and 3,469 topics; and Unstable Development Versions, with 1,664 posts and
294 topics (Cacti.net, 2013). With respect to paid support, an England-based company
called Transitiv Technologies specializes in Open Source Support & Services and can
provide support through their Cacti Consultancy Services (Transitiv Technologies, 2013).
An American company called credativ LLC also provides OSS support in the US, UK,
Germany, and Canada. credativ's Cacti support covers all of the following Linux
supported distributions: Debian, Ubuntu, Red Hat, SuSE, openSuSE, CentOS, Xandros,
OpenBSD, and FreeBS for a monthly fee of $305USD (credativ LLC, 2013).
139
Nagios Enterprises offers professional annual support for organizations that
require this service (Nagios Enterprises, 2013). NE provides support directly in the
public forum63, however a specific forum called Customer Support requires credentials
to access, as well as a product license key associated with one of the following products:
Nagios XI, Fusion, Core or Incident Manager. A General Support forum is also
available for community support. The Nagios XI license includes support and
maintenance, from $1,295USD to $2,495USD per year (Nagios Enterprises, 2013). For
Nagios Core, the OSS version, annual support plans are available starting at $2,495USD
The Zabbix community support is provided through the Zabbix Help Forum64,
which hosts 35,560 posts and 10,047 topics. Zabbix offers to its customers 5 different
support tiers: per-incident based support plans up to complex support tiers, version
upgrade, on-site training and on-site consulting (Zabbix SIA, 2013). Nonetheless, there is
no publicly available pricing for this kind of support. The support web site states that the
During the interview of the Manager, Data Centres, we asked his opinion of
supporting the OSS package internally, and he pointed out that he had doubt that the
expertise existed within the organization and could be leveraged to support this package.
Staffing levels pose a challenge for any complex monitoring system. If monitoring
63
Nagios Support - http://jmll.me/tbc34
64
Zabbix Support - http://jmll.me/tbc9
140
solutions are to remain effective, they must constantly evolve; not only with moves, adds,
and changes, but by supporting new devices and new alerting mechanisms - from
numeric pagers to the thing after twitter. Influencing the direction of that evolution is
simpler when you have the source code and one good programmer (Calvin, 2013).
Consolidating the information above, Table 50 shows the score for each kind of
support and shows that support is not a significant concern for any of them.
Support
Kind Cacti Nagios Zabbix
Self-Support 1 0 1
Paid Support 1 1 1
Community Support 1 1 1
Total 3 2 3
However, an organization might face different challenges down the road when
opting for community support. For instance, support response times and the quality of
response provided by an unpaid entity likely has no SLA; the response depends on the
6.4.6 Interoperability
For this criterion, technological requirements were classified into four main
categories: interface, standards, protocols and data formats, in order to evaluate the OSS
141
Cacti, for instance, has a web interface following the W3C standards and mobile
clients, such as iCacti65 for iOS devices; for Android devices, there is CactiViewer66 and
nmidClient Cacti67. Furthermore, with the support of additional modules or plugins this
OSS can handle PDF and HTML reporting; out-of-the-box it has PNG as its default
image standard. Cacti supports SNMP v1, v2c and v3 out-of-the-box and can export raw
Nagios also has a standardized web interface and mobile clients developed by the
community and the proof of this can be seen simply by searching the Google Play Store
(Android store) for the word Nagios where one can find aNag68 and Nagbag69, among
others; for iOS there is OnCall70 and iNag71, for example. Between core functionalities
and additional plugins, Nagios handle PDF, PNG and RRD, but appears to lack HTML
support. Nagios can export data in all the previously stated required formats.
Finally, Zabbix has both a web interface following W3C and community
developed mobile clients for iOS, such as MobileOp72 and Mozaby73; for Android
65
iCacti, iOS Cacti client - http://jmll.me/tbc35
66
CactiViewer, Android Cacti client - http://jmll.me/tbc36
67
nmidClient Cacti, Android Cacti client - http://jmll.me/tbc37
68
aNag, Android Nagios client - http://jmll.me/tbc38
69
Nagbag, Anrdoid Nagios client - http://jmll.me/tbc39
70
OnCall, iOS Nagios client - http://jmll.me/tbc40
71
iNag, iOS Nagios Client - http://jmll.me/tbc41
72
MobileOp, iOS Zabbix client - http://jmll.me/tbc42
73
Mozaby, iOS Zabbix client - http://jmll.me/tbc43
142
devices there are apps including ZAX Zabbix74 and Zabbix on the go, among others in
the Google Play Store. This OSS can handle PDF and PNG but has a lack of RRD and
HTML standards use. For protocols listed, there is a full compatibility using SNMP v1,
v2c and v3. Lastly, the only mean to export data from Zabbix is by using plain text and
Interoperability
Importance Cacti Nagios Zabbix Cacti Nagios Zabbix
Score Score Score
Interface
W3C 1 1 1 1 1 1 1
Mobile 1 1 1 1 1 1 1
Standards
Portable Document Format 2 1 1 1 2 2 2
Portable Network Graphics 1 1 1 1 1 1 1
Hypertext Markup Language 2 1 0 0 2 0 0
RRDTool 2 1 1 0 2 2 0
Protocols
SNMP v1 2 1 1 1 2 2 2
SNMP v2c 2 1 1 1 2 2 2
SNMP v3 0 1 1 1 0 0 0
Data formats
Spreadsheets (XLS) 1 0 1 0 0 1 0
Comma-separated Values 2 1 1 0 2 2 0
Plain text 1 1 1 1 1 1 1
XML 2 0 1 1 0 2 2
Final Score 1.2 1.3 0.9
Table 51 shows the scores for every Interoperability element evaluated as well as
74
ZAX Zabbix. http://jmll.me/tbc48
143
6.4.7 Security
Cacti, Nagios, and Zabbix were queried at the CERT Vulnerability Notes
database and no results were obtained. However, in order to validate any vulnerability
already registered for these packages in 2013, the Common Vulnerabilities and
Exposures (CVE) registered two incidences for Nagios, buy they affect versions prior
3.4.4 (CVE, 2013), thus the scores remains equal at one for all three evaluated OSS
packages. For both Zabbix75 and Cacti76, one vulnerability in 2012 and none on 2013
6.4.8 Roadmap
The Cacti development roadmap is available for future releases, including the one
that was targeted. Cacti 1.0.0 will be released during the first quarter of 2013; Cacti 1.1.0
in the third quarter of 2013 (Cacti.net, 2011). The roadmap also contains major features
for each version. Additionally, its last development check-in in Cactis SVN repository
(Cacti.net, 2013) was on January 4th, 2013, by the author Gandalf", the primary
contributor, which indicates that the primary contributor is still involved in the project.
75
Zabbix: Vulnerability Statistics - http://jmll.me/tbc44
76
Cacti: Vulnerability Statistics - http://jmll.me/tbc17
144
No product roadmap resource was found for Nagios, neither in the NE site nor the
Nagios Community site. With respect to activity, the last Nagios Core stable version
The Zabbix roadmap for the next version (2.2) is available at the Zabbix.org
wiki77. The roadmap is divided into two main branches: time and functional. The time
roadmap has no explicit year, but because the page, last edited January 9th, 2013, states
May 1st (no year) we anticipate the release date for version 2.2 on that date in 2013. The
document called Whats new available in the documentation site78. The most recent
modification to the Pre-2.1.0 (alpha) source code was made on February 9th, 2013
Using the values from Figure 24 we completed Table 52 to obtain the final score
Roadmap
Indicator\Package Cacti Nagios Zabbix
Roadmap 2 0 2
Project Activity 3 3 3
Final Score 2.5 1.5 2.5
77
Zabbix Roadmap - http://jmll.me/tbc45
78
Whats new on Zabbix 2.2 - http://jmll.me/tbc46
145
6.4.9 Performance
In this case a simple question is asked: Does the OSS package have performance
tuning parameters?
Consider first Cacti. It can use more memory by installing a plugin called
BOOST, which enables the ability process a much greater number of data sources
processed per pass (up to 400,000). It also enables image caching, in order to save
rendering and processing resources and improve front-end performance. Also, Cactis
default poller (written in PHP) can be replaced with Spine (C based) for high-
large number of hosts and services (more than 1000). Subjects include optimizing
hardware for maximum performance and a set of tunable configuration parameters for the
Zabbix, in its main documentation site, has a best practices article where it
recommends some general advice on hardware, such as using SCSI or SAS instead of
IDE or SATA, fast RAID storage, fast Ethernet adapter and plenty of memory. While it
does not specify how much of each resource to use, it does discuss which resources
146
performance are shown. This is a multi-tier system and also includes best practices for
the database engine, which is the most important part of Zabbix tuning (Zabbix SIA,
2013).
In this case Cacti is assigned a score of one (1) for this criterion, because it has
been designed to handle large environments and offers plugins and other options to tune
for a larger environment; Nagios is assigned a score of one (1) notwithstanding its limited
performance tuning options in the OSS version Nagios Core, but being pre-compiled is
a significant performance advantage which we felt offsets the limited tuning options; and
finally, a score of two (2) is assigned to Zabbix for providing explicit recommendations
6.4.10 Scalability
Cacti can be considered a vertically scalable monitoring system, because its capacity
is increased by adding memory, CPU and storage, rather than adding more nodes which
makes evident the lack of scalability. It has been documented that the largest Cacti
installation comprises more than 1,000,000 data sources. To accomplish this, a couple of
resources were required: BOOST plugin and MySQL memory tables (Scheck, 2012). The
147
current 0.8.8 architecture meets the multi-polling strategy to gather data from various
installation is difficult and differs significantly from the default installation. In order to
accomplish horizontal scalability plugins like Distributed Nagios eXecutor79 are needed.
This plugin basically offloads a significant portion of the work normally done by Nagios
2011). Zabbix is self-described as Enterprise Ready, due to its ability to scale from small
environments to large ones with thousands of devices. There are Zabbix installations with
over 100,000 devices monitored, showing that Zabbix is able to process more than
1,000,000 checks per minute using mid-range hardware and collecting gigabytes of
79
Distributed Nagios eXecutor - http://jmll.me/tbc47
148
According to Calvin (2013) when comparing vertical and horizontal scalability to
linear scalability, linear scalability wins every time; there is no advantage to horizontal
boundaries, for example, exceeding the maximum number of devices may push an
application beyond its linearly scalable window. Since all evaluated OSS package have
linear scalability, the linear scalability score is meaningless and can be ignored. However,
given a choice between linear and non-linear scalability we would always choose the
linear scalable application. While Nagios should have received a score of three (3) for
being both horizontally and vertically scalable, we have elected to deduct a point due to
6.4.11 Documentation
Our investigation clearly showed that all three packages have adequate
documentation sites and most of the sub-criteria covered, however there are certain
elements that were not identified. Appendix H lists all the available documentation
resources that we discovered for Cacti, Zabbix and Nagios. It is important to mention that
Table 54 shows both sub-criteria of documentation, user and technical. Our three
candidates have been scored considering the official sources only. The Importance
weighting that has been assigned to each sub-criterion was determined by the customer.
149
User Documentation
Importance Cacti Nagios Zabbix Cacti Nagios Zabbix
Score Score Score
U-Guides 2 1 1 1 2 2 2
How-Tos 2 1 1 1 2 2 2
FAQs 1 1 1 1 1 1 1
Total Score 1.67 1.67 1.67
Technical Documentation
Importance Cacti Nagios Zabbix Cacti Nagios Zabbix
Score Score Score
Developers
API 1 1 1 1 1 1 1
SDK 0 0 0 0 0 0 0
SC 1 0 0 0 0 0 0
KB
Known 2 1 1 1 2 2 2
Issues
FAQS 2 1 1 1 2 2 2
Problems 1 1 1 1 1 1 1
Troubleshooting
Diagnosis 1 1 0 1 1 0 1
Logs 2 1 1 1 2 2 2
Maintenance
Install 2 1 1 1 2 2 2
Configure 2 1 1 1 2 2 2
Optimize 2 1 0 0 2 0 0
Total Score 1.36 1.09 1.18
Table 54 User and Technical documentation scorecard: Cacti, Nagios and Zabbix.
Finally, Table 55 provides the final score for each OSS. While the user
documentation for all three packages was equally useful the technical documentation is
arguably more important for the use of the OSS as the enterprise monitoring system.
Documentation
Cacti Nagios Zabbix
Overall 2.00 1.00 2.00
Technical 1.36 1.09 1.18
User 1.67 1.67 1.67
Final 1.68 1.25 1.62
150
6.4.12 TCO
There is a belief in the absolute freeness of OSS in many organizations, and this is
also perceptible in the University of Toronto culture. Many of the software packages that
UofT runs are OSS; such is the case of the official Identity Provider Service, which is
Shibboleth and the Next Generation Student Information Services (NGSIS)80 based on
the Kuali Foundation81. With the application of this model to Enterprise Monitoring
Software, we have exposed many of the hidden costs that free-software imposes.
The United Steelworkers (USWA), Staff-Appointed Unit, Local 1998 (USW 1998
I+TS staff falls in the scope of the union, which henceforth will be called unionized staff.
The USW Salary Grid (effective July 1, 2012)82 mandates unionized staff salaries.
The annual salary used in our TCO calculation is based on the hourly for pay band 16 at
the hiring rate: $77,568 CAD (subject to deductions required by law), thus the hourly rate
would be $42.61 CAD before taxes. The hourly rate also includes an annual increase in
salary of 2%. As previously mentioned we have not included the overhead for the
80
NGSIS - http://jmll.me/tbc49
81
Kuali Foundation http://jmll.me/tbc53
82
USW1998 - http://jmll.me/tbc50
151
employee in our TCO calculations, only the salary. From now on, we will refer to this
As previously mentioned, our analysis of OSS packages has a cost, which will
include as part of the Up-front evaluation study category within the TCO calculation.
The application of this model to IT Enterprise Monitoring Software has taken roughly
four hours a day for a week, which makes a total of 20 hours or $852.20CAD (ITA-16).
service called JumpBox83 that provides small virtual machine instances "ready-to-use" for
testing or POC with pre-packaged configurations and can be run in any computing
environment that supports virtualization, i.e. VMware, OpenStack, VirtualBox, etc. The
cost for a Gold license of this service is $150 USD/month. We chose the Gold offering
The internal cost of virtual hardware for a medium-size virtual machine (1vCPU,
groups with 7200-RPM SATA), provided by the I+TS Virtualization and Storage
Services84, is $1,210CAD per annum, which includes VM Support but it does not include
Operating System installation and administration. HIG staff would perform those tasks,
and the cost is reflected in the TCO as the man-hour cost (ITA-16). Using a 30-day trial
83
JumpBox - http://jmll.me/tbc51
84
I+TS Digital Assets - http://jmll.me/tbc52
152
offering we perform the POC without committing to this cost (University of Toronto,
I+TS, 2012).
The Initial Configuration cost is the time invested in configuring the OSS for
the first run, and thus there is neither a migration cost (data & users) nor a training cost.
Process and best practices refer to the proper configuration of the core elements and
a knowledge base for each product, which may take significant time to acquire.
resource tasked 5x5 basis (five hours a day, fives day a week) in pay band 10:
unionized staff (ITA-16) with neither external consulting services nor external training.
at the application level includes additional modules, templates (data, graphs, etc.),
Customization for business needs, initial configuration is simply the branding of the
153
login page and specific data-gathering templates, estimated at 10 hours (ITA-16) or
$426.1CAD.
Training, Initial training consists of gathering all the required information for
the installation and use the OSS, which is related to the documentation criterion;
considering that Cacti has more technological documentation, we have estimated 5 hours
Cost of Support services in this case will rely on a unionized staff member
(ITA-10). We have estimated 30 hours (ITA-10) of Support, training for support per
annum or $891.9CAD.
(ITA-16) per annum for ongoing training for users or $1,278.3 CAD; 24 hours (ITA-
16) per annum or $1,022.64CAD for Maintenance. Thus, the TCO for Cacti is
$72,073.92 with a discount factor of 5%, previously described. Further details can be
seen in Appendix J.
154
expertise within the organization, and another 30 hours (ITA-16) or $1,278.3CAD for the
Initial configuration.
Customization for business needs, initial resources cost includes the data-gathering
templates, which will have to be done from scratch, or by further study. In either case, 20
The Training, Initial training cost assumes self-training. The staff responsible
will investigate, learn and share knowledge in order to achieve an acceptable level of
Additionally, 30 hours (ITA-10) or $891.9CAD per annum, have been budgeted for
(ITA-16) per annum or $1,022.64CAD. The Nagios TCO has been estimated at
For Zabbix, the Integration, initial Configuration cost consists of two elements:
considering that the available documentation is good enough, and the Initial
155
Customization for business needs, initial resources cost, like Nagios, includes
the data-gathering templates, which need to be built from scratch unless an investigation
has been done previously. Hence 15 hours (ITA-16) or $639.15CAD have been
estimated. The Processes and Best Practices, Initial Configuration cost includes
quality of the OSS documentation. For Zabbix, the documentation scored well, thus an
(ITA-10) or $891.9CAD per annum have been estimated for Support, Training. Finally,
The calculated Zabbix TCO, including a discount factor of 5% for a three-year period, is
Appendices J, K and L show the TCO breakdown for Cacti, Nagios and Zabbix
respectively. Summarizing, Cacti has the lowest TCO of the three, primarily because of
the staff time investment required for the other two. The reason for this is evident in the
documentation scores. Documentation and resource availability are critical when staff
156
Therefore, the cost reduction perceived by EIS in using OSS does not bare
scrutiny. Hidden costs exist. For instance, a week of general training on any package
might reduce the learning curve and the time investment for several of the activities that
impact TCO.
Support costs can be all but dismissed when compared with CSS support costs,
which routinely exceed on hundred thousands dollars per year for an institution the size
of UofT. However, OSS paid annual support solutions can decrease the time wasted in
By calculating the TCO for a Free software product we dispel the myth that free
software has no cost. Furthermore, it shows that only the purchase price is reduced with
From this TCO analysis we are now prepared to rank our three OSS candidates
for IT Enterprise Monitoring software: Cacti had the lowest TCO; Zabbix was second
and Nagios was third, but in all fairness, the delta between highest and lowest is less than
3.5% of the lowest TCO or about $2,400CAD. Thus we assigned Cacti a score of three
(3), Zabbix a score of two (2) and Nagios a score of one (1).
157
6.5 Phase 4 Valuation
In Table 56 all of the criteria scores have been consolidated along with a
weighting or importance assigned to each criteria. Multiplying the individual score by the
weighting factor, and then by totaling the weighted scores, we obtained a final score for
each candidate. The right-hand column holds the maximum score attainable for each
Using this technique affords us an upper bound against which we can judge the
relative importance of the actual scores of the three candidates. This is made clear by
converting the scores into a percentage. A margin of error in our evaluation process of
more than 1% in this case, would mean that all three scores were effectively identical.
158
Functionality
100%
TCO License
80%
60%
Documentation Community
40%
20%
Scalability 0% Seniority
Performance Support
Roadmap Interoperability
Security
Figure 29 Radar graph comparing criteria scores for Cacti, Nagios and Zabbix.
159
6.6 Phase 5 Selection
Consider Table 54. The scores tell us that we have a statistical dead heat; that is there
is no obvious best choice based on the final scores alone. However, Table 54 gives us
insight into the deficiencies of the existing OSS package (Cacti) and since Zabbix
scalability and core features fulfill most of UofT instrumentation needs, suggests that a
The results dont suggest that a change from Cacti to Zabbix would substantially
change the cost or efficacy of the Enterprise Monitoring Solution. Reviewing these
results with the customer it was concluded that operating both could offer significant
with the current Cacti instance, in order to work as a distributable monitoring system, and
Since EIS and HIG are responsible for the installation, operation and maintenance of
the Data Centre and the entire Virtualization Infrastructure, and considering the
remarkable expertise of the staff, the TCO of a second package is dwarfed by the other
The current EIS Cacti installation login screen is shown on Figure 31, and further
parameters are managed. Considering that Cacti is already implemented and monitoring
160
the electrical, environmental and operational indicators, and considering that it is already
in use to monitor the Learning Management System, it makes little sense to retire it.
implement Spine, the fast C-compiled replacement for the PHP poller, Another strategy
to mitigate this issue is to create isolated Cacti instances and create a development
position in order to customize or add the features that it lacks. Ironically, even student
161