Vous êtes sur la page 1sur 15

E-Guide

Power and Cooling best practices


for large data centers
Data center infrastructure is constantly evolving. Increases in
computer room and data center density and diversity are driving
change in the power and cooling systems that business critical servers
and communications devices depend on for their performance and
reliability. In this expert e-guide from SearchDataCenter.com, discover
power and cooling tips in order to optimize energy efficiency in your
data center. Also, learn best practices for backup power maintenance
to ensure uninterrupted power.

Sponsored By:

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

E-Guide

Power and Cooling best practices for


large data centers
Table of Contents
Data center infrastructure management: Power and cooling tips
Best practices for backup power maintenance
Resources from Schneider Electric

Sponsored By:

Page 2 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

Data center infrastructure management: Power and


cooling tips
By Omar McKee, Contributor
Data center infrastructure is constantly evolving. Increases in computer room and data
center density and diversity are driving change in the power and cooling systems that
business critical servers and communications devices depend on for their performance and
reliability.
As equipment density rises, hardware becomes mission critical because the each application
deployed increases business dependence on data center IT systems. At the same time,
entire facilities, as well as individual racks, are supporting an escalating number of devices
as server form factors continue to shrink.
Density is an issue felt across all business, according to a 2006 Data Center Users Group
study released in October. Heat density and power density represented two of the top three
issues driving change in the data center as more than 40% of the respondents noted these
as top trends related to infrastructure.
For many organizations, the IT infrastructure has evolved into an interdependent business
critical network with the data center as the hub. A power failure at any point along the
network can impact the entire operation -- and have serious consequences for the business.
As a result, there exists a valuable opportunity for resellers to work closely with customers
to proactively identify problems within their power systems that could adversely affect
availability of their critical systems and operational performance of their facility.
Preventive maintenance usually requires a shut-down to ensure electrical connection
integrity. Most preventive maintenance measures should only be attempted by qualified
personnel.
The following are preventive maintenance tips for resellers to use when reviewing a
customer's power systems:

Sponsored By:

Page 3 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

Small UPS devices should be inspected annually.

Medium and large UPS systems should be inspected twice a year to ensure proper
operation and confirm that the unit is operating within the manufacturer's
specifications.

Data center semi-annual service

Perform temperature checks on all breakers, connections and associated controls.


Repair and/or report all high temperature areas.

Perform complete visual inspection of equipment including subassemblies, wiring


harnesses, contacts, cables and major components. Check air filters for cleanliness.

Check modules for the following:


o

Rectifier and inverter snubber boards for discoloration.

Power capacitors for swelling or leaking oil.

DC capacitor vent caps that have extruded more than 1/8".

Record all voltage and current meter readings on module control cabinet or system
control cabinet.

Measure and record harmonic trap filter currents.

Data center annual services

Check inverter and rectifier snubbers for burned or broken wires.

Ensure all nuts, bolts, screws, and connectors for tightness and heat discoloration.

Verify fuses on the DC capacitor deck for continuity (if applicable).

With customer approval, perform operational test of the system, including unit
transfer and battery discharge.

Calibrate and record all electronics to system specifications.

Install or perform Engineering Field Change Notices (FCN), as necessary.

Measure and record all low-voltage power supply levels.

Measure and record phase to phase input voltage and currents.

Review system performance with customer to address any questions and to schedule
any repairs.

Sponsored By:

Page 4 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

Data center battery inspection service


This visual inspection should be performed during the UPS semi-annual and annual
preventive maintenance services.

Check integrity of battery cabinet (if applicable).

Visual inspection of the battery cabinet or room to include:


o

Check for NO-OX grease or oil on all connections.

Check battery jars for proper liquid level (if flooded cells).

Check for corrosion on all terminals and cables.

Examine physical cleanliness of battery room and jars.

Measure and record DC bus ripple voltage.

Measure and record total battery float voltage.

Data center preventive maintenance service on power management systems

Perform complete visual inspection of internal sub-assemblies, wiring harnesses,


contactors, cables, major components, and check for proper clearance around the
unit.

Examine all transformer, terminal block and ground/neutral bus bar connections, as
well as input and output breakers for tightness.

Inspect high and low voltage junction box terminals for tightness.

Inspect all option wiring for tightness. (Spike suppressor, ground fault, phase
rotation/loss).

Inspect all capacitor bank connections for a solid fit.

Verify that all cooling fans are functional and air ducts are open.

Confirm continuity of all fuses and that they are correctly rated.

Measure input and output phase to phase voltage.

Determine the output, neutral and ground current.

Verify kVA load and capacity per phase.

Validate grounding electrode conductor and any isolated grounds.

Measure all filter capacitor currents at no load for all three phases.

Measure primary, secondary, second harmonic and third harmonic (if applicable). All
should be balanced within 2.5% deviation.

Sponsored By:

Page 5 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

Verify EPO lamps are illuminated.

Check that local and remote EPOs are functioning properly if permitted.

Confirm that monitor is recording within +/- 2% of those values measured.

Activate transformer over-temp alarm and shutdown circuits to confirm proper


operation if permitted.

Verify operation of any option for alarm or shutdown sequence if permitted and of
any customer alarm circuits and specified messages.

Make sure of specified restart capabilities either manual or auto-restart.

Verify operation of the bypass switch and the bypass transformer over temp alarm.

Sponsored By:

Page 6 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

Best practices for backup power maintenance


By Julius Neudorfer, Contributor
Having uninterrupted clean power for the critical load is the objective of every data center.
To achieve this goal, the systems in the power path must be properly maintained and
tested. Ideally, this would be done without interrupting or potentially exposing the critical
load to power loss.
However, maintenance is sometimes seen as a disruptive, (un)necessary evil and expense
by some senior managers. This is especially true in todays economic climate, where every
expense is examined to see if can be reduced or eliminated. Nonetheless, periodic
maintenance is required to achieve the projected level of equipment reliability and critical
load uptime. Of course, this requires that some level of redundancy be built into the power
chain to allow for concurrent operation during maintenance (i.e. tier 2-4).
The higher the level of power system redundancy (N+1, 2N or S+S, corresponding to tier
levels 2-4) the lower the probability that power to the critical load wont need to be
interrupted during scheduled maintenance procedures. However, redundant equipment is
meaningless unless it is properly maintained and tested. Improper procedures and human
error have caused outages, even in tier 3- and 4-level systems.
Assuming there is redundancy available to allow for maintenance, lets examine these key
components and best practices for backup power maintenance.
Main utility power panel
The main utility power panel is the first panel in the data center power path. At the utility
service entrance, the utility hands off the power to the entire facility. Although this panel is
normally untouched during normal operation, its recommended that it is visually and
thermally inspected on a quarterly or semiannually basis, and no less than annually.

Sponsored By:

Page 7 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

Generators
The need to constantly test and maintain backup generators is well recognized by data
center facilities managers. In many cases, there is an automated weekly generator exercise
routine initiated by the ATS. It is also imperative that all staff be apprised and immediately
available during any scheduled maintenance or test. Virtually any type of testing requires
constant supervision. For example, starting a generator and ATS load transfer test to the
generator, and then moving on to other tasks (or going out to lunch) once the initial load
transfer is successful, is poor practice and invites exposure to failure.
While it may be boring to stand around for 30 to 60 minutes just looking at a running
generator, it is a good time to listen for unusual sounds and inspect the generator for fluid
leaks. It is also good practice to take some voltage and current measurements, as well as
rpm and frequency readings. Observe and record oil pressure and temperature gauges and
also scan specific areas of the motor-generator with a hand-held IR thermometer or thermal
scanner. By recording these readings, you will have a baseline and running record for
reference that can be analyzed. You can also use the readings to help monitor for any
problems and facilitate preventive service on the suspect areas. Maintenance schedules,
such as oil and filter changes, are based on run-time hours, as well as periodic intervals,
and are usually prescribed by the engine manufacturer. In addition, diesel fuel should be
checked for quality semi-annually, or even more frequently, when warranted.
Generator paralleling switchgear
In larger sites with multiple generators, paralleling switchgear is required. This extra
equipment increases the complexity of the data centers backup power system, as the
generator synchronization controls and paralleling switchgear require special attention.
Ensuring that the sync controls are working correctly is critical, and regular testing and
inspections should coincide with the generators physical maintenance. If all the generators
arent synchronized -- rotating at the exact same rpm and in-phase with each other -- the
load wont be able to be transferred to the generator array. The data center may go down,
even if some (or even all) generators are running, but are not in-sync.

Sponsored By:

Page 8 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

Of course, some components of the sync controls are also part of the systems mounted on
the generators, and as such must be coordinated with the generator maintenance program.
It is very common for the generator, ATS and paralleling gear to be maintained by the same
vendor. Focusing on the specialized requirements of the sync controls, such as other
switchgear and regular visual and thermal inspections, is recommended.
Automatic transfer switch (ATS)
Note that unlike most types of switchgear that typically remain in static positions and
untouched during their service life, ATS equipment is far more frequently used to make,
break and transfer power under load. Therefore, it must be closely watched so see if the
contacts need be serviced or replaced. Every time an ATS affects a power transfer, it
essentially uses up the contacts by the arcing, caused by the making and breaking of
high-energy circuits. In most cases, the ATS gear must be disassembled to examine or
replace the contacts. The electromechanical transfer mechanism must also be serviced to
make sure it can move freely and is free from contaminants.
For complete maintenance, the ATS needs to be de-energized. The ATS also needs to have
a functional isolation (bypass) path, either internally or externally, to allow for uninterrupted
power to the load during maintenance. Not all ATS installations have this feature; those that
dont have it require that power be interrupted for ATS servicing. The ATS bypass must be
part of the original design requirements to ensure that it can be serviced without
interrupting power to the load. The ATS should be inspected quarterly or semiannually, and
maintained annually.
Note that some data centers will operate on generator during UPS or battery bypass
operations to avoid possible exposure to a utility outage during maintenance, as there
would not be UPS power available to provide ride-through while the generator starts and is
ready to accept the load.
In addition to the major equipment categories listed above, larger sites with A-B power
systems (2N or S+S) may also have one or more tie circuit breakers. The circuit breakers
allow power sources to transfer to the alternate A-B side and permit concurrent operation
during maintenance. This is normally done hot-to-hot (both sides are energized and must

Sponsored By:

Page 9 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

be in phase) to keep the critical load energized during the power source transfer. There can
be multiple ties located at different points in the electrical system, such as both before and
after the ATS and even downstream of the UPS, depending on the level and type of design
redundancy. This allows for different sections of the power path to be separately bypassed
or shutdown, while still permitting delivery of both sides of the A-B power to the racks.
However, to prevent a system outage, it is extremely important to ensure that these tie
circuit breakers are only operated in the proper sequence by authorized and fully-trained
personnel. Normally, tie breaker handles are kept locked to prevent this problem from
occurring.
Main power distribution panel
After power has passed through the ATS, it goes into the main power distribution panel.
Typically, this panel feeds the UPS and cooling equipment, as well as lighting and other data
center systems. Like the main utility panel, it is normally untouched during typical
operation, and it should be visually and thermally inspected annually (at a minimum).
Maintenance bypass panel (MBP) for the UPS
Power into and out of the UPS passes though the MBP and out to the critical load, so it is
extremely important that it is visually and thermally inspected. Sometimes, in smaller data
center sites, external MBPs are not installed to lower initial UPS purchase and installations
costs, or because someone assumed that since the UPS already had an internal bypass,
they would not need to also purchase an external bypass panel.
Unfortunately, this assumption is a fairly common occurrence for smaller sites, and it has
major consequences if the UPS needs to be de-energized or replaced. These same smaller
sites also usually only have a single UPS, so they are forced to cut power to the critical load
in the event the UPS needs to be de-energized.
In many cases, the MBP is matched to the UPS and manufactured and installed by the UPS
vendor. These matching MBPs can be also equipped with Kirk Key Interlocks and can
interactively communicate with the UPS controls to prevent mis-operation. They are usually
also covered and maintained under the same UPS service contract. A written procedure and

Sponsored By:

Page 10 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

clear understanding of how to operate the MBP should be imparted to key site personnel to
help avoid a problem should the need arise to safely bypass the UPS.
Uninterruptable Power Supply (UPS)
Internal systems are electrically checked and visually and thermally inspected. Factory
trained service technicians may also run diagnostics. In some cases, the UPS can be placed
in internal bypass, and other tests or maintenance procedures require that the UPS be deenergized and externally bypassed via the MBP. In either case, the critical load is then
exposed to utility failure unless there a redundant UPS. As noted above, some data centers
will operate on a generator during UPS or battery bypass operations to avoid the possibility
of a utility outage. Physical maintenance, such as cleaning the UPS fans and changing or
cleaning the air filters, is also performed. This is typically done semi-annually, but should be
done annually at a minimum.
Battery plant or other energy storage for the UPS
For the UPS to support the critical load from when the utility failure occurs until backup
power returns from the generator, stored energy must be always instantly available. Energy
is most commonly provided from one or more strings of batteries.
Battery banks require regular maintenance and inspection for signs of corrosion, leakage
and temperature variations from cell to cell. Each battery is connected to the other in a
series via a jumper cable, and each cable must be checked to ensure its tightly connected
and free of corrosion. In a typically 480V battery cabinet, there are forty 12-volt batteries
and therefore 80 terminals than need to be inspected. This is in addition to the electrical
voltage and internal impedance testing, as well as periodic load testing.
Note that some data centers will operate on a generator during UPS, battery bypass or load
testing operations. Using a generator is necessary to avoid a utility outage while there is no
UPS power available.

Sponsored By:

Page 11 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

Many larger sites have dedicated battery-monitoring systems that can monitor each battery
individually, not just the entire string. This is useful for detecting early signs that an
individual battery is deteriorating and endangering the integrity of the entire string.
While other forms of short-term energy storage are also used in the data center, such as
the flywheel or the so-called rotary UPS, their maintenance is primarily mechanical in
nature and varies by different manufacturers recommendations.
Batteries need maintenance, testing and replacement more than any other power-related
component. Depending on the type of battery -- VRLA, wet cell or NiCad -- testing should
be done quarterly or semi-annually, but annually at a very minimum. Unless there is an
allotted budget for the procedure, it is often deferred or ignored. It is worthy to note that
statically speaking, battery failure is the most common cause of downtime, other than
human error.
Load Testing
Load testing is usually performed at the initial commissioning of the data center. Typically, it
covers all the critical areas in the power path described above. However, once a site is
operational, it is difficult to perform load testing without interrupting power, unless it is a
tier 3- or 4-level facility. Opinions on the necessity for continued load testing are mixed.
Purists will insist that it should be performed regularly. Some larger sites even have load
banks onsite and they may be pre-wired into key points in the electrical system.
Other data center operators will see load testing as unnecessary, and under normal
condition, an additional exposure to failure that is done only if a piece of equipment is
suspect or has been replaced. This is especially true for smaller tier 1- and 2-type sites,
where the load banks need to be rented and temporarily wired into panels. Of course, in
those cases, the critical load must have another source of power, and the switchgear must
be already in place to bridge the power without dropping the load, or it must be shut down
during the load test.
One of the more debated issues is runtime testing the battery banks, either directly or while
powering the load bank from the UPS, because each full runtime discharge diminishes the
working life and capacity of the cells. Even after a successful load test, a single cell can fail

Sponsored By:

Page 12 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

the next day, and if utility power is lost, the critical load will be dropped. The only way to
mitigate this potential exposure is by having multiple battery strings.
Planning, documentation, training and supervision
Needless to say, this article provides only a top-level view of data center backup power
maintenance issues. Actual maintenance procedures vary according to each manufacturers
service recommendations and requirements and should only be performed by properly
trained service personnel. Moreover, key data center staff, such as shift supervisors, should
also observe normal maintenance that is performed by outside vendors and in-house
technical resources to ensure procedures are followed. Staff should be familiar with, and
even able to perform, some basic and emergency procedures, such as manual operation of
equipment, starting the generator, ATS power transfers and operation of the UPS bypass
gear.
These procedures should be well documented, reviewed and updated as needed. Equipment
vendors or service personnel should conduct training, as well as semi-annual or annual
refresher courses. In fact, the ability of in-house staff to properly manually operate critical
bypass gear may help avert a downtime incident. Properly documented procedures and
supervision by on-site personnel may also avert a total data center shutdown.
Moreover, proper documented detailed procedures and supervision by on-site personnel
may avert a total data center shutdown. This circumstance can arise if it becomes necessary
to stop improper maintenance from occurring by new service personnel who are not fully
familiar with the sites equipment and systems. Emergency procedure documents should be
readily available and accessible to key personnel. Documents should contain clearly labeled
photos of the equipments controls and there should be instructions on the exact sequence
of operation and emergency use. Also consider having one to two page emergency
procedure cards that can be posted at or near the UPS-MBP and that also include
information for manually operating ATS.
The quality and frequency of maintenance is sometimes based on the size of the data center
and facilities department. Facilities staff are often far more sophisticated if the organization
is running a dedicated data center. Alternately, a facilities department supporting a 2,000

Sponsored By:

Page 13 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

square-foot data center in a large mixed-use building may not be as sensitive to some of
these specialized data center requirements, because the emphasis and expectations are
more often based on the building's systems. The overall culture and training level of the
facilities staff makes a huge difference. Also, because many maintenance procedures are
contracted out to either the equipment manufacturers or one or more service or subcontractors, it is imperative that someone from the organizations own management team
be aware of the scheduling, what work is being performed by who, as well as who is
supervising it.
Each data center site may differ in the types of equipment and maintenance requirements,
yet all sites need to have preventive services that dont affect the operation of the IT
equipment. Some managers try to avoid full failover testing and major maintenance of
critical systems, as it could potentially go wrong..This simply moves the limited (and
presumably avoidable) known risk on the planned maintenance day, to the unknown
exposure, the other 364 days of the year.
By avoiding or deferring maintenance, IT personnel could be exposing the data center to
downtime from a variety of undetected malfunctions that went unnoticed while on normal
power, but failed during a utility outage. Proper training, planning, supervision and
documentation of maintenance procedures, as well as upper management support, is crucial
to ensuring that a normal scheduled event doesnt turn into a downtime debacle.

Sponsored By:

Page 14 of 15

SearchDataCenter.com E-Guide
Power and Cooling best practices for large data centers

Resources from Schneider Electric

Power Monitoring for Modern Data Centers


Switchgear Design Impacts the Reliability of Backup Power Systems
Low Voltage Circuit Breaker Guidelines for Data Centers

About Schneider Electric


Schneider Electric delivers engineered solutions designed to increase safety, lower life cycle
cost and maximize power system reliability. Whether you require a new data center
installation, refurbishment, replacement, or recommendations for optimizing existing
equipment, our nationwide network of qualified experts provide the expertise and
accessibility necessary to deliver a complete solution specific to your needs.

Sponsored By:

Page 15 of 15

Vous aimerez peut-être aussi