Académique Documents
Professionnel Documents
Culture Documents
Whitepaper is supplement to the Cisco Press publication The ISP Essentials by Barry Raveendran Greene, and Philip Smith. Materials can
be used with the permission of the authors and Cisco Press. Materials can be used with the permission of the authors and Cisco Press. Public
copies are available at www.cisco.com/public/cons/isp/essentials/ or www.ispbook.com .
Never attempt an upgrade without being aware of potential side effects from unforeseen
problems that will happen during the upgrade.
Never mix a hardware and software upgrade. Do the software upgrade first, gain
confidence in the image, then do the hardware upgrade. It minimizes the confusion when
you need to troubleshoot.
Never attempt an upgrade without having read the release notes that come with the
software release. It also helps to read the release notes for all intermediate releases
because that will give the engineer good information about what has changed in the
software over the release cycle.
Key Guidelines:
Stability is the Objective
Know the Potential Side Effects during an Upgrade
Always have a Backout Plan
Do not mix Software and Hardware Upgrades
Read the Release Notes
Dial access layerA common software release is run on all access platforms. As with
the previous example, a more frequent cycle might be necessary. Some ISPs build new
infrastructure for new services, so when infrastructure is unchanging, it makes little sense
to upgrade software. Some dialup networks that we have had experience with have
hardware running the same software image for several years.
VPN access layerA common software release is run on all platforms. This example is
included because it is the current fashion in the industry. Often ISPs use bleeding-edge
software and hardware to deliver VPN services, and frequent upgrades for new features
can be necessary from time to time. Again, the usual rule applies: Dont change it unless
new features are necessary; it saves the customers from going through pain.
a network problem. Having strong control over software versions will mean that diagnosing
network problems can be achieved more easily.
Ask your Cisco Systems Engineer for the Network Verification Tool (NVS). This is the public name for the Pagent
suite of testing software designed to run on Ciscos routers.
Cisco Systems, Inc.
170 West Tasman Drive.
San Jose, CA 95134-1706
Phone: +1 408 526-4000
Fax: +1 408 536-4100
Minimize Risk. Selecting specific routers and sections of the network that have a low
impact on the overall operations of the network minimizes risk. For example, some ISPs
use redundant routers in a part of the network with a lot of redundancy. If the router used
for the phased deployment has an unexpected failure, service impact is minimized
through the redundancy.
Software Validation. Validating the features, functionality, and stability of the software
is paramount. The ultimate validation is running the software live in the network. Most
ISPs will do this initial step on a router with maximum redundancy (minimizing risk).
They will leave the software on for a given period of time, watching a variety of metrics,
seeking to determine if something was missed in their lab testing. The next phase of the
software rollout will be triggered once a given time as passed on this software validation
router.
Incremental Backouts. Backout plans are essential. What is even more essential is for
the back-out plans to be incremental. One of the most common back-out techniques is for
the ISP to roll back every router in the network to the original image. This, despite having
the new software running OK on other parts of the phased rollout. Often something may
happen during a deployment phase that has nothing to do with the software. Other times
it has everything to do with the software. So an ISP who have completed three phases of
rollout and runs into a problem on phase four, should fall back to phase three. At that
point, detailed analysis of what happen should occur. That analysis will determine if the
backout should continue or if there was something else happening in the network.
Maximum Confidence. The ultimate goal with any phased rollout is maximized
confidence in the software. This only happens with time live in the network. Many ISPs
have key time periods between each rollout phase insuring the team has confidence and
consensus before moving forward to the next phase. When everything works, the
operations team can assure itself that everything that could be done has been done.
Lessons Learned
Learning from past experience and the experience of your peers will minimize risk during an
upgrade cycle. After each successful software upgrade on a network, review what worked, what
did not work, and where improvements can be made. Update and document this experience so
that the next upgrade will build on the past experience. Besides what has been highlighted
earlier, some of the lessons weve learned include:
Check Flash on all the Devices. Many times ISPs get into the middle of an
upgrade cycles and find that the devices flash is not large enough to hold the
image load (normally the old image and the new image).
Check the Route Processor Memory. As a rule of thumb, always assume a new
IOS image will consume more memory than pervious versions. New features and
functions will add to this memory consumption. Check all the Route Processor
memory (including secondary route processors) to insure there is enough
memory.
Check Line Card/VIP Memory. Distributed architectures have Line Cards with
their own memory and processor requirements. A general rule of thumb is that
The Product Security Incident Response Team (PSIRT) handles all Cisco Security Vulnerabilities. Information can
be found at http://www.cisco.com/go/psirt/
Cisco Systems, Inc.
170 West Tasman Drive.
San Jose, CA 95134-1706
Phone: +1 408 526-4000
Fax: +1 408 536-4100
the Line Cards memory should be half of the required memory on the Route
Processor. 3
Check the CPU Load of all Route Processors and Line Cards. Know your
networks condition before the upgrade started. Many times a engineer will
upgrade a router, get it back into production, then notice a CPU level is at
99%/99%. The immediate assumption is that this CPU spike is caused by the
upgrade. That assumption can be false. With out a data point before the upgrade,
the engineer will not know if the CPU spike is caused by something in the
software or if there was a pre-existing problem on the device.
Check the Logs. What has been happening on the day before the upgrade?
Examination of the logs will provide the ISP Engineer insight into potential preexisting problems.
Of course the authors advise all ISPs to max out all memory. It is more expensive to execute field upgrades of
memory than it is to max out the memory at the time of purchase. Memory at the time of purchase is a depredated
capital expense. Field upgrades are a operational cost that incurs downtime, field upgrade time, and potential
outages.
Cisco Systems, Inc.
170 West Tasman Drive.
San Jose, CA 95134-1706
Phone: +1 408 526-4000
Fax: +1 408 536-4100