Vous êtes sur la page 1sur 167

Breaking Murphys Law: Project Risk

Management

By Evin J. Stump P.E.

.
Table of Contents
Preface ........................................................................................................................................................................ 3
Chapter 1Basic Ideas .............................................................................................................................................. 7
1.1 Introduction ..................................................................................................................................................... 7
1.2 Fighting Murphy ............................................................................................................................................. 8
1.3 What is a project?.......................................................................................................................................... 11
1.4 How do risks get into projects? .................................................................................................................... 18
1.4.1 Just plain bad luck.................................................................................................................................. 18
1.4.2 Poorly designed goals. ........................................................................................................................... 18
1.4.3 Brittle vs. robust plans. .......................................................................................................................... 24
1.4.4 Poor estimating and estimate nullification. ........................................................................................... 26
1.4.5 Unintended consequences...................................................................................................................... 35
1.4.6 Lack of communication. ........................................................................................................................ 35
1.4.7 Failure to do due diligence. ................................................................................................................... 37
1.4.8 Micromanagement. ................................................................................................................................ 38
1.4.9 Unethical behavior. ................................................................................................................................ 39
1.4.10 Regulatory problems. ........................................................................................................................... 40
1.4.11 I forgot. ............................................................................................................................................. 41
1.4.12 Advancing the state-of-the-art. ............................................................................................................ 41
1.4.13 New integrations. ................................................................................................................................. 43
1.4.14 Make versus buy. ................................................................................................................................. 43
1.4.15 Items furnished by customers or others............................................................................................... 43
1.4.16 Procrastination ..................................................................................................................................... 45
1.4.17 Resource shortages. ............................................................................................................................. 45
1.4.18 Revenue forecasts. ............................................................................................................................... 46
1.4.19 A system risk checklist. ....................................................................................................................... 47
1.4.20 Poor project execution. ........................................................................................................................ 49
1.4.21 Business continuity risks. .................................................................................................................... 49
1.5 Risk drivers ................................................................................................................................................... 53
1.6 Whose Risk Is It? .......................................................................................................................................... 55
Chapter 1 review questions .............................................................................................................................. 56
Chapter 2Classic Project Risk Management (CPRM) ........................................................................................ 58
2.1 Introduction ................................................................................................................................................... 58
2.2 The overall process ........................................................................................................................................ 58
2.3 Identification ................................................................................................................................................. 60
2.4 Analysis ......................................................................................................................................................... 66
2.4.1 Task elements affected. ......................................................................................................................... 66
2.4.2 Cost impact. ........................................................................................................................................... 68
2.4.3 Schedule impact. .................................................................................................................................... 70
2.4.4 Standing armies, idle assets, and hammocks. ....................................................................................... 73
2.4.5 Performance impact. .............................................................................................................................. 74
2.5 Planning......................................................................................................................................................... 79
2.6 Tracking ........................................................................................................................................................ 84
2.7 Controlling .................................................................................................................................................... 85
2.8 Communicating ............................................................................................................................................. 85
Chapter 2 review questions .................................................................................................................................. 86
Chapter 3Advanced Project Risk Management (APRM) ................................................................................... 88
3.1 Introduction ................................................................................................................................................... 88
3.2 Robust team................................................................................................................................................... 90
3.2.1 Competence. ........................................................................................................................................... 90
3.2.2 Dedication. ............................................................................................................................................. 91
3.2.3 Tenacity. ................................................................................................................................................. 92
3.3 Robust plan.................................................................................................................................................... 92
3.3.1 How robust is your plan? ....................................................................................................................... 93
3.3.2 Is your plan robust enough?................................................................................................................... 94
3.3.3 Should you try to make your plan more robust? ......................................................................................
3.4 Rapid re-planning .........................................................................................................................................
3.4.1 Plan awareness. ...................................................................................................................................... 95
3.4.2 Minimal plan complexity..................................................................................................................... 100
3.4.3 Centralized vs. decentralized planning ................................................................................................ 101
3.5 Putting it all together................................................................................................................................... 102
3.5.1 Organizing the team ............................................................................................................................. 102
3.5.2 Scenarios. ............................................................................................................................................. 103
3.5.3 Simulation. ........................................................................................................................................... 105
3.5.4 The paper aircraft game. ...................................................................................................................... 108
Chapter 3 review questions ............................................................................................................................ 116
Chapter 4Sample Implementations .................................................................................................................... 117
4.1 Introduction ................................................................................................................................................. 117
4.2 Project organization and operations ........................................................................................................... 117
4.2.1 Activity settings ................................................................................................................................... 120
4.2.2 Risk database. ...................................................................................................................................... 121
4.2.3 Risk closure. ......................................................................................................................................... 122
4.2.4 Risk management plan. ........................................................................................................................ 122
4.3 Typical methods and tools .......................................................................................................................... 122
4.3.1 Root cause analysis. ............................................................................................................................. 122
4.3.2 Multi-voting. ........................................................................................................................................ 125
4.3.3 Tri-level attribute evaluation. .............................................................................................................. 126
4.3.4 Brainstorming. ..................................................................................................................................... 128
4.3.5 Systematic risk identification. ............................................................................................................. 130
4.3.6 PC charts. ............................................................................................................................................. 136
4.3.7 Communicating risks to management or to clients. ............................................................................ 137
4.3.8 Communicating baseline change to the project team.......................................................................... 137
Chapter 4 review questions ............................................................................................................................ 140
Chapter 5Statistical Analysis of Project Risks .................................................................................................. 142
5.1 Introduction ................................................................................................................................................. 142
5.2 Basics of probability ................................................................................................................................... 144
5.3 Expected value ............................................................................................................................................ 148
5.4 Description and aggregation of risk drivers ............................................................................................... 150
5.5 Monte Carlo Simulation.............................................................................................................................. 158
Chapter 5 review questions ............................................................................................................................ 160
AppendixGantt Charts and Schedule Networks ................................................................................................ 164
Preface

For now, and possibly forever, project risk management must be classified as an art.
It is too new, and its body of knowledge is too amorphous, for it to be considered a
science. That means that it is subject to individual artistic interpretation. I therefore
do not blush when admitting that this book is a personal interpretation. But I hope,
and believe, that it is an astute and helpful interpretation. I have tried hard to make it
so.

I write from the viewpoint of an engineer who has labored in and helped manage
projects large and small over more than half a century. These projects have all been
in either the aerospace/defense or the construction industries. This may predispose
me to think of project risks in terms of potential engineering problems, but I have
tried to balance the presentation to include helpful information for those whose
concerns are with projects that are not of an engineering nature.

Most of us think of risks as bad things that can happen. Theres certainly a lot of
truth in that view, but as applied to projects, we need to be a bit more general, and at
the same time, a bit more specific. We need to keep in mind that the reason people
take risks in projects is the expectation that something good will happen. So in our
accounting for risks, the good things often more than balance out the bad things,
else projects would not come into being as often as they do.

We also need to keep in mind a distinction made by academicians, but often lost on
the rest of us: the distinction between risk and uncertainty. Risk, they say, is things
that could happen that we are willing to quantify, while uncertainty is a kind of angst
so vague we are not willing to quantify it. For example, weather forecasters are
willing to quantify the likelihood of rain tomorrow, and perhaps for a few days beyond
tomorrow, but are much less willing to quantify the likelihood of rain on this same
date a year from now. Similarly, we may be willing to quantify the likelihood of
winning a project in a specific bidding situation, but are not willing to quantify the
likelihood of a new competitor appearing from out of nowhere in the next year with a
product that will render ours obsolete. Alertness and market intelligence, not
quantification, are the best responses to that threat.

We are willing to use information on the likelihood of rain tomorrow to make certain
decisions, such as whether to carry an umbrella, or have a picnic, or plant a crop.
But we will also use the very vagueness of information about rain a year from now to
make other decisions, and these decisions may be just as important. For example,
we make decisions based just on the mere existence of the phenomenon of rain.
Thats why we have roofs, gutters, and drainage ditches, even though they may
mostly stand idle.

So, you are entitled to ask, in a book about project risk management will we only be
talking about things we are willing to quantify? No, we will also talk about the
uncertainties we are unwilling to quantify, but perhaps in some circumstances we will
assume a willingness to at least rank them.
I suppose this means I should have titled this book something like Management to
Protect Projects from Failure, or Management to Help Projects Succeed, avoiding
use of the word risk, but I didnt. My excuse is that most people dont make the
academic distinction between risk and uncertainty that I described above. So I often
use the word risk to mean both risk and uncertainty, just as most people do. Its a
handy convention.

Quantification of risks, as opposed to mere ranking, is a somewhat controversial


subject. Some people feel that certain sophisticated efforts at quantification go too
far. And probably some do, especially as applied to smaller, less consequential
projects. On the other hand, I firmly believe that some level of quantification is not
only appropriate, but even necessary to clearly convey meaning, especially among
team members in large projects. Of all of the symbols we deal with in life, numbers
are the least ambiguous. I have tried in this book to use quantification in ways that
are both reasonable and understandable. You can judge whether I have succeeded.

Semantic difficulties surround the word risk. To mitigate these, I rely on two
concepts of risk. The first is the notion of structural risk. This is the intuitively
sensed danger that becomes apparent as soon as the project goals are stated,
even before a specific plan comes to mind. It is based on background knowledge or
opinion that suggests the likelihood of project failure based on common sense.
Structural risk is essentially angst. It can be valuable because it tends to warn us
away from the foolhardy. But at the same time, it can inhibit the formation of daring
projects that turn out to be of great benefit. Structural risk, being intuitive and only
partially reasoned, ultimately is not reliable as a guide to the possibility of project
success. But it can be a strong indicator of the need for thorough planning and good
risk management.

The other risk concept is the risk driver. In this book I define this concept carefully,
to distinguish it from the risk drivers sometimes mentioned by others, usually with
little or no explanation as to meaning. My risk driver concept embraces both risks
and uncertainties. When I can legitimately quantify, I assign real numbers to risk
drivers. When I cannot legitimately quantify, I may assign rank order in a system of
values as a proxy for quantification. If even that seems impractical, I may simply
designate a risk driver as needing watchful waiting. Either way, I refer to it as
valuation of the risk driver. Valuation aids in deciding what, if anything, to do about
a risk driver.

A recurring theme throughout the book is that a risk driver cannot be identified until a
project plan emerges. Risk drivers are essentially weaknesses in the plan, places
where it can possibly (but not necessarily) fail. The basic idea behind what I call
classic project risk management is to try to find these weak places and fix them, if
that is possible and economically reasonable. But I also discuss what I call
advanced project risk management, which goes beyond trying to identify and fix risk
drivers. The basic idea in advanced project risk management is that a sufficiently
robust project team with a sufficiently robust project plan may be able to
overcome even unidentified risk drivers. The emphasis is on achieving robustness.

In this book, I make frequent references to the project team, or simply to the
team. To whom do I refer? The modern concept in project management is to form
a tightly integrated, multi-discipline team that works closely together to advance the
development and implementation of a product that will satisfy the project goals. In
small projects, the entire team might fit in one room. In mid-size projects, the team
might fit in a single building, or in a complex of buildings, with perhaps a few
members remotely located. In very large projects, the team might be scattered all
over the world.

In large projects, it may be necessary to distinguish between the core team, and
the peripheral team. The core team will typically be a group that will fit in a large
conference room, and the peripheral team will be everyone else. Although this may
be a physical necessity, it introduces the risk of poor communications, and either that
risk must be accepted, or means must be provided to mitigate it.1

In the opening chapter of the book I introduce some basic ideas to clarify the nature
of projects, as opposed to other human activities, and to provide some insight into
ways that risks can creep into projects. Those of you who work in projects will
readily recognize many of these, but you may beneficially learn of others you havent
yet encountered. A key element of project risk management is to have a project
team that is risk aware, that is, it has been sensitized to the possibility of dangers
and opportunities. If you know of projects risk situations I have not covered, and I
am sure there are many, you should add them to whatever knowledge you gain from
this book.

The second chapter is devoted to what I call classic project risk management. This
is the most commonly practiced form, and is appropriate for many, if not most,
projects.

Chapter 3 is about advanced project risk management, for projects that are of critical
importance, where failure would have serious consequences.

The fourth chapter discusses examples of implementations. These are basically


management techniques that can help in implementing project risk management.
They can be useful in getting you started, and keeping you on track.

The fifth and final chapter is about statistical analysis of risk drivers. When risk
drivers are quantified, the methods usually come at least in part from the area of
mathematics called statistics, and in particular a subset of statistics called
probability. Many readers will be unfamiliar with these subjects, so I have included
this chapter with the hope of imparting at least a conversational familiarity. No
attempt is made to make a statistician out of any reader. That would surely fail, both
1
This is not to imply that poor communications cannot be a problem in small projects.
because I am merely a dabbler in statistics, and also because few if any readers will
harbor a secret impulse to be expert in this field.

As recently as the 1980s and early 1990s, most project teams knew little or nothing
of project risk management. Today, many teams practice it with fair to good
success. That is commendable, to be sure. Unfortunately, there are still many
teams who do not practice it all, and others whose practice is marred by ignorance
and perhaps by attitudes. My hope is that this book will reach and influence some
project teams that are not enjoying the full benefits of project risk management, and
help propel them to greater success. My further hope is that it will reach and help
some teams to do even better what they are already doing fairly well.

It is customary for authors to acknowledge assistance given by others in creating a


literary work. Assistance has come from many directions over the years, and in
many cases I cant recall names, faces, or sources. However, I would like to
especially acknowledge the many good ideas received from Mr. Barney Roberts of
Futron Corporation, and from Ms. Virginia Cook, formerly of Futron. Also, I would
like to acknowledge the support of Mr. Greg Ward, who encouraged my interest in
project risk management, and helped crystallize my ideas on the subject. Then also,
there was Mr. Michael Gallo of Kelly Space and Technology, who worked closely
and productively with me on one of my early consulting assignments in project risk
management, and Ms. Jamie Wilson, of Phillips Petroleum, who has given me
several training and consulting opportunities all over the world. Finally, I must
certainly mention Mr. Ralph Tourino of Lockheed Martin, who gave me a major
opportunity to mature my thinking in project risk management, and who made it clear
to me that good project risk management provides a distinct advantage when bidding
competitively for project opportunities. I can never adequately repay the help I have
received from these individuals or from the many others who remain unnamed.

I am solely responsible for all errors in this book.


Chapter 1Basic Ideas

1.1 Introduction

The purpose of this chapter is to introduce some basic ideas about project risk
management (PRM). I will first explain the reference to Murphys Law in the title of
this book, and the concept of fighting Murphy in order to achieve project success.
This may come across as a bit corny, but I took this approach because it introduces
a bit of levity into what might otherwise be a boring or even distressing subject (who
likes to talk about the possibility of failure?)2.

In this chapter I also make a stab at defining what a project is, so that we are not
talking past each other. Then, we will examine together some of the principal and
common ways risks get into projects. I will not try to list sources of risk exhaustively
because that is clearly impossible. But I will try to introduce many of the common
ways, so you are at least sensitized to those.

I then introduce the notion of a risk driver. In subsequent chapters, risk drivers
provide a convenient and fairly rigorous framework for thinking about and managing
project risks.

I end this chapter with a brief discussion of whose risk is it? This is important
because different project stakeholders have different perspectives about project risk.

If you agree in principle that projects can be very


important to their stakeholders, and that some projects
can be very risky in some sense of that word, you still
You cant expect
may not find it self-evident that there should be such a
to hit the jackpot
thing as PRM. Why should we have a special kind of
if you dont put a
project management with the word risk tossed in?
few nickels in the
Indeed, no need at all, if ordinary project management
machine
enjoyed a history of consistent success. The sad fact
is, it does not. History, even recent history, is full of
projects that have failed one way or another to meet
their expectations. The failures include outright inability to complete the mission,
being over budget or behind schedule, and serious unintended consequences. They
represent a loss to stakeholders, and often harm human society collectively.

The rationale for the existence of a special kind or project management called PRM
is that it is a corrective for poor project management. Hopefully, as more project
teams learn to practice effective PRM, it will become a part of project management
culture worldwide. If and when that happens, the advice given in this book will no
longer be needed. But I believe that will not happen for a long time. The logic of
failure is still prevalent everywhere in places high and low.

2
If you are one of the probably very few people in the industrialized world who have not heard about
Murphys Law, read on. It will be made clear to you shortly.
Before we begin, a cautionary note is in order. This book is about
managing risk in projects, about making it more likely they will
succeed. I make no claim that by following the advice in this book you will eliminate
all project dangers. You cannot eliminate all of them unless you decree that it makes
no difference what the project outcome is. I also do not argue that a risky project
should not be attempted. Some of mankinds greatest gains have come from risky
projects that were successful. All I say is, if a project is attempting something
worthwhile, isnt it better for all concerned that it succeed?

1.2 Fighting Murphy


Think for a moment about the title of this book: Breaking Murphys Law: Project
Risk Management. Who is this Murphy? What is his law? Murphy is famous for his
dictum that what can go wrong, will go wrong. While we have all observed
counterexamples that prove that Murphys law is not always true, we have also
observed many examples of things going wrong when it seemed unlikely that they
would. Hence, the law. (See Exhibit 1 for additional information.)

Exhibit 1
About Murphys Law

The purported origin of Murphys Law was at Edwards Air Force Base,
California in 1949. It is attributed to Captain Edward A. Murphy who was
involved in a test program in which rocket sleds were used to determine human
tolerance to acceleration and deceleration. Apparently, the original statement of
Murphys Law was simply If anything can go wrong it will. Creative and
cynically humorous corollaries and commentaries on the Law abound.
Several of them are sprinkled throughout this book in scroll boxes.

A medical doctor, Col. John Stapp, famous as leader and self-designated


guinea pig for the rocket sled experiments, offered Stapps Ironical Paradox:
The universal aptitude for ineptitude makes any human accomplishment an
incredible miracle. But Stapp also credits the flawless safety record of his
project to a determined effort to overcome Murphys Law. He recognized that in
important affairs, fighting Murphy is worthwhile.

This book is closely aligned with Hills Commentaries on Murphys Law:

If we lose much by having things go wrong, take all possible care.


If we have nothing to lose by change, relax.
If we have everything to gain by change, relax.
If it doesnt matter, it does not matter.

The following military versions of Murphys Law are among my favorites

Ask the average person what he or she means by risk and you will probably get an
answer something like the possibility of something bad happening. Then he or she
might add, even if I dont want it to. Ask the average high stakes entrepreneur the
same question, and you will probably get an answer something like
the possibility of something good happening. Then he or she
might add, provided I make it happen. As I will from time to time remind you in this
book, risk has a dual nature. It is both bad things that can happen, that you may be
able to prevent, and good things you might be able to activate. In this book I will spill
far more ink talking about how to prevent the bad things, but the possibility of good
things is why we take project risks in the first place.

Behavioral scientists have noticed that these are among the propensities of human
beings:

We hate to lose, and hate even more admitting to a loss


We are overoptimistic and overconfident
We believe randomness is the exception rather than the rule; we think patterns
exist when they dont, especially when the existence of a pattern would be
favorable to our interests.

If these tendencies did not have an overall survival value for the human species, we
probably would not have them. They probably cause us to take risks that work out to
have a greater advantage to us overall than if we had abstained from taking them.
So, far be it from me to discourage risk taking in general. But clearly, there is such a
thing as carefully calculated risk, and there is such a thing as foolish risk. Hopefully,
in this book I can help you learn better how to recognize and minimize your
dangerous risks in projects, and avoid taking foolish ones.

Here are some synonyms of risk, taken from a good dictionary. In Exhibit 2, they are
divided into two groups, labeled (perhaps imprecisely, but indulge me) Bad Risk and
Good Risk.

Exhibit 2Bad and Good Risk

Bad Risk Good Risk


Hazard Chance
Peril Venture
Danger Speculation
Jeopardy Gamble

The cultural bias toward thinking mostly about the bad things that can happen is
reflected in approaches to PRM as it is usually practiced. Indeed, some PRM
consultants and experts proclaim that PRM is only about preventing the bad things.
In this book we are a bit more sanguine. We remain open to the possibility that good
things can happen too, even though they may need a push. In my view, PRM can
be thought of as an activity combining resistance to bad things happening, while at
the same time pushing to make good things happen. Even though much of our
discussion is about preventing bad things, please keep in mind that pushing the
good things obviously can be even more rewarding.
In my view there are only three basic strategies (with numerous
variations) for breaking Murphys law, as it applies to projects.3
The first is to hope you get lucky. This strategy might also be called muddling
through. It could be an adequate strategy in a project where not much is at stake,
where no one really much cares what the outcome is. It can be a terrible strategy if
anything valuable or important to people is associated with the project or its
outcomes.

The second strategy is to try to think of everything that could go wrong, make and
frequently review a list of those things, and try to think of low cost, quick ways to
keep them from happening. Then, you take whatever actions you deem prudent to
avoid bad things happening and to help good things happen. You also take note of
bad things that have occurred in spite of your best efforts, or that have been evaded,
or simply overtaken by events. This might be called classic project risk
management (CPRM). It is surely the most frequently practiced variety on projects
of importance.

Sometimes on especially sensitive projects, project teams will supplement the basic
CPRM approach with statistical and other techniques that evaluate risks. This
requires a fair amount of work, and also requires some understanding of statistics (at
about the level of business school Statistics 101), but may be justified for several
reasons we will discuss later. I introduce some statistical concepts in a relatively
painless manner in chapter 5. The information in that chapter will not make you
expert enough to do all of the quantitative risk analyses that are sometimes done in
projects, but it should make you expert enough to talk intelligently with the people
who do them.

Chapter 2 is dedicated to CPRM. You need to understand how it works because


most of you will probably decide that CPRM is what you need to do for your
particular project. Also, it is a necessary foundation for what I call advanced project
risk management (APRM). APRM is the third of the risk management strategies
mentioned above. It scales PRM up to a high level of intensity. APRM is
appropriate for very risky, very important projects. It is discussed in chapter 3.

When you are operating a really critical project where failure is not an option you
even want to think about, you should consider going with APRM. Here is a quick
insight into what I mean. Visualize a Special Forces team like the Navy SEALs or
the Army Green Berets planning a difficult military mission where success is vitally
necessary, and especially where somebody could get hurt or killed. They will of
course use the ordinary techniques of CPRM, where they try to visualize all that can
go wrong, and think of ways to counter it.

But they will do a lot more than that. They will also be sure that each team member
(from the lowest ranks to the highest) is as well trained and competent as possible.
They will cross-train people so that if people are killed or wounded, they still have a

3
Actually, there is a fourth strategy: dont do the project at all, or delay it until conditions are more
favorable. I assume here that you need to or have decided it is best to get the project started now, or
at least pretty soon.
good chance to complete the mission. And they will not make a
fetish of the chain of command, especially when things get difficult.
When the situation is dicey, leadership will shift quickly and automatically to the
person who has the most expertise with the problem at hand, even if that person is a
junior enlisted person. Their attitude will be: Forget egos, titles, and ranks; Its the
mission, stupid! To achieve APRM status, your team (not group, not organization,
and especially not committee) needs to be able to say, and mean, Its the mission,
stupid!

APRM may not be for your project. After all, in the projects of most readers of this
book, nobody will get shot at. If somebody dies it will be purely an accident.
Besides, APRM costs more than CPRM, and maybe you would like to save the extra
money. But before you make that decision, consider this:

Although most likely nobody will get shot at or have a hand grenade tossed into
their cubicle, your competition can still be figuratively shooting at you, and so
can Murphy. Those economic bullets can really sting!
Human beings are famously not very good at anticipating the future, which is the
whole basis of CPRM. A risk you overlook might severely cripple your project.
While probably no one will actually die on your project4, it could happen that
someone could get injured, or sick, or pregnant, or have major family problems,
or just simply quit and go to work somewhere else. Could that happen? Could it
damage your project?
APRM is more about a fierce commitment to being able to recover when
something bad happens, as opposed to trying to anticipate all possible bad
events and dreaming up ways to counter them. In examining a particular project,
we can be pretty sure we will miss some risks. The ones we miss can become
problems, as opposed to risks, which are merely hypothetical problems. Success
will be defined by our ability to find timely solutions.
APRM is about having a robust plan with a rapid re-planning capability, and
capability to communicate the new plan almost instantly, and robust people to
execute it, so that the project can ride over the potholes and maybe crash into a
few barriers without breaking down. Would it really be so hard to get yourself to
that condition? Should you want to? Read chapter 3!

1.3 What is a project?


Websters New Illustrated Dictionary gives the following primary definition of
project.

Something proposed or mapped out in the mind, as a course of action, a plan.

An Internet dictionary site provides:

A plan or proposal; a scheme. Also, an undertaking requiring concerted effort.

4
Unless, of course you have a project to clear land mines or maybe dive for submerged treasure in
shark infested waters.
Clearly, the notion of a project is closely associated with the notion
of a plan. The notion of a concerted effort also rings true. Implicit in these
definitions, but not clearly enunciated, is the notion of goals. Its hard to imagine a
serious project that does not begin with goals of some kind.

To further flesh out the definition, consider that a project is an agent of change. Its
purpose is to change something that already exists, and often to create something
entirely new. In a project, that which exists may be replaced, enhanced, or
eliminated.

I bring the notion of change agent into the dialogue because our instincts warn us
that attempts to institute changes can have unforeseen outcomes. Projects of all
sizes and shapes are inherently risky to some degree, larger ones generally much
more than smaller ones. Therefore, to effectively manage a project requires some
understanding of how to manage risk. Some people go so far as to say that project
management is project risk management, because without risks, project
management could be turned over to the project team, or nowadays even to
computer programs that remind everybody what to do, once the plan is set in motion.
I dont go that far, but I do believe that management of risks is most of what a project
manager should be doing, and much of what his team should be doing. The purpose
of this book is to help you gain a good understanding of this vital subject, so you can
perform appropriately.

The concerted effort mentioned in one of the above definitions calls to mind that
projects virtually always require a mustering of the resources needed to conduct
them. The needed resources are seldom standing idle
and available where and when the project needs them.
Resources needed by projects include time and money,
Whenever you of course, but also people, materials, equipment, data,
set out to do information, and infrastructure. This mustering and
something, subsequent release of assets is a key characteristic that
something else distinguishes projects from routine operations in
must always be business or government.

An close analogy for the role of projects in human


affairs is the operation of an aircraft. Getting the aircraft airborne is a project. Once
it is airborne, it flies uneventfully (most of the time!) to its destination, and the
resources used to get it airborne are diverted to less stressful pursuits, like drinking
coffee and taking naps, assuming you have a copilot. A second project is required to
get the aircraft safely back on the ground. You should note that common knowledge
among pilots is that the most dangerous parts of a flight are the takeoff and the
landing (especially the takeoff). By analogy, projects are generally riskier than
routine business operations.

Projects begin with the perception of a need for change. Typically, some project
champion suddenly sees an important unmet need or opportunity, and the project is
born. The first step after recognition of the need for change is to define the projects
goals and constraints. A goal is some new state of nature that we
desire to create. A constraint is a condition we wish not to violate.5
Typical constraints are caps on costs, limitations on project duration, environmental
constraints; avoidance of undesirable secondary consequences, required practices,
licenses, clearances, and the like. Note that the coexistence of goals and
constraints is inevitable in any project, and that this condition assures that there will
be risk.

The goal creation step is often contentious, because different people may have
different perceptions of what changes are needed, how much time and money
should be spent, and what compromises should be made. Project champions and
the sponsors who support them typically screen a significant project proposal by
comparing it to many alternatives and options before accepting it. Two questions
must be answered for any project: 1) why do this
project, and 2) why not do some other project instead?
If there is a
Ultimately, goals are the criteria we use to determine if
possibility of
a project has been successfully completed. If all of the
several things
goals are unequivocally met, we say that the project is
going wrong, the
hugely successful. If a few are not quite met, but the
one that will
deviations are small, we generally concede moderate
cause the most
to good success. Anything less is generally viewed as
d ill b
a failure or near-failure. Most would concede that the
following are examples of project failures:

The project was cancelled because of actual or likely cost or schedule overruns,
or apparent inability to meet key project goals
The cost or schedule constraints were seriously violated, with serious
consequences, even though the project was completed
There had to be a major and unwanted restructuring of the goals after the project
started in order to avoid an overrun condition in coat, schedule, or both
The project created seriously bad unintended consequences for the sponsors or
for others
A vital mission was not completed
Revenues were so small that a major financial loss resulted (for projects that
have revenue goals).

A project plan is the medium through which the goals of the project and the proposed
means of meeting them are communicated within the project team and to interested
stakeholders outside the team. The project plan is a living, not a static document.6
The rough plan that was drawn up when the project was first conceived will seldom
be a suitable guide to action when the project is running at full speed later on, when
50 or 500 or 5,000 people are involved. The plan is merely an expression of what is

5
Another view is that goals and constraints are statements of what we would like to have, and what
we are not willing to give up in order to have it.
6
I use the word document merely for convenience. A plan is more often a set of documents, and
may be presented and contained in media as diverse as wall displays, computer data files, video
presentations, and bound volumes.
currently considered to be the best course of action to achieve the
goals. It is not God. It can and should be changed when it needs
to be changed.

In todays world, it is not uncommon for the plan in a major project to change in
some degree almost every day due to new or unforeseen circumstances. This is
both good news and bad news. The bad news is that changing a plan takes effort
and costs money. And, the new plan must be checked to see if it still permits the
goals to be accomplished with acceptable probability. The good news is that the
project team is kept current in what is supposed to happen, so they can adjust their
actions accordinglythat is, provided the changes are promptly and clearly
communicated to them.

A well-conceived project plan includes most if not all of the following elements, in
one form or another:

The goals and constraints, as agreed with the project sponsor (recall that these
are the criteria used to eventually measure project success)
The statement of work (SOW), which describes generically what the team is
supposed to accomplish and by when (the SOW
is sometimes integral with the goals, but may
also supplement them separately).
A carelessly The baseline product design, a description of
planned project the current concept for the product design7
takes three times solution that we believe will satisfy the goals
longer than (typically in the form of drawings or
expected; a specifications for engineering projects, but for
f ll l d some types of projects, the baseline design may
be captured in a simple text document, or
perhaps just in the memories of the
participants). Derivative product requirements and all existing design and
implementation methodology details are also be included in the baseline. These
are requirements and details that flow from the selection of a particular design
and implementation scheme, as opposed to requirements directly associated with
the project sponsors goals. A baseline is not necessarily a completed design.
Early in the project, it could be little more than a sketch with some explanatory
text. But as the project progresses, the baseline should flower into something
much more comprehensive.
The work breakdown structure (WBS), a hierarchical list of generally product-
oriented tasks to be accomplished by the team to create the product defined by
the baseline design. Some elements of the WBS may not be directly product-
oriented. For example, there could be a project management element, or a legal
matters element, or a contract management element, or elements for other
activities that are critical to project success.

7
Every project has a product of some kind. The product can be tangible or intangible, or some
combination. Every product has a design, which simply means that it has a description of some
kind.
The WBS dictionary, a description of the work to be
accomplished for each WBS element (such descriptions are
sometimes referred to as work packages and may include budget and schedule
allocations).
The schedule, often a Gantt chart on smaller, less risky projects, or a network
diagram on larger, riskier projects.8 When individual tasks in the schedule are
assigned durations, a network diagram performs the vital function of permitting
computation of the length and location of the critical path. This provides
guidance to the team as to which tasks are most critical to complete on time, and
which have some slack in their completion time. Most beneficially, the work
elements of the schedule are also the work elements of the WBS dictionary.
The cost estimates, usually broken into labor hours and associated costs,
material costs, and other costs, for each task listed in the WBS, and each fiscal
period of interest (often monthly, but sometimes weekly). The cost estimates
must be based on the baseline design and other parts of the plan concerned with
implementing the baseline design.
The budgets, which are financial allocations to groups or functions within the
team that show how much they are authorized to spend and when. Typically, the
budgets are based on the cost estimates, except for funds held in reserve by the
project manager as a means of dealing with risks and problems. (Sometimes
project budgets have no connection to estimates. This can happen when the
project sponsor says This is all the money I havelive with it, and the project
does not shrink to conform. This unfortunately common situation is a recipe for
project disaster.)
The staffing plan, which describes by name or at least by skill category, each
person expected to work on the project. The plan also identifies which functional
group in the project team each person is to work in, and who leads that group.
The staffing plan also shows how many hours each person or skill category is
supposed to work in each fiscal period, and on which tasks.
The risk management plan, usually comprising descriptions of identified risks
and their possible project impact, plans for avoiding or modifying risks, the
accomplishments to date in modifying risks, and the special expenditures
budgeted or incurred to modify risks.
A statement of earned value. This compares what has been done and what
has been spent at the present time with what the plan calls for.9 It is a measure
of success to date in overcoming risks and problems. It also can be extrapolated
roughly to predict future success (or failure!).
If appropriate, there may also be other plans for the acquisition, retention, and
use of other assets, such as capital assets, raw materials, manufactured items,

8
If you are not familiar with Gantt charts and network diagrams as scheduling devices, see the
Appendix. Gantt charts are useful for a general understanding of the flow of the work, but are not very
useful for analysis of schedule risk. Schedule risk analysis is generally based on schedule networks.
An elementary level grasp of schedule networks is vital to understanding some of the discussions in
later chapters. The most important single concept in schedule networks is the critical path. It is the
longest duration path through the network.
9
Plans can and do change. The usual interpretation of earned value is what has been done versus
what the original plan called for, with the original plan adjusted for changes that were requested by the
sponsor.
expendable items, computers, data services, consultants,
utilities, local transportation, etc. From a risk management
standpoint, such plans should always exist as appropriate and should be
scrupulously executed.
As appropriate, disaster recovery plans and business continuity plans that
are not necessarily limited to natural disasters. A disgruntled current or former
employee may do more damage than a hurricane. An embezzler can wreak
financial havoc. A server failure can put a project out of business for a time.
Various other plans that may be needed, such as for human safety, quality
control, hiring, etc. The general rule should be, if something is important to
avoiding project failure, or hurting people, or loss of vital assets, have a plan for
it, and make sure the plan is followed.
Record of tradeoff analyses accomplished. The baseline product design, as it
evolves, is always the result of tradeoffs between design options. These
tradeoffs typically include customer satisfaction, cost, schedule, and risk, among
other factors. It is important that the project team not lose sight of how it came to
select the current design baseline, while moving away from previous design
baselines. Loss of this understanding can result in confusion and reinventing the
wheel.
Revenue forecastMany projects are managed totally from a cost perspective,
even though they may generate revenue after or even before project completion.
But if revenue generation is expected as part of the scope of the project, then the
plan for revenue generation should be included as part of the project plan. Once
the time has come when revenue is expected to flow, the actual flow of revenue
should be recorded and compared to the plan. Activities to ensure or enhance
revenue flow should be recorded.
Project metrics are records over time of major results related to satisfaction of
goals and derivative requirements. For example, if maximum product weight is a
key requirement, then a record that tracks computed or actual weight as the
project proceeds is a vital metric. Metrics are important to an understanding of
how well the project is meeting goals, constraints, and derivative requirements.

One last aspect of project plans needs to be discussed before moving onproject
phases. Larger projects typically are divided into several major phases. There are
many preferences for how to do this, and often the project team must follow the
scheme of phases decreed by the customer or sponsor. A generic scheme for
phases that has wide acceptance follows:

Conceptthe initial phase wherein the need for the project is first understood,
and advocacy for it begins to build. The key goals and constraints are roughed
out. Steps may be taken to alert or even solicit help from organizations that may
later be involved, such as potential prime and subcontractors, or other
departments within the same company, and to solicit from them advice on how to
structure the project. The costs of the concept phase are often not considered to
be project costs, but rather separately funded bid, proposal, or business
development overhead expenses.
Business case developmentthe phase where the value of the project is
initially assessed. This phase often runs more or less concurrently with the
concept phase. The assessment may be done based on a
number of different criteria, such as return on investment, net
present value, internal rate of return, competitive positioning, customer
satisfaction, long term stability, effectiveness, survival, mission objective
accomplished, etc. Government projects often use value measures such as
benefit/cost ratio or cost effectiveness. It may be more appropriate in a military
project to use the phrase threat case development rather than business case
development. Usually business (or threat) case development includes some
consideration of structural risk, as well as it can be understood at the time.
Contractors or other performance agents are contacted and qualified. They may
submit rough order of magnitude cost estimates. If a formal proposal is required
either internally or externally, it is prepared as part of this phase. The costs of the
business case development may not be considered to be project costs, but rather
separately funded bid, proposal, or business development overhead expenses.
Preliminary designin this phase, available product design options are
explored and traded off, and at least the first baseline design is created. High-
level derivative requirements and specifications are developed. Needed project
sites are explored and surveyed. Some may even be acquired. Study of
logistics, safety, reliability and support begins. Critical designs requiring long-
lead procurement are detailed. Development and qualification testing is
performed. Analyses, simulations, studies, etc. are done, and models,
prototypes, and pilot plants are created and tested. The phase typically ends
with a comprehensive product design review.
Detail designthe entire design is detailed in specifications, drawings, and
reports. Tooling and other implementation aids are designed and built. Support
equipment and facilities are designed and built. Implementation planning is
completed. Critical material is acquired. Sites not already owned or otherwise
available are acquired. Sites are prepared for implementation. The phase
typically ends with a comprehensive product design review.
Product validationproduct validation provides objective evidence that the
design solution is mature, or is assuredly progressing to maturity. It provides a
level of confidence in meeting the planned capabilities and levels of performance.
Extensive testing is typically performed. If test failures indicate less than
satisfactory performance, fixes are devised and retesting is done.
Implementationthe product is created. The means of creation vary with the
nature of the project. In hardware projects, implementation generally means
fabrication and assembly. In software projects, it means design, coding, test, and
related activities. In construction projects, it means construction work at the site.
In a health care project, the product could be a set of procedures to be followed,
or it could involve the acquisition and use of new diagnostic tools. In a financial
project, the product might be a new instrument for investment. Creation of the
product ends with acceptance and deployment.
Operations and supportthe product is operated and supported in its deployed
state. Maintenance, adjustments, retrofits and repairs are performed. In many
projects, this phase is not the responsibility of the team that created the product,
yet it may have large structural risks for the sponsor or other stakeholders.
Retirementthe product is retired and disposed of. This may be irrelevant to
some projects, but to others, such as the aircraft industry, chemical plants,
factories, or especially the nuclear power industry, it is a major
factor. Retirement is not always considered to be a part of the
project. In smaller projects, it may be regarded as simply a routine business
activity.

Not every project has all of these phases. And in projects that have them,
sometimes they are at least partially concurrent. Software development projects
usually combine design and implementation. If the project includes revenue
generation, that is usually included in operations and support. That phase also
typically includes warranty actions.

1.4 How do risks get into projects?


There are so many ways that risks can get into projects that one might think it does
little good to try to identify or classify them. If Murphy is even half right, the risk that
will hit your project is one you didnt think about. Perhaps so, but risk awareness is a
good thing, and it helps to increase risk awareness to take a look at some areas that
are commonly sources of project risk. At least you can be on your guard against
these common sources of failure.

1.4.1 Just plain bad luck.

As the saying goes, just plain bad luck happens. (Well, thats not quite the saying,
but you know what I mean.) But think for a moment of the last time you had bad
luck. Was it really unpreventable? If you had been a little more attentive, a little
more careful, could what happened have been prevented, or at least mitigated?
What lessons, if any, can be learned from your experiences with just plain bad luck?

1.4.2 Poorly designed goals.

In this book, I dont try to discourage people from setting difficult goals. I merely
point out that 1) absent means to modify risks, difficult goals increase the chance of
failure, and 2) poorly drafted goal statements increase risks unnecessarily.

What are some of the main considerations in the design of project goals, from the
PRM standpoint? Because of its huge importance, lets explore that question in
some depth.

1.4.2.1 Stability. One of the most difficult risks a project team can face is
instability of the project goals. Conscientious as a project team may be, it is hard to
satisfy a sponsor who keeps changing his mind.10 Of course, in the fixed price
contract project environment where the work content generally is tightly defined by
the sponsor, the project team can handle this rather neatly by recording all of the
changes and presenting the project sponsor with the bill. This may or may not result
in an unhappy sponsor.

10
What may even be worse is a project team that cannot make up its mind about how to approach an
otherwise well defined job.
In other situations, goal instability can be a difficult problem for the
project team in controlling the sponsors risk. Failure to control the sponsors risk
can result in an unhappy sponsor. Even though the sponsor is at least a major
contributing cause of the problem, the sponsor may blame the project team.

I believe instability of the goals has two main causes,


Fuzzy project 1) a poorly organized sponsor, and/or 2) technological
objectives are immaturity. Often the project team can help relieve the
used to avoid the instability problem if the project sponsor is cooperative.
embarrassment The key is trying to identify why instability may exist. If
of estimating the it is due to technological immaturity, it might be well to
t t delay the project (or create a less ambitious project)
until the technology is more mature. If it is due to
conflicts or uncoordinated decision-making in the
sponsors organization, it may be possible to resolve
the problem at a higher level of management.

A common source of instability is multiple sponsors who keep disagreeing about


what to do. If it is necessary to have multiple sponsors, the distribution of power
among them and the protocols for exercising it must be carefully considered.

Sometimes it is helpful to do more prototyping. This means communicating better


with the sponsor what the product will be like without actually building it. There are
many forms of prototypes. Examples are brass boards (for electronic products),
static and working models, mockups, and computer simulations. Software projects
often create so-called rapid prototypes that are little more than inactive or semi-
active screens the sponsor can look at to better understand what the final product
will look like. If the sponsor dislikes what he sees, changes to a prototype are still
relatively inexpensive. Prototyping helps satisfy the sponsor who says, Im not sure
what I want, but Ill know it when I see it.

Anticipatory documentation is another form of prototype that has been used


successfully to assure customer satisfaction before proceeding with expensive
detailed development of a product. It involves creation of draft user guides or
training materials based on what the product will be like. By simply reading the
documentation, the sponsor is able to obtain an excellent mental picture of the final
product configuration.
Exhibit 3
16 Times Wrong

I worked on a project where a physical impossibility resulted in project cancellation.


The leader of our project team had a brainstorm about a potential product, and went to
see a potential customer about it. On the flight to see the customer, he roughed out
some calculations to show that it would work. He did them on a yellow pad with the
help of a slide rule (no pocket calculators then). The customer gave us a $3 million
contract (a fair amount of money in the 1960s) to develop and test the technology.

Several months later, we conducted a separately funded two million dollar test of the
hardware we had developed. The results were only about 5% of what we expected to
see. Stunned, the project engineer (me) went back through the calculations
meticulously, and even had a programmer write a computer routine to test them. Lo
and behold, we analytically confirmed the 5% test results. Our leader had made a
serious error in his calculations. In a key equation, he had forgotten to divide by 16.
The customer immediately cancelled the project upon hearing of the results. He was
not a happy camper. I definitely learned some lessons from this experience.

There is a form of volatility of the goals that can be benign, but is not necessarily so.
It is called goals creep. Sometimes when a project is in progress, it becomes clear
that for a relatively small amount of additional time, money, or both, the project can
be made to produce benefits that outweigh the added costs. Or sometimes, for
example in military projects, a new or modified threat or problem is discovered that
must be countered, causing goals to increase in scope. But all too frequently, goals
creep represents nothing more than a hidden agenda that is gradually revealed.
When this is the case, it is just a form of bait and switch. Hidden agendas and bait
and switch are discussed later in this section.

1.4.2.2 Basis in reality. Aside from goal instability, little can make a project
more risky than to have goals not based in reality. You may have had the
experience of working on a project where the goals were poorly grounded. Even
though the plan may have been thorough and the project team may have performed
as well as could be expected, the problem didnt get solved because its true nature
was never recognized.

Goals can be at variance with reality in several ways. Some of the most common
are:

Mathematical or logical error or misapplication of scientific principles,


misunderstanding of laws and customs, etc. See Exhibit 3 for a real life example.
Failure to appreciate the size of the gap separating what is known and familiar
from the expected end state at project completion. Heres an extreme example of
what I mean: In 1492, suppose that Christopher Columbus decided to go, not to
India, but to the moon. He would have been about
500 years ahead of the needed technology. No
In research and matter how much money Queen Isabella gave him,
development he would never have made it in his lifetime. As I
projects, the
three parameters
are task, time,
and resources. If
you know any
write, there are probably projects out there that will fail because
their proponents do not understand that what they want to do
will not be possible for another 20 years, or more. However, we should keep in
mind that technology moves much faster today than it did in the time of
Columbus. Projects that anticipate new technology are not necessarily a bad
thing.
Attempting to do too many new and untried things in one project. This requires a
long learning curve, the scope of which is almost always underestimated. A good
rule of thumb is that if you are trying to do more than two new things in one
project, the project is very risky. The usual remedy for this is to divide the larger
project into two or more consecutive smaller projects. If there is a failure, the
impact is likely to be less.
Failure to determine the availability of a critical resource. The resource might be
people skills, equipment, infrastructure, or even data. It could also be political
support.
Failure to understand the nature of an existing situation, especially the nature of
complex systems that the project is intended to influence or with which it will
interact.

The last mentioned way of being disconnected from reality is sometimes recognized
after the fact by unintended consequences. A classic example is the introduction
into agricultural practice of the insecticide DDT. It was hailed as a boon to mankind,
especially in third world countries, where it significantly raised crop yields. Yet only a
few years after it was introduced, it was condemned as an environmental disaster.

A post-project excuse commonly heard when goals are not based in reality is We
didnt know then what we know now. While that may be true, it is often equally true
that someone could have, but did not, take the trouble to work the problem through
before starting the project. There used to be a common
saying in project teams: There is never time and money
to do it right, but there is always time and money to do it
Clearly stated over. In todays competitive, resource limited
instructions will environment, there is seldom time and money to do it
consistently over. It needs to be done right the first time.
produce multiple
interpretations. Another aspect of poor basis in reality is when cash flow
or elapsed time constraints are overly tight. This often
happens because the constraints are rigidly set before
there is a good understanding of the desired product.
This problem cannot always be avoided when the project deals with very
sophisticated state of the art systems. But it should be avoidable for low-tech
public construction projects, infrastructure projects, and other projects where there is
a huge database of consistent cost and schedule experience that can easily be used
to verify estimates. Yet amazingly, these projects often experience serious cost and
schedule overruns. The difficulty is often traceable to a lack of realism in the goals,
often introduced by micromanagement and political meddling. It can also be
introduced by the opposite of micromanagementa failure of proper oversight.
Trust but verify, as the saying goes.
1.4.2.3 Clarity. A clear goal is one that has definite criteria
that can used to decide whether or not it has been met. A counterexample is: The
software shall be user friendly. This goal is vague and unclear. The project
sponsor and the project team are likely to have radically different ideas about what
user friendly means. The difference in these ideas could span a cost range of
millions of dollars. If these extra millions are not in the plan, failure is likely.

An unclear goal is often in reality a complex of linked sub-goals, which are not well
articulated. In failing to articulate them, risk is introduced into the project. Faced
with unclear goals, the project team will tend to work on what is most familiar, or
what is most convenient. The result may be completely unsatisfactory.

It is seldom necessary to define every goal in a project with mathematical precision.


But certainly all of the important ones should be clearly expressed. Not all problems
with goals are preventable, but lack of clarity almost always is.

1.4.2.4 Generality. General goals have one or a few criteria. Specific goals
have many. A general goal in a project might be to create a system that will move all
baggage from an arriving aircraft into the baggage claim area within ten minutes of
aircraft arrival. There are many possible means for achieving this, and each of them
can be defined by a set of specific design requirements.

A great source of risk in projects is the premature transition of goals from general to
design-specific. It is very risky for project goals to prescribe specific design solutions
before the nature of the problem is clearly understood. What is generally best is an
orderly and careful transition from the general goals to design-specific derivative
requirements, as the situation unfolds.

Particularly in complex cost-plus contract work, the project sponsor must avoid the
temptation to over-specify the end product in the concept phase. This may force the
project team down a path that leads to a poor, costly design solution.

1.4.2.5 Linkages. I have already noted that vague goals can be a cover for a
set of multiple, linked sub-goals. When a set of linked goals is examined in depth,
there can be several types of linkages. Most notably, there can be positive and
negative linkages between goals. The result can be stresses and risks in the project.

If two goals are positively linked, satisfying one tends to satisfy the other. But if they
are negatively linked, satisfying one tends to dissatisfy the other.
Product performance goals and cost and schedule constraints
have strong negative linkages. Negative linkages are revealed in
expressions such as cost / benefit, cost efficiency, and cost effectiveness. Real
projects must deal successfully with negatively linked goals. They are unavoidable.
The first step in dealing with them is to recognize that they exist.

One way to Exhibit 4


proceed is Desirable Properties of Goals
to give
priority to It may be helpful to you to have a short checklist of desirable goal properties. I
one goal show one here, but make no claim that it cant be improved upon.
and let all
negatively Easy to understand
linked goals Meet human needs
find their Balanced and realistic given the state of the world
own level.
Consider possible unintended bad consequences
This
Precisely specified
minimizes
risk. The Testable as to when they have been satisfied
more usual Define assumptions that must be valid for achievement
situation is Minimize internal conflicts
to set limits No hidden agendas
on all goals. Clearly state all applicable constraints
But be
aware that the tighter the limits, the higher the risk.

1.4.2.6 Visibility. A perilous situation arises when negatively linked goals


comprise a hidden threat that only becomes visible when trouble arises. It is
important that goals be visible when the project begins.

Unfortunately, hidden threats are not always easy to find, especially when they take
the form of a secret agenda that someone wants to keep under wraps as long as
possible. The best defense is generally a thorough review of all of the goals for
consistency and coherency. Hidden threats may reveal themselves simply because
they distort the pattern established by the goals that are clearly visible. Its a bit like
having a snake sleeping under a sheet. You cant see the snake, but you can tell
theres something under the sheet that doesnt belong there.

Ambiguous, obfuscating language often betrays a hidden threat. This is especially


true when it occurs in specifications that govern the performance of the projects
product. Whenever a specification is written in a confusing way, you should strongly
suspect that whoever wrote it doesnt know what he wants.11 Later, when he has
finally decided, you may be blamed for not understanding what was (according to
him) always as clear as spring water. This sounds like a rare condition, but actually
it is not all that uncommon.

11
Or knows, but doesnt want you to see it clearly just yet.
An especially pernicious form of hidden agenda is the bait and
switch. The expression bait and switch comes from the retail
world. It happens when a merchant advertises a product at a very low price to entice
customers into the store. But when they arrive there, they find that the advertised
low cost product is mysteriously sold out, and will never be stocked again. So the
merchant attempts to sell them a much higher priced product. Something similar
often happens in the world of projects, especially government sponsored projects.
Someone, often a project performer or interest group, successfully sells a project to
a sponsor (such as Congress or a city government) on the basis of a phony, low cost
estimate. Once the project has begun, the true extent
of the cost is gradually revealed. It is typically much
higher than the phony estimate. The sponsor, to avoid
losing face or for other reasons, goes along and
A good plan continues the project.
today is better
than a perfect The bait and switch phenomenon in projects is not
plan tomorrow. necessarily due to overt dishonesty. It can also be due
to pride, greed, and simple hubris. Truly independent
cost reviews by disinterested (but expert!) parties are
usually an effective remedy for the bait and switch.

In the visibility category, there is a subtle form of risk that might be likened to not
being able to see because you have a hundred bright lights shining in your face.
Sponsors of projects who have some uncertainties as to what they really want and
are willing to accept sometimes load up project contracts with extensive boilerplate
language and lists of related specifications to the point that even they dont fully
understand what they are asking for. This is especially common in government
procurements, where 50 or 100 or more specifications may be listed. This gives the
sponsor comfort that he will not be criticized for leaving out something important.
But it leaves the performer of the work in an uncomfortable condition.

Often in a long list of boilerplate specifications one will find ambiguities and even
conflicts. Boilerplate specs are often neglected and get out of date. The project
performer either must try to sort it all out, and insist on exceptions when troublesome
items are found, or go ahead and accept the work as is and hope for the best.
Chances are, the fine print will never become an issue, but if it does, the problems
could be huge.

A sponsor who wants a successful project will not load it up with unnecessary
boilerplate requirements that require a team of lawyers and engineers a month to
sort out. This practice is unprofessional. The dubious protections it provides are
likely to be more than offset by a higher price for the work.

1.4.3 Brittle vs. robust plans.

Imagine you have a project in which the general goal is to build a ten-story,
sophisticated, environmentally green office building. You engage an architect
whose experience is limited to the design of low cost tract housing.
The result is a design disaster.

Or, imagine you have (a previously noted) project whose main goal is to move
baggage from arriving aircraft in a major airport to baggage carousels within ten
minutes of aircraft arrival. You design the system and install it, thinking to test it after
it is installed. The system tears up luggage, cannot be economically fixed, and has
to be replaced with a more traditional design. (If that sounds familiar, something like
it actually happened and was in the news a few years ago. The project was the
recently built new airport in Denver, Colorado.)

These are examples of brittle plans. They are plans that are readily susceptible to
breakage, especially when stressed. Stress may be inevitable, and breakage
therefore likely.

Brittle plans are a frequent result when constraints,


especially cost and schedule constraints, are set
unrealistically tight. The tightness of the constraints
New systems causes the project team to skimp on materials quality,
generate new expertise, testing and other areas where prudence would
problems. indicate that skimping is a bad idea. They can also be
the result of micromanagement.

Another common source of brittle plans is the low bidder


syndrome. The contractor is selected based only on bid
amount. It often turns out that the winning bidder is lowest in cost because he
doesnt understand the true expectations of the goals, or, he understands them and
intends to cheat. He has insufficient money in his contract to provide the required
quality. So the quality provided turns out not to be what was expected. The result is
seemingly endless problems.

Yet another source of brittle plans is an inexperienced project team. The team is
attempting to accomplish something foreign to its skills and/or experience. Its plan is
not likely to account well for reality.

The opposite of a brittle plan is a robust plan. A robust plan is one that is so
structured that it is resistant to risks, and results in accomplishment of the goals.
Robustness can be described at several levels. I like to think of it as having three
discernable levels, low, intermediate, and high, with various shadings in between:

Low level robustness. A plan with low-level robustness is like a car with good
shock absorbers. It goes smoothly over the smaller potholes. Some sources of
low-level robustness are team cross-training, flexible (i.e., redundant or
interchangeable) resources, good team spirit and morale, and a clear
understanding of the plan by the team, even as the plan evolves.
Intermediate-level robustness. The plan is reality seeking. Special efforts
are made to understand reality, and especially what is best in terms of meeting
the goals. A reality-seeking plan includes plenty of tests, tradeoffs, and what-if
analyses. Many hypotheses are tested and rejected in arriving
at the baseline product design. Windows of opportunity are not
hastily closed. Redundant approaches are taken in areas of high criticality.
High-level robustness. This is a plan in which contingency escapes and fast
recovery from problems are given careful attention. The team is highly qualified
and well trained. The team in some sense practices the project before actually
beginning work on it. Robustness at all three levels is a characteristic of
advanced project risk management (APRM).

1.4.4 Poor estimating and estimate nullification.

The observant reader will notice that I spend the next several pages on the subject
of poor estimating, and less than a page on many of
the other sources of risk. The reason is that poor
estimating is a common and huge contributor to project
risk, while many of the other sources are occasional or
To spot the
sometimes minor contributors. I believe that many
expert, pick the
project leaders and teams have a poor understanding
one who predicts
of how estimating gets done, and especially how to
the job will take
avoid poor estimates. I hope here to impart some of
the longest and
that missing knowledge.

Also discussed is a phenomenon I call estimate


nullification. If a project team has an excellent, skilled, experienced estimator, and
the work to be done is well understood, it will not get a poor estimate. But it can still
have major cost risks if persons in authority chose to nullify the professionally done
estimate, and replace it with a phony one. Commonly, project managers, interest
groups, and others do this. One reason is to sell the project; in the hope that once
it gets started, political factors can be counted on to keep it going. This bait and
switch tactic was discussed earlier.

Every project plan contains estimates. Estimates are statements or assumptions


about the anticipated resources that will be needed to perform the project.
Resources can be of many kinds, including money, time, labor skills, materials,
manufactured components, equipment, infrastructure, services, utilities, data, raw
materials, and professional services. Typically, the project team creates estimates
based on a particular baseline design and a sequence of activities needed to create
it. Estimates are commonly collected and organized by a person (or group) here
designated as the estimator.12

The estimator should have the authority and the responsibility to examine the
estimates for realism. A realistic estimate must be based on a clear understanding
of what is to be done and the resources required. The degree of realism (and
accuracy) of an estimate tends to increase as the project proceeds and grows in
maturity. In the concept phase, estimates are commonly only rough approximations.
During preliminary and detail design, they tend to steadily increase in accuracy. It is
12
Other terms are in use, such as cost analyst or cost engineer. The estimator is not necessarily a
single person. On large projects, there may be several estimators reporting to a lead estimator.
only when the project is complete and the costs and duration
values are summed that the estimate is dead-on accurate, but by
then, you have no use for an estimate.

Exhibit 5
An Unfortunate Syndrome

An unfortunate syndrome plagues many projects. It introduces cost and possibly


schedule risk into the project. It works like this. Team leads estimate what they think it
will really take to do the job. Then they pad it a bit, or more than a bit, just in case.

Thats not always good. Team leads dont always have a good grasp on how much risk
they face, because risk in other team components may affect them, and they may not
fully realize this. There may be much more or much less risk than they have in mind.
What they should do is give their best estimate of what will happen in their area if
everything goes according to plan. Risk contingency estimating should be a separate
exercise from cost estimating, with input from the entire team and the project manager.

Whats even worse is when each lead pads the estimate a second time in anticipation of
a reflex action cut by the project manager. If the cut is not forthcoming, the estimate is
too high. If the project manager cuts too much, the estimate is too low. A project
manager should not arbitrarily cut estimates made by the people who have to do the
work. If he thinks a lead cannot or will not give an honest estimate, perhaps he should
consider getting a new lead.

Project sponsors sometimes push project teams to commit to accurate estimates


very early in the project, at the very time when the team has the least understanding
of the project. This is especially true in technology projects. If these estimates are
used as the basis for the project budget, serious risk is introduced, in the sense that
it is unlikely that the project can be completed for the amounts budgeted. The
estimating community is well aware of this phenomenon, and has tried to develop
estimating techniques that provide more accuracy with less information. I will
discuss estimating techniques shortly.

I learned the following two universal axioms of


estimating from a very wise project leader:
If the
assumptions are You cant estimate anything if you dont understand
wrong, the what it is.
conclusions All estimates contain error; the estimators job is to
arent likely to be minimize it.
d ith
If you only partially understand something, any
estimate you make about it will contain error. The size
of the likely error corresponds roughly to your lack of knowledge. It is well known
that the early estimates of a project at the concept phase are often wrong by 50% or
even 100% or more. This is often partly due to instability of the
goals, but also may be due to incomplete understanding of all of
the ramifications of the current baseline or pre-baseline design, and what it takes to
develop and implement it.

Accordingly, what is ideal is not to produce estimates that are highly accurate from
the first day of the project, but to produce a sequence of increasingly accurate
estimates as the project proceeds. The accuracy of the estimates produced at a
given point in time should be consistent with what is known about the scope of
project work.

But what is ideal is not always attainable. Sometimes estimates have to be made
and converted to budgets while information is still incomplete. To the extent that the
nature of the work is unknown or not clearly understood when a funding, schedule,
or other commitment must be made, there is risk in the project. Clearly, risk is
reduced if such commitments are made a late as possible. On the other hand, it
costs money to wait and collect more information. And waiting may not be possible
for other reasons.

How can poor estimating be avoided? Sometimes it cannot be totally avoided. My


recommendations are few and simple:

Dont have high expectations for accuracy that are inconsistent with what is
known about the work to be estimated. Keep in mind that the estimating error is
likely to be at least as big as the lack of understanding of the scope and nature of
the project. Remember that estimating risk tends to decrease as more is learned
about the project. Wait as long as you can before committing to a particular set
of estimates.
Use a lead estimator who is well experienced in estimating the type of project you
are engaged in.
Give your lead estimator freedom to challenge the credibility of estimates offered
by project team members. Establish a process for resolving differences of
opinion.
Thoroughly document the ground rules and assumptions behind all estimates.
This avoids later misunderstandings, and provides a foundation for negotiations,
if there are to be any (there almost always are).
Discourage both fat and lean estimates, fat estimates being those that are so
padded that they virtually cannot be overrun and lean ones being those that
express unwarranted optimism about what it takes
to do the work. Strive for estimates that represent
what it will really take to do the job, neither fat nor
If you seriously lean. Fat estimates can kill a project, or lose a
attempt to do a bidding competition. Lean estimates risk overruns.
project better, See Exhibit 5 for an unfortunate syndrome that
faster, and should be avoided.
cheaper, you can Cross-check all large and critical estimates using a
only be sure that separate and distinct estimating method as a sanity
th j t ill b check.
The last item may cause some of you to ask, What do you mean,
separate and distinct method? How many ways are there to make an estimate?
Interestingly enough, there is more than one way to arrive at a reasonable estimate
in most situations. Which one is best depends on the situation. Lets briefly examine
the most common ones.

1.4.4.1 Bottom up. Bottom up (also called grass roots) estimating is the
method most familiar to most people. In this method, the lead estimator interviews
or otherwise collects information from each of the project team leads, i.e., the people
who lead each of the primary work areas. The leads estimate all of the resources
they think they will need to do the work. Usually bottom up estimates are entirely
judgmental, and are based on past experience, which can be highly variable with
respect to the instant project.

Bottom up estimating can be quite accurate provided the leads have a complete
understanding of the work to be done, and are experienced in doing that kind of
work. From a management standpoint, an advantage of this method is that it
represents a personal commitment from the lead to perform within the limits of the
estimate (but only if the lead is going to stay with the project!). A disadvantage is
that this method can hugely miss the mark if the work is not well understood (most
such misses will err in the direction of being too low).

Another disadvantage is that grass roots estimates are expensive and time
consuming. They tie up resources that might be better employed. In a large project,
it might take a month or more to generate a good grass roots estimate.

My recommended practice is that grass roots estimates, if used, be used only to


make the final estimate before committing to perform. Using this method early in the
project when there are large gaps in knowledge is wasteful of resources and can
result in wildly inaccurate estimates

1.4.4.2 Standards. Standards estimating is most commonly used for the


implementation phase of a project, but can be used for other phases if good
standards exist. A standard is an accepted resource amount to do a particular kind
of work that is based on careful past measurements. For example, many companies
maintain standards for various manufacturing or construction operations. These are
collected in books or computer files. The estimator looks at each activity that must
be done, and finds the standard for it. If there is a sequence of elementary activities,
the standards for each are looked up and the results are summed.

Standards result in very accurate estimates when the following are both true:

The current work conditions are the same as the conditions under which the
standards were derived. For example, standards often are derived by observing
the performance of a highly skilled worker, working under ideal conditions. If the
workers in the current project are not all highly skilled, or if they are not working
under ideal conditions, standards will need to be adjusted. Sometimes standards
files or books provide realistic adjustment factors or formulas
for various non-standard conditions, and sometimes they dont.
Standards have been recently updated to account for newer, better ways of doing
things, when those new ways will be used on the project.

Keeping standards current can be a big job. It is not uncommon for standards to be
out of date. Obviously, out of date standards should be used with great care.

A common practice is to first use standards to estimate the work, then to multiply the
result by a variance factor based on recent experience. I worked in a
manufacturing operation that used the standards of another division of the company
then multiplied the results by 3.2. The large variance was due to the fact that the
standards were developed for high rate production, but were being applied in a less
efficient factory that was essentially a job shop.

1.4.4.3 Analogy. Analogy estimates are simply comparisons of current work


with previous completed work for which resource requirements are a matter of
record. Analogies can be made at virtually any level of the project if there is history
for comparison. Here are a few examples to clarify the possibilities.

As a low level example, suppose that the current project needs a 50 horsepower
air compressor. A previous project bought a 30 horsepower air compressor for
$3,000. A quick rough estimate is needed. The estimator multiplies $3,000 by
50/30 (the horsepower ratio) to get $5,000. Estimates such as this are usually
conservative when the estimate of something bigger is based on something
smaller, because of economies of scale.13 But they are not always
conservative. Famously, software development projects require more labor
hours per line of code in large projects than in small ones. This is usually
attributed to increased communications overhead in large software
development projects.
As an intermediate level example, suppose that a pipeline construction project
needs a temporary remote site camp for 500 workers for 90 days. In a previous
project, there was a temporary camp of similar nature for 200 workers for 40
days. This camp cost $150 per day per worker. The estimated cost of the new
camp is therefore (150)(500)(90) = $6.75 million.
As a high level example, suppose that NASA wants to launch a new space
satellite for communications purposes. The new satellite has been estimated as
weighing 450 pounds. Looking at previous communications satellites, two were
found of roughly similar complexity. One weighed 350 pounds and cost $10,000
per pound; the other weighed 600 pounds and cost $9,000 per pound. By
interpolation, the new satellite was estimated at $9.8 thousand per pound.

Analogy estimates tend not to be highly accurate. On the other hand, they tend not
to be highly inaccurate, either. They are robust in the sense that they will seldom

13
A useful approximate rule of thumb for many mechanical items that are similar except for size is that
CM = CN(M/N)1/2, where CM is the cost of the larger one, CN is the cost of the smaller one, and M/N is
the size ratio. N and M must be in the same units, such as pounds, inches, horsepower, etc.
lead to gross error. However, a necessary condition of robustness
is that the historical comparison must be at least roughly
equivalent.

In estimating, equivalence is a catchall term that covers a host of possible necessary


adjustments, such as adjustments for complexity, level of difficulty, learning
associated with production quantity, currency inflation, currency exchange rates,
quality level, documentation required, reliability, technology maturity, etc. An
experienced estimator can usually detect most of the key adjustments that are
needed for equivalency and make them with reasonable accuracy, while an
inexperienced estimator is more likely to miss some, or to make them inaccurately.14

Analogy estimates are commonly used early in a project to get a rough idea of cost.
They are also used to trade off options being considered for the baseline design.
They are generally fast and inexpensive. Often, the estimator can do them with only
a little help from others.

1.4.4.4 Parametric models. Parametric estimating is analogy estimation gone


sophisticated. This form of estimating uses models that contain mathematical
equations and algorithms. Some parametric models are complex, but not all are.
Very simple ones are sometimes called rules of thumb. An example of a rule of
thumb is that residential construction in a certain area currently costs about $120 a
square foot. The mathematical equation would be:

Cost in $ = 120 x number of square feet

Here is a slightly more sophisticated parametric model:

Man-months of effort to develop software = 2.4 (KSLOC)1.05

(In this equation, KSLOC stands for thousands of source lines of code.)

The two examples just cited are too simplistic to give very accurate results, because
many other variables can affect the costs being estimated. Much more sophisticated
parametric models are available, however. Some of these require inputs regarding
30 or more project parameters. For example, a sophisticated model for estimating
software development costs might ask for information about all of the following
parameters before it produces an estimate:15

Required software reliability


Data base size
Execution time constraint

14
Design engineers and others often try to generate their own cost estimates for tradeoffs. They
assume that cost estimating is simple, especially estimating by analogy. They commonly overlook
issues of equivalence, and these issues can have huge effects. My recommendation is to beware of
estimates not checked by the projects lead estimator.
15
This list is from the public domain Intermediate COCOMO model published in Boehm, Software
Engineering Economics, Prentice Hall, 1981.
Main storage constraint
Virtual machine volatility
Analyst capability
Application experience
Programmer capability
Virtual machine experience
Programming language experience
Modern programming practices
Use of software tools
Required development schedule

Some parametric models are sophisticated enough to provide features such as:

Currency inflation tables


Currency conversion tables
Cost spreads over time
Cost allocations by department
Cost allocations by labor category
Cost allocations by project phase
Risk analyses
Estimates to complete for partially completed projects
Project schedules
Manpower constraints
Custom reports

Sophisticated parametric models are almost invariably


computer models. Most of them can run on personal
History repeats computers. They are far too complex for manual
itself. Thats one computation.
of the things
wrong with Parametric models typically are developed from
history statistical analysis of large pools of data. Where
statistical analysis cannot provide needed information,
sometimes expert opinion is incorporated into the model
using techniques such as the Delphi method.

A project team wanting to use a parametric model basically has two choices: 1) buy
a license for or otherwise obtain a commercially available model, or 2) create a
parametric model based on their own historical experience.16 The latter choice is a
poor one for most teams, for these reasons:

Most project teams have only a small historical database to work with, compared
to the data available to commercial modelers

16
Some trade associations accumulate project cost and other data from members and organize it into
what might be considered parametric models, or at least relatively sophisticated analogy models.
Models can quickly become obsolete and inaccurate if not
maintained, and maintenance is expensive; project teams tend
to neglect model maintenance
Most teams dont have access to people with good model building skills.

When should you use a parametric model? The following should all be considered:

A model must be available for your industry and product type. I am aware of
excellent models for development, production, and support of sophisticated or
complex hardware (especially for aerospace and defense or high-end commercial
use), all commonly used manufacturing processes (and a few not so commonly
used), development and maintenance of software, and construction. High quality
models also exist for the situation where software is integrated into hardware.
The model builder should have convincing information as to accuracy (if they
have been in business for several years, they probably do).
The model builder should provide excellent technical support. You will
understand why if your model screws up the night you are finishing a critical
estimate.
Be careful in committing contractually to a cost or schedule generated solely by a
parametric model. It is best to always cross-check a parametric estimate using at
least one other method. This is especially true if the parametric model does not a
have a history of successful use in your team.
Most highly sophisticated parametric models must be applied at the component
level of a product to get good accuracy. They generally cannot be expected to
work well if you start at the subassembly or total system level. However, if you
make your inputs at the component level, a good model will aggregate costs and
schedules to higher levels, including costs of integration, assembly, and test.
Parametric estimates are usually quick and inexpensive compared to bottom up
or standards estimates, even taking into account costs of licensing the model and
training people to use it.
Parametric models can provide a competitive advantage in the following sense.
When design options are being considered and traded off, parametric models can
usually produce cost, schedule, and sometimes risk estimates with reasonable
accuracy much faster and more economically than any other method. This
permits the examination of more options in a given time than would otherwise be
possible. The examination of more options increases the likelihood that a team
will hit upon a design that is better and more cost effective than its competitors.
This can be a major factor in winning competitions. Of course, this advantage
can vanish if your competitors also use parametric models effectively.
It is best if the model you acquire has calibration features. The raw output from a
model typically represents approximate industry average costs and schedule.
Your teams may consistently perform above or below average on certain types of
project. Calibration features allow you to tailor the model to your experience,
providing a more accurate result. But note that calibration is a fairly complex
matter that should be handled only by a skilled estimator who is familiar with the
model and its characteristics.
Some parametric models may be able to help reduce risk by
virtue of their ability to express cost (and sometimes schedule)
risk quantitatively. When risk is made visible by the model, it may call attention to
a need for risk mitigation.
Sophisticated parametric models require training, typically a few days or as long
as a week. Training is best given to people who are already experienced
estimators, although engineers and certain others may also benefit from it.
Training should be supplemented for a period of time with on-site support from
the model builder. This tends to eliminate the possibility that the model will be
misused due to misunderstanding of its functionality.
If you are going to depend on a parametric model, send more than one person to
training. If your sole model expert is sick, or goes to work for someone else, you
can be crippled.
The parameters used by the model must be parameters whose values are
available to you when you need to do the estimate. Typically, parametric model
builders are careful to select a minimum set of parameters that 1) you will tend to
have values for very early in the project, and 2) will provide a reasonably
accurate estimate. For example, certain hardware models use as parameters
product weight, materials used to make the product, machining and assembly
tolerances, integration complexity, experience levels of the development and
manufacturing teams, production quantity, design and manufacturing tools
available, and other parameters about which you are likely to have early
information based on sketchy design and programmatic information. The ideal
for a parametric model is to quickly get the best possible estimating accuracy out
of the information you are likeliest to know in the projects early stages. Some
parametric models have knowledge bases that provide average defaults for
parameters about which you temporarily may not have any information. Thus
you can always get a reasonable estimate with an absolute minimum of
information.

1.4.4.5 Quotes. A quotation from a supplier can be considered accurate and


virtually risk free under certain conditions. All of the following should be considered:

The supplier is known, stable, and trusted.


The quantity required that you have specified and that has been quoted is known
to be stable. If it may be unstable, you should have the supplier quote more than
one quantity. You will have to decide what to do about quantity risk. Be aware
that quantity errors are common in projects. They can arise in many ways.
The quote is fixed price, and guaranteed for a time, typically six months. You are
sure your procurement will take place before the quote expires.
You and the supplier both clearly understand what is included in the quote and
what is not.
The supplier has inventory to meet your needs, or can otherwise guarantee
timely delivery.

It is tempting to consider a quote to be risk free just because it is a quote from a


trusted supplier. Sad experience often shows otherwise. A major reason why
quotes dont turn out to be accurate is because the vendor providing the quote was
given inaccurate information about quantity required. Another
frequent problem is inaccurate information about need date.

The above criteria that apply to quotes generally also apply to bids and proposals.
Inherently risky to the buyer are cost plus and time and materials bids or
proposals.

1.4.5 Unintended consequences.

If everything An unintended consequence is something that happens


seems to be unexpectedly as a result of a decision or action. The
going well, you main reason unintended consequences occur is that
have obviously people take action or make decisions without thinking
overlooked very hard about the consequences. Nobody is smart,
clever, or knowledgeable enough to think of every
possible consequence of an action. In chaos theory,
we are told that a butterfly flapping its wings in Asia could cause a hurricane in
Florida. We certainly cant blame a butterfly for not thinking of that, and we couldnt
blame a human, either.

Still, we can hold humans responsible for outcomes that a reasonable person should
have been able to foresee. The law requires that of the unwashed masses, so
certainly project teams, which are often led and manned by our best and brightest,
should be even more accountable. Project teams should catch a lot more
unintended consequences than they do, especially when the outcomes are bad.

There are several ways to combat unintended consequences. Unfortunately, many


of them are not applied when they should be. Here are a few:

Lessons learned programs, in which problems encountered and their solutions in


real projects are logged and taught to future project teams
Risk brainstorming sessions, which encourage people to think expansively and
outside the box about what could go wrong, not just within the project, but also
in its environment
Safety and reliability programs, which seek to
identify all possible failure modes and describe
their effects, so that steps can be taken to avoid
them.
Army axiom:
Any order that Unintended consequences dont necessarily occur
can be while a project is active. They may occur long after it
misunderstood is done and gone. But they can be costly and
damaging nevertheless, and liability may linger for
years.

1.4.6 Lack of communication.


Continuous and effective communication is a central requirement
for effective PRM. Think of a police SWAT team without their
personal radios and not allowed to use their hand signals. Someone on the team
would be much more likely to be killed or injured. Think of an aircraft carrier without
radar, intercommunications systems, radio, and engine telegraphs. Airplanes would
crash, the ship could run into a pier, and the enemy could approach without being
seen.

Today, when a new project organizes and moves into offices, someone may
question whether the team needs a break room or a childs day care area, but hardly
anybody would question that each team member of any standing needs a telephone.
And the team will also likely have a fax machine, a copier, and e-mail. Even more
sophisticated fast means of communication are likely to be available, such as the
Internet. Some of these are things that no team would have had even 25 years ago.
One hundred years ago, the list would have been shortened down to virtually
nothing.

Today we place high emphasis on the means or media of communication. We


readily justify the expense of all of the latest gadgets and software. Individual team
members often buy with their own money items that the team itself is not willing to
buy, so they can stay more in the loop. A prime example is cell phones. Another is
personal digital assistants.

In spite of all the gadgets, projects have a lot of problems with communications.
Even as the media become increasingly sophisticated, the communications
themselves often are prolix, confused, or unclear. Sometimes they are just
thoughtless. Here are a few examples you may have experienced, ranging from the
merely irritating to the serious:

Someone leaves a message on your answering machine. At the end of the


message is a phone number you are supposed to call. It is recited so quickly that
you have no opportunity to grab a pen and paper and write it down. You save
and replay the message, with pen in hand and paper at the ready. Still, the digits
of the phone number are recited so rapidly you cant distinguish them. You invite
the person in the next office to listen to the message. She cant sort out the
phone number either. You either ignore the call or waste time trying to track
down the person who placed it.
A similar phenomenon can occur when people leave their names. Virginia
James comes through well enough on a single try (or maybe its Jaymes?), but
what about Evin Stump? The latter needs to be spelled. Even then, many will
get it wrong. Ivan Strump? Evan Stunt? Kevin Stumpy? Even Stamp? If its
important to get something right, its worth taking the time and trouble of getting it
right.
A vital engineering report is scrupulously accurate in its equations and in its data.
But the explanations of what was done are incomprehensible. They are written in
third grade English. (How is it that some people are able to earn masters and
even doctoral degrees in challenging disciplines, but cannot, or will not, write
clear, literate English?)
A small group of senior engineers is doing leading edge work
on what the next product design baseline should be. They
arrive at what they believe to be a truly elegant design solution. They meet with
the project manager, who formerly headed the advanced design group, and who
talks their language. He wholeheartedly approves of their design solution. Two
weeks later, it is discovered that half the people in the project are still doing work
that applies only to the previous baseline. Work on the new baseline has lagged.
The change in baseline was not communicated throughout the team. (This was a
personal experience of the author.)
A part of the team located in Houston discovers a serious potential problem, and
proceeds to fix it. But the part of the team in Alaska is not aware of the fix.
Unfortunately, the part of the team in Bahrain takes action that renders the fix not
doable. The consequences are serious.
A personnel directive about vacations is issued three times because it provided
the wrong information two times. Some people acted on the incorrect
information. Key people go on vacation when they are urgently needed at work.

1.4.7 Failure to do due diligence.

Due diligence is an expression that seems to have come to us from the financial
world, especially the world of corporate acquisitions and credit and veracity checks.
The message seems to be that when an important transaction is contemplated, we
need to be especially careful to learn and verify the facts surrounding the
transaction, then to build safeguards into the transaction that minimize the chance of
failure. Failure to do due diligence may result in a transaction that is not in the best
interests of one or more of the parties.

All projects of any consequence involve financial transactions, and due diligence is
important in reducing the risk inherent in any financial transaction. But projects also
require non-financial transactions and these too need due diligence. Examples of
transactions that may not have a financial component are access agreements,
promises to provide at no cost, promises to provide by a certain date, promises to
disclose intellectual property, and promises relating to quality and fitness for a
specified use.

Here are some considerations to be kept in mind in any transaction, financial or


otherwise:

What safeguards are in place to assure that the subject matter of a transaction is
suited for its intended use in the project plan?
Would it be cost effective to spend more money or acquire more resources to
increase the level of security provided by existing safeguards?
Would penalty or incentive clauses in the transaction agreement be useful in
reducing the likelihood of failure of the transaction?
What guarantee and / or litigation avoidance clauses should be included in the
transaction agreement to reduce the likelihood of failure?
If the subject matter of the transaction is not suited for its
intended use, or does not meet the needs of the project plan,
will that become known in a timely manner, such that timely corrective action is
not foreclosed?
If the subject matter of the transaction is not suited for its intended use, will the
other party take timely and complete corrective action? If not, can the project
team correct the problem and still have a successful project?
Will any delays in corrective action be accommodated by slack in the project
schedule and reserves in the budget?
If safeguards fail, what tasks in the project will be impacted, and how badly?
Which of these tasks are on the project critical path?

1.4.8 Micromanagement.

I believe micromanagement is most usefully defined as an attempt to control details


of an outcome by a person or group other than the people who best understand what
needs to be done. Most commonly, it is realized as an attempt by a senior person or
group to interfere with the work being done by more junior people, even though the
senior person is not close to the action, and may not fully understand the
circumstances. The consequences of micromanagement can be serious.

Sometimes very junior people complain of micromanagement, even though the


senior people who are interfering may have good reason to do so. Sometimes
senior people recognize that the junior person is making mistakes that will have
serious consequences if they are not promptly corrected. This seems to imply that
bad micromanagement can cause problems, but good micromanagement can
stop problems. How do you know which is which? How do you stop bad
micromanagement and promote good
micromanagement? My thought is that this is the wrong
way to look at the problem. What you want to do is stop
all micromanagement, insofar as possible. You want all
When two people management to be done by the people who are closest
meet to decide to the action and who best understand what to do, even
how to spend a if sometimes the formal chain of command is
third persons temporarily broken.
money, fraud will
lt But what if the person closest to the action is so green
that he or she doesnt really understand what to do?
My answer to that is a question: why did the project
team allow that to happen? Why is a green and inexperienced person put into that
situation, when the team has (or should have) people who would know what to do?
Why hasnt the team trained its people so that anyone in a position to make
decisions knows what to do? Why isnt a green person paired with an experienced
person until he or she gains the knowledge and confidence to make good decisions?

I believe the proper role of project management is to set policy and establish
strategic ground rules for the team. Except in the smallest projects, project
management does not get involved in day-to-day tactical
problems, unless the team feels it needs to push them upstairs
because it cant handle them.

Exhibit 6
The Purloined File

Jane Doe was recently hired as an administrative assistant at a large firm in Los
Angeles. She is responsible for maintaining central files. Among the files in her custody
are files containing highly confidential information relating to pricing of proposals made
by the company to government agencies.

After just a few days on the job she got a call from Mr. Hannity. Mr. Hannity wanted to
know what happened to the pricing file on the XYZ project that he asked to have sent
overnight mail to his hotel room in Washington, DC. He urgently needs it, he said, for
negotiations with the customer.

Jane explains to Mr. Hannity that his request for the file must have gotten lost, because
she doesnt remember seeing it. And besides, she explains, only Imogene Thatt,
manager of pricing, can release the file. She tells Mr. Hannity he will have to get her to
sign a release memo.

That same afternoon, a release memo bearing the signature of Imogene Thatt appears
in Janes company mail envelope. It instructs her to send the file by overnight mail to
Mr. Hannity at a hotel in Washington, DC. Jane complies.

A few days later, a woman comes by Janes workstation and introduces herself as
Imogene Thatt. Imogene asks for the pricing file on the XYZ project. Jane finds a sign
out sheet showing that it was sent to a Mr. Hannity in Washington DC. A quick
investigation discloses that there is no person named Hannity working for the company,
and there is no Mr. Hannity registered at the hotel. He checked out the day the
overnight mail containing the file would have arrived. He paid cash for the room and left
an invalid forwarding address. The signature on the authorizing memo was forged.

Most of the time, it should be able to handle them.

1.4.9 Unethical behavior.

Unethical behavior includes all forms of lying and stealing. I would go so far as to
say that it also includes violations of the Golden Rule, at least the more serious
ones. That would include greed and overly aggressive and damaging behavior.

Many organizations today have found it beneficial to stress the importance of ethical
behavior on the part of employees, and to establish penalties for unethical behavior.
These rules and sanctions have undoubtedly had a deterrent effect. But sometimes
projects have special vulnerabilities to bad behavior that make it wise to go a bit
further in the direction of prevention.

One obvious consideration is that non-employees can also damage a project. This
is especially true of non-employees who are former employees. Sabotage of a
project is also a possibility. Most of us work in environments
where we believe sabotage to be extremely unlikely, which tends
to make us all the more vulnerable if someone thinks the stakes are high enough to
commit an act of sabotage.

Sabotage does not necessarily mean that someone will come into your office or
factory or site and destroy or steal something. It can be much more subtle than that.
For example, someone might call one of your less cautious employees and make a
seemingly casual inquiry about your processes for doing certain kinds of work.
Using this information, the saboteur proceeds to set up a situation that causes the
process to fail. An example is given in Exhibit 6.

Here are some other examples of unethical conduct intended to wrongfully modify
the outcome of a project:

Modification of a test or financial result, usually to show a better outcome than


actually happened
Off-loading costs to other projects; known to happen especially when a company
has both fixed price and cost plus projects (the cost plus projects unethically and
illegally absorb cost overruns from the fixed price projects)
Intentional administrative mistake, usually designed to slow down the project so
that the perpetrators have time or opportunity to modify a project outcome or to
cover evidence of wrongdoing.
Theft of vital project resources or information.

1.4.10 Regulatory problems.

Two kinds of risks are connected with government regulations:

Unsuspected difficulty in dealing with regulations now on the books.


The difficulty of dealing with new regulations that may come into being
unexpectedly while the project is in progress.

Not all kinds of projects have to deal with significant regulatory problems, but most
major projects do, in one or more ways. The problems can be as minor as getting a
building permit to as difficult as filing an environmental impact statement in a
contentious situation.

Projects that frequently have to deal with government agencies, especially those of
foreign governments, do well to stay abreast of the agencys thinking about
regulatory issues. If changes are coming, it is best to get involved and make your
voice heard at hearings held by the agencies to get public input.

Regulatory problems in foreign countries can be especially vexing. The safest


course is to have in-country partners who are well enough connected to see these
coming and knowledgeable enough to know how to deal with them.
1.4.11 I forgot.

It seems that every major project has one or more I forgot stories. Often, it is a
forgotten task that was left out of the planning. Sometimes, it is a needed resource
that nobody remembered to acquire until its absence became painfully obvious. At
other times, it may be a vital communication that was overlooked until it was too late.

At one time my view of I forgots was relatively benign. I viewed them simply as
holes in the plan that werent really risks, but just items that needed to be put into a
tickler file and promptly taken care of by somebody. My view has changed. I
forgots can do real damage and should be called risk drivers. The tough problem
is, how do you convert I forgots into somebody remembereds?

Consider a heavily loaded aircraft preparing to take off. At takeoff with a heavy load,
the aircrafts flaps are supposed to be set at a certain position. It is the co-pilots
responsibility to do this. He forgets to do it, and the aircraft overruns the end of the
runway and crashes into a building, destroying the airplane, the building, everyone
on board the aircraft, and several people in the building. The subsequent
investigation reveals that there was no pre-flight checklist in the cockpit. The
presumption is pilot error. But could you call it pilot error if no pilot ever used a
checklist? If it wasnt an accepted, even mandated, practice? You would have to
call it an I forgot. Thats what often happens in projects. Failures to have safety-
ensuring processes are mistakenly called I forgots.

Project teams must share the blame for I forgots. The blame should not be
assigned to one or a few people in whose area of work the I forgot occurred.
Project teams should be aware by now that I forgots occur easily in the rush to get
a project going and completed. Yet only rarely do I hear of a project creating and
using a readiness checklist, or any kind of process to be sure that things that need to
be done got done.

To be sure, aircraft takeoff checklists are used repeatedly, and have evolved until we
are pretty confident nothing important has been left off. A project, on the other hand,
is only done once, so how can we be sure nothing has been left off its checklist? My
answer to that is project readiness reviews and simulations, as discussed in chapters
3 and 4. These are the project equivalents of aircraft checklists.

1.4.12 Advancing the state-of-the-art.

Most project work uses well-established or at least proven technology. But there are
projects that intentionally move in new directions. They attempt to advance the
frontiers of knowledge. Because they are to some extent ventures into the unknown,
we cant be completely confident of the outcome.

Unless there is extreme urgency, it is wise to take small steps toward new
technology. Instead of creating a single project that goes in one jump from
scientists dream to an operational system, one should create a research project that
demonstrates the basic idea, then perhaps two or three follow-on projects that
gradually increase knowledge of how to apply the idea. Only when
the idea is well understood should there be an attempt to
commercialize or otherwise implement it as a mature system.

Projects that try to bite off too big a chunk of new technology in a single project
often fail in one or more respects. They are prone to both cost and schedule
overruns, and they also often fail to produce the desired technical results, due to
resource shortages.

I cite here an extreme example just to make the point. Suppose that the U.S.
government were to decide to create a single massive project whose goal is to set
up, within 10 years, a number of power generating plants based on the principle of
nuclear fusion. The budget is $50 billion.

Exhibit 7
Two Airplanes in One

The C-17 cargo aircraft, a product of Boeing, is now an effective and successful part of
the U.S. Air Force inventory. But in the early stages of development, the project almost
failed.

Originally the cost of development was supposed to be on the order of $2 billion. The
eventual cost of development was about $5 billion. There were several reasons for this
huge cost growth, but probably the biggest driver of cost growth was the difficulties
experienced in trying to integrate all of the new ideas into one aircraft.

Cargo aircraft usually are either strategic or tactical in nature. The C-17 is both. In a
sense, it is two airplanes in one. It can carry heavy payloads for long distances to get
people and materials from the United States to a distant combat zone, and it can also
carry payloads to rough airfields right in the combat zone, to provide close-in tactical
support.

Where other cargo aircraft may have five or more crewmembers for a certain mission,
the C-17 will have only three. This is a huge savings in operational cost, but it had to be
paid for in the development of sophisticated, automated cargo handling equipment that
could be operated by one person.

Other new and unusual requirements also added to development cost, such as being
able to back up an incline, and turn in a very short radius.

Under present circumstances, the project will almost certainly fail. The reason is that
controlled nuclear fusion is a technology that clearly has a long learning curve.
Some doubt it will ever be practical for useful power generation. A more reasonable
approach is what is actually being done now: funding the research at a modest level,
with no pass / fail criteria. Its a lets see what happens as opposed to a lets make
it happen approach. We are gambling small amounts we can afford to lose in hope
of winning a scientific lottery.
1.4.13 New integrations.

New integrations refers to the possibility that project goals will require the project
team to integrate existing technologies in ways they have not been integrated before.
It also refers to the possibility that the project team will choose a design solution that
requires such integrations, even though the goals do not.

Well-established technologies may not be particularly risky in and of themselves, but


combining them in new ways can be very risky, simply because of the unexpected
difficulty of doing it. An occasionally cited rule of thumb is that any project that
combines more than two new things is very risky. While I cannot vouch for this rule
of thumb, I can cite an example where multiple new integrations eventually were
successfully done, but the project was almost cancelled due to huge cost overruns.
See Exhibit 7.

1.4.14 Make versus buy.

The make versus buy issue arises most famously in projects involving manufacture,
where the issue is whether to build something in-house or contract it out. Actually,
the issue has a much wider scope than manufacturing. Similar decisions can arise
in software projects, construction projects, provisioning projects, and in fact in almost
any kind of project.

This issue clearly exposes the dual nature of risk. There can be both upsides and
downsides to either decision. Here are some of the potential upsides to a make
decision:

Learning and technology grow internally, increasing the internal knowledge base
May prevent layoffs in the workforce and shutdowns of certain capabilities
May lower internal overhead rates, making for a more competitive posture
May permit better management control of the work, reducing risks

Here are some of the potential downsides to a make decision:

May require unwanted hiring and overtime work


May overtax existing facilities
May tie up a lot of management time
May result in higher costs and poorer quality
Batteries are not because of unfamiliarity with the product
included.
The upsides of the buy decision are the mirror images
of the downsides of the make decision, and vice versa.

1.4.15 Items furnished by customers or others.

In an attempt to save time, money, or both, the project customer or sponsor


specifies that the project team, in the execution of the project, must make use of
certain infrastructure, equipment, systems, processes, materials,
or people that are already available to the customer / sponsor.
This situation may be fraught with risk.

What is offered for use is often too little, too late, the wrong thing, or defective. Even
if it available on time, is not defective, is the right thing, and is in the right amount, it
may be strange and unusual to the project team. They may have to spend a lot of
time and energy becoming familiar enough with the item to use it effectively. Some
or all of that time and energy may not have been included in their plan.

What is true of customer furnished items can also be true of what are commonly
called commercial off the shelf (COTS) items. The COTS problem has lately
received a lot of attention in software projects. The problem typically goes
something like this: Instead of writing code to perform a certain needed functionality
in a software program, the team elects instead to use a certain COTS product that
has that functionality. The teams tradeoff to reach this decision was that the COTS
license would only cost $1,000, but to write the code themselves would require about
$20,000 in labor. So they buy a license for the COTS product.

Exhibit 8
Customer Provided Computers and Software

A sad example of ineffective customer provided material came to my attention


recently. It involves an American construction management company that was
engaged to build a large bridge in an Asian country. The in-country sponsor
told the construction company that it would provide personal computers and
software for project management purposes, such as scheduling, estimating,
budgeting, etc.

The American team arrived in-country and set up shop. They were extremely
hampered in their operations for several weeks because the promised
computers were not available. Finally, the long awaited computers arrived at
their offices. It turned out that they were old computers, going back to the time
the original IBM PCs came on the market. Their memories and processing
speeds were totally inadequate for modern project management software. And
the monitors did not do color.

So the team quickly went out and bought one up to date computer so it could at
least get started in its work. They loaded the software that had been provided
by the customer. It was up to date, modern software. There was only one
problem: all screens were in the language of the host country. Not one person
on the American team could speak or read that language. This comedy of
errors caused weeks of delay, and thousands of dollars in unexpected costs.

They then discover that the supplier of the COTS software is not very forthcoming
about how to engineer it into their system, so they have to do a lot of reverse
engineering to make it work. They also discover that the COTS
supplier is just about to issue a new release and soon will no
longer support the old release. Finally, they discover that they have to write
hundreds of lines of so-called glue code to make the COTS stick properly to their
software program. By the time they are through, they have spent $50,000, when
they could have done it themselves for only $20,000.

These kinds of problems dont always occur with customer furnished items, COTS,
etc., but they occur frequently. A project team should always carefully scrutinize any
decision about using unfamiliar items of any kind, whatever their source, when it is
possible to use a more familiar approach. I am not trying to imply that a COTS
product will not be a better solution; merely that it should be investigated carefully.

1.4.16 Procrastination

Most likely we have all had the experience of having to carry out a task working with
other people who have not held up their end of the pole. It can be most frustrating.
All you need from them is a small action they could take care of in a few hours or
perhaps even minutes, but days go by and what you need from them simply does
not get done. You know they are aware of the responsibility they have to you. You
may think they should feel guilty about not taking care of business. You call and
leave messages. You may get excuses and renewed promises, like, You know,
weve been up to our neck in alligators, but well try to
get to it in the morning.

Eventually, a crisis time comes. What they have


Procrastinate failed to do is about to keep you from meeting a
today! Tomorrow critical deadline. A failure to meet the deadline will
may be too late. result in costs, delays, inconvenience, and
workarounds. It may even result in damage to
reputations. With luck, at the last minute the other
people come through with what you need and the day
is saved. Or, they dont come through and disaster
happens.

If you have sufficient leverage, such as recognized command authority, or maybe as


the source of project money, you can often overcome procrastination by simply
ordering do it now, with an implied if unspoken or else. But with little or no
leverage, the problem becomes one of persuasion. My approach to this problem
usually is to first try to find if there is a hidden reason for non-cooperation that needs
to be resolved, such as a real or imagined slight, or maybe an unpaid invoice. If that
seems not to be the case, I inquire into their list of priorities, and where in that list my
work stands. Most often the priorities are vague or personal, and rather than admit
that my important work ranks below an office birthday party, my work will get done.

1.4.17 Resource shortages.


Over the years, I have observed or heard about many problems
with resource shortages. While some of these relate to items such
as raw materials, office space, computer chips, and the like, the one that jumps to
mind as being pervasive is people. Heres an example. Many project managers
have had the experience that their project got off to a slow start because they
couldnt get the people with the right skills early in the project. And once they got off
to a slow start, the project was always playing catch-up afterward. The delays
created extra costs, and sometimes affected product quality.

Two phenomena seem to exacerbate this problem. One is that in many projects
there is something I call a skill mix transition. The project plan is laid out in such a
way that the early tasks require people with the highest skills (some organizations
call then superheroes). Later, these people may drift off to other project teams and
be replaced by people with lower or perhaps simply different skill levels. The other
factor is that highly skilled people are expensive, and most organizations can only
hire a limited number of them. Or if they could afford to hire them, they cant find
them. They tend to be a scarce commodity.

The overall result is that when the project team is being mustered, the highly skilled
people already have other assignments, from which it is difficult to get them
released. And if they do become available, there is usually a fight about who gets
them for their project.

Many if not most resource shortages arise due to project plans that are out of touch
with reality. The failure is often one of due diligence. If any resource is critical to a
project, its timely availability should be carefully checked. Sometimes it is cheaper
and safer to reserve a critical resource by procuring it in advance (if that is
possible) rather than waiting until it is needed, then finding it is not available.

1.4.18 Revenue forecasts.

In projects where the team is graded on revenue, a common problem is that revenue
expectations are not met. Revenue forecasting is inherently risky because so many
factors influence it.

Typically, the situation is that the project team has developed a product that
hopefully other people can be enticed to buy. The attitude of the team is that of
course people will like and buy it, because were a great team and we built it! But
many things can get in the way of that expectation. Here are a few of them.

The intended market simply does not like the product. It has other options it likes
better.
The price is set so high that sales are reduced below expectations.
The price is set so low that there is no profit.
The people who did the forecast simply did not do enough research or ask
enough questions.
The product is late to market and encounters unexpected competition.
The distribution or deployment planning is flawed; the product
has failed to reach customers in a timely fashion.
The economy enters a recession and slows demand.
The marketing plan does not square with reality.
The sales force is not sufficiently motivated to push this product; they have other
priorities.
The competition unexpectedly introduces an effective competing product.
The new product competes with existing products of the same firm, dragging all
of them down.
Existing sources of competition were overlooked.

1.4.19 A system risk checklist.

The preceding sections of this chapter have attempted to increase your risk
awareness by describing risk situations that often arise in real projects. I could
continue in this manner for another 100 pages and still not name all of the possible
problems that could arise.

The closest thing I have seen to an exhaustive checklist of sources of project risk is
a list that was created by the Risk Management Working Group of the International
Council of Systems Engineers (INCOSE). That list is reproduced below. Probably
the best way to use it is as a source of ideas in a risk brainstorming session.

Exhibit 9--INCOSE System Risk Checklist

Infrastructure
PROJECT Legal (intellectual property,
Administration trademarks, product liability)
Certification (customer procedure) Logistics
Change management Market analysis
(requirements creep) Market window
Competition Operations and maintenance
Configuration management Partners
Contract People (manpower) skill mix
Contract performance Performance management system
Cost Personnel management
Customer relations Personnel development
Earned value Political
Environmental Quality assurance
Facilities Reliability
Financial markets Resources (computer, people,
Funding (customer, builder, facilities)
investor, partner) Safety (product, OSHA)
Industrial benefits (local content) Schedule
Information management
Security (product, End product
environment, Environmental impact assessment
computer) Functional model
Sponsor Improvement
Standards compliance Integrated disciplines / integrated
Strategic plan project team
Suppliers Integrated master plan
Subcontractors Integrated master schedule
Training Interface definitions
Integration
PROCESS Logical model
Activity description Hardware
Entry criteria Human factors
Exit criteria Maintainability
Feedback from process users Manufacturing sources
Flow diagram Obsolescence and life cycle cost
Inheritance (reuse, tailored) Operations and maintenance plan
Inputs Physical model
Interfaces Producibility
Lessons learned Program management plan
Maintenance Request for proposal
Maturity Requirements document
Meets needs o Ambiguity
Metrics (performance) o Completeness
Outputs o Compliance (format,
Ownership content)
Process defined o Consistency
Process elements o Implementation independent
Process exists o Testability
Roles and responsibilities o Traceability
Standard compliance Reliability
Tools Requirements traceability matrix
Top-level description Risk management plan
Security (network, product)
PRODUCT Safety analysis
Activation plan Safety of use
Allocation Safety plan
Architecture Software
Concept of operations Specifications
Commercial off the shelf (COTS) Statement of need
o Long term availability Statement of work
Deactivation plan Survivability
Decision (make vs. buy) System design
Declassification procedures System engineering master plan
Detailed design drawings Technology
Documentation o Availability
Test plan Upgrade
Test report User manual
Testability User training
Traceability Validation plan
Trade study report Vulnerability
Training

1.4.20 Poor project execution.

Poor project execution may be the leading proximate cause of project failure. At
least, several studies have made that claim. But what is it, exactly?

An analogy that I like is the high school band when it is first organized and is first
tackling a Souza march. Most members are off key, and they are not all at the same
point in the score at the same time. The result is more like noise than music.

A high school band with good leadership and reasonable talent will in time overcome
its execution problems. A high school band with poor leadership or little talent will
improve only marginally, or could get worse, due to discouragement.

My analogy mentions both leadership and talent. Not mentioned, but implied, is
practice at working together. All are important. Some organizations (3M and
Lockheed Martin come to mind) maintain project teams that stay together and go
from project to project. These teams have proven leadership and long experience at
working together. The team members are highly qualified and motivated. Project
execution is virtually flawless. Sometimes such teams can do in weeks what
ordinary teams need months, or even years to accomplish. Being a member of such
a team is a point of pride, and draws the respect of fellow professionals.

By the definitions used in this book, if the budget and the schedule and other goals
and plans make sufficient allowance for poor execution, there is no risk. I have seen
this happen once, on a small project, but it is rare. Almost always, the plan is drawn
up on the expectation of reasonable or at least typical execution, so the possibility of
weak execution is a risk.

Many if not most project teams are pulled together on an ad hoc basis. Members
may come on loan from functional organizations, other projects that are winding
down or shifting focus, or they may be newly hired. A few members may have
worked together before, but many will be new to the team. An experienced,
successful project manager will understand how to pull the team together and get
them working together at reasonable efficiency. An inexperienced project manager
with poor people skills will most likely execute poorly. If poor execution is a
possibility, steps can be taken to improve the situation. See chapter 3 for thoughts
on that subject.

1.4.21 Business continuity risks.


Every business is supported by some kind of infrastructure, that is,
the set of properties and systems that support business operations
on a day-to-day basis. The infrastructure for a one-woman home office business
might comprise as little as a room set aside in the home, a desk, chair, computer,
and telephone, and perhaps a checking account. For Toyota Motor Company the
infrastructure is huge: sprawling assembly plants, vast computer networks, testing
tracks and laboratories, and much more.

Projects also have infrastructure. And more often than not, a failure of that
infrastructure will have serious implications for project success. Such failures are
likely to interrupt the normal flow of business unless suitable precautions are taken.

Most business continuity issues boil down to seven areas of concern. Some
protective measures may work pretty well for all projects, but most projects should
do (or have done for them by a qualified professional) a tailored risk analysis to
determine their specific vulnerabilities and how best to protect themselves. I now
briefly discuss each of the seven areas just to create awareness. Applicability and
specific remedies should be determined by individual projects, or for a cluster of like
projects.

DisastersDisasters that affect projects can be natural or manmade. The


principal natural disasters are due to extreme weather, earthquakes or fires.
Other possibilities are epidemics and infestations. Manmade disasters usually
are due to human error or deliberate action, such as sabotage or terrorist action.
A project should consider the most likely disaster possibilities and should analyze
their potential impact. The range of protective measures might include better
links to law enforcement and other government resources, hard
communications capabilities, redundant sites, and emergency operations centers.
Affordable insurance is available for many types of potential disasters.
Data lossToday projects rely heavily on data stored in various electronic or
electro-mechanical memory devices. A variety of events can cause temporary or
permanent loss of critical data. The effect of such losses, if there is no backup,
can range from small delays in isolated areas of the project, to shutting the whole
project down for days. Fortunately, this problem has received a lot of attention,
and excellent data backup capabilities are available.
Information securityLeaks of critical information to competitors can result in
being put at a competitive disadvantage. Sometimes competitors can use even
seemingly benign information to piece together valid conclusions about your likely
actions and defenses. The key to information security is awareness of all
possible paths information might take from your project to a competitor. This
must begin with knowledge of where the information resides within your project.
Once that is known, there must be cost-effectiveness tradeoffs to determine
where to erect barriers, and what those barriers should be. Erecting a huge
screen to prevent a competitor from counting the cars in your parking lot would
probably be deemed cost-ineffective. But having employees and certain others
sign patent assignment or non-disclosure agreements would probably be deemed
very cost-effective. Encryption is now available for critical information and is
usually cost effective. Typically so are paper shredding and computer firewalls.
Probably the biggest flow of critical data is out your doors in
briefcases. This is not necessarily bad, as long as you have
reasonable assurances that the destination is not a competitors briefcase. That
assurance can never be absolute, and maximizing it may involve measures
ranging from employee education to briefcase inspection, depending on
circumstances.
AvailabilitySystems critical to project success often become unavailable due
to failure or the need to be down for periodic maintenance. In many projects
today, the most critical system is the system of client / server computers, and
specialized software, which support a wide range of project tasks. But other
types of system can also be of critical importance, such as telephones, electrical
power, radio communications, and even copiers and printers. Sometimes one
can schedule around periodic maintenance, but increasingly, the demands of
international business make that more difficult. The best solution is often a
combination of mitigation methods, especially redundancy and training.
Electrical powerI mentioned electrical power in the previous section on
availability, but because of its crucial importance, it deserves a separate if brief
discussion. Today, virtually all projects are critically dependent on a stable
supply of electrical power. Stability of power in the U.S. is highly variable by
region. But all regions are subject to at least occasional outages that can range
from a few seconds to many hours. Even a one second outage can cause loss of
unsaved data, or loss of a critical machine calibration. The most common
remedy is backup (redundant) systems. These usually involve batteries or
independent generators (or both) that kick in when an outage occurs. In certain
circumstances, they can also involve solar or wind power. Power stability is not
just a matter of off or on. Sometimes even when it is on, it is subject to voltage
surges than can damage equipment, or affect its operation adversely. Today,
equipment is available at usually reasonable costs to prevent this.
Facility issuesMany project teams operate in facilities with varying degrees of
physical security. The degree of physical security generally reflects a judgment
as to how important it is to keep the information contained in the facility from
falling into the wrong hands. Modern security provisions for facilities may include
one or more of the following:
o Identification badges or cards
o Biometric systems for recognition of faces, fingerprints, or retinal patterns
o Measures to minimize potential falsification of lost or stolen identification
badges or cards
o Visual recognition and confirmation by another employee
o Means for knowing where in a facility an employee is located
o Means for knowing what critical information an employee has accessed
and when
o Compartmentalization, specifically making certain parts of a facility or
access to certain information off-limits except to people with appropriate
clearances and need to know.
Emergency notificationFailures, compromises, and potential compromises of
business integrity by any cause generally need to be immediately communicated
to at least a key core of the project team. Once they get the word, they need
specific guiding principles to follow to minimize their need to be creative in the
midst of crisis. The immediate communication need can
usually be organized around some combination of landline
telephones, cell phones, radios, pagers and e-mail. Technologies available
include conference calls, semi-automated response team building, prioritization of
contact methods, and type-to-voice conversion. The specific guiding principles
often are incorporated into hardcopy documents contained in notebooks and
kept at particular stations where they are not likely to become unavailable in a
crisis. These documents typically contain lists of government agencies and other
resources in the community or in the company that can be called on, or who must
be notified, and provide names and phone numbers. They also may contain
information about specific mobilization actions that must be taken and possibly
shut-down or start-up procedures for critical equipment or processes.

Projects often rely on the business continuity processes of the organization in which
they reside. This may be sufficient, but often is not, especially for critical projects.
An alert project team will check it out.
1.5 Risk drivers

As we have seen, the word risk has several connotations. Sometimes when we
use it, its easy to be misunderstood. So I have developed the concept of a risk
driver to increase the precision of discussions of risk. A word of caution: the
expression risk driver appears elsewhere in the risk management literature, but not,
as far as I am aware, with anything like the precision of meaning I give it here.

In the sense I use the expression, it is not possible for a risk driver to exist
independently of the existence of a project plan. I took some pains earlier in this
chapter to define what I mean by a plan in an ideal sense. But a plan need not
conform totally to those ideals to be subject to risk drivers. A rough analogy would
be that you can have fleas on a healthy dog, but you can also have fleas on a sick
dog. A further analogy would be that fleas are likely to be more distressing to a sick
dog than to a healthy dog.

I hereby define a risk driver as follows:

A risk driver is any root cause that MAY force a project to have outcomes
different than the plan.

Regarding this definition, please note the following:

If there is no plan, there can be no risk drivers. This is not a mere quibble.
People are known to talk with all seriousness about projects being risky when
they dont have a clue as to the plan. The right plan might reduce an otherwise
risky project to a walk in the park. (What I have called structural risk is a feeling
of angst that it may not be possible to have a low risk plan.)
The number and severity of its risk drivers define the risk level of a project. They
are like patches on the hull of a ship. They are the places leaks are most likely to
start. But without a hull, there can be no patches. It is entirely speculative to talk
about how risky a project is until its risk drivers, or at least its major ones, have
been identified and assessed vis--vis the plan.
Typically, the less robust (more brittle) the plan, the more impact a risk driver can
have.
Even though the skills of the project team may not be defined explicitly in a
project plan, they are nevertheless implicitly part of the plan.
Any change in plan has the potential to eliminate
or modify some risk drivers and possibly
introduce others.
When stupidity is The more complex the project, the more risk
a sufficient drivers (generally). Larger projects are usually
explanation, more complex than smaller ones. Technology
there is no need projects are generally more complex than
to have recourse infrastructure projects.
The expression root cause refers to the
deepest identifiable and describable underlying
source of a potential forced change in the plan. Sometimes
what appears to be the source of a potential problem is only a
symptom of a root cause. There are several reasons why we want to find and
deal with root causes, not symptoms (see Exhibit 10).
Due to the consideration of only root causes, we can reasonably regard risk
drivers to be statistically independent (that means essentially that one risk driver
does not interfere with or enhance another).

Exhibit 10
Why Do We Prefer to Deal Only with Root Causes?

It may be helpful in the short run to treat symptoms. Medical doctors do that
when they cant figure out what the root cause is, or dont know how to treat it.

But it is better in projects to ferret out the deepest root cause we can, and to
treat it. Here are three reasons why.

1. It facilitates finding an effective course of action for mitigation


2. It is easier to make a focused choice for assignment of responsibility for
risk modification
3. If quantitative risk analysis methods are in use, it minimizes the thorny
problem of statistical correlation between symptoms of risk drivers. These
Interactions are very difficult to deal with, logically or mathematically.

The word MAY is capitalized in the definition to emphasize its importance. MAY
means it might or it might not. If we know that some hypothetical root cause
cannot possibly force a change in the plan, that so-called root cause is not a root
cause at all, and is certainly not a risk driver. If we identify a root cause that
absolutely WILL change the plan and there is nothing we can do about it, that so-
called root cause also is not really a risk driver. It is a perhaps unrecognized de
facto change to the plan. A characteristic of all risk drivers is lack of complete
knowledge about the future. Sometimes project teams for convenience limit the
definition of risk drivers to possible changes that have at least a 10% chance of
happening, and not more than a 90% chance. The rationale is that if something
has under a 10% chance of happening, we can better spend our time worrying
about other things that are more likely to happen,
and if something has better than a 90% chance
of happening, it is regarded as a virtual certainty
If you perceive and is included in the plan. But be careful when
that there are you omit low probability outcomes. Something
four possible that has a 10% chance of happening that could
ways in which a wreck the whole project may be something that
procedure can go you want to do something about.
wrong, and The word outcomes in the definition is plural to
circumvent express the notion that a risk driver, if it comes to
these, then a fifth
pass, might have many different outcomes, not just one.
Outcomes can be good or bad. For example, a risk driver
might cause a schedule slip, or it might cause an early completion. Recall the
dual nature of risk.
Mitigation or modification of a risk driver refers to any actions we contemplate
and may have started to implement to reduce the size of a risk driver.
The size of a risk driver is independent of any measures we might contemplate
for mitigating it. However, during the application of mitigation measures, we can
at any time reevaluate the potential impact of a risk driver. We may have been
able to shrink it.
At any time, we may perceive that a risk driver has been overtaken by events
(sometimes called OBE). Being overtaken by events means that the project
has reached a point where there is reason to believe that the size of a risk driver
has increased or decreased significantly from its previously assessed impact
values, even though we took no specific action to make that happen.
The project team normally does assessment of the size of a risk driver, usually
periodically in accordance with a risk management plan. However, there can be
independent risk audits where others do it. As the final arbiter of risk
management, the project manager may choose to make his or her own
independent assessment.
It is premature at this point to attempt a definition of the expression size of a risk
driver. That is a complex concept that can be defined in more than one way, as
will become clear later. For now, simply note that risk drivers can have complex
impacts, and some are instinctively seen a bigger or worse than others.

1.6 Whose Risk Is It?

I conclude this chapter on basic ideas about PRM with a question any project
stakeholder should ask: Whose risk is it? Projects can be structured in any number
of ways. Many types of relationships can exist between stakeholders. Some risk
drivers may damage or benefit some stakeholders more than others.

The ideal arrangement in any project is when risk is allocated to the stakeholders
best able to manage it. Managing it includes having the resources to pay for bad
outcomes that happen in spite of everyones best efforts. Poor allocation of risk is
itself potentially a risk driver, because if a bad thing happens to a project team
member (e.g., a subcontractor) who is vulnerable, he may fail, and bring the project
down with him.

There are many devices for distributing risk. A success-oriented project should look
for a win-win selection of these devices. Here are a few:

Indemnification clauses in contracts


Escalation clauses or other protections against unexpected currency inflation or
exchange rate changes in international projects
Subcontracting to give some of the work to teams that are
better qualified to do it (but be aware, every subcontract
removes some control from the prime contractor; poorly qualified or unethical
subcontractors can create big problems)
Incentive and penalty clauses to encourage good management and discourage
bad management
Use of fixed price contracts to protect the sponsor (but be aware, this can
increase risk if the executing team might fail due to underbidding the work)
Use of cost-plus contracts to protect the contractor (but be aware, this can
potentially cause the contractor to manage carelessly, running up the sponsors
costs)
Buying insurance for those risks that can be cost-effectively mitigated this way
Requiring a risk management plan with frequent status reports
Observing the work in progress first hand, not through intermediaries.

Chapter 1 review questions

1. Think of a project failure that has been in the news. What do you perceive to be
the primary root cause of the failure?
2. Do you think projects are more or less likely to fail now than 20 years ago? Why?
3. Consider the statement Risk is a component of human value systems. Is it
true?
4. Explain the difference between the ordinary persons view of risk, and the view of
a dedicated entrepreneur.
5. What is meant by the duality of risk?
6. When is muddling through an appropriate project risk management strategy?
7. What are the main elements of classic PRM, as defined in this chapter?
8. Is change always risky? When is it not?
9. What perception begins a project, as defined in this chapter?
10. Is it more accurate to say that risk is real, or that it is perceived?
11. What are the criteria we ultimately use to determine whether a project has been
successfully completed?
12. We should wait as long as possible in a project to create a baseline design, true
or false? Why?
13. In which phase of a project, as defined in this chapter, do you normally prepare a
formal proposal?
14. Doing more prototyping is a potential remedy for what problem with project
goals?
15. An unclear goal is often in reality a complex ofwhat?
16. What does it mean to say that two goals are positively linked?
17. What are two universal axioms of estimating, as defined in this chapter?
18. Why should the lead estimator have the power to challenge the realism of
estimates made by other team members?
19. If it is early in a project, and you are trying to decide between two baseline
concepts, would you make two bottom-up estimates to compare the costs?
Why?
20. Explain the meaning of equivalence in estimating.
21. In general, should designers be allowed to make their own
estimates of cost in design trades? Why?
22. Why shouldnt most project teams build their own parametric cost models?
23. Explain briefly the nature of due diligence as applied to projects.
24. Would you always buy COTS software for a $1,000 license fee if that would
cause you to avoid development costs of $20,000? Why?
25. This chapter identifies several major areas of commonly occurring project risk.
Name a significant project risk area that could apply to almost any project that
has not been mentioned in this chapter.
26. Is the following statement a risk driver? The hole pattern drilled in the base plate
does not match with the hole pattern in the spindle. Explain your answer.
27. Name at least one reason we would prefer to deal with root causes rather than
with symptoms.
Chapter 2Classic Project Risk Management (CPRM)

2.1 Introduction

In the preceding chapter, I attempted to give you some basic ideas about the nature
of projects and about how Murphy might work against you to cause your project to
fail. I will now move on to what I like to call the conventional or classic approach to
project risk management, CPRM.

To a large degree, CPRM is intuitive. Many of the things I will talk about are pretty
much common sense. My main contribution to the dialog will be to organize and
clarify the process, and toss in helpful hints along the way.

To get the conversation started, I offer in Exhibit 11 a simple process flow chart for
CPRM:

Exhibit 11
CPRM Process Flow Chart

Identify Analyze Plan Track Control

Communicate

Exhibit 11 only approximates the true nature of the CPRM process. To make its
workings clear, I will elaborate not only on the overall process, but also on each of
the boxes in the process.

2.2 The overall process


Exhibit 11 is not intended to imply a one-shot process, but rather a continuous,
ongoing, repeated process. It begins as early in the project as there is enough of a
plan for risk drivers to be identified, and continues until risk is perceived to be
negligible. Also, the process should be repeated for each identified risk driver,
beginning within a reasonable time after its identification. Moreover, in some
situations, the process may be repeated more than once, totally or in part for each
risk driver, whenever the risk driver is mitigated, partially mitigated, or overcome by
events.
A risk driver may be closed out at any point in the process if it has
become moot. Risk drivers can be overtaken by events, or can be
mitigated. Overtaken by events means that the perceived concern has vanished
for some reason, other than specific action by the project team. Mitigated, of course,
means reduced or eliminated by team action.

Most of the time, the sequence of activities for a given risk driver will be in the order
shown in Exhibit 11, although exceptions are a possibility. The Communicate
activity is an ongoing activity that permeates and enhances each of the other
activities. Without effective communication, none of the other activities are possible.

Lets recapitulate and build on some of the key ideas in chapter 1:

A project begins with a perceived need for change.


Establishing goals is the first formal step in creating the project.
Goals are both positive and negativepositive goals are the things we want the
project to accomplish, and negative goals are the constraints we do not wish to
violate.
Goals can inject a propensity for risk into a project in many ways, a number of
which are described in chapter 1. These can be thought of as structural risks
brought about by the nature of the changes the project contemplates. We can
intuitively sense structural risks, but cannot quantify them. They remain vague
until a plan has been formulated. Recall that absent a plan there can be no risk
drivers. Risk drivers are weaknesses in a plan that might lead to some
appreciable degree of failure.
Some plans are more robust than others. A robust plan has a higher likelihood of
success than a brittle plan. A robust plan may have fewer or smaller risk drivers,
or both, than a brittle plan.
The initial baseline plan is likely to be immature and brittle. Plans tend to become
mature and more robust as the project proceeds, and knowledge increases.
The ideal is to continue the planning process, upgrading the baseline plan until it
is sufficiently robust that all stakeholders are satisfied with the level of risk and
the potential for mitigation of remaining risks. Less than ideal, but often a
practical necessity, is to put the plan into full execution even while significant
risks remain.
If the planning process reaches an impasse where the stakeholders cannot be
satisfied the goals should be modified, or the project should be abandoned, or
less risk-averse stakeholders should be found.

Identification of risk drivers is possible only when we have set forth a plan for
accomplishing the goals, however tentative and incomplete that plan might be. The
more concrete the plan, the better our ability to identify and analyze risk drivers. For
that reason, wisdom dictates that we set forth the first baseline plan very early in the
project, so that we begin to get visibility into what is likely to happen to us. In terms
of the typical project phases defined in chapter 1, the first baseline
plan should be created in the concept phase, or in the business
case development phase.17

I an aware that some project managers resist as long as possible establishing any
kind of baseline. I worked on one project where that was egregiously the case. The
rationale seems to be to keep options open. I believe in keeping options open, but
argue that the best way to do this is to progress from one baseline to another,
without committing significant resources, until a sufficiently robust plan is found. The
problem I have with not creating and communicating a baseline as early as possible
is that without a baseline to work on, the team will drift and work at cross-purposes. I
have seen this happen more than once. The best way to reach a robust baseline is
to have the team as a whole work on improving the current baseline. A change of
baseline must be promptly and unambiguously communicated to the team.

As we work toward refining and improving the initial baseline plan, and moving on to
the next, we strive to identify risk drivers that relate to the current baseline plan, and
to take them through the steps of CPRM defined in Exhibit 11. This helps us to
better define a new baseline, one that is less vulnerable to risk drivers. Said another
way, we use the CPRM process to aid the planning process. In fact, CPRM
becomes an integral part of project planning.

It is not unusual that the early evolution of project baselines will expose weaknesses
in the goals that can be corrected if they are communicated to the project sponsor in
a timely manner. For this to happen there must be free communication between
stakeholders. Lack of communication and trust between the team and the sponsors
can result in unnecessary risk.

2.3 Identification

The Identify box in Exhibit 11 pertains to the identification of risk drivers. Recall
from chapter 1 the definition of a risk driver:

A risk driver is any root cause that MAY force a project to have outcomes
different than the plan.

Recall also the importance of the key word MAY. Risk drivers contain uncertainty.
That distinguishes them from problems, which can be thought of as risk drivers that
were never mitigated and finally matured. If a risk driver is not mitigated, it could be
because mitigation was found to be infeasible or uneconomical. But it could also be
because it was never identified.

Recall also that a risk driver is a root cause. That means that to properly identify it,
we must go beyond symptoms and minor complications to get as close as we can to

17
Recall (chapter 1) that the plan includes both the definition of the product and all descriptions of
the means for achieving it.
the real source of potential difficulty. Chapter 4 presents excellent
techniques for doing that.

The identification process is not complete until the risk driver has been reduced to a
root cause. However, a rapid identification process such as brainstorming need not
be burdened with a search for the root cause. It is sufficient in such a process that
potential risk drivers be simply marked, however tentatively. They can be reduced
to root causes in due course. Frequently, it will be found that potential risk drivers
will be found to be different manifestations (symptoms) of the same root cause.
When that happens, potential risk drivers can be consolidated in terms of the
common root cause.

A risk driver may or may not occur. If it does not occur, we are able to follow the
plan as it has been laid out. If it does occur, some changes to the plan are forced.
Because of the dual nature of risk, these changes could be either detrimental or
beneficial. Sometimes it happens that a beneficial change is not forced, but is
available as an option to be considered.

A risk driver that occurs may have various outcomes. In this book, I treat description
of the outcomes as part of analysis, discussed in the next section. The end product
of the identification process should be a clear statement of all root causes that could
create difficulties (or benefits) in the future, insofar as they are known at a given
point in time.

How do you describe a root cause? To assure that a risk driver is not mistaken for a
current problem, it is helpful to include a conditional word such as may, might, or
could in the description. It is also helpful if the description is at least one complete
sentence, written with clarity in mind. Terse sentence fragments can wrongly
communicate the true nature of a risk driver. I offer here a few examples of (I hope)
clearly written risk driver descriptions.

The available programmers may turn out to be less skilled in the C++ language
than the programmers we wanted to have working on the project.
Airtron Corp. may not meet its schedule commitment for delivery of the 50
horsepower air compressor. They blame material shortages.
We were unable to visit the Delta site. There may be large rocks below the
surface that could increase the difficulty of excavation.
The frammis could fail its acceptance test. Most likely problem: overheating.
Sponsor authorization to start work on phase 3 could be late.

For contrast, I now transform each of the above into a less clear risk driver
description, which could easily be misinterpreted:

Inexperienced programmers
Air compressor late
Rocks below the surface
Frammis test failure
Late authorization for phase 3
In a very small project, where everyone is always fully familiar with
what is going on, description shortcuts like those above may not cause any
problems. But as the size of a project increases, it becomes increasingly likely that
someone will misunderstand a tersely stated risk driver description and act
inappropriately.

Because some projects resist doing risk driver identification, I should say a few
words about its justification. Fire drills to solve problems can be expensive.
Identification, on the other hand, is relatively cheap. Identification that leads to
mitigation is almost always cheaper than problem solving fire drills. The obvious
conclusion is that identification is a worthwhile activity. The more structural risk in
the project, the more worthwhile it is likely to be. There is often a good reason
behind angst.

Here are some frequently heard arguments against doing risk identification, and my
favorite responses to them:

I dont have time. Theres too much regular project work.


o This is indicative of a project that is in a hurry to get started without
much planning. If the structural risk is high, the probability of failure is
also high. Isnt it nave to believe that risk identification is not regular
project work? Should the team proceed without any consideration that
what they are doing could fail in one or more ways?
Its not rewarded. Nobody wants to hear what we cant do.
o If this is true, then project management is not doing its job. Hubris has
set in.
I dont want to look stupid, especially in front of upper management.
o It is possible to communicate the presence of risk drivers without
looking stupid. (See section 4.3.7 for thoughts on that subject.)
We already know our risks. We did an assessment at the beginning of the
project. Once is enough!
o CPRM is necessarily a continuous process. As the plan evolves, the
risk picture can change dramatically. An assessment at the beginning
of the project seldom picks up all important risk drivers that will become
visible over the life of the project.
This is just another management initiative. Ill wait to see if theyre serious
before I put any effort into it.
o If they are serious, and you dont get involved, you can get left behind.
If they arent serious, you might be able to make points with a
convincing story as to why they should be.
They shoot the messenger. Bring bad news and you might get punished.
o Its true that the shoot the messenger syndrome is still in effect on
some projects. I have experienced being shot at for bringing bad
news, as perhaps you have been. I think that the best way to deal with
this is to bring a proposed solution along with the bad news. But if the
shoot the messenger attitude is prevalent, then CPRM does not have
fertile ground in which to grow. Also, keep in mind that the timing of
delivery of bad news can be critical to its acceptance.
One more thought: I have learned never to deliver
bad news in writing.

The last bullet above brings to mind that despite the current interest in PRM, there
are probably still many projects that arent knowledgeable or interested. Lack of
knowledge is rather easily overcome (just read this book!), but lack of interest is a bit
tougher. One reason: some project managers are truly addicted to fire drills. They
relish the challenge of putting out fires single handedly. They think it makes them
look like macho superheroes to be continually battling against problems. Because of
their passion for dealing with problems, it is hard to get them to think in terms of
preventing problems. You may never convince this type of project manager to
embrace CPRM.

I wish I could tell you how to cope with the situation where the project manager will
not embrace team-based CPRM. In my sole experience with that situation, I was
fortunate in that a vice president forced the project manager to do it. But you may be
less fortunate. The fact is that there are three unavoidable prerequisites to
successful CPRM. They are:

Management commitmentCulture comes from the top. The project manager


must be personally committed. If he or she does not embrace CPRM, it may be
that it can be enforced from above. As I said, I have seen that happen once, but
it wasnt a particularly pretty sight, at least at first. Later, the project manager
came to realize the benefits and enthusiastically embraced it. His comment to
me: Its like having a glass-bottomed boat. You can see the rocks coming before
you run into them, and you can steer to avoid them.
Training for risk awarenessThe reason training for risk awareness is (almost)
always necessary is that CPRM is relatively new, and many team members will
not understand or be accustomed to the level of risk awareness that is
appropriate. This need may gradually go away as teams become used to
working with CPRM.
Risk management processesKnowing the essentials of what to do is one
thing, but being able to get anything done requires established processes within
which to work. Chapter 4 outlines some proven process possibilities.

Exhibit 12
The Case of the Arrogant Project Manager

One project manager in a major government agency, in front of several leads of his
project team, told me that that he knew all of his project risks, and was positive he could
overcome them. He named four potential technical problems. His team leads sat quietly
and said nothing. A few moments later, the project manager was called away to a phone
call. While he was out of the room, his team members deluged me with questions about
how I could help them. They all fell silent the moment the project manager re-entered the
room. The project was soon cancelled because it had grown well beyond the available
funding, without accomplishing any of its technical goals. This failure had the unfortunate
effect of seriously limiting the scope of a related high visibility project of international
importance.

This project manager, a brilliant and successful engineer, thought one dimensionally of
his project risks only in terms of the technical problems he had to solve, and he was
confident that he could solve them. Given enough time and money, he probably could
have. The problem was, he didnt have enough time and money. And he had no
reserves for unexpected problems. His was a very visible and career damaging failure.
I have not yet addressed a couple of key issues related to
identification. I have talked about what a risk driver is and how you should write it
up, but I have not talked about how you discover it in the first place. I also have not
talked about whether you can be sure you have discovered all of the risk drivers.
The how-to of discovery I defer to chapter 4, where it is discussed in some detail.

The sad fact is that you cant discover all possible risk drivers. In terms of the
potential for discovery, you can usefully classify risk drivers into three types:

ObviousThese are the ones that just fall into your lap. You can see them
coming, so to speak, a mile away. They are things like new and untried
technology, new processes, areas of uncertainty as to acceptance by customers,
unknown responses by competitors, and so on.
UnknownThese are the risk drivers you have to look for. They dont just jump
out at you, but they are discoverable if you make an effort. The problem with
them is that as you continue to pursue discovery, you eventually encounter
diminishing returns. Eventually you reach a point where discovery efforts turn up
only nit picks, that is, risk drivers of very low probability, very low impact, or
both. Nevertheless, it is important to look for them at least up to a point, and to
use more than one discovery technique. Some useful discovery techniques are
discussed in chapter 4.
Exhibit 13
A Complex Risk Driver

This example is hypothetical, but it does closely parallel something that actually
happened.

A major defense contractor was under contract to develop and produce a low cost
missile. Because of the goal of keeping costs unusually low, the plan called for using
low cost components wherever possible. One key component of concern was a device
called a flight control actuator. This was an electromechanical device that delivered
torque to the missiles aerodynamic control surfaces to cause it to keep to a desired
flight path.

There were no off the shelf actuators that met the requirements. But two highly qualified
subcontractors were available who could design and build a suitable device.
Unfortunately, the unit production costs they quoted were over $5,000 a set, which was
deemed to be too high, given the goal of low cost.

So instead of having a subcontractor develop the actuator, the contractor elected to


develop it in-house. After a year of trying, the result was total failure. The contractor
simply did not have the skills to design and produce an actuator. Its efforts to do this
were a technical failure.

Eventually, one of the subcontractors had to be hired to do the job. The resulting delays
affected several work elements, increased costs, lengthened schedules, and had a
serious affect on the entire project.

The contractors decision to try to build its own actuator was never subjected to a risk
analysis. If it had been, a better solution would likely have been found.

Unknown unknownsThese are also referred to as unk-unks. They are risk


drivers that seem to come out of nowhere. No reasonable amount of discovery
effort could have been expected to find them. My belief, however, is that a
project only rarely fails due to an unk-unk. I can think of only one example. It
has been suggested to me that certain projects were damaged by unk-unks, but
on consideration, in every case I recognized that there clearly was a failure of
due diligence. The so-called unk-unks should have been discovered. I invite
readers to try to think of examples of true unk-unks that have seriously damaged
a project.
2.4 Analysis

The purpose of analysis is to estimate how a risk driver can potentially impact the
project. The potential impact of a risk driver can have several dimensions. The five
most important are usually:

Which task elements are affected


Cost impact (on each affected task element)
Schedule impact (on each affected task element)
Product performance impact
Impact timing

The potential multi-dimensionality of impacts is why measuring the size of a risk


driver is difficult. I now explore these one at a time. But first I note that these
impacts are seldom independent. Quite commonly, if one of them appears, at least
one other will also appear, and they will usually be interactive. The scenario in
Exhibit 13 illustrates this.

2.4.1 Task elements affected.

Sometimes a risk driver will affect only one task element in the projects work
breakdown structure. When this happens, the impact is in a sense isolated, but it
could still result in project failure.

From a schedule risk standpoint, it matters whether or not the affected task is on the
project critical path (see the appendix if you are not familiar with this term). If it is
not, a schedule impact on the single task might not affect the overall schedule. If it
is, the impact will directly affect the length of the overall project critical path.

Here are some examples of risk drivers that might conceivably affect only a single
task:

A few days of rainy weather while the task is in progress


A temporary shortage of materials
Temporary unavailability of a key person
A minor design error
Estimating error with regard to resources required by the task.

Far more serious, generally, and quite common, is the


case where a risk driver affects more than one task
Logic is a element. Here are some possible examples:
systematic
method of A major test failure
coming to the A failure to get a timely approval or permission
wrong conclusion Late delivery of a key item from a vendor
Poor understanding of a principal requirement
Volatility of the sponsor with respect to project goals.

One important measure of the criticality of a risk driver is the number of work
elements affected. In my experience, a risk driver that can potentially affect several
work elements is far more likely to be a serious threat to the project than one that
can affect only one work element. It is also more likely to be expensive to mitigate.
Multiple impact risk drivers typically will engender some soul searching as to how
much to spend on mitigation. I will have more to say on that later.

A common phenomenon is what some call the standing army effect, or its close
relative, the idle asset effect. Consider the situation in Exhibit 14. A small group is
performing task A. Task B is being performed concurrently by a larger group of 50
people. According to the plan, Task A will be completed two weeks before Task B is
complete.

The large group doing task B is supposed to move on to Task C when they have
finished Task A. Unfortunately, Task A overruns its schedule by six weeks. Task C
cannot begin until Task A is complete. The 50 people who are doing Task B finish
on time, but have nothing to do for four weeks. They have become an idle standing
army.

The idle asset effect is similar, except that it involves a physical asset other than
people. Such assets could be any building, infrastructure, or machine that has
period costs.

Exhibit 14 Example of
The standing army and idle
Standing Army Situation asset phenomena are prime
examples of how a risk driver
Task A can impact more than one
Task A completes task. In Exhibit 14, whatever
8 Weeks late. was the root cause of the
delay in Task A likely had all
Begin
Task C of the following effects, as a
Merge point
20 Weeks minimum:

Task B Delay in Task A


10 Weeks Delay in Task C
Work force continues to Task C. Cost overrun in Task A
Cost overrun in Task C

Whenever you examine a project schedule network, look for merge points, like the
circled point in Exhibit 14. Any task entering the merge late affects every task
beyond the merge, whether or not a standing army event transpires. One solution
to that, of course, is to eliminate merge points insofar as possible. But this solution
can significantly increase the duration of the project, which may be undesirable. You
hear a lot about the benefits of concurrency in projects, and there are important
benefits, such as getting a product to market ahead of the competition. But
concurrency generates merge points and makes it more likely that risk drivers will
have multiple impacts. I am not saying you should never use
concurrencythat would be foolish. But I am saying that when
you do, consider the risks.

2.4.2 Cost impact.

Many projects are severely constrained with respect to cost. Sometimes the sponsor
is willing to pay just so much, and not a penny more. But in a cost-plus-fixed-fee
contracting environment, the risk is mostly on the sponsor, because if the contractor
reaches the cost limit but still has not completed the work, the sponsor may have
little or nothing to show for his investment. In a fixed-price contracting environment,
the risk is mainly on the contractor, because if he cannot finish within the available
funds, he must continue with his own funds, or go bankrupt.

In the rare situation where the sponsor is willing to pay whatever it takes, cost risk
vanishes, but most people can spend an entire career and never work in that
environment. In most projects, there is considerable accountability for cost.

Perhaps the simplest kind of cost impact analysis is to estimate a single most likely
cost increment (or decrement) for each work element that can be affected by a given
risk driver. Then, an estimate is made of the likelihood of each impact happening.
Likelihood, also called probability, is traditionally expressed as a number between
zero and one. (See chapter 5 for a discussion of probability.)

The next step is to aggregate the cost impact across all tasks and all risk drivers as
illustrated in Exhibit 15.

Exhibit 15--Example of Cost Risk Impact Table

Prob = 0.3 Prob = 0.5


Risk Driver #1 #2 Totals for
Task Impact K$ Result K$ Impact K$ Result K$ Tasks
1.2 5 1.5 10 5 6.5
2.2 0 7 3.5 3.5
3.4 12 3.6 0 3.6
3.5 2 0.6 5 2.5 3.1
5.1 3 0.9 0 0.9
Project Impact 6.6 11 17.6

Exhibit 15 illustrates a small project where there are five tasks subject to risk (1.2,
2.2, 3.4, 3.5, and 5.1), and only two risk drivers, numbered 1 and 2. If it occurs, risk
driver #1 is deemed to have a most likely impact of $5K on task 1.2, a most likely
impact of $12K on task 3.4, and so on. A similar analysis for each task is done on
risk driver #2, assuming it occurs.
A probability is assigned to each risk driver. The probability 0.3 for
risk driver #1 is equivalent to a statement that it is judged that this
risk driver has a 30% chance of happening, and a 70% chance of not happening.
(Recall that weather forecasts use a similar system for reporting the probability of
rain.)

Each impact is multiplied by the probability of the risk driver occurring. The results
are shown in the columns headed Result K$.

Summation vertically of each of the columns headed Result K$ results in an


aggregated, risk weighted impact of each risk driver across the entire project. For
example, the value for risk driver #1 is $6.6K.

Summation horizontally of each of the task rows results in an aggregated, risk


weighted impact on each task. For example, the value for task 1.2 is $6.5K.

The value $17.6K in the lower right corner is the risk weighted sum for the entire
project. It represents a best guess at the likely overrun of the present plan. If the
plan includes reserve funds sufficient to cover it, the project may be in good shape.

Obviously, this scheme can easily be extended to any number of tasks and any
number of risk drivers. A spreadsheet is an excellent vehicle for this kind of
analysis.18

What can be learned from a table such as Exhibit 15? Obviously, it shows expected
impacts on each task due to each driver and as an aggregate. It also shows
expected overall impacts due to each driver. In Exhibit 15, risk driver #2 is the most
powerful, and task 1.2 is the riskiest. The expected cost overrun for the project is
$17.6K.

This information provides clues for risk mitigation. If we could get rid entirely of risk
driver #2, that would have the greatest benefit. But it would also be beneficial if we
could somehow modify task 1.2 so that risk driver #2 no longer affects it.

What does this exhibit suggest with regard to how much we might be willing to spend
to mitigate risk? In principle, if we could entirely mitigate risk driver #1, we might be
willing to spend up to $11K, the weighted amount it contributes. Realistically, we
would not do this, because money spent for mitigation is hard money, while a risk
weighted amount is soft money, meaning in effect that if the risk driver is destined
never to occur, money spent mitigating it is wasted. A rule of thumb I recommend is
to spend no more than 20% of the risk adjusted amount for mitigation. Thus, to
mitigate risk driver #1 entirely I would spend no more than (0.2)($11K) = $2.2K. If I
could only partially mitigate it, I would spend even less.

18
As an alternative to the type of analysis I describe here, some analysts attempt to
estimate the gross overall cost effect on work elements without considering the
separate root causes. I recommend against this practice because it can easily lead
to overlooking significant possible cost impacts.
If you only have a fixed amount to spend for mitigation, a table
such as Exhibit 15 obviously can also provide guidance for mitigation priorities.

The cost risk model illustrated here is a good compromise between a very simplistic
qualitative approach, and a more rigorous quantitative approach. A simplistic
qualitative approach is shown in chapter 4, and more sophisticated quantitative
approaches are discussed in chapter 5.

2.4.3 Schedule impact.

Meeting a prescribed schedule is frequently a cherished goal of project sponsors.


There can be many reasons for this. Among the possibilities are getting to market
ahead of the competition, meeting schedules on other, related projects, meeting
scheduled occupancy dates, or getting the project completed within a funding cycle,
while money is available.

Cost impacts are relatively simple to deal with, as illustrated in the previous section.
Schedule impacts are a bit more complex. You could build a table such as Exhibit
15 for schedule impact, but it could have a serious flaw. To show why, lets lay out in
Exhibit 16 a hypothetical schedule network for a small project; well call it Project X.

Exhibit 16Schedule for Project X

Task A Task D
5 weeks 6 weeks

Begin Task B Task E Task F End


4 weeks 3 weeks 2 weeks

Task C Task G
Path lengths:
7 weeks 8 weeks
AD 11 weeks
BEF 9 weeks
BEG 15 weeks
CEF 12 weeks
CEG 18 weeks

There are five paths through this network, AD, BEF,


BEG, CEF, and CEG, with path lengths as noted
above. The longest path, CEG, is called the critical
Everything takes path; its length is the duration of the project.
longer than you
think.
Now consider a risk driver impact of five weeks on Task A,
doubling it to ten weeks. The path AD will now be 16 weeks long.
The duration increase on task A may also increase the cost of task A, but it has no
effect on the duration of the overall project, which is still 18 weeks. Clearly, risk
driver impacts off the critical path do not affect the project overall duration if they are
small enough.

Now consider a risk driver impact of five weeks on task C. This lengthens the critical
path by five weeks, causing the overall project duration to increase from 18 to 23
weeks. Clearly, risk driver impacts on the critical path affect the project duration
directly.

Finally, consider a risk driver impact of four weeks on Task B. This will change the
critical path from CEG to BEG, and will increase its length by one week to 19 weeks.

This gives you a notion of the complexities inherent in schedule risk analysis. Some
sophisticated methods are available for dealing with this complexity. Some involve a
technique called Monte Carlo simulation, which we will discuss in chapter 5. But for
now, lets look at a simpler approach that gives fairly good answers.

To begin with, it should be noted that some large projects build huge working level
schedule networks that may have thousands of detailed tasks. Schedule risk
analysis at this level of detail is difficult, partly because of the necessity of assigning
risks to so many tasks. What you need to do is to work with a management level
schedule in which most of the detailed tasks are condensed into closely related task
groups. This is the kind of high-level schedule network typically seen in management
meetings. It should have no more than (about) 50 to 100 tasks.

If you have an appropriate computer tool (such as MS Project), you can set up the
project in the tool, and print out a network diagram. Otherwise, draw the diagram on
a long strip of paper, such as butcher paper or printer paper. By inspection, trace
out all of the paths through the networks, and add up their lengths.19 The longest will
be the critical path. Most scheduling tools will do this computation for you.

Build a schedule impact table similar to the cost impact table shown in Exhibit 15.
Such a table is shown in Exhibit 17. (If you are clever, you will combine your cost
and schedule tables into a single table. Only then can you clearly portray
phenomena such as the standing army or the idle asset effects that have both cost
and schedule impacts. I will give an example of that shortly.)

Exhibit 17--Example of Schedule Risk Impact Table

Prob = 0.3 Prob = 0.5


Risk Driver #1 #2 Totals for

19
This will usually not be as difficult as it sounds. At a glance you will see that many paths obviously
are not critical. There is a systematic way to do this. If you are interested, consult any good college
level text on quantitative management.
Task Impact Wks Result Wks Impact Wks Result Wks Tasks
1.5 2 0.6 2 1 1.6
1.8 1 0.3 1 0.5 0.8
2.2 3 0.9 7 3.5 4.4
4.7 8 2.4 5 2.5 4.9
8.2 1 0.3 0 0 0.3
Project Impact 4.5 7.5 12

Note: Tasks listed are only those on the critical path.

In Exhibit 17, the only tasks of concern for schedule impact analysis are those on the
critical path. These can be treated mathematically just like costs. The risk driver
impacts are additive. But ah, you say, what if the critical path changes due to a large
risk driver impacting a task not on the critical path? The answer is that this simple
method does not handle that situation. But in many real world projects, it is not a
significant factor. Many real world projects have a very stable critical path. But if
you want to explore the issue, you might look at the second longest path, and see if
any large risk drivers impact tasks along that path, especially if they have a high
probability of occurrence. You could then make a judgmental adjustment of your
overall schedule risk based on that assessment.

What can you do with a table such as Exhibit 17? You can of course note that the
expected project schedule overrun is 12 weeks. If that is not acceptable, you can
start thinking about ways to reduce some of the impacts, especially the larger ones,
such as the impact of risk driver #2 on Task 2.2.

Exhibit 18Schedule for Project Y

Task A Task D
5 w $10k 6 w $15k

Begin Task B Task E Task F End


6 w $6k 3 w $6k 2 w $8k

Task C Task G
7 w $14k 8 w $10k

Task H
18 w $18k
2.4.4 Standing armies, idle assets, and hammocks.

Earlier we discussed the standing army and the idle asset phenomena. To clearly
portray the impact of these, you need to combine cost and duration impacts into a
single impact table, which we will do shortly. There is yet a third phenomenon that
requires this treatment. I call it a hammock activity (that name is not unique to me.)
In Exhibit 18 I show a schedule network for Project Y that has the potential to
generate a standing army, and also has a hammock (spanning) activity in task H.
Note that the planned duration and the planned cost for each task is shown in the
exhibit. Summing all eight tasks, the planned project cost is $87k. Summing along
the critical path, the planned duration is 18 weeks.

Tasks B and C have a merge. If the labor force from Task C is to be the same labor
force that does task E, and if Task B is more than three weeks late, then Task E will
have an idle, standing army for the length of time Task B is late.

Task H is a hammock task. It is called that because it spans all of the other tasks.
Typical hammock tasks are project management, project control, and sometimes
functions such as specialty engineering, quality assurance, public relations, etc.

Note that Task H is shown as being 18 weeks in duration. The reason is that the
critical path of the project (the path CEG) is 18 weeks in duration. If the critical path
length of the project should change due to the action of risk drivers, then the length
of the hammock task would change to have the same length. The usual
assumptions in analyzing the effects of risk drivers on hammock tasks (absent
information to the contrary) are these.


Hammock tasks have no duration risk of their
Whether a own. Any change in their duration will be due to
mission expands changes in the critical path length of the tasks
or contracts, they span.
administrative Hammock tasks have two distinct kinds of cost
overhead risk. The first is the cost response to risk drivers
ti t acting directly on them. In this, they are like any
other task. The second is increased cost due the
stretching of their duration (or possibly decreased cost due to a contraction of
their duration). The increase due to stretching usually is assumed to be
proportional to the amount of the stretch. Similarly for a contraction.

Lets now build, in Exhibit 19, a combined cost and schedule impact table that takes
all of this into account.

Exhibit 19--Combined Cost / Schedule Impact Table

Probability = 0.2 Probability = 0.8 Duration Cost


Risk Driver #1 Risk Driver #2 Total for Total for
Task Wks Result $k Result Wks Result $k Result Tasks Tasks
A 2 0.4 0.4
B 4 3.2 5 4 3.2 4
C 6 6
D
E
F
G 4 0.8 5 1 0.8 1
H 4
Totals 0.8 7.4 3.2 4 4 15.4

Tasks C, E, and G are the original critical path tasks in this schedule, but a risk driver
impacting task B has elevated it to the critical path, replacing task C. As a result,
Task C has a standing army cost impact. The whole project stretches out for an
expected 4 weeks, resulting in added cost in the hammock task H of $4k. The
overall result is an expected 4 week schedule slip, and an expected $15.4k cost
increase.

These expectations are not certain. Mathematically, they approximate what is


most likely to happen. They also approximate the median result, namely the
result that has a 50% chance of being exceeded, and also a 50% chance of not
being exceeded. Exceeded or not exceeded by how much? This simplistic analysis
provides little insight into that subject. See chapter 5 for a discussion of methods
that can give information about potential ranges.

2.4.5 Performance impact.

I first must be clear on what I mean by performance impact. Performance impact is


not the same as execution impact. It does not deal with the performance of the
project team. It deals with the performance of the product produced by the team,
relative to the goals that have been established. While there may well be a
correlation between the two types of performance that does not concern us here.
Suffice to say, having a weak team increases the chances of having a poor product.
A weak team could be a risk driver, if it is a root cause.

Our approach will be to assume that, in the absence of identified risk drivers, the
projects product, whatever that may be, will comply with all of the goals set for it by
the projects sponsor. If there is any significant chance of that not happening, it
should be attributable to risk drivers that are mostly identifiable.

When you boil it all down, performance impacts come in only three categories. One
is the situation where the product fails to meet its goals due to the action of one or
more risk drivers, and that is the end of the matter. Nothing can be done to repair
the situation, or else we are unwilling to try, for one reason or another. The project
soon terminates. There may be further termination activities and costs, but they are
not directed toward product performance. I will call this Type 1 performance risk.

The second is the situation where the product also fails to meets its goals due to one
or more risk drivers, and we have decided we want to (or must) spend more time and
money in the hope of making the product goal compliant. In the
first category, we are defeated and accept defeat. In this second
category, we are damaged, but refuse to accept defeat. But in refusing to accept
defeat, we may spend more money and more time than our plan calls for. We
expect to finally prevail and have a successful product. I will call this Type 2
performance risk.

The third is the situation where, again, the product fails to satisfy, and again we
spend time and money trying for success, but after a time we fail and concede
defeat. I will call this Type 3 performance risk. Lets look at examples of each of
these situations.

Consider one type of project commonly worked by the National Aeronautics and
Space Administration (NASA). The product is a body of scientific data, typically
gathered by instruments carried aboard a spacecraft. NASA characteristically writes
success criteria for these projects. These criteria are among the projects positive
goals. NASA also creates a schedule and a budget, which are the negative goals
(constraints). A failure of the launch vehicle (the rocket that boosts the instrument-
laden spacecraft into earth orbit) will result in loss of the spacecraft and all of its
instruments. This is a total mission failure. Once it happens, nothing can be done to
repair the situation. It is the end of the matter. This is a Type 1 performance failure.

Suppose that the mission is to go to Mars, land on its surface, take pictures, and
measure soil composition and atmospheric properties. The launch vehicle
succeeds, the interplanetary flight succeeds, the Mars landing succeeds, but one of
the cameras fails, reducing the amount of data gathered somewhat below the level
contemplated by the project goals. The loss is less than a total mission failure, but
again, nothing can be done about it. We cant (yet) send a repair crew to Mars. This
is a Type 1 performance failure, but is not a total failure.

To generate an example of the second type, suppose that during the development
phase for the vehicle that will go to Mars, one of the key scientific instruments failed
a test designed to demonstrate its ability to withstand the severe vibration
environment encountered during launch. Taking that instrument out of the manifest
would have meant not being able to meet all of the project goals. Leaving it in would
require some redesign and retest of the instrument. Assuming that we must hold to
the positive goals, and the money to do the added work is not in the current plan, it
would mean a cost overrun and possibly a schedule overrun as well.20 The project
sponsor would be obliged to augment the funds to accommodate the overrun, unless
the project already had contingency funds that would cover the loss.

With regard to the third type, suppose that we have set out to build a fuel cell
powered car that will sell for under $20k. We overrun the initial project budget and
schedule, but decide to put additional funds into the project and try again. At length,
we fail and quit trying.

20
Schedule overruns can be a serious matter for space vehicle projects, because they might cause
missing a launch window.
In the second and third types of performance risk, the technical
failure is not the ultimate impact. The ultimate impact is a cost and
possibly a schedule impact. We can boil the previous few paragraphs down into the
following three definitions:

Type 1 performance impact is a failure of the product to meet project goals, a


failure that cannot or will not be reversed. There is no secondary cost or
schedule impact aimed at fixing the failure (there could be secondary impacts
such as project cancellation charges).
Type 2 performance impact is a failure of the product to meet project goals, it
being both possible and likely that such failure will be reversed, resulting in a
secondary cost or schedule impact, or both. The impacts are due to work
needed to mitigate the failure.
Type 3 performance impact is a failure of the project to meet project goals, and
we assume the failure is reversible and continue trying. But ultimately we
concede defeat and end the project.

These definitions clarify the nature of what some call technical risk, an
unfortunately ambiguous and often-misused term. As I have shown, there are three
kinds of technical risk, and two of them are just special cases of cost or schedule
risk. I dislike the term technical risk because many projects are not technically
oriented. But all projects have a performance orientation. They all have a product
they want to succeed.

I have shown that Type 1 performance risks can result in either partial or total
compromise of project goals. It would seem natural to think of each discrete positive
performance goal as having a percentage importance, so that all of them taken
together add up to 100% of what is wanted. Assigning quantitative relative
importance measures to goals is a very useful practice in all projects, and especially
in projects that have Type 1 performance risks. It focuses attention and effort on
what is important, and reduces the likelihood that the project team will spend
inappropriately large amounts of time and money on things that are not very
important. It also promotes the likelihood of customer satisfaction.

Once they have a list of performance goals that may be subject to Type 1, 2, or 3
performance risks, the project team will in due course develop a baseline design for
meeting them. Given a specific design, the team can do analyses called failure
mode analyses. These are searches for ways the product might fail. A failure mode
analysis is essentially a form of search for root causes that can cause product failure
or degraded performance. The more sophisticated forms of this type of analysis for
advanced hardware and software systems are beyond the scope of this book.
Engineering specialists should always perform them. But in simpler situations, the
methods presented next may suffice.

We have previously described cost impacts in terms of money, and schedule


impacts in terms of a span of time. These dimensions are readily quantifiable. But
how should we describe Type 1 performance impacts? Keep in mind that these
impacts are final outcomes for the project, and that they are presumed to be
irreversible. Not all goals are easily quantifiable, so the approach I
recommend here is to quantify goals in terms of their perceived
importance.

For example, suppose that a Mars project has just these three goals:

Goal 1: Take and transmit to Earth a minimum of 500 high resolution digital
pictures of the Mars surface around Landing Site X
Goal 2: Determine the gaseous content of the Mars atmosphere at Landing Site
X in terms of the percentage of each component, and transmit the data to Earth
Goal 3: Perform a complete chemical analysis of the soil at Landing Site X, and
transmit the data to Earth.

The physical reality is that none of these goals can be accomplished unless a
spacecraft is successfully directed to and lands on Mars. Exhibit 20 shows the
necessary sequence of events. It also shows the assigned mission values in
percentages for the three goals (summing to 100%), and shows nine hypothetical
risk drivers (RD) and the performance areas they can impact.

Lets now build a table, in Exhibit 21, of the risk drivers and their probabilities of
occurrence. (We assume that reliability engineers have calculated these
probabilities based on a pool of historical data.)

Exhibit 20Example of Treatment RD6 Take


of Type 1 Performance Risks Pictures
30%
RD7

Measure
Launch Flight Landing Atmosphere
20%

RD1 RD 2 RD3 RD 4 RD5 RD8


Measure
Soil
50%
RD9

Exhibit 21

Risk Driver Occurrence Probabilities


1 0.001
2 0.032
3 0.011
4 0.0025
5 0.008
6 0.003
7 0.002
8 0.06
9 0.0021

We first need to find the probability of reaching Mars successfully. That is the same
things as the probability that none of the first five risk drivers happens.
Mathematically, it can be calculated as:21

(1-0.001)(1-0.032)(1-0.011)(1-0.0025)(1-0.008) = 0.9464

(NASA would probably never accept a mission probability this low, but remember,
this is just an example.)

The probability of total success in taking pictures is given by:

(0.9464)(1-0.003) = 0.9436

Note that the second factor is the probability that risk driver #6 will not occur. The
probability of total success in measuring the atmosphere is given by:

(0.9464)(1-0.002)(1-0.06) = 0.9162

Note that the second and third factors are, respectively, the probabilities that risk
drivers #7 and #8 do not occur. The probability of success in measuring the soil is
given by:

(0.9464)(1-0.0021) = 0.9444

Finally, we calculate the expected project value with respect to the mission goals.
This is a weighted sum, wherein the goal values are multiplied by their respective
probabilities of achievement and then are summed:

Project value = (30)(0.9436) + (20)(0.9161) +(50)(0.9444) = 93.85%

To recapitulate, some projects have risk drivers that can create Type 1 performance
risks. These are essentially irreversible failures. I have suggested that a good way
to describe these is to allocate values as percentages to all of the positive goals that

21
This calculation uses two rules of probability. The first is that if the probability of something
happening is P, a number between zero and one, then the probability of it not happening is 1-P. The
second is that if the probability of success of the first of a series of independent events is X, the
probability of the second is Y, the probability of the third is Z, etc., then the probability of them all
succeeding is the product XYZ.
can be affected by these failures, such that these percentages add
up to 100%. Each goal value is multiplied by its probability of
occurrence, and these are summed. The result is a project score or grade, with the
highest possible grade being 100%.

Most accidents in It may be possible to improve the mission value score


well-designed of a project by mitigating risk drivers. Mitigation may
systems involve cost money, and may also add time to the critical path.
two or more So a judgmental tradeoff is possible between cost,
events of low schedule and project value. The tools described in the
probability above discussion make these tradeoffs possible on a
rational basis.

Please dont think that the process for Type 1


performance risk drivers described here applies only to NASA projects that launch
spacecraft. It applies to any mission-oriented project that can have irreversible
failures of the product that essentially kill the project. For example, a project to
develop a new financial instrument could have an irreversible failure if investors did
not accept it, resulting in a shutdown of the project. The probability of success might
be estimated using focus groups.

Type 2 and 3 performance risk drivers have further consequences beyond the
technical failure itself that can be measured as cost or schedule impacts of both.
Note that the failure itself is not the risk driver, the cause of the failure is. There is a
great temptation when describing this type of risk driver to write a simple statement
such as The gizmo might fail its acceptance tests. I have done this myself. The
conscientious risk analyst will try to write such a statement for every known failure
mode of the gizmo. Problem is this might result in 50 risk driver statements, which is
more trouble than it is worth. My advice: do the best you can to maintain fidelity in
risk description, but dont try to make risk management the only activity in the
project. If you do, you will soon run out of friends.

Type 2 and 3 performance risk drivers can easily be included in cost and schedule
risk impact tables.

2.5 Planning

You have identified a risk driver, and you have analyzed its potential impact. Whats
next? Planning.

Planning is the process of deciding what if anything should be done about identified
risk drivers. Other names for planning are mitigation and abatement. I prefer the
name planning because it seems more inclusive of the possibility that you might be
able to do more than just stop the bad stuff from happening. You might also
encourage good things to happen. Remember the duality of risk?

Before undertaking a plan for doing something about risk drivers, it is well to recall
(chapter 1) that risks affect project stakeholders differently, and also that the ability of
stakeholders to deal with a given risk driver can vary significantly.
So the first step in the planning process is to allocate the work to
the various stakeholders. Generally, the stakeholder who could be most impacted
should take on the planning task. However, I have seen more than one situation
where a weak stakeholder relied on a stronger one to mitigate risks that most directly
affected the weak stakeholder. In all cases that I can recall, this was done because
it was in the self-interest of the stronger stakeholder that the weaker one not fail.

Once the lead stakeholder has been determined for each risk driver, the next step is
to identify the particular group or individual who should lead the effort. The usual
rule here is to assign the people who have the most knowledge of the risk driver,
although there could be exceptions.

If a good job has been done in identification and analysis, the planning task begins
with knowledge of the nature of the risk driver and its size, in some sense of that
word. Depending on the type and sophistication of the analysis, we might have
knowledge of one or more of the following indicators of size:

Number and identity of task elements affected


Cost impact (across all affected task elements)
Schedule impact (across all affected task elements on the critical path)
Product performance impact
Impact timing
Probability of occurrence

The precision of the knowledge resulting from analysis depends both on the nature
of the analysis and on our ability to collect good data. These, in turn, are highly likely
to depend on how much time and effort we are willing to put into the analysis task.

Probably the fastest, cheapest form of analysis is to simply write a text description of
the risk driver, and then decide whether it is big, small, or somewhere in between.
One might improve on that by using the simple tabular analyses presented in section
2.4. At a more sophisticated level, one could use the mathematical tools described in
chapter 5. The conundrum is that the simpler and
cheaper the analysis, the less information we have for
making a plan.22 The less information we have for
making a plan, the more likely the plan is to be a brittle
one. As the saying goes, you pay your money and you
There is no free make your choice.
lunch.
Whatever knowledge has been gathered in analysis will
be used to plan what to do about a given risk driver.
That could range from nothing to a whole lot. Why
would we decide to do nothing? Here are two reasons:
22
However, I would caution that there is such a thing as over analysis. This happens when the
available data does not support the sophistication of the tools. When a risk drivers impacts are over
analyzed, time and money are consumed, but there is no new information that is useful for deciding
what to do.
Mitigation23 is impossible; it is beyond our capability.
Mitigation has been judged economically infeasible; it will cost more than it is
worth.

In principle, the issue of economic infeasibility should relate quantitatively to the


expected damage from the risk driver. Previously, I offered a rule of thumb that not
more than 20% of the mathematically expected cost impact should be spent on
mitigation. In my opinion, that is about as far as anyone should go in quantitatively
relating effort spent on mitigation to risk driver damage. A big problem is that the
rule of thumb relates only to cost impact. What about schedule impact? What about
mission success impacts when Type 1 performance risk drivers are present? What
about a host of other issues too numerous to describe here?

As an engineer, I am a fan of quantifying whenever it makes sense. But when it


comes to prioritizing risk drivers for mitigation action, or for deciding how much to
spend in time, money, or other resources, the decision is usually too complicated for
purely quantitative methods. It usually must have a subjective component. Here are
some factors that may bear on a subjective decision:

The number or risk drivers that have been identified


The perceived likelihood that more risk drivers will be identified in the future, and
their likely severity
The presence of must mitigate risk drivers that can be project show stoppers
The amount of discretionary resources available at the functional group level
where the mitigation actions will be undertaken
The amount of money the project manager is willing to contribute from project
management reserves
The possibility that an effort to mitigate will totally or partially fail
The reputation of the team if the risk happens
Possible injuries or loss of life
The possibility of project cancellation if the risk happens.

Once a decision is made to do something about a risk driver, it should be recorded in


a written plan. I have more to say about that in chapter 4.

I will end this discussion about planning with a review of eleven mitigation
techniques. I have yet to observe a risk mitigation effort that did not use one or more
of these, but if you discover a new one that is useful, the world would be pleased to
hear about it.

23
Unfortunately, the English language has no single word that means both mitigation of bad things
and enhancement of good things. So, please understand my use of the word mitigation to mean
either mitigation or enhancement, depending on circumstances. Interestingly, the claim has
frequently been made that the Chinese character for crisis is a combination of two symbols, the one
for danger, and the one for opportunity. Too bad English has no such word. Unfortunately, the claim
has been debunked by Sinologists. Even so, its too bad there is no appropriate word in English.
AnalysisAnalysis includes studies, models, simulations, and
research, both primary and secondary.24 Analysis usually is the
mitigation tool of choice when the root cause of risk is ignorance, and the desired
knowledge is not readily available from trainers. If knowledge is available from
trainers, education is usually faster and cheaper than analysis.
InsuranceThe general idea of insurance is that a third party agrees to pay for
an unfavorable outcome that matches a carefully worded description contained in
the insurance policy. In return, you pay a fee, called a premium. Generally,
before a third party will agree to do this, there must be a clear pattern of
occurrences of the unfavorable outcome, from which probability and size of loss
can be ascertained with high confidence. The insurer sets his premiums at a
level that he expects will cover his loss liabilities, his internal costs, and his profit.
Generally, projects cannot buy insurance for losses other than the ones
traditionally covered by insurers, such as fire, product liability, key man, etc. The
reason is that there is no history. Insurers, unlike many project risk analysts, are
seldom willing to assign probabilities based on intuitive or scant evidence. The
phenomenon called self-insurance occurs when a project is willing to (or must)
absorb its own losses. This amounts to acceptance of the risk, and is not by itself
a mitigation technique.
Buy-outA significant risk in some projects is that a key resource may not be
available when it is needed. The key resource might be anything from an
engineer with a special skill, to a fortuitously sited and equipped warehouse, to a
particular type of application specific integrated circuit that is about to go out of
production. The idea of buy-out is to acquire the critical asset while it is available,
and before it is actually needed. I once worked on a project where we had an
antenna design problem, and none of our engineering staff was adequately
familiar with that body of knowledge. It was still six months before we needed the
services of an antenna specialist, but we proceeded to hire a recent PhD level
graduate with the requisite skills. We thus preempted the possibility of not having
the necessary skills when we needed them, but we paid the price of six months
salary for the engineer, who was given other, fairly trivial work that could have
been done by a lower paid engineer.
Transference / sharingThe basic idea here
is to spread risk around so that the impact of an
unfavorable event is distributed over more than
one project stakeholder (sharing), or is loaded
If you think the onto the stakeholder best able to absorb it
problem is bad (transference). As an example of transference,
now, just wait in cost plus contracts the sponsor takes most of
until weve the risk because structural risk is deemed to be
high, and the impact on the performer might be
so severe as to put him out of business.
Without the transference, it might be that no
performer could be found to attempt the work. As an example of sharing, there
are arrangements where a cost overrun is shared between the performer and the
24
Primary research is a search for new knowledge independently from what has been learned in the
past. Secondary research is a search for catalogued or distributed knowledge, such as a library
search, Internet search, or a survey.
sponsor. This creates an incentive for the performer to avoid
an overrun condition, but gives him some protection if he cant
avoid it.
IncentivesAn incentive is a conditional payment or inducement offered to the
performer in the hope that he will work more efficiently than usual. A common
use of incentives is to get a project completed quickly. In an earthquake in
California some years ago, there was severe damage to a major freeway. The
result was huge traffic jams. A construction company was offered an unusual
cash incentive if the work could be completed in only three months. The
construction company offered to pass most of
the incentive money down to its employees if the
very tight schedule was met. It was. The
incentive eliminated an obvious risk driver.
Its easy to tell (Naturally, there was much criticism of the
when youve got contractors obscene profits.)
a bargain it PenaltiesA penalty is a conditional reduction
doesnt fit. in the contract price, imposed if the performer
fails to meet certain conditions. Probably the
most common use of penalties is to encourage
the performer not to finish later than a critical
need date. But there are other uses as well. For example, penalties can be used
to mitigate the possibility of poor quality, unfair labor practices, and a host of
other perceived ills.
RedundancyRedundancy is the use of multiple means to ensure a critical
outcome. The idea is that if one or more means fails, there will be backup means
to prevent a complete failure. Many types of redundancy have been used in
projects. Here are some examples:
o Awarding study contracts to three potential suppliers of a new, high-tech
product, to see which one can come up with the best product design
o Installing a backup electrical generator in a hospital in case local electrical
utility service goes down
o Using two hard drives in a critical computer, in case one fails.
Cost reductionsSuppose that in the baseline design of a product there is a
part called a gizmo that costs an estimated $100. But no gizmo has ever been
built, so there is some uncertainty in the cost. Assume we believe that the gizmo
might cost as much as $110, i.e., we believe the risk is $10. Now, suppose that
an enterprising engineer says he can duplicate the function of the gizmo with a
part called the frammis that is already in production, and that costs $100. If we
decide to use the frammis and not the gizmo, we have mitigated the $10 risk
provided that we have not inadvertently introduced new risks that exceed $10.
Cost reductions tools, such as target costing and value engineering, can be
powerful risk mitigation tools. But we must be certain that the lower cost doesnt
introduce new risks that offset the benefits. So often, what looks like a clever
cost reduction idea turns out to be high risk.
EducationEducation is a quick and inexpensive way for a project team to
acquire needed knowledge that already exists. Education can mitigate
ignorance. Suppose, for example, that a factory wants to start using
programmable logic controllers to automate certain functions. But the factory
engineers have never worked with these devices. A vendor of
training offers a well-regarded five-day course in the subject,
and will put a subject matter expert in the factory for a month, at a reasonable
price, to aid the transition. Risk removed, but at some cost.
AvoidanceAvoidance means backing away from project goals that have
marginal value and high cost or risk. The typical situation is that the project
performer finds that to accomplish a certain goal, costs will exceed the budget.
The performer appeals to the sponsor to remove the goal. The performer agrees
that the goal can be dispensed with, in the interests of containing costs.
Sponsors may not always agree with an avoidance appeal, especially in fixed
price work where the contractor bought in by bidding too low. This is a major
risk for contractors who like to bid too low in the hope of making it up later with
directed change work.
HelicoptersA significant risk in many projects is that a poorly informed or alpha
male (alpha female?) sponsor will decide to cut something vital in a project
proposal just to demonstrate control. This can make the likelihood of failure
significant. A cynical but often useful way to fend this off was (so the story goes)
developed by a Hollywood businessman in the
public relations business. He would add
helicopter rental to his proposals at a price of
$125,000, claiming that a helicopter was needed
If you think to take aerial photos to add visual interest. His
education is sponsors would see this obvious, marginal and
expensive, try expensive item and cut it, leaving everything else
ignorance. intact. People creating reports and presentations
often follow a similar strategy. They will put in
something wrong or even mildly outrageous,
expecting management to cut itand usually
that happens. Their hope is that the message they really wanted to get across
remains intact. The helicopter strategy may not work with hawk-eyed sponsors.
And, so I have found, there is always the possibility that the helicopter will fly, but
something you really need will get cut. A good axiom to remember: Have the
strongest justification for the things that are most important, and have only
minimal justification for items you can do without. And if possible, maintain full
flexibility in allocation of resources, in case your helicopter flies, but something
important does not.

I offer a final comment about risk mitigation. Sometimes a risk mitigation effort will
find that the risk identification activity has inadvertently focused on a symptom and
not on the true root cause. There is then a choice. Either the mitigation effort can
simply try to alleviate the symptom, or it can try to cure the root cause. In medicine,
relief of symptoms is often a worthy goal, especially when symptoms are painful
and/or not much can be done about the root cause. But on projects, the best thing
usually is to repeat the risk analysis in light of the now known root cause, and to
again assess the need for and style of mitigation.

2.6 Tracking
Tracking pertains to the acquisition, compiling, and reporting of
data relating to the status of identified risk drivers. One purpose of
tracking is to keep the team informed about progress in identification, analysis, and
mitigation of risks. Another purpose is to trigger needed actions.

Projects teams will vary considerably in what they consider to be information worth
tracking, and also in the medium used for tracking. Additional discussion of these
subjects can be found in chapter 4. Meanwhile, here is a suggestive list of
commonly tracked information about risk drivers:

ID number
Date of first entry
Name of responsible person
Text description of root cause
List of tasks potentially impacted
o Cost impact (each task)
o Duration impact (each task)
o Type 1 performance impact (each task)
o Hammock? (Yes/No)
Text description of mitigation plan
Current status of mitigation actions (updated regularly)
Lessons learned
Remarks

2.7 Controlling

The initial decisions concerning what to do about a risk driver are made in the
Planning task, described in section 2.5 above. Subsequent decisions are made in
this task. The controlling task involves review of the status information, deciding how
to proceed, and executing the decisions.

The task may involve some re-analysis of the risk driver, if its status appears to have
changed. Changes happen for several reasons, including new knowledge, mitigation
actions, and simply being overtaken by events.

Decisions about new actions are generally made using the same processes as the
original decisions.

2.8 Communicating
Without effective communication, all of the risk
management activities described above will not be
The information effective. Communication about risk drivers may be
you have is not more difficult than most other project communications
what you want. because of the nature of the subject matter. People
The information may not be accustomed to talking about likelihood,
you want is not especially when it is quantified as probability.
what you need.
The information
you need is not
what you can
obtain. The
Assigning probability values strikes some people as a totally
foreign activity. They may be quite comfortable with expressions
like Pretty sure this will happen, but not positive, but translating that to 90%
probability is a big step for them.

Another sometimes-uncomfortable topic is negative consequences. Most people are


optimistic and cheerful most of the time. Talking about what could go wrong and
especially estimating what could go wrong in dollars of cost and weeks of schedule
seems somewhat defeatist.

A principal reason that projects get into trouble is that risks are ignored. One reason
they are ignored is that the people who discover them are reluctant to talk about
them. The only way out of this death spiral is to freely communicate risk information,
and the actions to be taken to mitigate risks.

There are numerous modes of communication of risk information. They range from
simple peer-to-peer discussions between team members to formal management
reports. The modes of communication should not be left to chance. They should be
set forth in the projects risk management plan. For more, see Chapter 4.

Chapter 2 review questions


In a recent editorial in a newspaper based in an urban county in the western United
States, the writer discussed some problems encountered by the county board of
supervisors. They were asked to approve what appeared to be a small construction
project for a maintenance building. It at first seemed to be routine business. But on
a closer look, the request, which came from the countys facilities management
group, was to add money to a project that was already in an overrun condition.

Several months before, the board had approved the project with authorized
expenditures not to exceed $600,000. The lowest bid was over $800,000. Later
there were discussions about whether all bidders got the same information, so new
bids were sought. Ultimately, the price tag was a bit over $1 million. The
supervisors noted with chagrin that this kind of thing was a common occurrence.
One supervisor commented, Were obviously not doing something right in Facilities
Management.

To further raise their ire, the supervisors had just received a grand jury report on the
recently opened new county hospital. It turned out that there was a considerable list
of post-occupancy projects, such as an unfinished ultrasound room, faulty flooring,
etc. According to the grand jury, one reason for the problems was a lack of
communication and coordination between the hospital and the countys Facilities
Management shop. There were further revelations by the grand jury. The hospitals
price tag, on opening, was about $188 million. But there were contract conflicts, and
the contractor left with warranty issues still unresolved. As a result, the supervisors
had to add an additional $5 million to pay new contractors to finish the work in a
timely manner.
Using what you have learned in chapter 1 and in this chapter, plus
your common sense, answer questions 1 through 10.

1. How do you think the county supervisors could improve their communications
with the Facility Management project teams?
2. Obviously, there are some problems in the Facility Management shop. If you
were hired by the county supervisors as a consultant to find the problems and
recommend fixes, what would be your plan of attack?
3. Do you think the project teams in the Facility Management shop are sufficiently
risk aware? How do you think they might become significantly more risk aware?
4. Do you think it likely that the Facility Management shop might not really have
project teams, in the sense that term is used in this book? (See the Preface.)
5. Same as question 3, but for the county supervisors?
6. Why do you think that all contractors did not get the same bid information on the
maintenance building? Can you think of a process for being sure that they all
did?
7. Why do you think the lowest bid on the maintenance building considerably
exceeded what the supervisors had authorized?
8. The maintenance building was for the use of the Facility Management group.
They were both a customer and the performer of this project. Evidently, there was
scope creep, a common name for project goals that continue to grow after a
baseline plan is created. This is a special kind of goal instability. Is scope creep
likely to be a problem when the project team is building a product for its own use?
In the bigger picture, is scope creep ever harmful? Can it be beneficial? When?
If it can be harmful, can you think of ways to mitigate it?
9. Contract conflicts and warranty issues are rampant in the construction industry,
especially in public projects. Can you think of ways to minimize the possibility
that a construction contractor will walk off the job, leaving unfinished work?
10. How could you be sure that the hospital staff understood its needs fully and
communicated them fully to the Facility Management group?
Chapter 3Advanced Project Risk Management (APRM)

3.1 Introduction

APRM is CPRM plus. The main differences


typically are in training, flexibility, attitude, loyalty,
morale, functioning as a team, overcoming
problems, and leadership. Reasonable extra costs
If anything simply of CPRM can usually be justified on the basis of
cannot go wrong, loss prevention. The same is not necessarily true of
it will anyway. APRM. APRM may be justified only when project
success is vitally important. Thats usually when
losses can be very damaging, such as potential
injury to persons or loss of life, possibly crippling
financial losses, or significant loss of business
momentum to the competition.

Like CPRM, APRM is scalable. For the riskiest projects, you might feel justified
in doing full APRM, but for less risky project you might only feel justified in
doing some APRM. It is difficult to tell a project team when they should use
APRM, and how much to use. Its mainly a matter of risk tolerance versus the
extra costs. Those kinds of judgments tend to be very personal, and often are
based on previous bad experiences. Which is not necessarily a bad thing.

APRM goes beyond identifying, analyzing, and attempting to mitigate risk drivers.
It will do some of that, of course, just for the sake of creating a reduced structural
risk environment. An analogy might be that if you are going to swim across a
raging, flood-swollen tropical river, first shoot all of
the crocodiles you can see. After that, be sure you
have the necessary swimming skills and endurance.
Only devising a
defense against But what if you missed a croc or two? Do you have
failure of the a contingency plan that has a good chance of
contingency plan working? And if that fails what then?
can insure
APRM starts with the assumption that the projects
positive goals have been pared down to an
irreducible minimum, and that its negative goals
(constraints) have been made as loose as possible. To do otherwise in a must-
not-fail project would be irrational. When a projects success is critical, it doesnt
make sense to burden the project team with nice-to-have but not really
necessary objectives. Nor should the team be shorted on vital resources if they
are by any ethical means affordable and obtainable. Also, all usual and
customary constraints that can be waived should be. A critical project should be
allowed to operate with a minimum of bureaucratic red tape.

Given that the goals, positive and negative, have been carefully massaged,
APRM can usefully be based on the iron triangle shown in Exhibit 22:

Exhibit 22The Iron Triangle of APRM

Rapid
Rapid
Re-planning
Re-planning
The buddy
system is vital to
your survival. It
gives the enemy
somebody else
Robust
Robust Robust
Robust
Team
Team Plan
Plan

In this chapter, I will explore each corner of the iron triangle of APRM separately,
and conclude the chapter by examining the functioning of the integrated
structure. The gist of what I will have to say should already be obvious. You
begin with the strongest team you can create, and the best plan you can devise
given the pared down positive goals and loosened constraints. Then, when
things go wrong, you so arrange affairs that you can very quickly devise,
communicate, and implement an effective new plan that will work in the new
environment. If you already know how to do that, you dont need to read the rest
of this chapter.
3.2 Robust team

What does it mean for a team to be robust? Does it mean that every morning
they eat their vitamin and mineral fortified cereal, and do ten laps around the
block, carrying 50 pound weights? No. Being physically fit can be a component
of robustness in certain projects, but the following elements of robustness are
usually more important.

Competence
Dedication
Tenacity

Lets explore these now.

3.2.1 Competence.

The usual image invoked by the word competence is knowing how to the job.
In APRM, we might go beyond that simple image. Lets explore just how far we
might go.

Expert status. Most team members are expert and current in at least one
discipline necessary to the success of the project. Those who are not must
be mentored by a team member who is.
Multi-disciplined. Most team members are significantly competent in more
than one discipline. This parallels the notion, once popular in management
theory, of the T-shaped man.25 The T-shaped man was supposed to have
great strength in at least one discipline, but to also have broad knowledge of
other disciplines as well. The multi-disciplinary approach is central to the way
crews are trained in the U.S. Navys submarine fleet. Every crewmember, of
whatever grade or rank, must have a key skill, and if that skill is weak, an
expert mentors him. In addition, he must have broadly based knowledge, but
not necessarily expertise, in all of the key systems of the ship and their
operation. An electronics technician must have some knowledge of the
torpedo tubes and how they work. A quartermaster must understand how the
ships buoyancy system functions, and so on. This mode of operation

25
That was before our current era of political correctness. Todays expression would probably be
T-shaped person.
enables submarines to better survive in a hostile environment. Because they
operate in a stressful environment, submariners get extra pay, which is a
good idea for all project teams that operate in APRM mode.
Know what they know and what they dont know. The idea here is that
team members live examined lives, following Aristotles dictum that the
unexamined life is not worth living. They have looked within themselves and
they know their capabilities and their limitations. They are realists. They are
immune to hubris, yet they approach their work with confidence. They are
balanced and wise. Even if young and energetic, they are mature yet creative
in their thinking, more so than the average adult. Typically they are well-read
and have wide ranging interests.
Know what their teammates can do. In APRM its crucial that team
members understand the capabilities of their teammates. On a football team,
the coach is expected to have a firm grasp of the capabilities of each team
member, and to use each team member to best effect. But APRM goes
furthereach team member must understand what the other team members
can do, and what they cant. In critical projects, the coach cant be
everywhere. When a problem arises that a team member doesnt know how
to handle, he or she should be able to quickly find the needed expertise. This
short-circuits the delay of passing information through the coach that he
doesnt need to have.
Fast execution. Have you ever watched a professional baseball team do its
warm-ups? Its a bit like watching a ballet. The motions have a sort of fluid
quality as the ball goes from player to player. With seemingly no effort, the
players move the ball around with blazing speed. Fast execution can also be
important in some projects. These are likely to be the same projects where
APRM will be employed. They are the projects where schedule is critical,
where waiting a day for a memo wont get the job done, where every day that
passes makes life more unpleasant for some of the stakeholders.

3.2.2 Dedication.

Another name for dedication is team spirit. At its center is the notion that team
members think the project is important to them personally. They are therefore
willing to give it a high priority in their lives, and to work enthusiastically to
achieve its ends. Dedicated team members dont just show up for work and do
the minimum expected. They have a degree of emotional involvement. They
have fervor. They struggle to achieve the best possible outcome for the team,
believing that outcome will also be best for them.
Dedicated team members do not build silos. Silos are artificial organizational
barriers that obstruct the full and open flow of information around and across the
team. Creation of silos is the beginning phase of future turf wars based on
ownership of certain functions, and creation of walls around the information
they contain. Silos prevent free and open communication between team
members. They create bureaucratic barriers to getting things done. Silos inhibit
fast execution, while people wait for the paperwork to get done.

3.2.3 Tenacity.

Competence and dedication may get you an assignment on an APRM-oriented


project team, but it is tenacity that will keep you there. The APRM environment
can be stressful. For example, in the world of software development, getting the
product to market quickly is sometimes deemed to be the only success factor.
These rapid development projects are sometimes called death marches. It is
famously said that you can tell its a death-march project by the number of empty
pizza boxes you can see in the project offices at two oclock in the morning.
Team members must be tenacious to survive in this environment. Of course,
being well rewarded tends to make them more tenacious.

As I have already noted, APRM is scalable. Not every APRM project should be a
death march. Most dont need to be. But every APRM project organizes itself to
have the same or higher probability of success in an unusually difficult or critical
project than a typical team has in a typical project. Adoption of APRM implies
a need and a willingness to go the extra mile.

3.3 Robust plan

No matter how robust the team itself may be, if it has a poor plan for
accomplishing the project goals, it is more likely to fail than if it has a better
plan.26 Most projects reckon failure in terms of excess cost, excess duration, or
both. Some measure failure in terms of inability to stay within certain constraints,
such as environmental pollution rules or required norms for doing business.
Others measure failure in terms of possibly succumbing to type 1 and type 2 or 3
performance risks (recall that these were discussed in section 2.4.5).

Three questions arise:


26
This sounds self-evident, but I have found that many teams give very little though to bettering
their project plan.
How robust is your plan?
Is your plan robust enough?
Should you try to make your plan more robust?

Lets take these questions one at a time.

3.3.1 How robust is your plan?

The logical way to measure robustness is probability of success. But probability


is quantitative (see chapter 5), and not all teams are equipped to deal rationally
with probabilities, for various reasons. In that case, relative likelihood expressed
in terms such as minimally, moderately, or highly likely must suffice.

Regardless of the way you measure the probability of success, you cant
effectively measure it unless the criteria for success are well understood, not just
by the leader, but by the team as a whole. A prerequisite for knowing how robust
your plan is, is to know clearly and concisely, team-wide, what must be
accomplished.

To determine likelihood of success, we need to go through processes such as


those described in chapter 2 (CPRM). More sophisticated project teams might
employ the statistical processes described in chapter 5. Later in this chapter, I
describe another process I call scenarios that can also be effectively used in
critical projects. The scenario process is a technique in which one starts with a
baseline plan then iteratively improves it by developing contingency plans for
weak areas in the project plan. This is not quite the same as looking for ways to
mitigate risk drivers. It is more like preparing alternative plans in advance to
work around risk impacts that were not mitigated. It works best when the project
goals are not too complicated.

Additional studies of the plan typically increase robustness. Oddly, some teams
assume the outcomes of studies and avoid the work of doing them. Here is an
important rule that minimizes the possibility of self-deception about what
additional studies might reveal:

Analysis and measurement of robustness of the plan must be based on the


current understanding of conditions, not on assumed conditions that may
or may not be revealed after additional studies or analysis are done.
I mention this because I have seen this error made more than once. The way to
find out what more studies will reveal is to do them.

3.3.2 Is your plan robust enough?

This is of course a judgment call. But you cant in all honesty make it unless you
have done at least the due diligence described in the previous section, namely,
estimate the likelihood of success, qualitatively if not quantitatively.

Then you should do more. Even if you think the probability of success is
reasonably high, achieving success could have serious side effects. This is
because critical projects tend to focus on limited objectives and have relatively
few and loose constraints. They may pay little attention to unintended
consequences. So consideration of the possibility of doing damage is a good
idea. If you can identify possible damage, you might be able to include in your
plan a remedy for it.

3.3.3 Should you try to make your plan more robust?

Again, this is a judgment call. Can you afford to spend more time and resources
on improving the plan? If you can, and you think improvement is possible, you
generally should. An exception could be if your last attempt resulted in negligible
improvement. Unfortunately, critical projects often need to get underway quickly
else the chances of success dwindle. Additional time for planning often is simply
not available.

3.4 Rapid re-planning

It is a rare project that is completed entirely in accordance with the initial baseline
plan. In my experience, initial plans are generally overtaken by events within a
few days to a few weeks after the project is rolling. Delays happen. Artifacts and
people that were expected to be available are not. Ideas that were expected to
work turn out to be impractical. Workarounds plans are needed to keep the
project on track. In some projects, the team can take days to create and
communicate a workaround plan. But in other projects, and especially in critical
projects, the workaround plan must be generated and communicated in near real
time, perhaps in the same day, and sometimes within minutes. This argues for
creation in advance of workaround plans likely to be needed, also known as
contingency plans. Canned contingency plans can be executed much more
rapidly than plans that have to be developed on the fly.

3.4.1 Plan awareness.

To have any hope of achieving rapid re-planning, it is vital that all team members
be fully aware of the current plan and their role in it. If they are not, news of the
new plan is likely to result in confusion. Therefore there must at the outset be a
process for making team members aware of the plan and their role, and this
process must be sufficiently robust that changes dont fall through a crack.

Plan awareness is more than just being aware of what is going on around you. It
is also being aware of the other players in the project, especially the ones who
owe you input (your suppliers), and also the ones to whom you owe output
(your customers). Are your suppliers holding to the plan? Will you be able to
hold to the plan given the situation of your suppliers? Can you rapidly switch
suppliers if the planned supplier defaults for any reason?

Plan awareness requires demolition of organizational barriers to communication.


The project must be regarded as a mission whose completion is vital to the entire
team. The team attitude must be Its the mission, stupid!

Plan awareness requires full knowledge and understanding of changes in the


plan. This further requires that changes be communicated crisply, much as a drill
sergeant communicates changes in direction to a marching body of troops.
Pursuing that analogy a bit further, we have all seen movies (or perhaps have
personally experienced the situation) where green troops fall all over themselves
when being drilled because they are unpracticed in understanding the meaning
of the commands and responding to them. A few weeks later, after repeated
drills, the same troops implement the same commands crisply, with precision and
pride. There is a message here for APRM. If the project team is to implement
changes crisply:

The changes must be communicated clearly in a command language that


the team understands
The team must have experience in responding to changes ordered in the
command language, so that they respond virtually by reflex action.

The typical project team is not a body of troops, and the small number of
commands that a drill sergeant needs to move his troops where he wants them is
not sufficient to direct changes in most project teams. In a very complex project,
especially a high technology project, the development and communication of
changes in the baseline design might require hours of meetings, poring over
drawings or computer printouts, scribbling on a white board, and consultation
with subject matter experts. In this context, how can there be anything
resembling a crisp command language?

Having attended many such meetings, having pored over drawings and computer
printouts, scribbled on white boards, and consulted with subject matter experts in
stressful environments, I can answer that question in this manner: When all is
said and done and a consensus is reached as to how the plan shall change, the
project manager (or if necessary his high level designee), using a form (or
process) created for this explicit purpose, records the changes to be made in the
plan.27 This record should indicate or provide for, as a minimum:

Which existing tasks are affected


Any new tasks created
Changes in budget for affected tasks
Changes in planned duration of affected tasks
Changes in the length of the critical path
A description of product design changes with respect to the previous baseline
Immediate issue of the document to the entire team, or at least to the team
leads, using the teams most reliable communications channel, or better yet
redundant channels
Response from the team that acknowledges both receipt and immediate
compliance (compliance problems, if any, must be cited in the response)

Here is my rationale for these requirements:

Why is this a responsibility of the project manager?


o The process ensures that the project manager understands and has
sorted out the full impact of the change, avoiding possible confusion at
the top level of the project when pieces of the change come up from
below as possibly conflicting recommendations. To fill out the form,
the project manager must personally explore the logic of the change

27
A generic sample form for this purpose is shown in chapter 4, section 4.3.8. The form can be
tailored to the needs of the project. The basic idea is to use the form as a command language
that is concise and clear.
and all certain and likely consequences so he can communicate them
clearly to the team.
o Universal and immediate acceptance by the team is likely when the
changes are issued over the signature of the project manager. The
form is a command language that the team has been trained to
understand and respond to quickly.
Why not use standard change board procedures?
o There is nothing inherently wrong with such procedures, but they may
be too slow for critical projects.
o Typically, ordinary change board procedures are more oriented toward
technical consequences of a proposed change; they may not
adequately address cost and schedule impacts.
What are some other advantages of this process?
o Fast and reliable communication of changes to the team
o Assurance to project management that compliance is happening
o Crisp execution of changes of baseline
Should every change, even minor ones, use this process?
o Not necessarily. If the change can be made within the scope of a team
leads current schedule and budget, and the lead is reasonably sure
that his action will affect only his area of responsibility, the change
could be made without a formal change notice issued by the project
manager.

There is another dimension to plan awareness that I have not yet mentioned. It
is status awareness. You may fully understand your goals and how you expect
to get there, but you dont have complete information unless you also know how
much progress you have made versus what progress you should have made,
and whether or not you are in danger of violating constraints.

If a project team has used half the allotted duration of the project but is deficient
in positive accomplishments, it must have the means to know that. If a project
team has used half the available budget, but has not completed half the work, it
should know that too. Projects, especially critical and fast moving ones, must
have an efficient system of measurements to keep the team continuously aware
of progress toward positive goals, and danger of violating constraints. This is the
issue of project metrics, about which much has been said in recent years.

Metrics are measurements, continued at regular intervals over time, that provide
clues as to project health. The choice of what should be measured to determine
health should be made by the project team, because they best understand what
health means to their project. Given that, it is a fact that most projects need to
measure as a minimum actual versus planned monetary costs, and technical
(i.e., product) progress versus expectations.

Probably the most useful form of cost metric is the cumulative planned versus
actual cost graphic. It is illustrated in Exhibit 23. It immediately and clearly
shows the situation.
Exhibit 23Example of Cumulative Planned vs. Actual Cost Chart

Time = Now

Actual Total
Cumulative Cost Budget
Cumulative Cost

Plan

Elapsed Time

Many projects not using APRM use such charts, but often in those projects there
is a lag of a month or more before the chart is updated. In APRM, the charts
may need to be updated weekly, or possibly even more often. Updates must not
be just at the total project level, but must also provide lower level information for
project leads. Updates must include changes in budgets due to baseline change
notices and other causes.

Most projects use legally mandated but cumbersome time card systems to collect
labor costs, and similar systems to collect material and other costs. These may
supply sufficiently rapid feedback for most projects, but for APRM projects it may
be necessary to devise faster systems. Fortunately, in this age of widely linked
computers, that is usually not hard to do.

Measurement of technical progress over time is not quite as straightforward.


Often some creative thought is needed to decide just which minimal set of
parameters needs to be measured. Frequent considerations are ease of getting
the data, and the frequency with which it should be updated. Here are some
metrics I have seen used in projects. Some of these must be estimated by
indirect calculation early in the project, as opposed to being directly measured
later. All of these metrics are typically plotted over time and compared to the
current target value.

Weight
Maximum speed
Accuracy
Radiated power
Thrust
Resolution
Range
Efficiency
Customer acceptance
Diversity
Drawings released
Cubic yards poured
Units produced
Lines of code written and tested

3.4.2 Minimal plan complexity.

This subject has been mentioned before, but due to its importance, it is worth a
bit more discussion. Giving a project the APRM treatment is difficult enough, but
becomes ever amore difficult as the complexity of the goals and plan increases.
To assure the success of a critical project, it is needful that the goals, and the
plan to achieve them, should be as simple as possible.28 The addition of nice-to-
have but unnecessary product features should be avoided. Likewise,
unnecessary or overly tight constraints should be removed wherever possible. A
few examples should suffice to illustrate this. They all come from actual projects.

The sponsor urgently needed the project to be completed, but constrained the
funding to that which could be set aside from monthly cash flow. This was a
somewhat volatile amount, and was never known until about a two weeks
after the end of each month. The project team had to live with a variable
funding constraint that changed every month. The team overran both the
schedule and the total budget, largely because of work inefficiencies. The

28
They should follow the well-known KISS principle: Keep It Simple, Stupid.
cost of borrowing all of the needed money would have been much less than
the costs of the overruns. It was likely that the sponsor had sufficiently good
credit to borrow the money.
The sponsor asked for software that would estimate the costs of certain types
of sophisticated hardware, given certain information about the products. The
schedule was tight and the funding was minimal. But he also prescribed that
the mathematical algorithms be developed in a accordance with a new and
untested theory that had been developed by a couple of university professors,
and he also wanted the software to be linked with other existing software in
difficult and complex ways. The latter two very difficult requirements were in
the category of nice-to-have, but did not contribute significantly to the utility
of the product. Addition of the extra requirements severely limited what could
be accomplished on the primary task.
Severe security requirements designed to limit or eliminate the leakage of
information from the project can add significantly to project cost, and can slow
down the work considerably.29 This is not necessarily a bad thing, but can
become a bad thing when the inefficiencies are not recognized in creating the
schedule and the budget. This is most likely to happen when:
o The project is urgent
o The sponsor normally does not have high security requirements
o The project team is not used to working in a high security environment.

3.4.3 Centralized vs. decentralized planning

Project planning has two basic opposing styles, but there can be useful mixtures
of them. One style is centralized, top-down planning. The other is decentralized,
bottom-up planning.

The demise of the Soviet state should give sufficient warning of the dangers of
sole reliance on top-down planning, as applied to an entire nation. I would
venture that it is at least equally risky for a project manager, perhaps assisted by
a small planning group, to plan an entire project then insist on running it
according to that plan. The obvious problem is that few small groups, however
bright and experienced, can successfully plan a multi-disciplinary project. They
just dont have the necessary visibility into the difficulties of what has to be done.
This is especially true in the kinds of projects where APRM is likely to be applied.

29
Added cost and duration of 30% or more have been observed.
There can be good reason for some early top-down planning in slower paced,
lower risk projects, for purposes of establishing rough time spans and order-of-
magnitude budgets. But those good reasons often dry up when considering
critical projects. The early top-level work should focus on selecting reasonable
positive goals and picking the right leader and core team members. The core
team should be allowed to select additional team members, determine (insofar as
possible) the monetary and other resources they will need, and (again, insofar as
possible) determine their own schedule. Moreover, both schedules and budgets
should have some elasticity, wherever that is possible.

3.5 Putting it all together


This section will focus on three critical activities that may distinguish upscale
APRM projects, at least as I view them, from ordinary projects. They are:

Organizing the team


Scenarios
Simulation

All projects organize teams, unless they happen to already be in place, but there
are some special considerations in organizing a team for the APRM environment.
What I call scenarios is a special kind of response to risk impacts that is more
robust than that commonly employed under CPRM. Simulation is a form of
project dry run practice that better prepares teams for the real thing.

3.5.1 Organizing the team

Selection of a qualified project manager is important in any project, but is even


more important if the project is to be organized along APRM lines. An APRM
project manager must have all of the usual book learning concerning
management of schedules and budgets, and he or she must also have a deep
understanding of the project goals, and at least most of the technology required.
On top of that, he must have good political skills for two reasons. One is that he
must be able to attract and inspire competent, dedicated, and tenacious people
to work with him. He should be the kind of person who is a magnet for quality
people.30
30
I apologize to female readers for occasionally not using gender-neutral language.
Unfortunately, the English language is currently missing a few words that would make it a lot
easier to be gender-neutral without creating grotesque sentence structures. Ms helps when you
The other need for political skills arises from the fact that in most organizations,
APRM-type projects will be the exception rather than the rule. They will arouse
resentments because quality people are pulled away from other projects, other
resources may also be diverted away, and special perks may be given to team
members. Often, they will operate in an isolated location, perhaps beyond
locked doors, somewhat out of contact with most people in the organization. But
in spite of its special circumstances, the APRM team will generally still need
support from other groups, and will have to be able to negotiate to get it.

An APRM project manager should be experienced in running APRM-style


projects, or at least should have worked under someone who is. One company
for which I consulted considers all competitively bid fixed price projects,
especially those involving software development, to be critical. The reason is
that on most of them in past years, they lost money. All project managers for
these projects must have successfully managed at least one other such project,
or have been the deputy manager. In addition, they have to attend a special
one-week company-developed course on project planning in the fixed price
environment. Moreover, a special corporate fixed price review committee must
review their initial baseline project plan. The project managers who meet all of
these requirements are a select group who are typically assigned to the toughest
projects.

Ideally, an APRM project team will be a team already in place and practiced in
APRM disciplines. If you form an ad hoc team, pulling people from throughout
the organization to perform a critical project, there is high risk of failure unless
you have time and money to train the team in the APRM disciplines. This is
especially true if many of the people owe first loyalty to some functional manager
outside the project who will write their performance reviews and determine their
compensation. The best situation is when an APRM team is more or less
dedicated to critical projects in general, has a low turnover rate, and carefully
selects and mentors newcomers.

3.5.2 Scenarios.

In CPRM, the focus is on identifying risk drivers and trying to figure out ways to
mitigate them. Mitigation consists of reducing their impact on the project, their

dont know or care about marital status, but it is not the only new word we need. Can anyone
suggest a new word that means he or she?
likelihood of occurrence, or both. Unfortunately, not every identified risk driver
can be fully mitigated, in the sense that there will either be zero probability of
occurrence, zero bad impact on occurrence, or both. Moreover, not every risk
driver can be identified. Many projects still suffer some risk impacts in spite of
careful CPRM.

APRM embraces CPRM as a subset of its activities, to eliminate as many risks


as possible, but the real focus is on resiliency and fast correction when bad
things happen. Thats the importance of the iron triangle of APRM we have been
discussingrobust teams, robust plans, and rapid re-planning. But there is yet
another tool that APRM teams can bring to bear to better ensure success. I call
it scenarios.

What is a scenario in the project context? It is an activity of the team. It uses an


exercise somewhat akin to brainstorming. Heres how it works.

In small projects, the entire team assembles in a single room. In larger


projects, the team breaks into smaller, manageable groups, each of which
does the scenario exercise separately. The exercises by various groups
need not all be done at the same time.
Each group appoints a facilitator to move the exercise forward and to keep
records. The facilitator should have one or two assistants. The facilitator and
his assistants should all be senior people because they will need to have a
good understanding of the project.
The facilitators bring to the meetings a common, pre-prepared list of known
risk drivers that can have a serious effect on the project. Usually the list
should include no more than the top ten to twenty. The risk drivers should
have already been stated in terms of what are believed to be root causes.
Possible impacts and the tasks they impact should also have been
identified.31
The facilitator announces a risk driver, reading it off from the pre-prepared list.
He then reads off the most likely impact, describing it in words and (at least
roughly) quantifying it. The facilitator next says, This just happened. What
should we do?

31
It is especially important in APRM projects to get as close as possible to root causes rather
than just symptoms because it aids the team in quickly taking the most effective action if
something goes wrong. It is better generally to be able to fix the root cause than to deal with a
symptom.
Group members proceed in brainstorm fashion to develop partial or complete
contingency plans for responding to the problems.32 These are all recorded
by the facilitators (tape recording is a good idea, so that the exercise is not
slowed down by manual recording)
As time permits, the facilitators work through as many risk drivers and risk
driver outcomes as possible, recording proposed contingency plans. A single
session should not last more than about four hours, with two or three breaks,
to avoid mental fatigue.
If coverage of the key risk drivers and their outcomes was sparse in the first
session, subsequent sessions can be held to cover more situations.
After the sessions are complete, the facilitators meet to review possible
contingency plans. Some suggestions will be infeasible, and others will be
dubious. But there may be some that have a chance of working. These
should be written up with brief descriptions of the outcome(s) that could
activate them, the actions should be taken if they are activated, and who
should take those actions. It may be that some actions should be made
conditional on occurrence of certain conditions or events.
The facilitators recommend to the project manager that selected contingency
plans be established as project policy. A training session is then held to brief
all team members on these policies and how they will work. If any
preparatory work needs to be done to lay foundations to make the
contingency plans easier to carry out, those work assignments should be
made as special project tasks, funded by the project manager.

3.5.3 Simulation.

The purpose of simulation is to increase confidence that contingency plans will


be properly executed if needed. I believe effective simulations are of two kinds.
One kind is games whose main purpose is to inject a spirit of teamwork into the
team. The other is specific simulation of activation of contingency plans. I will
shortly give two examples of the latter kind. Then, I will close this chapter with a
description of a teamwork game called the Paper Aircraft Game. It demonstrates
powerfully the power of teamwork, and the inefficiencies created by
organizational silos.

3.5.3.1 Simulation of emergency procedures. Imagine a project to build a


new oil refinery. As the refinery concept nears completion, the project plan calls
for development and simulation of emergency procedures activated by a fire or

32
Brainstorming rules apply. See section 4.3.4.
explosion. The purposes of the simulation are to 1) test the procedures,
especially the time it takes to carry them out and their vulnerability to human and
other errors, and 2) determine what design changes, if any, in the refinery would
make accidents more manageable.

Part of the evaluation of the simulation will be an assessment of how much


organic fire fighting equipment and training the refinery needs, and how much of
the burden can placed in the hands of nearby community fire-fighting resources
under various conditions.

An event design group (EDG) independently designs a set of events to be used


in the simulation. These remain confidential and unknown to the designated
accident response team until they are announced at the time of the simulation.
Because the refinery is only in the conceptual design stage, the simulation
necessarily involves a certain amount of lets pretend, but every effort is made
to make it as realistic as possible. For example, written emergency procedures
are put into play, and the cooperation of local firefighters is obtained. There may
be primitive dummy structures. There are simulated injured and dead, and they
are transported to a simulated triage facility. From time to time, the EDG injects
random but credible stressful events into the scenarios as challenges. The same
group also acts as observers and takes notes on outcomes and possible
improvements for later lessons learned meetings.

Readers might ask if the lets pretend couldnt all be done on paper, or in a
computer. The answer is yes, but the computer may not be smart enough to
know about certain human errors, the effects of fatigue, or other unexpected
conflicts that will turn up in a more realistic simulation.

3.5.3.2 Simulation of the risk adjusted project plan. The simulation


described in the previous section is essentially a drill, but this simulation is more
of a game, and can be a lot of fun, as well as instructive, for the project team. A
definite benefit is that it forces acute awareness of the project plan and its known
risks. This game assumes a completed baseline plan, and a completed risk
analysis, with risk drivers identified, including statistical characterization at least
at the level of chapter 2, if not at the level of chapter 5. The plan must include
the planned risk mitigation efforts as distinct tasks or subtasks. The best time to
do this type of simulation is just as the project is about to kick off.

The core members of the project team assemble into a room. Prominently
displayed in the room are the project schedule in condensed network form, and
task budgets.33 The projects scheduling and financial control people participate
in the game by adjusting the schedule and tracking actual versus planned costs
as the game progresses. They can do this most effectively if they use the same
computer tools they will use in the actual project. (Thus the tools are also
tested.) Using a computer and a projector to project the cost and schedule and
other key results on a screen is most helpful.

As in the previously described simulation, there is an EDG. Typically this group


will comprise senior people who have a good understanding of the project plan,
and are mature enough to understand the risk drivers and their possible
consequences.

A time compression scale is agreed upon. For example, ten minutes of real time
might represent one month of project schedule time. With this scale, in a project
expected to last four years (48 months), the game would last 480 minutes (8
hours), plus time for breaks and timeouts to take care of discussions and
administrative details. It might take more than a single workday to play the game
for a long project.

Each task in the schedule is assigned a task lead. Generally, this should be the
same person who is expected to lead the task in the project. The task lead
should be intimately familiar with all tasks under his or her purview. It is best that
he or she have a brief written description of each task, as opposed to trying to
commit the nature of each task to memory. In particular, each lead should know
the expected timing and nature of all deliverables that are to be produced by
the task(s) he or she leads. Deliverables include all products of each task. They
can be anything from a document such as a report to lines of code written to
hardware tested or built. For purposes of the game, evidence of completing the
deliverable shall be a simple handwritten note, as opposed to the deliverable
itself. The note is handed to the task customer, who can be either an internal or
an external customer. The EDG represents all external customers.

Before the game begins, the EDG should meet and go over the list of risk drivers
and their possible outcomes and associated probabilities so that they are
thoroughly familiar with things that might happen, when, and which tasks can be
affected. The EDG must be equipped with a computer, pocket calculator,
random number table, or other device for drawing random numbers. They will

33
Condensed network form means a form that is appropriate for a game. A network with 1,000
tasks would be unwieldy. On the order of 100 tasks or fewer is appropriate.
from time to time use this to determine if a task goes according to plan, or if there
is a risk driver impact that drives if off plan. When something happens to drive
the project off plan, the EDG announces the outcome to the team. The affected
team leader(s) must devise a response, and announce it. The EDG shall be the
sole judge of the effectiveness of the response.

The game begins by pretending to kick off all tasks that are scheduled to begin
at time zero. During the first month, which might be only ten minutes,
depending on the selected time compression ratio, the tasks are performed
according to plan. If any risk drivers can affect any ongoing task in the first
month, the EDG draws random numbers to determine which if any of them act,
and also their impact. They announce all such impacts for the benefit of the
team, and the scheduler and controller make appropriate adjustments to cost and
schedule. The game is then advanced to the next month, and continues in a
similar manner. The EDG also at appropriate times judges the success of
attempts to mitigate, and the risk numbers are adjusted accordingly. This will
typically involve some salesmanship on the part of team leaders to convince the
EDG of the viability of a proposed solution. The solution could be ad hoc, or one
worked out in advance as a contingency plan. Notes verifying delivery of
deliverables are passed at the appropriate risk adjusted time. Passing these
notes permits starting of new tasks whose start is dependent on the deliverable.

This game almost always results in detection of built-in and avoidable schedule
and budget problems. It also serves to make the team much more plan aware
and risk aware. If time is available, the game can be played more than once. I
suggest that if this is done, each replay should incorporate lessons learned in the
previous play. I also suggest that in each replay, the EDG slightly increase the
toughness of its decisions.

3.5.4 The paper aircraft game.

This an instructional game played by two competing project teams, aided by a


facilitator who has at least one assistant. The purpose is to teach the importance
of good communications and teamwork in the development and production of a
product within cost and schedule constraints. It is also an exercise in meeting
requirements at minimum cost and with minimum risk.

The facilitator divides the participants into two teams of approximately equal size
by random selection. They are called the Red Team and the Blue Team. There
must be at least five people in each team. Each team will designate one of its
members as Chief of Design, a second member as Chief of Production, a third
member as Chief of Testing, and a fourth as Chief of Quality. These four
represent the four major areas of product development: design, production,
testing, and quality. A fifth member is designated the Customer for that team.

If there are more than five members in a team, they can be assigned to help any
of the four chiefs, or to assist as customers. Or, they can be silent observers,
perhaps assisting the facilitator and recording events that transpire for future
review and analysis of lessons learned.

The two teams will be assigned to separate rooms. Red Team members,
including the Customer, will all sit at the same table or in closely adjoining tables
so they can readily communicate throughout the product life cycle. The Blue
Team design, production, testing, quality and customer functions will all sit
separately, far enough apart that they cannot hear each others conversations. If
necessary to make this happen, they will be put in separate rooms.

To begin the game, the facilitator discusses the Project Goals, spending enough
time to be sure that all participants clearly understand them. The positive goals
(desired accomplishments) are 1) to create a paper aircraft that when launched
by hand will achieve a distance of at least twenty feet horizontally away from the
launch site, and 2) to comply with certain product quality requirements, noted
later.

The negative goals (constraints) are to complete the project on schedule and in
budget. The schedule goal is to meet the positive goal requirements within
twenty minutes from the time that the materials are delivered to the team. The
cost goal is to not exceed a cost of $40. Cost is calculated at the rate of $1 per
minute for the first five minutes, $2 per minute for the next ten minutes, and $3
per minute thereafter. Fractional minutes count as a whole minute. As later
noted, there are opportunities to reduce costs by reducing the materials
consumed.

A team completing in twenty minutes and using all of the materials will
experience the budgeted cost of $40. A team completing five minutes late will
experience a $15 budget overrun, less any materials savings. The game is
terminated at 20 minutes, regardless of the completion status of any team.
The facilitator supplies each team with the necessary raw materials and tools. All
of the following are delivered to the Red Team. As noted, the team may achieve
some cost reductions by not using all of the materials:

Four sheets of white 8-1/2 x 11 stationery (team gets a credit of $2 per


sheet returned unused)
Three paper clips (team gets a credit of $2 per paper clip returned unused)
Four pencils with erasers
Two rulers marked in 1/16ths of an inch
Four rubber bands (team gets a credit of $3 per rubber band returned
unused)
One roll of adhesive transparent tape, wide
One pair of scissors
One pad of scratch paper

Deliveries to the Blue Team are made to functional groups as follows. Each
group is made aware of the resources available to the other groups. The Blue
Team may also achieve some cost reductions by not using all of the materials:

Design
o Two pencils with erasers
o One ruler marked in 1/16ths of an inch
o One pad of scratch paper
o Six blank Memo forms for communicating with other groups
Production
o Four sheets of white 8-1/2 x 11 stationary (team gets a credit of $2
per sheet returned unused)
o Three paper clips (team gets a credit of $2 per paper clip returned
unused)
o Two pencils with erasers
o One ruler marked in 1/16ths of an inch
o Four rubber bands (team gets a credit of $3 per rubber band returned
unused)

Exhibit 24 Robust Communications in


C
Multifunctional Team

Customer

Design Quality

Production Test

Exhibit 25 Linear Model for Communicating Product


Development Information

Customer Design Production Quality Test

Communication rules:
1. All communications must be written or pictorial! The facilitator is the messenger.
2. The Customer delivers the Project Goals document to Design. Thereafter, the only communication with the
Customer is when Test notifies the Customer that flight testing was successful, thus completing the project.
3. Design sends Production completed designs and specifications for the product. If Production has a problem
interpreting or implementing them, it may communicate in writing with Design, who may respond in writing.
4. Production delivers the product but no other communication to Quality. If Quality rejects the product, it sends
Production a written explanation of why, and returns the product to Production. Otherwise, it forwards the product
to Test.
5. Test must flight test the product as it was delivered to them. They may not make any adjustments to it. If flight
test is successful, Test notifies the Customer. If it is not, Test returns the product to Production and notifies
Design of the test results. Design must then send new design information to Production.
o One roll of adhesive transparent tape, wide
o One pair of scissors
o Six blank Memo forms for communicating with other groups
Quality
o One ruler marked in 1/16ths of an inch
o Six blank Memo forms for communicating with other groups
Test
o Six blank Memo forms for communicating with other groups.

The Red Team, members of which are all seated together, work together to
design, build, test, and assure the quality of their paper aircraft. There are no
limits to the level at which they communicate and cooperate. Their common goal
is to build their paper aircraft and meet the quality and flight requirements within
the 20 minute constraint.

The Red Team operates under the modern concept of a core functional team.
Communication is unfettered in all directions, as depicted in Exhibit 24. The
solution of problems is a shared responsibility, and anyone who offers an
effective solution is listened to, regardless of their occupational specialty.

The Blue Team design, production, test, quality, and customer groups are all
seated separately. They may not talk directly to each other. Their exchange of
information is done in accordance with the linear model that was almost
universally used in major projects in the 20th century, and whose use is still
widespread today. This is sometimes referred to as the over the wall mode of
operations. The various product development functions customer, design, etc.
each reside in a silo that isolates them from the other functions. Their
communication is through documents, and only through prescribed channels.
They produce the documents, and toss them over the wall to the other
functional groups.

Not all linear models work exactly the same way. For purposes of this game, the
linear model will be that depicted in Exhibit 25.

The following information regarding project goals shall be made available to all
members of both teams.

Information for Teams


Each team shall design and build an aircraft (a device that travels through the air)
made primarily of the paper that is supplied to them (not including the scratch
paper or Memo forms). The design shall incorporate at least 100 square inches
of paper (note that an 8-1/2 x 11 sheet of paper has an area of 91 square
inches). Other materials that may be incorporated into the design include any
amount of the transparent adhesive tape supplied to the teams, and/or the paper
clips and rubber bands. No other materials may be used.

The aircraft shall travel a distance of at least 25 feet horizontally from the launch
site when launched one time by a single person. One or more of the rubber
bands supplied to the teams may be used to assist the launch.

The design must be documented in specifications and dimensioned drawings


with tolerances (these can be freehand) sufficient to permit production of
additional quantities of the aircraft at a later time. The produced aircraft shall
conform to the drawings and specifications.

The Chief of Quality for each team must certify (orally) to the Facilitator upon
project completion that the produced aircraft:

Is made only from authorized materials and contains at least 100 square
inches of paper
Traveled at least 25 feet horizontally from the launch site when launched as
prescribed
Is properly documented in drawings and specifications and complies with the
documentation.

The facilitator may inspect the aircraft and documentation and shall be the sole
judge of the accuracy of the above certifications, and may disqualify a team that
is not compliant.

The time allowed for design, production, and test is twenty minutes. The game
will be terminated at 25 minutes. The budget is $40. Time is charged at $1 per
minute for the first five minutes, $2 per minute for the next ten minutes, and $3
per minute thereafter. Partial minutes count as whole minutes. Credits are given
for unused materials as follows: $2 per sheet of white paper; $2 per paper clip,
and $3 per rubber band.

Read the requirements carefully before beginning design. Creative, low


cost design is highly encouraged.
For all communications between groups the Blue Team shall use the prescribed
memo form. A copier is available for Blue Team use only. All copies shall be
made by the facilitator or his designee.
MEMO
All communications between Blue Team groups must use this form. Copies of
drawings and specifications may be attached. You may NOT write on the back.
The facilitator will effect delivery, and if requested will make copies.

TO

FROM

REGARDING

MESSAGE
Chapter 3 review questions

1. If you had to describe the differences between CPRM and APRM in a single
word, what would that word be?
2. Consider a project you once worked on that used neither CPRM nor APRM,
where at least one thing went seriously wrong during the project. Could
CPRM have detected and at least partially corrected it? What about APRM?
3. For the project in question 2, make your best guess at the extra costs of all
things that went wrong. Then, make your best guess at the costs of
implementing APRM on that project, including training. Finally, make your
best guess at the money that APRM likely would have saved. What are your
conclusions, if any?
4. Describe some of the problems that would have to be overcome to ready an
ad hoc team, whose members had never worked together before, and also
had not worked in the APRM environment, for operation in the APRM
environment.
5. What issues do you think you would have to resolve with other organizations
if you decided to operate a project in APRM mode in your company?
Chapter 4Sample Implementations

4.1 Introduction

The purpose of this chapter is to give you hints as to how you might implement
PRM in your project if you have not already done so, or if you think your
implementation is weak. The examples given here are purely tutorial. They
should not be regarded as norms, nor are they necessarily suitable for every
project.

The chapter is divided into two major parts. The first part suggests project
organization and operations; the second, typical methods and tools, all of which
have been used successfully in at least one project.

4.2 Project organization and operations

How should PRM fit into project management? Of course, it should be seamless,
a routine part of operations, and not a heavy burden on the project team. But
what should the various players actually do?

Exhibit 26Example Project Organization


and Operations

Project manager Control


-review
Top N -integrate Assign
Risks across teams responsibility

Technical leads Analyze Plan


-review Review
-recommend plans
-prioritize -approve plans
-evaluate Status/
-classify Plans
forecasts

Team members Risk Drivers


Identify Track
Status/forecasts
210
Because there are many ways to organize a project, the suggested
organizational process example shown in Exhibit 26 is generic. It assumes that
there is a project manager or equivalent, and that there is a set of persons who
are called technical leads, even though the project may not be about
technology.34 Finally, it assumes that under each technical lead, there are team
members who do the actual work.

Note that Exhibit 26 allocates all of the PRM functions defined in chapter 2
except communications, which is everyones responsibility. The project manager
is presumed to exercise a management function, and to oversee the control
function. This includes reviewing PRM actions and ideas and coordinating
actions across the team. It includes setting policy on all matters germane to the
project, including risk management. It generally also includes approval of
mitigation actions that will cost money or tie up resources, although sometimes
this is delegated to technical leads if the impacts are within their discretion as
defined by existing policy. Most of the actual work of PRM is delegated
downward.

In very large or complex projects, it is not unusual for the project manager to
have a deputy or other person who assumes some of his duties in PRM. This
person also may be designated the Project Risk
Coordinator, or some such title. The job of project
risk coordinator is seldom a full time job, except
possibly in a very large and complex project.
If you do
something you Some project teams advocate the use of a PRM
are sure meets Board or equivalent. This is a board or committee
with everyones that acts much as a change board does in
approval engineering projects. It typically reviews all risk
findings, perhaps estimates impacts of mitigation,
and makes recommendations to the project
manager. It is my belief that standing boards of this type are an unnecessary
complication. They may delay the management of risks, and they create an
extra layer of management and extra cost.

34
By technical lead, I mean a person who has expertise and has been designated the leader of
a major work area of the project. In a large construction project, one technical lead might be the
framing supervisor. In a financial project, one technical lead might be a CPA who has expertise
in auditing.
The primary responsibility for identifying risk drivers rests with the team
members. The reason: they are closest to the actual work and are generally
better equipped to discover the nuances of causes and possible effects. This is
not to say that everyone else should be blind to risk drivers. There certainly can
be major risks that are visible only to the project manager, or to team leads.

The analysis function is assigned at the level of the technical leads. They have
responsibility for reviewing risk driver status from time to time, prioritizing
mitigation and other actions, evaluating risk driver impacts, and classifying risks
into accepted categories.

In this implementation scenario, it is assumed that only the top N risks are
actively studied at a given point in time, and it is only the top N whose status is
reported to the project manager. N is a number of risk drivers that the team has
agreed it has the time and resources to actively manage. Commonly used
values of N are N = 10 or N = 20. It is not uncommon on large, complex projects
to have over 100 reported risk drivers, many of which turn out to be trivial. To
give significant management attention to this many risk drivers usually would be
to dilute the effectiveness of the risk management effort.

The question of course arises, how do you choose the top N? To do this
rationally, it is often helpful to bring in quantitative notions, although there may be
many qualitative considerations as well. In chapter 2 and again in chapter 5, the
notion of expected value of a risk driver is examined. Expected value of cost
impact, schedule impact, or some combination of both, usually is an excellent
way to choose the top N, all else being equal. Of course, a risk driver that is a
strongly potential project stopper belongs in the top N, even if, by some fluke, its
expected values are relatively low.

After the project manager has completed his or her review, one or more
assignments of responsibility for action are made at the technical lead level.
These typically include studying appropriate actions, and recommending and
approving plans.

The plans are entered into the tracking system, which is maintained by the team
members. Status and trend information from the tracking system is assumed to
be available to team members, technical leads, and the project manager. This is
can be done in most complex projects using computer networks.
4.2.1 Activity settings

PRM activities occur in three basic settings:

Individual activities performed on any day by engineers, specialists, technical


leads, managers, etc.
Weekly team meetings led by technical leads
Monthly project meetings led by the project manager and attended by the
technical leads and other key representatives
Individuals (or small sub-teams) at the appropriate level are responsible for
identifying new risks, classifying them, and for making initial rough screening
estimates of their impact, probability, and time frame. An individual or small sub-
team assigned responsibility for a new risk must decide whether it should be
accepted, researched, watched, or mitigated. If a risk is to be mitigated, the
scope of the mitigation effort must be decided and converted to either an action
item or a mitigation plan. Assigned risks must be tracked, and periodic status
reports must be prepared.

Weekly is often a good interval for team meetings. These are generally multi-
purpose team meetings, and PRM is not normally the only subject discussed.
Risk related activities at weekly team meetings typically include:

Technical lead establishes priority of teams risks and recommend which ones
should be recommended as top N--these are the risk drivers reported to the
project manager
Technical lead assigns responsibilities for new risks
Mitigation plans are reviewed and approved (possibly after modification)
Status reports are presented, and the team decides if risk drivers can be
closed, or if mitigation plans need to be changed, and if tracking should
continue for specific risk drivers.

Monthly is a common interval for project-wide meetings. Typical activities related


to PRM at monthly project meetings include:

Technical leads bring their top N risks to the meeting for review and possible
re-prioritization
Project manager and technical leads review potential risk closings
recommended by teams
Mitigation activities and results are reviewed
Major mitigation plans and resource allocations are reviewed and approved or
disapproved
Successful mitigation efforts are recognized, and informal awards are made.

4.2.2 Risk database.

Every project in which risk is a serious consideration should have a risk driver
database. In smaller projects, and sometimes in large ones, the risk driver
database is combined with the projects problem action item list. This
consolidation avoids maintenance of two separate databases, and is a good fit
generally. Risk drivers, after all, are just conditional problems, while problems
are just risk drivers that have gotten ripe and fallen to the ground, so to speak.

An excellent practice, and one frequently overlooked, is to add lessons learned


to the database whenever risk drivers and problems are closed out. This can be
a valuable source of robust planning information across all future projects, or
even, sometimes, for the same project. Lessons learned statements ideally
should contain a complete history of the problem, or risk driver, including actions
taken, dates, and key players.

The format of the database, and the medium where it is kept, are decisions for
the team to make. Options range from a simple notebook to a networked
database with query features. The latter is appropriate for a small, compact
project; the latter is almost a necessity for a large, dispersed project.

Section 2.7 listed some typical categories of information that a team might want
in its database. That list is repeated here for convenience.

ID number
Date of first entry
Name of responsible person
Text description of root cause
List of tasks potentially impacted
o Cost impact (each task)
o Duration impact (each task)
o Type 1 performance impact (each task)
o Hammock? (Yes/No)
Text description of mitigation plan
Current status of mitigation actions (updated regularly)
Lessons learned
Remarks

4.2.3 Risk closure.

When the risk exposure due to a particular risk driver has reached a sufficiently
low level, a level agreed to within the team, the risk driver is put on inactive
status. This does not mean it should be purged from the database. For several
reasons, it is always wise to retain records of risk drivers until the end of the
project, and perhaps longer in some cases. Inactive status simply means that it
is no longer tracked or reported, and it cannot be in the top N (unless it rises from
the grave!). As previously mentioned, when a risk is closed, the lessons learned
should be documented.

4.2.4 Risk management plan.

The previous three sections (4.2.1, 4.2.2, 4.2.3) suggest one scheme for
implementing PRM. It is a practical scheme in most situations, but is not the only
such scheme.

To maintain order and avoid misunderstandings,


whatever scheme is adopted should be incorporated
into a written risk management plan. All team
Being sure members, and especially all new team members,
mistakes will should be briefed on this plan, and any changes
occur is a good made to it. The project manager should formally
frame of mind for approve the plan.
catching them
4.3 Typical methods and tools

This section of the chapter describes some typical methods and tools that might
be used in PRM. These are suggestions only, and should not be regarded as
norms. They are intended merely to be a source of ideas.

4.3.1 Root cause analysis.

Some seeming risk drivers are in reality just symptoms of the true root cause.
How can you distinguish a symptom from the true root cause? It isnt always
easy, but most of the time it is worth the effort, because it gives great insight into
the nature of the risk drivers and the consequences that could flow from it. This,
in turn promotes the possibility of more comprehensive risk mitigation.

In principle, the method of finding the root cause is simple. The basic idea is that
you first describe the most obvious manifestation of the potential problem, such
as a possible late delivery of some newly coded software. This manifestation is
generally the one that first occurs to the project team, because their thoughts are
typically focused on events and their proximate causes, as opposed to root
causes.

Once you have stated a symptom or proximate cause of a potential problem, you
ask the question, How could that happen? You may come up with just one
answer, or there may be several answers. For each of these answers, you can
ask again, How could that happen? And again, you may come up with even
deeper answers. At some point you decide that you dont really know a deeper
cause, or that the only explanation lies in some mystical expression such as Its
a mysteryGod must have willed it. Such expressions are not useful as root
causes because they put mitigation out of reach of mere mortals. (Sometimes
its out of reach even when the expression is not at all mystical!)

It is not hard to see that this process of reaching for


deeper causes could lead to the discovery of
entirely new risk drivers. These, in turn, can be
analyzed for potential impacts. This is yet another
The leak in the
good reason for a careful evaluation of root causes.
roof is never in
Cause and effect diagramming is a method for
the same
illustrating the relationships between a perceived
location as the
risk and the many factors (risk drivers) that can
di
cause it. The method can be used by an individual
or by a group. When used by three or more
persons, one should act as facilitator and recorder.
The method uses brainstorming to help identify factors populating a cause and
effect diagram, as demonstrated in Exhibit 27. Brainstorming rules apply (see
section 4.3.4).

Here are the typical steps in the process.


The facilitator presents a risk driver symptom for which root causes are to
be found (in Exhibit 27, the perceived symptom is a possible schedule slip
in the delivery of the ABC system).
The facilitator explains the cause and effect process
The group brainstorms and constructs a fishbone diagram on board or
on paper
The group identifies cause factors and adds them to fishbone structure
Then group identifies the most significant causes and circles them
The group collects additional information if necessary to refine the
diagram
From the process should emerge a clearer vision of the nature of the risk
drivers and how to describe them

Exhibit 27--Typical Cause and Effect Diagram

Process / Policy Personnel


On other
Training Key personnel program
Original not available
inadequate
estimate
exceeded Less reuse of Leave when
Too high old designs trained Potential
learning than planned
curve
ABC
Poor Subsystem
Inadequate
estimating
Inadequate support system
Equipment
problems
Schedule
test system
Slip
Cost Supplier labor
Cost containment difficulties
containment

Hardware Tools / Environment

191

Note in Exhibit 27 that four important risk drivers have been identified as
contributing to the symptom shown (assuming that cost containment is actually
two separate risk drivers). An interesting possibility is that each of these drivers
may have other outcomes that should be considered. Another interesting
possibility is that other minor drivers not considered important at this time, such
as poor estimating, or supplier labor difficulties, have been identified and can be
put on a watch list.
The cause and effect diagram is a powerful tool for analyzing risk symptoms to
find root causes. Interestingly, it has at least one other benefit in PRM: Once the
chain of causality is established, it is easier to visualize the most cost effective
approach to risk mitigation.

4.3.2 Multi-voting.

In processes such as cause and effect analysis, opinions can differ. It is not
always easy to get a clear consensus on what is important and what is not. The
idea behind multi-voting is that it forces a consensus in the democratic sense. It
essentially collects the weight of opinion. Of course, there is no guarantee that
the collective weight of opinion is more accurate than the opinion of any
particular individual, but chances are that it is in most cases else democracy as a
form of government probably would have died off by now.

Multi-voting requires at least three participants, and is simple and fast. It goes
best if one of the participants also acts as a facilitator. Here is the process
(looking at Exhibit 28 will help you understand it):

Exhibit 28Multi-voting

Voting Form One participants vote


Item Points in points
Item A
Item B Tally of Votes
Item C
Item D Item A Totals
Item E 6, 10, 12, 5, 9.42
Item B
Number of votes 15, 14, 15, 15, 13..72
Item C
12, 12, 11..35
178 Total points per item
Make a list of the items under considerationthe goal of the group is to rank
them by importance or likelihood
Assure that all participants understand all of the items on the list
Choose the number of votes each participant can cast in each round (long
lists may need several rounds of votinga practical limit for the number of
items to be considered in each round is about 20). A good rule of thumb is
that: each participant can cast a number of votes equal to one-third of the
number of items in the current round. Each participant can distribute his
votes among the items in the list any way he chooses. Each vote awarded to
an item by any participant is worth one point.
Conduct voting and tally the points.
In each round, eliminate all items that got no votes at all. In the final round,
retain only the number of items you want, and select them from the highest
ranked items, in rank order
Review rankings with participants to be sure there are no misunderstandings.

Note that multi-voting can be used either to select the most important risk drivers
from a list, or to determine the groups best consensus of the likelihood of
occurrence of a list of outcomes. The number of votes awarded to each outcome
can be used as an indicator of probability.

4.3.3 Tri-level attribute evaluation.

This is a simple method for making a coarse or preliminary screening


evaluation of a risk driver. It is mildly quantitative, but not demandingly so. The
result of a tri-level attribute evaluation could be the final evaluation made by the
team, or it could be a screening device to determine which risk drivers should be
subjected to a more detailed analysis.

Because the method is coarse in nature, it cannot distinguish fine shades of gray.
Risk drivers that a more exacting analysis would consider to be somewhat
different could be considered as about the same using this method. The method
essentially lumps cost and schedule impacts. If it is important that these be
segregated, this is not an appropriate method. An advantage of the method is
that it does not require much if any researchit can be used in a meeting, using
only the information (or opinions) people already have. It is therefore an
inexpensive way to evaluate risk drivers.
In this method, a standard table is used to convert probability and impact into a
single index of risk exposure. The index has three levels (hence the name):
high, moderate, and low. This is the conversion table:

Exhibit 29Tri-Level Attribute Risk Exposure Table

Probability
Very Likely Probable Improbable
Catastrophic High High Moderate
Critical High Moderate Low
Impact Marginal Moderate Low Low

To use this table, one must first make two decisions: 1) is the impact
catastrophic, critical, or marginal, and 2) is the probability of occurrence very
likely, probable, or improbable. Notice that impact is vague, and so is
probability. Also vague is just which project tasks might be affected. To make
these decisions consistently, it is necessary that the project team agree in
advance on some criteria for making them. It isnt extremely important what the
criteria are, but it is important that they be used consistently

I now present an example of criteria that might have been agreed to by a project
team. The listing is pretty much self-explanatory.

Impact: Catastrophicimpact is judged to be catastrophic if one of the


following could happen:
o Schedule slips >20%
o Cost overrun > 20%
o Project loses funding
o Higher life cycle costs
o Client cant use product
o Morale suffers, people leave
Impact: Criticalimpact is judged to be critical if one of the following could
happen:
o Schedule slips 10-20%
o Cost overrun 10-25%
o Workarounds for quality problems
o Morale suffers
Impact: Marginalimpact is judged to be marginal if it is neither catastrophic
nor critical
Probability: Very Likelyprobability is judged to be very likely if there is
>70% chance of occurrence
Probability: Probableprobability is judged to be probable if there is a 30-
70% chance of occurrence
Probability: Improbableprobability is judged to be improbable if there is
<30% chance of occurrence

Once the decisions are made as to the level of


A complex impact and the level of probability, the table
system that provides the level of risk exposure. For
works is example, if the level of impact is assessed as
invariably found critical, and the level of probability is assessed
to have evolved as very likely, the table assigns the level of risk
f i l exposure as High. The advantage of this
method is that it can resolve complex information
into a one-word description, which is often a
useful thing to do. Disadvantages are that distinctions between cost and
schedule and other impacts are lost, and there is no information concerning
which tasks might be affected.

4.3.4 Brainstorming.

Brainstorming is a group method for generating ideas, which can be inputs to


other methods for screening, grouping, prioritizing, or evaluating. It is particularly
valuable for identifying risk drivers.35 The method actually tends to initially
identify possible impacts or symptoms of risk drivers; the risk drivers themselves
typically are ferreted out using root cause analysis.

Here are some characteristics of brainstorming:


Often used for identification, but can also be useful for analysis, tracking,
controlling, etc.
Participants verbally identify ideas; participants may build on each others
ideas
Best used in a small group (<10 people)
Requires a skilled facilitator to deal with conflict and to encourage shy people
to participate
Does not require participant training
Is an enjoyable exercise
35
It is also an effective method for generating ideas for risk mitigation.
Generates a lot of ideas in a short time
Here are suggestions on conducting a brainstorming exercise:

Facilitator explains subject matter, process, and the rules:


o Do not judge or criticize ideas of the speaker
o Encourage wild ideas and out of the box thinking
o Build on ideas of others (chaining)
o Go for quantity of ideas; dont try to develop ideas into plans or root
causes (do that later)
Participants generate ideas using one of two processes:
o Unstructured: call out ideas spontaneously
o Round-robin: each participant takes a turn, in order, to state an idea
(forces shy people to contribute)
Record ideasfacilitator writes on a visual medium in sight of all participants
Review listall review for clarity and understanding; revise as needed

As an alternative to free-form brainstorming, it is sometimes helpful to do a


more structured version. In structured brainstorming, the group starts with a
memory-jogging checklist, such as, for example, a subset of the INCOSE
checklist in chapter 2. They begin at the top of the list, and generate ideas about
each item in the list, in turn, until no more ideas seem to be forthcoming. At that
point, the facilitator moves on to the next item on the list.

I once participated in an effective structured brainstorming session that used a


list of only four words to trigger ideas. The acronym WFHN (wiffen) serves as a
reminder of these words, which are Wild, Far, Hard, and New. In this context,
the word wild connotes anything untamed or difficult to control or predict. Far
can mean something geographically distant, but it can also mean distant in time,
or something very unusual. Hard means complex or difficult, and new of
course means unfamiliar.

The output of a brainstorming session is normally passed on to a designated


person or group for coarse screening and perhaps root cause and other
analyses. The better ideas will become risk drivers, and the best of those will
make the Top N list.

It is a fairly prevalent idea that brainstorming is the way to do risk identification. It


is merely one way. A potential problem with brainstorming is that it is strongly
biased to reflect the ideas of the participants, however much they strive to think
outside the box. The team that uses only brainstorming as a method of risk
identification is likely to miss some significant risk drivers. Other effective, more
systematic methods of risk identification are presented in the next section. If you
have been relying only on brainstorming, please consider using these other
methods as well.

4.3.5 Systematic risk identification.

Brainstorming, discussed in the previous section, might be characterized as a


chaotic, almost random search for risk drivers. In this section, we will present
methods that might fairly be called systematic. They are systematic in the
sense that they embody a specific methodology that may vary somewhat in its
details from project to project.

4.3.5.1 Project and corporate checklists. Many firms and government


agencies that repeatedly do projects incorporate their lessons learned into
project or corporate checklists. Often, they require that a project team complete
these checklists before beginning or bidding on a project. Typically, these
checklists are not immersed in the details of the project; they are more general in
nature. Their usual purpose is to communicate to higher management a feeling
of comfort that the project team has done its homework and has not bitten off
more than it can chew. The focus often is on business relationships, such as
credit ratings of customers and subcontractors, the nature of contractual
relationships, the presence or absence of certain clauses in contracts,
competitive positioning, terms of payment, etc.

4.3.5.2 Professional checklists. Modern projects commonly require and


use highly skilled professional people in their execution. Depending on the
nature of the project, these people could be engineers, accountants, scientists,
medical doctors, trainers, or other highly skilled persons.

Given their skills and their experience, these professionals can usually sense
when a given situation may be risky (structural risk). For example, they may
have strong doubts that what they are being asked to do can be done with the
resources available, or even with unlimited resources. But they may be unwilling
to speak for fear of a common syndrome in which project managements
response to their concerns may be something like If you cant do it for that, Ill
find someone who can. That kind of response can quickly chill any overt
attempt at project risk management. And to be successful, project risk
management must be in the open.
They are likely to feel they need a buffer against being accused of whining. One
such buffer is a professional checklist. Such a checklist could take several
forms, but one that I recommend is a summary listing of their understanding of
what exactly they are supposed to do as members of the project team, coupled
with an estimate of the labor hours, material dollars and information flow they
need to do each item. It should also recite the timing and sources of these
resources if they are to meet the preliminary schedules that have been set for the
project.

A very successful project manager for whom I once worked required such
checklists from every professional area of expertise associated with his projects.
By collecting and comparing these checklists very early in the project planning,
he was able to spot weaknesses, and to make appropriate corrections.
Ultimately, he negotiated the checklists with the professional groups on his
project, and put them into a file that was available to every project team member.
The message of this file was, This is the way we have all agreed to do it; its now
up to you to make it happen. You will be graded on how well you make it
happen. The file was used as the basis for the project schedule and budget. It
was updated as the project evolved. Because the entire team had good visibility
into the plan, including what everybody else was supposed to be doing, this
project manager in the 12 years that I knew him never had a project failure.

In many projects, timing (and quality, too) of the information flow is critical to
success. For example, in aircraft engineering projects, the design proceeds from
skill to skill in a fairly conventional sequencepreliminary sketches, then weights
analysis, then loads analysis, then stress analysis, etc. For each skill area to
meet its schedule, it must have information from other groups that are its
suppliers, and that information must appear when it is needed and expected.
Unfortunately, formal project schedules sometimes seem to be unaware of this
critical flow of information. For example, the schedule may note the task of
writing a stress report, but the stress information may have been needed
elsewhere long before the report is issued. The need date for the information
goes unrecognized on the schedule.

4.3.5.3 Goals and requirements reviews. Recall that I have used the term
positive goals to refer to what the project sponsor wants to accomplish with the
project. I use the word requirements to refer to what is necessary to perform a
particular selected baseline implementation of the goals. I have also pointed out
that there are many ways goals and requirements can introduce risks into
projects. The purpose of a goals and requirements review is to try to detect and
fix conflicts, inconsistencies, ambiguities, and lack of clarity among and between
goals and requirements.

Such a review should begin with the goals, because they are senior to the
requirements. Examination of goals should be done very early in the project else
the whole project effort could suffer. The discussion of goals in chapter 1 could
be a basis for this inquiry. If the project sponsor changes the goals in midstream,
the examination of the goals should be repeated.

Reviews of requirements should be done each time a new baseline design is


designated, and each time the sponsor changes the goals. Questions to be
asked include:

Are there conflicts between the goals and the requirements?


Are the requirements internally self-consistent?
Are the requirements clear and unambiguous?
Are the requirements doable?
What new risks have the requirements introduced into the project that the
goals did not?
What old risks have the requirements removed from the project, or reduced?

These questions should be asked not just about the requirements within each
specialty group, but across the project as a whole.36

4.3.5.4 Basis of estimate reviews. Every project of significance makes


estimates of the resources required to perform its tasks, and these estimates
form the basis of the project plan.37 Every estimate has what estimators
commonly call a basis. The basis of an estimate is the set of ground rules and
assumptions that underlie it, the foundation upon which it rests. Sound

36
. I have worked on one multi-billion dollar project where failure to take the steps described here
eventually tied the project into knots of confusion and recrimination and almost resulted in its
cancellation.
37
I should qualify this statement. I have found that common practice in a certain highly visible
government agency, with an annual budget in the billion of dollars, has been to create politically
viable project budgets independently of the estimates made by the project teams, and typically
Nothing
much lower. Asever
a result, the agency is frequently accused of bad estimating when its projects
gets built on The problem is not always bad estimating, but an administration direly in
overrun their budgets.
need of schedule or
adult supervision (the agency has been referred to as the worlds largest sheltered
workshop).
withinI budget.
am happy to report that, as this is written, this situation is apparently on its way to
being corrected.
estimating practice requires that when an estimator does an estimate, the basis
must be faithfully recorded for future reference. Here are four paraphrased and
sanitized (to protect the innocent) notes made by estimators concerning the basis
of a set of estimates:

Telephone conversation with buyer Jane Doe (date)


Engineering judgment of Hydraulics Group lead, Jay Smythers (date)
Quote from Maelstrom Corporation (date)
Average percentage for project management in last five projects (date)

Even if the estimator conscientiously does his or her job of recording the basis of
estimate, there can easily be significant risks in the estimates. To provide
concrete examples of how this might happen, lets take each of the above four
bases and examine its context:

The telephone conversation with Jane Doe was about a proposed purchase
of 200 telephone handsets with accompanying wiring within several buildings.
Jane assured the estimator that she had a firm quote from a reliable supplier
that was good for six months. There was no risk. Later follow-up by a risk
audit team revealed that indeed the suppliers prices were valid and
reasonable, but that the quantity 200 was a snap judgment by the
communications engineering group. They had not had the opportunity to
inspect the site to see how many phones were actually needed, so they used
200 as a guess in the vein of surely that will be enough. It turned out to be
about half enough.
Whenever an estimator has to record the word judgment in an estimate
basis, it should set his or her risk antennae to quivering. There are so many
things that can go wrong with a judgment. Among the obvious questions
are:
o Do the people who made the judgment fully understand the work?
o Do they have enough experience in this kind of work to know what they
are doing?
o Are they being overly optimistic to impress somebody?
o Havent they done anything similar before that they could use as a
basis?
The quote from Maelstrom Corporation, it turned out, was a budgetary
quote. In other words, it was a rough estimate, not a commitment. The word
quote can have many meanings.
This one seems pretty reasonable, at least on the surface. If project
management has been averaging, say, twelve percent of project costs over
the past five projects, it might be reasonable to use twelve percent in the
instant estimate. Then again, it might not be. A closer look at these statistics
revealed that, although the average of five recent projects is indeed twelve
percent, the average for the two largest projects was almost sixteen percent.
The instant project is large.

4.3.5.5 Team readiness reviews. When an aircrew is preparing for


takeoff in an airliner, it goes through an elaborate checklist to be sure everything
is in it proper condition. Not to go through this checklist, or even to do it sloppily,
would be an unforgivable offense except possibly under the direst conditions.
The checklist is a reasonable and necessary means of protecting lives and
property.

Lives may not be at stake in most projects, but property almost always is
sometimes lots of property. So why, when a critical, complex project begins,
should there not be a readiness checklist that the team goes through? I have
only known one project manager who does anything evenly remotely like a
proper readiness checklist.

How could you design such a checklist and be reasonably sure nothing essential
was left out? The honest answer is that maybe you cant be absolutely sure. But
there is a process that can be followed that gives some assurances. Consider an
aircraft checklist. It varies somewhat from aircraft type to aircraft type. But
inevitably, it is based on 1) the flight and mechanical characteristics of the
aircraft, and 2) the local conditions at time of takeoff. How does this apply to
project readiness?
I suggest that the appropriate analogy for a project readiness checklist is 1) the
nature of the project plan, and 2) the local conditions at time of project start.
Every plan makes certain assumptions. The checklist should list those, and ask
if they are currently valid. If they are not currently valid, what is the likelihood
they will be valid at the time they are expected to be valid? What will it take to
make them valid? Is the plan sufficiently robust to accommodate some variances
if things dont go quite according to plan? As for local conditions, you ask

Exhibit 30
An Overstressed Process

It is often the case that a new project will overly stress a process that was
not designed to handle it. I once went to work as a systems engineer for a
large company that had just acquired a huge project. The project utterly
overwhelmed the factorys material handling process. Many engineers were
diverted from doing engineering work to chasing parts on the factory floor,
flying to see suppliers, cashiers checks in hand, to expedite payment and
delivery, and going to local machine shops with drawings in hand to get rush
orders filled. It took six months to recover from this problem. Was this
foreseeable? Absolutely!
questions like Does the team have adequate facilities? or Is our hiring program
in place? Is there anybody out there willing to be hired?

4.3.5.6 Process readiness reviews. Some projects are highly dependent


on certain processes being in place and working properly. Often, these
processes are external to the project boundaries and may be controlled by others
outside the project. If any of the processes break down, the project can be in
serious trouble. An example is a project that is dependent on the firms customer
order handling process, or perhaps on its quality process.38

A process readiness review will often find that the vital process is currently
working well. It should dig a bit deeper. Questions like What problems have
you had in the past? should be asked. If there have been problems, their
causes and solutions should be explored. Will conditions change enough in this
project to overwhelm existing processes? (See Exhibit 30 for a real life
example.) Are there plans for changing the process, and if so, how will this affect
our project?

38
Worse yet is dependence on somebody elses processes.
4.3.6 PC charts.

I am talking about neither personal computer nor politically correct. Here, PC


means Probability/Consequence.

This is a type of chart that displays graphically both the probabilities and the
outcomes (consequences) of the most severe (top 10?) risk drivers. The basic
idea could be used in several variations, but I think the risk driver display is the
most interesting. In a PC chart, probability is plotted on one axis, and
consequence on the other. Which axis is which matters little. Exhibit 31 is an
example.

The probability scale conventionally is 0-100%, but ordinarily a risk driver having
less than 10% probability or more than 90% will not be plotted.

The consequence scale could be in dollars, if cost is the most important aspect of
risk. Or it could be in weeks, if schedule is most important. Other possibilities

Exhibit 31 Example PC Chart


100

90

80
8
2
70

60 1
Probability %

50
3
9
40
7
30 5
6
4
20
10
10

0
0 10 20 30 40 50 60 70 80 90 100
Consequence %

exist, such as a weighted risk metric based both on cost and schedule impacts,
as well as, possibly, Type 1 performance of other risk impacts. I generally prefer
the weighted risk metric approach, with a scale running from 0-100%, or
equivalently, from 0 to 1. The design of such a metric is an option for the project
team.

Ten risk drivers have been plotted in Exhibit 31. They are numbered 1 to 10. A
legend should be provided (not shown) associating each number with a brief text
description.

The numbering scheme could be merely random, but it useful to number the risk
drivers beginning with the earliest likely to occur in the project as number 1, with
the highest number assigned to the last likely to occur. This gives the chart a
crude time dimension.

The distance of the risk driver plot point from the origin of the coordinate system
is a rough indication of the severity of the driver. For example, #8 in Exhibit
31is apparently the most severe risk driver.

This is a good chart for quickly communicating the risk picture to management.

4.3.7 Communicating risks to management or to clients.

Risk information generally is most dense at the lower levels of the project team,
and must be communicated upward from time to time. Effective upward
communication of risk information can be dicey. Here are my thoughts on how
that should be done.

Give the big picture first


Answer obvious key questions
Provide a qualitative description, not just a number
Use real-life stories and powerful analogies
Tell not only what you know, but also what you suspect
Spare the minute details
Point out where the data are weak
Give a sense of the uncertainty
Identify the positions of the stakeholders (who could be affected and how?)

4.3.8 Communicating baseline change to the


project team.
Information flows
efficiently
throughout
organizations
except that bad
news encounters
hi h i d
One quick way for a project to become chaotic is when changes in the project
baseline are not crisply executed. This can happen several ways. One is when
the change is not clearly described in the first place. When this happens, team
members will likely come to different understandings of what is to be done, and
will move in different directions. Its like what typically happens in the game of
Simon says. Some people will not remember that the command is executed
only when Simon orders it, and will do it when Simon does not order it. And
some will fail to do it even if Simon does order it. Confusion reigns unless team
members are alert!

Another road to chaos is when the change is not uniformly communicated. As


the old military saying goes, there is always ten percent who dont get the word.
Allowing ten percent not to get the word can be disastrous in a critical project.

Even worse than not getting the word is not knowing what to do with it. The
project must have a command language for baseline changes that all have
been trained to understand.

Exhibit 32 illustrates a generic command language in the shape of a form that is


completed by (or on behalf of) the project manager. This form identifies affected
old tasks, new tasks, new budgets, and new duration values. It also states the
nature of the product changes. Allowed exceptions are listed in the event that
local conditions do not permit exact implementation of the changes.

A form such as that shown in Exhibit 32 is very useful in preventing a project


from descending into chaos. It uniformly and crisply communicates the change,
and asserts that it is the desire of the project manager. It is up to the project
team to be sure that everybody who needs to know gets a copy and understands
it. Because a baseline change is generally a momentous event in a project, it
often warrants an all hands meeting to be sure that everyone gets the word.

Note that this particular example form is designed so that budget and duration
values by task are stated for both the old and new conditions. This is generally
easy to do. But product changes are given by reference to other documents,
such as drawings and specifications, because full descriptions of such changes
are often voluminous, and cannot be put directly into a one or two page change
notice. However, a brief summary of the change and the reasons for it are
shown.
The reader experienced in project work may well ask why the above approach is
or could be preferred to the typical change board approach, with its rather
ponderous bureaucratic procedures. It is not necessarily to be preferred. In a
project where there is plenty of time to dwell on a change and make it
deliberately, the traditional approach may be preferable. But in fast moving,
critical projects, the above approach allows the project manager to respond
quickly using change bullets to solve problems that require changes. It does,
however, require the project manager to fully master the logic and impacts of the
change, generally with the help of his project leads.

Another reasonable question might well be, is this form needed for every change,
even the smallest? My answer would be no. It is unnecessarily tedious to
change the baseline for every small change in the plan or in the product. A
baseline should be thought of as general solution to the project goals that
contains some flexibility for small changes. The team should make its own
definition of the limits of a small change that does not amount to a baseline
change. Cost and schedule impact should probably be a consideration, and
perhaps so also should be impacts on multiple tasks.

Exhibit 32Generic Notice of Change of Baseline

Notice of Change of Project Baseline This Baseline Is #: 3 Previous Baseline #: 2


Project Title: Develop of X-2 Hybrid Car Date Effective: 8/15/03
Authorized by: Imin Charj, Project Manager Signature:

Description and Reason for Change:

The current rear axle assembly (Dwg. 342-3) does not reliably withstand the stress of maximum acceleration with full load.
Wear will likely be excessive, creating warranty problems. Analysis and development testing shows that substituting
titanium for steel in the housing (only) is the most cost effective solution. Make this change immediately, and retest
according to amended test procedure TP431-4.
Existing Tasks Affected:
Budget (K$) Duration (Weeks)
Task ID Task Title Old New Old New
3.1.1 Loads analysis 18.5 19.5 12 12
3.1.2 Stress analysis 21.8 23 13 14
3.1.3 Engineering test 12.2 13.1 10 11

Sums 52.5 55.6


New Tasks Created
Budget (K$) Duration (Weeks)
Task ID Task Title Old New Old New
3.1.1a Materials research 3.4 3

Sums 3.4
Increase in critical path (weeks): 2
Increase in total budget (K$): 6.5
Chapter 4 review questions

1. Do you believe that the primary responsibility for risk driver identification
should rest with team members at the working level? Why?
2. Section 4.2.2 lists some items commonly included in risk management
databases. Can you think of any others that should be added?
3. What reasons can you think of for retaining records of risk drivers at least until
the end of the project?
4. Alone, or with one or more friends or colleagues, identify a potential problem
that could occur in an area of familiarity in the next few months. Have you
identified a symptom, or have you identified the root cause? Most likely, you
have identified a symptom. Create a fishbone diagram to search for its
probable root causes. For each identified root cause, try to find other
potential impacts.
5. Make a list of about 20 things your organization needs to accomplish in the
next few months. Together with two or more of your colleagues, use multi-
voting to prioritize them.
6. Try to think of a way to extend the tri-level attribute method to make it
sensitive to both cost and schedule risk.
7. Together with a few of your colleagues, think of a problem facing your
organization, and brainstorm solutions to it.
8. Do you have a corporate checklist for projects? Is it available to team
members? What is the process for completing it and getting approval to
proceed?
9. Consider the kind of work you most commonly do as a professional. Create a
generic checklist that you could apply to your next project.
10. A goals and requirements review can be conducted at any time in the life of a
project. Conduct one on your current project.
11. Select a key estimate in your current project, one upon which much is riding.
Review the basis of estimate, looking for holes or unstated assumptions.
12. With the benefit of hindsight, develop a team readiness checklist for your
current project.
13. With the benefit of hindsight, develop a process readiness checklist for your
current project.
14. In section 4.3.6, the following statement was made: The distance of the risk
driver plot point from the origin of the coordinate system is a rough indication
of the severity of the driver. The expected value is also considered a good
measure. It is based on the product of the probability and consequence.
Create a hypothetical PC chart, and compare the two measures for each
point, after having first proportionally scaled them so that the largest value in
each system is 100. Are there any serious inconsistencies between the two
systems?
15. Section 4.3.7 lists nine suggestions for communicating risks to management.
What additional suggestions do you think should be added?
16. Section 4.3.8 illustrates a generic form for generating and communicating
changes in critical projects. How would you improve on this form for your own
project?
Chapter 5Statistical Analysis of Project Risks

5.1 Introduction

This chapter is a relatively (but, alas, not totally) painless excursion into the world
of statistics as applied to project risk management. All I try to do here is make
you comfortable with a few statistical concepts so that you can talk intelligently
with real experts who actually will do the statistical work on your project. (I
obviously am assuming you are not one of those experts and that somehow a
decision has been made to do statistics on your project. If you are one of those
experts, chances are you know at least as much about this subject as I do, and
very possibly you know more.)

For openers, not every project needs the statistical treatment. It involves some
work, and some extra cost. It requires the skills of someone who is a least
minimally qualified to do statistical analysis. And in todays world, it generally
requires the acquisition of software tools that may range in cost from as low as
$500 to as high as $20,000, perhaps more, depending on the sophistication you
want. The benefits of the statistical treatment are minimal, or even negative, on
small, simple projects, but can be significant in big, complex projects. Where
exactly does that line get drawn? Its vague, but if I had to draw it, I would
probably want to use some kind of statistical treatment for a project fitting any of
the following categories.39

Any fixed price project that actively involves more than 50 people at a given
point in the project, including subcontractors
Any cost plus fee project that actively involves more than 50 people at a given
point in the project, including subcontractors, if an overrun of cost or schedule
will likely jeopardize your future relationship with your customer

39
Some argue that the statistical treatment is never really needed; you can manage project risk
quite well without it. Perhaps so, but in my opinion quantification has great value in complex
projects where some decisions involve millions of dollars. It provides a defendable basis for
decisions that otherwise would tend to be made on a gut feel basis. It also provides indicators
very early in the project of the likelihood of failure, so that early corrective action can be taken that
is proportionate to the danger. Further, even advocates of non-quantitative PRM frequently use a
low level form of quantification for purposes of effective communication of risk drivers. It is easy
to misinterpret non-quantitative measures such as low, medium, high risk unless they are
associated with probability numbers, and cost/schedule impacts.
Any project where you believe your customer expects the thoroughness and
attention to detail implied by a statistical treatment, especially if that customer
wants to review the statistical results from time to time, and is paying all of the
costs
Any multinational or transcontinental project that has major teams in more
than one city, state, or country, that have to do a lot of close coordination
Any public project in which there has been major and noisy disagreement and
conflict as to whether the project is needed, and where blame certainly will be
placed if there is a failure
Any large project in which more than two major new ideas or concepts will be
introduced where successful integration of these ideas or concepts is vital to
project success
Any project in which technologies that have not been previously integrated
are integrated for the first time
Any project, especially large projects, where a significant cost or schedule
overrun could be damaging to your organization
Any project more than two years in length.
Any project that attempts to significantly advance the state-of-the-art
Any large project where it is believed that the estimates, especially of cost
and duration, are somewhat speculative and for one or more reasons that
condition cannot be completely corrected.

Why would you want to do a statistical analysis on any of the above project
types? Here are some thoughts on that subject:

High-risk projects are likely to be conceived in an environment of budgetary


and schedule conflict. Those who want the project to proceed may be
somewhat at odds with those who will pay for it. Those who want it may want
it as soon as possible. Those who pay for it might like to stretch it out, or
otherwise modify it. There can be severe conflicts in the project goals.
Those who want the project to proceed tend to want the most possible in
terms of performance goals. In response to pressures from those who will
pay for the project, they tend to claim that a lot can be done with a little
money, in the hope of selling the project. Hubris sets in.
Applying a statistical process to a project is not a silver bullet that cures all
ills, but if done conscientiously, it tends to inject reality into the project goals
and planning.
It is likely to bring to the surface potential problems that might otherwise have
been overlooked.
It provides a quantitative measure of risks that is superior to non-quantitative
measures such as major and minor risks that can easily be misinterpreted.
A quantitative process can better account for risk interaction effects. As a
simple example of an interaction effect, consider that a delay for any reason
may increase project costs.

The above should be helpful, but I cannot replace your judgment with mine. You
need to decide whether a statistical treatment is right for your project. It will
probably be easier for you to make that decision after you have read this chapter.

Mathematically sophisticated readers might possibly stop reading at almost any


paragraph, because what I am saying is old hat to them. But if you do continue, I
ask that you please not be offended by the lack of mathematical rigor. I have
pursued an intuitive, simplistic approach for the very reason that math-fearing
readers will become very turned off if I start talking about theorems, lemmas,
proofs, and the like. Those math-fearing folks are the very people I want to
reach.

5.2 Basics of probability


Probability is a branch of mathematics closely allied with statistics. Some would
say it is a part of statistics. However you classify it, we need to look at some very
basic ideas in this field of knowledge. We will attempt nothing terribly
complicated, though.

Perhaps the most basic idea of probability is that for a given event, there may be
more than one way it can happen, and these various outcomes may not be
equally likely. Said more precisely, a given event can have a number of different
mutually exclusive outcomes. In a race of eight horses, any one of the eight
might be the winner. And some of the horses have better odds of winning than
the rest. In a test of hardware, the hardware might pass, or it might fail. Perhaps
failure would be divisible into several sub-outcomes, based on the mode of
failure. Other sub-outcomes could relate to various combined failure modes.

As a practical matter, in quantitative PRM we seldom attempt to describe more


than five discrete ways in which an outcome can occur. Even though there may
actually be hundreds or even thousands of ways, we want to group these into
ways that are not much different in impact to avoid tedious descriptions that dont
do very much to improve the fidelity of the analysis. Consider that if we tried to
describe ten different outcomes, their average probability of occurrence would be
only 10% each. Ten percent is at the edge of who cares? whereas if we only
describe five different outcomes, they have an average probability of occurrence
of 20%, which is worthy of some concern.

As you will probably have guessed from the preceding paragraph, all of the
possible outcomes taken together are said to have a probability of one, or
equivalently, 100%. For example, if the weather forecast says the probability of
rain tomorrow is 80% (0.8), the implication is that the probability of it not raining
must be 20% (0.2). The statement that all possible outcomes have a probability
of 1, or 100%, is equivalent to saying that exactly one of the outcomes is bound
to happen. For sure, it will either rain or not rain.

If we say that an outcome has a probability of zero, we mean that it is impossible.


It can never occur. If we say that an outcome has a probability of 1 or 100%, we
mean that it is certain to happen.

How do we assign probability numbers to outcomes? If the outcome is either


impossible or certain, we know how, but what if it is somewhere in the gray area
between impossible and certain? One way is to look at historical data. Suppose
we have bid on 36 projects in the past few years. Of those, we have won the
work 9 times. Thats 9 out of 36, or 25% of the time. So if somebody asked
What is the probability we will win this new project we are bidding, the best
answer, based on the available evidence, would be 0.25. The probability of
losing would therefore be 1 0.25= 0.75.

Lest we get into an overly simplistic frame of mind about this, we should
recognize that more information could lead to a different result. For example,
suppose that of the 36 projects we have bid, fifteen were over $5 million; the
others were smaller. Of those over $5 million, we won 6, or 40%. So, at least
apparently, we are somewhat more formidable as competitors on the larger
projects than on the small ones.

In using historically derived probabilities, we need to keep in mind that the bonds
of history are not unbreakable. By aggressively cutting our costs, and working
hard to understand the customers requirements better than anyone else, we
might be able to significantly raise our win probability beyond what history
suggests. On the other hand, if things stay the same, history probably will tend to
repeat itself, at least approximately.
Another way to get probability numbers is through something called the Principle
of Insufficient Reason, or PIR. In certain circumstances, this is an excellent
approach. Basically, the PIR says that if an event can have n different outcomes,
n being an integer number, and we have insufficient reason to believe that any of
the outcomes is more likely than any of the others, we should assign all of the
outcomes a probability of 1/n.

The classic example is the coin toss. Disregarding the highly improbable
outcome of the coin landing on its edge, there are two outcomes, heads and tails.
Having insufficient reason to believe that heads is more or less likely than tails,
we assign each outcome a probability of . This is equivalent to believing that
the coin is fair. We normally would assume that, unless we saw that the coin
was bent, or we had some evidence of skullduggery.

An obvious extension of this argument is to a fair six-sided die. If we have


insufficient reason to believe a die to be loaded, we assign a probability of 1/6
to each side coming up in a toss.

An assignment of probability creates a certain expectation. For example, if we


tossed a coin we believe to be fair 1,000 times, we will expect the number of
heads to be very close to the number or tails. If the number of heads turns out to
exceed the number of tails by more than a few, we would be tempted to believe
that the coin is not fair. How big a number is a few is a statistical consideration
that is well beyond the scope of this simple discussion.

The PIR can be used directly in project risk management in those circumstances
where outcomes are deemed roughly equal in likelihood. This is often the case
when a project is very young and outcomes can be named, but not fully
described.

In PRM, the most prevalent way of getting probability numbers is called


subjective assignment. For example, if a certain hardware test is a critical event
in a project, we might subjectively assign passing the test a probability of 0.7.
This assignment is only as valid as the evidence that supports it. But even if the
evidence is thin, the assignment can be of value, as opposed to having no idea
at all of the probability of passing.

Subjective assignments are often made by a consensus of the most


knowledgeable people on the project. Usually, it is poor practice to have them
made by a single person connected with the project, no matter how
knowledgeable he or she may be. A single person is too likely to be biased. Of
course, there is such a thing as group bias, but it is probably less likely than
individual bias.

One possible bias is excessive pessimism, but a more common one is hubris, a
word that means excessive optimism. Hubris is common because people on
projects, like teenagers, like to believe they are invincible, and that their project is
bound to succeed. If hubris is rampant, you can almost smell it when you have
been around the project team for a day or two. When a state of hubris exists, it is
a pretty safe bet that the project will have some major problems, because people
will not be on their guard against risk drivers. When people are not on their
guard, crashing and burning becomes likely in any but the most benign project
environment.

A useful and interesting way to home in on a subjective probability assignment is


to begin with the PIR, then modify it using rational arguments. This is best
illustrated by example.40 Suppose that a piece of hardware has two major
components, the widget and the gadget. A critical test of the hardware has been
assigned the following four possible outcomes:

A. Success (test passed)


B. Failure of the widget only
C. Failure of the gadget only
D. Failure of both the widget and the gadget

We begin by pretending that we have insufficient reason to believe that any of


these outcomes is more likely than any other. So we tentatively assign a
probability of 1/4th (0.25) to each outcome. Looking more closely at the situation,
we realize that P(D) (shorthand for the probability of outcome D) has got to be
less then either P(B) or P(C). In fact, appealing to something called the
multiplication rule, we believe that P(D) = P(B)P(C).41 So a preliminary estimate
of P(D) is (1/4)(1/4) = 0.0625.

40
Please do not think that this example is the only possible version of this process. The
approaches are limited only by your state of knowledge, and willingness to be guided by common
sense.
41
We already know that P(A) + P(B) + P(C) + P(D) = 1. But a rule in probability, which we will not
attempt to prove here, says that if B and C are independent events (not influencing each other),
then the probability of both of them occurring is given by P(B)P(C), i.e., the product of P(B) and
P(C). The multiplication rule does not require that the occurrence of both events be simultaneous
in time, although it could be.
Now we have, tentatively:

P(A) = 0.25
P(B) = 0.25
P(C) = 0.25
P(D) = 0.0625

But this cant be correct, because the sum is no longer 1! Additional adjustments
need to be made. Looking at P(A) and P(B), perhaps engineering estimates lead
us to believe that they are approximately equal, but not necessarily equal to 0.25.
We look even further, and come to a belief that the probability of passing the test
is 0.8. Summarizing our current state of knowledge:

P(A) = 0.8
P(B) = x
P(C) = x
P(D) = x2

A little algebra, or just trial and error arithmetic, quickly reveals that x = 0.095:

P(A) = 0.8
P(B) = 0.095
P(C) = 0.095
P(D) = 0.009

Except for rounding error, the sum is 1, as it must be.

This technique is a mixture of PIR and subjective


assignment. I have found this to be a most
effective way to arrive at subjective probability
The probability of assignments. It forces the people assigning the
anything probabilities to come up with reasons why the
happening is in various outcomes should have different
inverse ratio to probabilities, and their relative likelihood. Absent
that thought process, they must all be assigned
the same probability.

5.3 Expected value


While probability assignments by whatever means are interesting, they are not of
themselves of great value in the practical sense. What really makes them useful
is associating them with outcome values. In the sense that we will be using the
word value, I mean a real number. I am limiting the conversation to outcomes
with which a real number can be associated. When we can assign numbers to
outcomes, we can talk meaningfully about a concept called expected value.

Consider a coin toss. The outcome is either heads or tails. Neither heads nor
tails is a number. So we cannot talk about the expected value of a coin toss.
We can arbitrarily replace the name heads with the name 1, and tails with
the name 2. But these arent really numbers either. They are simply the names
of numbers used as proxies for the names heads and tails. But if we lay bets on
heads and tails of say $1 on heads and $2 on tails, we can talk about expected
value of the outcomes. $1 and $2 are real numbers.

We can also assign numbers to measurable product characteristics, such as


weight or reliability or mean hours to repair. We can therefore compute expected
values for these parameters. But what about project outcomes such as customer
satisfaction or competitive advantage? It can be done if we feel that it would be
helpful. One technique would be to postulate, for example, that perfect
completion of the project (meeting all goals) wins us a perfect score of 100% in
customer satisfaction and also in competitive advantage. Then we would
subtract points (subjectively, unless the customer agreed to participate in the
exercise) if certain outcomes occurred short of perfect completion.

When we talk about outcomes, we should in principle enumerate all of them to


make our meaning clear. If there are many possible outcomes, this is both
tedious and perhaps not very enlightening. The notion of expected value allows
us to characterize an entire list of outcomes with a single number. It is in a sense
an average of all of the outcomes. Mathematically, it will be the average
outcome if the event is repeated many times.

The calculation of expected value is simple and easy to remember. It works like
this. Multiply the value of each outcome by its probability, and then sum the
results. For example, suppose that a given event with three outcomes can be
characterized as follows:
Exhibit 33 Example Expected Value Calculation

Probability Value Product


0.35 $500 $175.0
0.45 $650 $292.5
0.20 $480 $96.0
Expected Value $563.5

I do not show a proof, but the fact is that if we repeatedly did an experiment or a
game where the above probabilities and values prevailed, the long-range
average outcome would be $563.5. So to repeatedly play a game defined by the
above values and probabilities, you would not want to pay or bet more than
$563.5 each time you played. If you bet less than that amount, in the long run
you would be a winner. If you bet more, you would in the long run be a loser.

The expected value has a certain appeal to decision makers because of this long
term averaging effect. If the above table represents your best estimates of the
costs and associated probabilities of a particular project, you can say that if that
project were repeated many times with the same people and circumstances, the
long run average cost would be $563.5. (Of course, real projects are never
repeated. A skeptic might note that they are not even done once, at least not as
originally planned.)

Should $563.5 should be your budget for the project? It would not be my budget,
unless I was really squeezed by my customer or by my competitors! Why not?
Look again at the above table. The most likely outcome (i.e., the outcome with
the highest probability) is $650, and there is at least a 45% probability of
exceeding the expected value. I dont like those odds, but somebody else might!

Expected value has great strength for predicting averages, but is weak at
predicting what a specific outcome will be. The most probable outcome is better
for that. The most probable outcome in Exhibit 33 is $650. In spite of this
weakness, the expected value has its uses, as we shall see.

5.4 Description and aggregation of risk drivers

Assuming that we have been able to identify a risk driver, how would we go
about describing it quantitatively? There are basically two approaches, discrete
and continuous. Discrete descriptions are a bit easier to deal with
mathematically, so I take them up first.

Recall from chapter 1 the definition of a risk driver:

A risk driver is any root cause that MAY force a project to have outcomes
different than the plan.

Before we give a risk driver a quantitative description, we normally will give it


some sort of identification code for keeping track of its status, and we also will
want to give it a text description so we can remember what it is about.

Assume that we are working on a project where no risk drivers have yet been
identified, and our numbering scheme is the most obvious one. We are able to
identify a certain risk driver, so its assigned ID is 1.

The text description can be done in many ways, some of which are not too
satisfactory. Using a sentence fragment such as Air compressor problem is not
particularly good, because it doesnt say what the potential problem is.42 It also
doesnt do much to reveal the presumed root cause, which is a good thing to do.
And, it gives us no hints as to what we might be able to do to mitigate it. Simply
put, it is an example of poor project communications.

We could improve on this somewhat by saying Possible late delivery of 30


horsepower air compressor. This certainly provides more information, but we
still have not communicated too well. Here is the kind of text description I
recommend:

MegaAir may not deliver the 30 hp air compressor on time due to other high
priority commitments in their factory.

Notice the word may. This clearly indicates a risk driver as opposed to a
problem. A problem would be if the risk is realized and the compressor is
actually late, or if we know for sure it will be late. Because teams sometimes
track risk drivers and problems in the same database, it is helpful to use words

42
Unfortunately, many project teams use overly terse descriptions such as these. They have the
potential for creating misunderstandings except in very small projects where communication is
not much of a problem.
such as may or might or could to clearly distinguish risk drivers from
problems.

Notice that the presumed root cause is clearly stated. The problem is in
MegaAir. It is due to other commitments they have made which makes them
uncertain as to their ability to meet our schedule. (We will assume that there is
no deeper cause that could be identified, such as We stupidly ordered a custom
designed 30 hp air compressor that only MegaAir can produce, so we are stuck
with the possibility that they may not meet our impossible schedule due to their
internal workload and the fact that our credit rating with them is the pits and they
didnt want the job in the first place.)

Once an ID has been assigned, and a root cause statement has been written, we
begin to think of quantification. As a precursor to that, we need to identify
discrete outcomes (recall that we are looking at a discrete method of description).
As we explained earlier, we probably dont want to identify more than five or so
discrete outcomes.

Three rules must be observed in identifying discrete outcomes. They are:

One of the outcomes must be Meet the plan, or words to that effect.
Presumably, there is always a non-zero probability that the plan can be met;
else we would not have chosen it as the plan. If there is no possibility of
meeting the current baseline plan, good risk management practice requires
that we develop a new baseline that has some possibility of being realized
before we begin to talk about risk drivers. An impossible plan does not have
integrity or credibility. It is even fair to say that it is not a plan at all.
The outcomes must be mutually exclusive. Said another way, they must not
overlap. A simple example of outcomes that are not mutually exclusive would
be in describing the outcomes of the toss of a single die as 1, 2, 3, 4, 5, 6, 1
or 2. The outcome 1 or 2 is not mutually exclusive with respect to the
outcomes 1 and 2.
The outcomes must encompass all reasonably expected possibilities. A
simple example of outcomes that fail to do this would be in describing the
outcomes of the toss of a single die as 1, 2, 3, 4, 5. The number 6 is a
reasonably expected possibility, and it was omitted.

Lets continue with the air compressor example:


Root cause statement: MegaAir may not deliver the 30 hp air compressor on
time due to other high priority commitments.
Outcomes:
o Meet the plandelivery on time or early
o Delivery 1 week late
o Delivery 2 weeks late
o Delivery 4 weeks late
o Delivery 6 weeks late

Note that we have observed all three of the rules stated above. The third rule is
met because we assumed that being more than 6 weeks late is not a reasonably
expected possibility, and also because we consider intermediate outcomes such
as 2.4 weeks late to be counted as having about the same effect as being 2
weeks late.

Having identified outcomes, the next step is to assign probabilities to the


outcomes. We have already discussed means for doing that. Suppose that the
following is the result for the air compressor risk driver:

Root cause statement: MegaAir may not deliver the 30 hp air compressor on
time due to other high priority commitments.
Outcomes: Probability
o Meet the plandelivery on time or early 50%
o Delivery 1 week late 20%
o Delivery 2 weeks late 10%
o Delivery 4 weeks late 10%
o Delivery 6 weeks late 10%

The probabilities are stated as percentages. Note that they add up to 100%, as
they must.
Exhibit 34--Project PlanTank Purging System

Order Air Compressor Install Air Compressor


6 weeks, $20K 1 week, $15K

Construct Foundation
3 weeks, $5k
Integrate & Test System
Begin 1 week, $15k End

Order Piping Install Piping


1 week, $15K 4 weeks, $50K

Order Controls Install Controls


5 weeks, $15K 4 weeks, $25K

Manage Project
10 weeks, $60k

The next step is to assign impacts to each outcome. The air compressor risk
driver by its nature involves schedule impacts. But schedule impacts on what,
exactly? And are there cost impacts as well?

To sort out these questions, we need to recall that the project plan for a complex
project will have both a work breakdown structure (WBS) and a schedule
network. The schedule network comprises the tasks described in the WBS.
Each task has an estimated cost and duration. In principle, a risk driver may
impact more than one task. The impact could be on cost, duration, or both, for a
given task. The situation is best illustrated by a simple example.

Assume that the project calls for installing the 30 hp air compressor on a
foundation, then connecting it to existing tanks via new piping. There are also
new controls to regulate flow. The purpose of the air supply is to purge the tanks
from time to time. We further assume that the design was completed at some
earlier time, so the project involves only implementation. Exhibit 34 describes
the project.
Summation of costs across all tasks yields a planned $220K in costs. The critical
path is 10 weeks long and lies through the shaded tasks in Exhibit 34. For
convenience, we again restate the risk driver regarding possible late delivery of
the air compressor.

Root cause statement: MegaAir may not deliver the 30 hp air compressor on
time due to other high priority commitments.
Outcomes: Probability Duration
o Meet the plandelivery on time or early 50% 0
o Delivery 1 week late 20% 1
o Delivery 2 weeks late 10% 2
o Delivery 4 weeks late 10% 4
o Delivery 6 weeks late 10% 6

We have added another column on the right that shows the duration impacts on
the Order Air Compressor task. If this risk driver had cost impacts, we could
have added yet another column on the right to show our estimates of those.

We now compute the expected value of the impact of this risk driver with respect
to duration of the Order Air Compressor task. The expected value of late delivery
is computed as follows:

(0.5)(0) + (0.2)(1) + (0.1)(2) + (0.1)(4) + (0.1)(6) = 1.4 weeks

Inspection of the above schedule network reveals that, on average, this risk
driver has no impact on the project. Adding 1.4 weeks to the compressor
delivery schedule has no impact on the critical path, and the driver has no cost
impact. However, the driver has the potential for adding as much as four weeks
to the schedule, but with low probability. If compressor delivery is late by more
than two weeks, it becomes the critical path, and the overall schedule can
increase beyond ten weeks to as much as 14 weeks. (To understand fully what
has just been said, you will need to carefully study Exhibit 34.)

Presumably, the project management task must increase in length to match the
overall length of the project. For simplicity, assume that project management
costs $6K per week at any duration. If the critical path increases from 10 weeks
to 14 weeks, program management costs increase by $24K in a $220K project.
Therefore $24K is clearly an absolute upper bound on what we are willing to
spend to mitigate this risk driver. In practice, we will not be willing to spend even
that much, because the probability of schedule overrun is not high. We can
estimate the expected value of the duration overrun in weeks as follows:

(0.8)(0) + (0.1)(4) + (0.1)(6) = 1 week

The expected cost overrun is therefore $6K, and that is more realistically an
upper bound on what we would be willing to spend to mitigate this potential
problem. For example, if the MegaAir factory is willing to guarantee on time
delivery of the 30 hp air compressor if we pay a premium of $6K, we might be
willing to do it. It depends on how important it is to meet the project schedule.43

Lets consider one additional risk driver in this project. By doing that, we can look
at another other form of analysis. We also can illustrate the concept of
aggregation of risk drivers.

The air compressor late delivery driver we expressed as a discrete driver. We


now look at an example of a continuous driver. Again, we assign an ID, then we
write a root cause statement (for convenience, we include the impact in the root
cause statementthis is acceptable for simple impacts):

Root cause statement: Because of potential rainy weather and resultant poor site
access, piping installation may take as long as six weeks, versus the four weeks
planned.

Here, we are given a continuous range between 4 and 6 weeks, hence the name
continuous driver. The expected value of a continuous range is midway through
the range, in this case, five weeks.

Is there a potential cost impact as well? Probably, because if it rains the crews
installing the piping will have a more difficult time of it, and will have to work more
hours. The planned labor cost is $60K in a four-week period, so perhaps it is
reasonable to assume that in a six-week period the cost can rise to $90K. The
expected value of cost is therefore $75K, an increase of $15K above the plan.

This driver has no impact on the critical path, even


in the extreme case.
The Laws of
Thermodynamics
43
1. You
A benefit cant win.
of quantitative risk analysis is that it gives us a good indication of appropriate limits on
expenses
2. toYou
mitigate risk drivers.
cant
break even
3. You cant get
t f th
Suppose that the people in charge of installing the controls suddenly realize that
their task also can be influenced by the same possibility of rainy weather and
consequent problems of site access. The risk coordinator on the project decides
to modify the current root cause statement to read as follows (again, for
convenience we state the impacts in the root cause statement):

Root cause statement (amended): Because of potential rainy weather and


resultant poor site access, piping installation may take as long as six weeks,
versus the four weeks planned, and controls installation may take as long as five
weeks, versus the four weeks planned.

We have here a root cause of rainy weather than can potentially impact two
different tasks. The expected value of duration of control installation is 4.5
weeks. We can assume that its expected cost has increased from $25K to
$28.125K (we leave it to the reader to verify that).

Because the control installation is on the critical path, this driver lengthens the
expected duration of the entire project! We may be willing to let that ride if the
extra duration does no real harm, but if there is a lateness penalty, we might
want to consider increasing the crew size, or taking other mitigation measures, in
the event of rain. What would we be willing to spend on such measures?
Certainly no more than the amount of the penalty!

The aggregated expected value of cost impact of this driver is the sum of the
impacts due on the piping installation task and on the controls installation task,
i.e., $15K + $3.125K = $18.125K.

A neat thing about expected values, and a major reason they are used in risk
analysis in spite of their shortcomings, is that they are always additive. This is
not true of any other commonly used statistic that you may have heard of, e.g.,
median, mode, standard deviation, etc.

Another nice characteristic of the expected value is that as you continue to add
risk drivers to a project, the aggregated expected value is not equal to the
aggregated most likely value, but it tends to move closer and closer to it. While it
is moving toward it, in most situations it is a conservative estimate of it, that is, it
tends to be a bit higher.

If multiple risk drivers impact the cost or duration of a given task, the expected
values of those impacts are additive within the task. The expected values of the
cost impacts are in fact readily additive across the entire project, but the duration
impacts must take into account the nature of the schedule network, and slack in
the schedule in non-critical paths.

5.5 Monte Carlo Simulation

The analyses we just did by hand would be impractical on a project with a


thousand tasks, and a hundred risk drivers. Those are not unusual numbers for
complex projects. In todays world they are done on a computer using
specialized software.

A common capability of such software is a statistical process called Monte Carlo


simulation. The name, of course, comes from the gambling casinos in Monte
Carlo. What the software does is rapid random sampling of possible outcomes in
the project. After randomly sampling the actions of all of the risk drivers
hundreds or even thousands of times, the computer tallies up the results.
Typically, it produces a histogram of cost or schedule risk or both.

Exhibit 35 is a typical histogram for cost risk. This particular histogram shows
that costs may range from about $20M to about $35M. The height of each
vertical bar represents the relative likelihood of a cost in the range shown at
bottom. For example, a cost between $25 and $26M is about twice as likely as a
cost between $22 and $23M.
Exhibit 35--Example of a Histogram

16
14
12
10
8
6
4
2
0
20- 22- 24- 26- 28- 30- 32- 34-
21 23 25 27 29 31 33 35
M M M M M M M M

Most risk management software will also produce cumulative histograms, also
called ogives. These are plots in which a running sum of the lengths of the bars
in the histogram is plotted. The ogive for the above histogram is shown below in
Exhibit 36. Instead of the actual sum, which is meaningless, the sum has been
normalized to 100%.

From the histogram various statistics can be generated, such as the cost range,
most likely cost, the expected value of cost, the median cost, standard deviation,
etc.

If the software is sophisticated enough, it can generate histograms and ogives for
both cost and schedule risk.

Another statistic commonly used in risk analysis is the percentile. Most


commonly used are percentiles by tens. For example, the tenth percentile is the
cost that has a 90% chance of being exceeded, the 40th percentile is the cost that
has a 60% chance of being exceeded, the 90th is the cost that has only a 10%
chance of being exceeded, etc. It is fairly common that budgets will be based on
percentiles. For example, the project manager may set the project budget or bid
price at the 70th percentile with the expectation that there is only a 30% chance
of cost overrun. On the other hand, in a tight competitive bidding situation, the
budget may be set at the 40th percentile in the hope of underbidding the
competition. This type of decision, of course, represents a tradeoff between
project cost overrun risk and the possibility of not winning a bidding competition.44

Exhibit 36--Relative
Likelihood Ogive

100
90
80
70
60
50
40
30
20
10
0
20 25 30 35
$M

Percentiles can be read directly from ogives. In the ogive above, for example,
the 10th percentile is about $23M, and the 70th is about $27.5M. If the budget
were set at the 70th percentile, there would be a 30% chance of overrunning the
budget.

Chapter 5 review questions

1. In a project you have previously worked on, can you identify a first order
interaction where a schedule delay increased cost? Where an increase in
cost caused a schedule delay? What might you conclude, if anything, from
your answers?

44
This issue is studied in detail in another book of this series, Bidders Perspective: Designing
and Bidding to Win.
2. In general, do you think that estimating error should be assigned as the root
cause for a poor estimate? Does that convey any helpful information?
3. You local weatherperson has said that the probability of rain tomorrow is
50%. Does this mean that the probability of cloudless skies is 50%?
4. A small business averages $1,000 in revenue on clear (not rainy) days, and
$800 in revenue on rainy days. The average number of rainy days per year in
the area is 30. The business is open 300 days per year. What is the
expected value of the annual revenue of this business?
5. Due to the job opportunity interview of a lifetime, you need to drive 800 miles
in a ten-year old car that is wearing out. If the car has no problems, you
expect to make 400 miles per day, arriving in two days. You have very little
money, and dont want to spend money on repairs before you start, but if you
do need repairs on the way, you can cover them, most likely. You identify the
following outcomes, probabilities, and delays in days. What is the expected
value of the length of the trip, in days?

Outcome Probability Delay, days


Meet the plan 50% 0
Fan belt fails 10% 1
Flat tire 10%
Fuel pump fails 10% 2
Miscellaneous failures 20% 1
6. In problem 5, you expect to get 20 miles per gallon, and gasoline costs $1.50
per gallon. Your gasoline cost will therefore be $60, and is assumed to be
risk free. If there are no problems, you will also need to pay $50 for one night
in a motel. A friend has made sandwiches for you to eat, so your food is free.
Having a flat tire will not cause you to have to spend an extra night in a motel,
but either of the two outcomes that have a one-day delay will cause that to
happen. A fuel pump failure will cause you to spend two extra days in a
motel. The repair cost for a flat tire is $20. For a fan belt, or a
miscellaneous failure, it is $100, including towing. For a fuel pump failure, it
is $150, including towing. What is the expected value of your trip cost?
7. You have been invited to a horse race, but you have no experience with
horse racing. Three horses will race, Alpha, Beta, and Gamma. The track
allows only win bets, not bets for place (second place), or show (third
place). An Alpha win will pay $2 for every $1 bet; a Beta win will pay $3, and
a Gamma win will pay $4. You have $1 you can bet. If you follow the PIR,
which horse should you bet on to maximize the expected value of your
winnings? (This next part is a bit difficult!) If you believe that each horses
probability of winning is inversely proportional to its win payoff, which horse
should you bet on to maximize the expected value of your winnings?
(Inversely proportional means proportional to the reciprocal.)
8. In a discrete risk driver description, it is important that one of the outcomes be
to meet the plan. But this is not necessary in a continuous risk driver
description. Explain why.
9. It is possible to have a hybrid risk driver description where there are discrete
outcomes, but each outcome is continuous (i.e., expressed as a range). Can
you think of at least one situation where this would be useful?
10. A project comprises three tasks, A, B, and C. Tasks A and B can both be
started when the project starts. Task C can only start when task B has been
completed. The planned costs and durations of the tasks are as follows:

A: $1,000 2 weeks
B: $1,500 4 weeks
C: $500 2 weeks

What is the planned aggregated cost of the project? Which tasks are on the
critical path? How long is the critical path in weeks?
11. In the project described in question 10, a discrete risk driver (#1) has been
identified that might impact only task A. No other risk drivers have been
identified. There is a 50% chance of meeting the plan for task A, and a 50%
chance of adding 3 weeks to the task. Assume that this task costs $500 per
week at any length. Given that #1 is the only risk driver, what is the expected
value of the duration of the project, and what is the expected value of the
project cost?
12. Toss two coins simultaneously 100 times, and record how many times you
got two heads, how many times two tails, and how many times one head and
one tail. Logic based on the PIR says that if the coins are fair, you should get
two heads about 25 times, two tails about 25 times, and one head and one
tail about 50 times. What did you get?
13. A small project has only one task. The cost is risk free. The duration is
planned for ten weeks. There is a 50% chance of meeting the plan, a 25%
chance of adding three weeks, and a 25%
chance of subtracting one week. Calculate
the expected value of the task duration.
Alls well that Also, explain how the coin toss experiment
ends. described in problem 12 can be used as a
Monte Carlo engine to simulate the project.
(Hint: Let two tails correspond to adding three
weeks to the project.)
AppendixGantt Charts and Schedule Networks
Most projects of any consequence have a schedule that is supposed to be
followed. If a project comprises nothing more than a simple chain of events,
none of which are worked concurrently, it suffices to simply list the events in the
order of presumed occurrence, and assign each one a start date. The only rule
is that the start date of any task cannot be earlier than the start dates of any prior
tasks. It is presumed that the end date of any task is the day before the start
date of the following task. Only the last task needs to have its end date
specified.

Here is a simple example:

Exhibit 37Simple Sequential Schedule

Prepare kickoff briefing Jun 1


Present kickoff briefing Jun 8
Prepare list of subject matter experts Jun 9
Contact subject matter experts (SMEs) Jun 15
Prepare SME statements of work Aug 3
Negotiate SME statements of work Aug 12
Contract with SMEs Aug 22
SMEs prepare course materials Aug 30
Review and approve course materials Oct 15
Issue course materials Oct 28-Nov 5

To improve visualization, this simple schedule can be converted to a Gantt


chart. A Gantt chart shows the period of each task as a horizontal bar, as
shown in Exhibit 38 below. This exhibit was prepared in MS Excel, but many
prefer to use a professional scheduling tool such as MS Project.

Exhibit 38Simple Sequential Gantt Chart

Jun Jul Aug Sep Oct Nov


Prepare kickoff briefing
Present kickoff briefing
Prepare list of subject matter experts
Contact subject matter experts (SMEs)
Prepare SME statements of work
Negotiate SME statements of work
Contract with SMEs
SMEs prepare course materials
Review and approve course materials
Issue course materials
In Gantt format, the sequential nature of this project is obvious. But the Gantt
format is not limited to sequential tasks. Concurrent or parallel tasks can also be
represented. This tends to shorten the project, but may also introduce risks, not
the least of which can be availability of resources. In Exhibit 39, the project
shown in Exhibit 38 is rescheduled with some tasks carried out with partial
concurrency.

Exhibit 39Sample Gantt Chart with Concurrency


Jun Jul Aug Sep Oct
Prepare kickoff briefing
Present kickoff briefing
Prepare list of subject matter experts
Contact subject matter experts (SMEs)
Prepare SME statements of work
Negotiate SME statements of work
Contract with SMEs
SMEs prepare course materials
Review and approve course materials
Issue course materials

Note that by using some concurrency, the schedule has been shortened by one
month. This has been done without changing the time span for any individual

Exhibit 40Schedule Network for Building a House

Go Landscaping Foundation Framing Roofing

Outside Electrical Inside Electrical

Dry Wall

Outside Plumbing Inside Plumbing

Floors

Outside Finishing Doors/Windows


Carpeting

Cleanup

Stop
task. The tasks are all still essentially sequential, but overlap has been used to
shorten the schedule. Use of overlap is based on the assumption that the next
task can be started when the preceding task is only partially completed.
In more complex projects, there may be many chains of sequential tasks running
parallel with each other, and interconnecting in complex ways. While the Gantt
format can be used for such projects, the precedence relationships quickly get
lost. For these projects, a schedule network helps visualize the situation much
better. Exhibit 40 is a fairly simple network diagram for building a house. Each
of the tasks is shown in a box. Connecting lines show both the flow of work and
the precedence relationships. Another convention, less frequently used, is to put
the tasks on the connecting lines and use nodes (dots or small circles) to
indicate precedence connections.

If an estimate of duration is applied to each task in a schedule network, it is


possible to determine the duration of each path through the network. The path
with the longest duration is called the critical path. All other paths have slack,
that is, extra time for completion of the tasks on that path. Management typically
focuses on completing each task on the critical path on time, because any delay
on that path lengthens the entire project.

Vous aimerez peut-être aussi