Académique Documents
Professionnel Documents
Culture Documents
A powerful language,
with such promise,
is largely being wasted!
People need information to make decisions. They don’t need reams of data; they
need straightforward answers to their questions. They just want to see the
numbers right now!
Right?
Wrong! We’re getting worse. Despite great progress in our ability to gather and
warehouse data, we’re still missing the boat if we don’t communicate the numbers
effectively. Contrary to popular wisdom, information cannot always speak for
itself.
In 1954, Darrell Huff wrote his best-selling book about how people often
intentionally use graphs to spread misinformation, especially in favor of their
own products or causes. Today, vastly more misinformation is disseminated
unintentionally because people don’t know how to use charts to communicate
what they intend.
Example #1
I found this table on the Web site for Bill Moyers’ public television show “Now”.
I felt that it provided important information that deserved a better form of
presentation. In this case the story could be told much better in visual form.
Example #1 - Improved
This series of related graphs tells the story in vivid terms and brings facts to
light that might not ever be noticed in the table.
Example #2
Here’s an example that I pulled from one of your reports. This is typical of many
graphs today—all dressed up, but overdressed to the point of distraction.
Example #2 - Improved
Here’s the same information, presented in a way that tells the story plainly and
clearly.
Example #3
Here’s another public health example from the state of Maine. This graph
contains important patterns that are difficult to discern due to clutter.
Example #3 - Improved
Example #4
If you were asked to tell the story contained in this display, it would take you
some time to put it together before you could even begin to explain it to others.
Example #4 - Improved
In this display of the same information, however, the story is clear and aspects of
the story that weren’t apparent in the pie charts jump right out.
Think
and
Communicate
“Above all else show the data.”
Edward Tufte
This Edward R. Tufte quote is from his milestone work, The Visual Display of
Quantitative Information, published by Graphics Press in 1983.
or 20
10
0
Sales 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
Finance
to
Operations Revenue
U.S.$
East West North
Marketing 100
80
0 20 40 60 80
60
40
and 20
which kind? 0
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
Year 2003
1. You begin by determining the best medium for your data and the
message you wish to emphasize. Does it require a table or a graph?
Which kind of table or graph?
2. Once you’ve decided, you must then design the individual components
of that display to present the data and your message as clearly and
efficiently as possible.
The old saying, “A picture is worth a thousand words,” applies quite literally to
quantitative graphs. By displaying quantitative information in visual form,
graphs efficiently reveal information that would otherwise require a thousand
words or more to adequately describe.
“[When] we visualize the data effectively and suddenly, there is what Joseph
Berkson called ‘interocular traumatic impact’: a conclusion that hits us between
the eyes.” William S. Cleveland, Visualizing Data, Hobart Press, 1993.
Take a moment to identify the various types of information that are revealed by
the shape of the data in this graph.
Tables work great when you need to look up individual facts, but they don’t reveal
trends, patterns, and exceptions very well. This particular table could be improved
through some simple formatting changes to make it easier to connect the data
from column to column,…
Now, however, by expressing this same information visually, thereby giving shape
to the data, the trends come alive.
Normal
scale
Log
scale
Faster rate of increase
Not just any graph will do, however. The graph must be designed properly to
display the intended message. In this case, because there is such a big
difference between the total population and that portion of the population that fall
into the highest age groups, a normal scale does not allow us to compare rates of
change. A logarithmic scale, however, supports this nicely. With a log scale, the
same rate of change equals the same slope of the line. Now we can see that the
oldest portion of the population has grown at a faster rate than the population as
a whole through the year 2000 and will resume this faster rate from 2004 through
2040.
The stories contained in numbers all revolve around relationships. The stories
contained in the numbers that measure public health, in fact, involve six
fundamental types of relationships. If you know the relationship that you’re
trying to communicate graphically and you know the best ways to graphically
encode that relationship, you possess a simple vocabulary that anyone can
learn to communicate numbers effectively.
Allow me to introduce the six relationships that you should get to know.
Time
Here’s the same exact data presented in two ways: one using bars and one using
a line. If you want to show trends and patterns of change through time, lines do
this job much clearer than bars.
Here’s a graph that shows change through time arranged vertically from top to
bottom as a sequence of bars,…
Relationship? Ranking
1
2
3
4
6
7
In this display of trauma registry injuries by county, notice how much more difficult
it is to compare the values and to get a sense of rank when they aren’t
sequenced to reveal the ranking.
Here’s the same data, with the counties arranged alphabetically on the left and by
number of injuries on the right. If the purpose of the display is to look up individual
values, which is the only thing that alphabetical order supports, a table would
work much better. The ranking display on the right, however, tells a story.
Relationship? Part-to-Whole
+ + + = 100%
A part-to-whole graph shows how the values associated with the individual
items in a full set of items relate to the whole and to one another.
Part-to-whole relationships are typically displayed as pie charts, but they don’t
communicate very effectively. If you want to see the order of items and to
compare the size of one to another, with this display you would struggle,…
…but with this simple bar graph the story is told simply and clearly.
Relationship? Deviation
A deviation graph shows how one or more sets of values differ from a
reference set of values, such as the deviation between expected and actual
cases of flu shown here.
When people need to see the differences between things, show them the
difference directly, rather than showing them the two sets of values and forcing
them to build a new picture in their heads of how they differ. The difference
between the median annual household income in Utah and in the U.S. as a whole
isn’t as easy to see in this graph,…
…as it is in this one, which directly expresses how household income in Utah
differs from the U.S. as a whole in positive and negative dollars.
Relationship? Distribution
or
This pair of histograms—one for boys and one for girls—are arranged in a way
that makes the patterns of each easy to see, yet still easy to compare.
Even better, by using lines rather than bars, the separate patterns can be shown
in the same graph in a way that features the shape of the patterns and how they
differ.
Relationship? Correlation
A correlation graph shows whether two paired sets of measures vary in relation
to one another, and if so, in which direction (positive or negative) and to what
degree (strong or weak). If the trend line moves upwards, the correlation is
positive; if it moves downwards, it is negative. A positive correlation indicates
that as the values in one data set increase, so do the values in the other data
set. A negative correlation indicates that as the values in one data set increase,
the values in the other data set decrease. In a scatter plot like this, the more
tightly the data points are grouped around the trend line, the stronger the
correlation.
I didn’t see many displays of correlations in the public health data that I
reviewed before the conference, but many interesting correlations live in your
data. In this example, I’m using WHO data to explore the correlation between
adult literacy and fertility rate by country. A correlation clearly exists: higher
literacy corresponds to lower rates of fertility. It is also clear from this display
that the highest rates of fertility all occur in Africa (the blue circles), which the
one exception of Yemen (the one green circle at the high end of fertility).
• Time-series
• Ranking
• Part-to-whole
• Deviation
• Distribution
• Correlation
Sometimes the stories that numbers tell involve multiple relationships that must
be shown together. Although these pie charts don’t do a good job of showing the
changing relationship between these age groups from year to year, the line graph
below works like a charm. It is now easy to see that in 2004 the number of people
75 years and older surpassed in population those from 65-74, and that it is
projected that before 2050 those 75 and older will surpass the 55-64 year old age
group as well.
It all makes perfect sense if you think about it. Here, the use of varying intensities
of color along a gray scale, from black for the rate of asthma diagnoses among
males, medium gray for rate among females, to light gray for the combined rate of
both genders, visually suggests a ranking of importance, the darker the greater,
which isn’t appropriate. Variations in intensity work wonderfully when you want to
communicate a ranked relationship, but shouldn’t be used arbitrarily. Entirely
different hues would do this job better.
Simple guidelines exist for keeping clutter to a minimum so people can clearly
see and think about the data. Follow the advice of Thoreau when he wrote
“Simplify, simplify, simplify.”
Here’s an example of a clutter reduction technique that I found in one of your data
displays. Trying to tell this entire story with a single graph would have been
cluttered beyond comprehension, but by separating it into two graphs that share
the same quantitative scale and arranging them close together as you see here,
the information is clear and though separated, the percentages of asthma attacks
among children three to ten years of age in the top graph and children 11-17
years of age in the bottom graph can still be compared quite easily.
Once you have the basics down, you can begin to tell more complex stories using
more advanced techniques, such as this example from GapMinder, which uses
an animated display to tell the story of how the correlation between the number of
births per woman and mortality among young children throughout the world has
changed through time.
To communicate
or not to communicate,
that is the question!
This important health information relies on
you to give it a clear voice.
The good news is, although the skills required to present data effectively are
not all intuitive, they are easy to learn. This won’t happen, however, until you
recognize the seriousness of the problem and commit yourself to solving it. It is
up to you. It’s worth the effort. If the data is important enough to communicate,
it is worth communicating well.