Académique Documents
Professionnel Documents
Culture Documents
Volume 1
Data Visualization,
Volume 1
Recent Trends and Applications
Using Conventional and Big Data
Amar Sahay
Abstract
Data visualization involves graphical and visual tools used in data analysis
and decision making. The emphasis in this book is on recent trends and
applications of visualization tools using conventional and big data. These
tools are widely used in data visualization and quality improvement to
analyze, enhance, and improve the quality of products and services. Data
visualization is an easy way to obtain a first look at the data visually. The
book provides a collection of visual and graphical tools widely used to
gain an insight into the data before applying more complex analysis. The
focus is on the key application areas of these tools including business
process improvement, business data analysis, health care, finance, manufacturing, engineering, process improvement, and Lean Six Sigma. The
key areas of application include data and data analysis concepts, recent
trends in data visualization and Big Data, widely used charts and graphs
and their applications, analysis of the relationships between two or more
variables graphically using scatterplots, bubble graphs, matrix plots, etc.,
data visualization with big data, computer applications and implementation of widely used graphical and visual tools, and computer instructions
to create the graphics presented along with the data files.
Keywords
big data, big data software, business analytics, business intelligence, charts
and graphs, data, data analysis, data visualization, information visualization, quality tools, software applications, visual representation
Contents
Preface...................................................................................................ix
Acknowledgments..................................................................................xiii
Computer Software Integration, Computer Instructions and Data Files..... xv
Preface
The purpose of this book is to introduce the graphical and data visualization tools. These tools are widely used in data visualization and quality
improvement to analyze, enhance, and improve the quality of products
and services. Visual tools are an easy way to gain a first look at the data
and they have been used to gain an insight into the data before applying more complex analysis. This book presents a collection of visuals and
graphical tools. These tools are commonly referred to as graphical tools.
A number of charts and graphs are commonly used to create visuals that
provide a quick summary, trends, and patterns in the data that is not
usually apparent from the data in raw form. The first part of the book
presents the applications of widely used charts and graphs.
This book provides a set of graphical and information visualization tools
that have been developed and used over the years in quality improvement and Lean Six Sigma programs. The use of these data visualization
and quality tools is not limited to quality programs. Some of the key
areas where these tools are applied include business process improvement,
business data analysis, health care, finance, manufacturing, engineering
process improvement, and product and process design.
The visuals and graphs in this book represent data visually that enables
an analyst to immediately view the important features and characteristics
of data. The graphs and charts explain the current state of a process and
also provide opportunities for improvement. Some of the visual displays,
for example, flow diagrams and value stream mapping, have been successfully used in studying, developing, and improving business and engineering processes. They also help redesign more efficient processes. Besides
improving the process design, many specially designed graphs and charts
are used in product and process design and improvement. In many cases,
these visual tools provide an idea about the variation in the process that
allows the opportunity for reducing variation. Variation reduction is one
of the major goals of process improvement and quality improvement.
These graphical tools are critical in problem solving.
x PREFACE
The graphical tools discussed in this book have been successfully applied to
This list consists of the goals of the overall quality program, but many
of these problems can be solved using simple but effective graphical tools
leading to product and service excellence.
The current trends in data analysis, data visualization, and visual analytics are capable of processing large amounts of data often referred to as
big data. Using big data and data visualization tools, several variables of a
process (e.g., a business process) can be plotted simultaneously and presented using dashboards. These graphs and charts immediately provide
overall visualization of a business process including sales, revenue, profitability, and they point out the problems across the entire supply chain.
The dashboards created using a business process data can explain the entire story and can help the management identify the strengths and weaknesses of its business. Since the human brain cannot visualize information
beyond three dimensions, plotting several variables simultaneously in two
dimensions can provide valuable insight.
This following are the highlights and the areas this book discusses
Data and data analysis concepts
Recent trends in data visualization and big data
Visual representation of data: widely used charts and graphs and
their applications
Investigation of the relationships between two or more variables
graphically: scatter plots, bubble graphs, and matrix plots
Data visualization with big data
PREFACE
xi
xii PREFACE
Acknowledgments
I would like to thank the reviewers who took the time to provide
excellent insights which helped shape this book.
I would especially like to thank Mr. Karun Mehta, a friend and engineer. I greatly appreciate the numerous hours he spent in correcting,
formatting, and supplying distinctive comments. The book would not be
possible without his tireless effort.
I would like to express my gratitude to Prof. Susumu Kasai, professor
of CSIS, for reviewing and administering invaluable suggestions.
I am very thankful to Prof. Edward Engh for his thoughtful advice
and counsel. Ed has been a wonderful friend and colleague.
Special thanks go to Mr. Anand Kumar, a senior consultant at Tata
Consultancy Services (TCS), for reviewing and providing invaluable
suggestions.
Thanks to all of my students for their input in making this book possible. They have helped me pursue a dream filled with lifelong learning.
This book wont be a reality without them.
I am indebted to the senior acquisitions editor Scott Isenberg, director of production Charlene Kronstedt, all the reviewers, and the publishing team at Business Expert Press for their counsel and support during the
preparation of this book. I also wish to thank Donald N. Stengel, editor,
for reviewing the manuscript and providing helpful suggestions for improvement. I acknowledge the help and support of S4Carlisle Publishing
Services, Chennai, India team for their help with editing and publishing.
I would like to thank my parents who always emphasized the importance of what education brings to the world. Lastly, I would like to
express a special appreciation to my wife Nilima, to my daughter Neha
and her husband David, my daughter Smita, and my son Rajeev for their
love, support, and encouragement.
Computer Software
Integration, Computer
Instructions and Data Files
The book uses the most widely used software packages. We have included the materials with the computer instructions in Appendix A of
the book. The computer instructions are provided for both EXCEL and
MINITAB that will facilitate using the book. Included are the following
data files that can be downloaded using the link provided:
EXCEL Data Files
MINITAB Data Files
The data files can be downloaded from the web using the following link:
URL: http://www.businessexpertpress.com/books/data-visualizationvolume-one-recent-trends-and-applications-using-conventional-andbig-data
CHAPTER 1
Dell
10
15
20
25
30
3.3
20
40
3.8
4.7
5.9
8.1
10.2
12.8
6.5
8.7
23.7
24.5
18.7
26.7
60
51.9
Year
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
2.8
Yahoo
eBay
Amazon
Sales ($Billion)
10
20
30
40
50
18.0
18.1
24.6
26.3
25.4
Year
2002 2003 2004 2005 2006 2007 2008 2009 2010 2011
12.2
22.5
37.9
45.8
42.8
CPU
Sales
14%
iPhone
Sales
54%
iPod
Sales
5%
Other
3%
ITunes
Store
4%
iPad
Sales
20%
4
DATA VISUALIZATION, VOLUME 1
Most of the graphs in this text can be produced using statistical and
data visualization software. We will illustrate several examples where computer software such as EXCEL and MINITAB are used to construct
the charts and graphs. Some other graphical displays, for example, flow
diagrams, process maps, and value stream maps, are widely used in studying and improving process. These are created using specialized software.
MINITABs Quality Companion, Microsoft Visio, and Smart Draw
are some of the widely used programs for this purpose. Another widely
used software for Data Visualization and Visual Analytics is Tableau Software. This software is capable of handling big data and creates high-level
graphs and charts to visually display data. An added feature of Tableau is
the analytics feature built into it that can answer many queries not apparent from the graphs and charts alone.
plots, and symmetry plots and their applications are explained with
examples.
Chapter 4
Chapter 4 presents graphical techniques of investigating the relationships
between two or more variables. The most commonly used graphs and
plots for this purpose are as follows:
Scatter plots and variations of scatter plots
Scatter plots with histogram, box plots and dot plots
Scatter plot with fitted line or curve
Scatter plot showing an inverse relationship between X and Y
Scatter plot showing a nonlinear relationship between X and Y
Scatter plot showing a nonlinear (cubic) relationship between X and Y
Bubble graphs showing the relationship between three variables
Matrix plots that investigate the relationship between several independent variables and the response or dependent variable
Three-dimensional plots, surface plots, and contour plots
These graphs along with their applications are explained.
Chapter 5
This chapter discusses data visualization techniques using big data. The
current trend is visualization with big data. Data visualization makes
complex and large data understandable. The chapter provides an introduction to big data, applications of big data in different fields including business, health care, government, manufacturing, and others. The
emerging trends in big data, visual analytics, and software products in
this area are Introduced. Examples of processing business data using the
Tableau software and dashboards are presented. Big data software p
rovides
a number of views and graphs of the same data. The chapter discusses the
emerging need for visualization with big data in the light of the increase
in the volume of data being collected and stored and the challenges of
Summary
This chapter provided an overview of data visualization and its importance. It has laid a foundation for the rest of the book by outlining the
chapter contents. Each chapter presents a class of graphical tools that can
be applied in areas ranging from simple to advanced analysis. The charts
and graphs find wide application in data analysis and also in quality improvement projects to detect and solve a number of problems. These
graphs and charts are critical in understanding the process from which
data are collected. Each chapter in the book is devoted to a particular class
of graphical and visual tools ranging from most commonly used graphical
tools to data visualization using big data.
Index
Add trend line, 136
Amazon.com, 102
Analytics, 5
Apple Inc., 4
Area graph, 5859
Area plot, 71
Bar chart, 36, 67, 145146
categorical data, 3738
of categorical variables, 156157
creating, 4244
description of, 124, 154155
horizontal, 155156
instructions for constructing, 125
of monthly sales, 37
of sales vs. month, 37
BI. See business intelligence (BI)
Big data, 5, 1718
applications of, 98
businesses, 101102
characteristics, 9899
description of, 9899
education, 101
health care, 101
Internet of Things (IoT), 102103
introduction and applications of,
9899
managing and handling, 108
manufacturing/operations,
100101
media, 102
real estate, 103
science and research, 103
software and applications of,
99100
tools to process, 17
United Sates of America, 100
Big Data Software Tableau, 109110
Bins definition, 117118
Bivariate relationship, 75
178 INDEX
3D scatterplot, 8589, 95
with projected lines, 8990
3D surface plot, 9091
with projected line, 90
Data, 1011
available, 12
based on types of measurement
scales, 1316
classification of, 1011, 15
collection and presentation of,
1920
for research and analysis, 1113
graphical summary of, 27, 65
hidden patterns in, 16
histogram, 2327
key features of, 3
levels of measurements, 13
objectives of, 15
organizing, 2021
sets of, 127
sources of, 1113
summarizing quantitative, 2123
visual representation of, 3, 19
Data analysis
concepts of, 114
functions of, 9
statistical thinking in, 1516
Data analytics, 9798, 104
Data mining, 1617, 98, 104
tools, 102
Data set, description of, 9
Data visualization, 9798,
107108, 113
advancements in, 99100
applications, 107
forms of, 104
fundamental concepts in, 103104
information displays, 107108
objectives of, 97
quantitative messages conveyed by,
105106
software, 5, 106110
techniques of, 99
terminology for, 107
Default histogram, 150
Dependent variable, 75
Discrete data, 1011, 18
Dot plot, 3536, 67
ebay.com, 102
Economic data, 12
Education, 101
Effective graphical displays, 105
Exabytes, 17
EXCEL, 5, 114
bar chart, 124128
frequency polygon, 132134
histogram using, 117121
line chart or connected line plot,
128129
pie chart, 130132
plot of cumulative frequency,
122124
scatterplots, 134135
Experimental design, 13
Fact-based decision making, 17
Fitted distribution, 65
Format trend line, 136
Frequency distribution, 2123
of categorical data, 123
graph of, 2327
using pivot chart, 140141
Frequency polygon
description of, 132133
of income data, 134
instructions for constructing, 133
Geographic positing, 102
Google, 12, 102
Government agencies, 13
Graphical displays, 105
of variation, 2728
Graphical summary of data, 27
Graphical techniques, description of, 9
Graphical tools, 5, 113
Graphs, 3, 10, 11, 107108
data and process analysis, 16
editing tools for, 120
summary of, 6573, 9296
Graphs displaying variation, 65
Hand-drawn visualizations, 106107
Health care, 101
Histogram, 62, 65, 72
constructing, 149
default, 150
INDEX
179
of response variable, 88
showing relationship between
multiple variables, 87
McKinsey Global Institute study, 101
Media, 102
Microsoft Visio, 5, 114
MINITAB, 5, 114
constructing a default histogram,
149153
histograms using, 147149
More than ogive, 162
Multiple time series plot, 138, 173
Independent variable, 75
Infographics, 104
Information displays, 107108
Information graphics, 104
Information visualization, 107, 108
Internet of Things (IoT), 102103
Internet sites, 12
Interval plot, 5054, 70
description of, 50
of piston ring diameters, 51
variation in sample data, 5051
yield grouped with temperature
time series, 52
Interval scale, 14
Inverse relationship, 77
IoT. See Internet of Things (IoT)
Negative relationship, 77
Neural network techniques, 100101
Nominal scale, 14
Petabytes, 17
Pie chart, 4850, 69
description of, 48, 130, 159160
federal expenditure, 160
instructions for constructing,
130131
percent in each category, 132
showing actual values in each
category, 131
variations of, 4950
Pivot chart, 139
frequency distribution and
histogram using, 140141
to summarize qualitative or
categorical data, 142144
Pivot table, 139
Positive relationship, 76
Probability, 114
Probability plot, 5962, 71, 72
Process analysis, objectives of, 15
Processes of data, 13
Qualitative data, 10
Qualitative variable, 14
180 INDEX
a one-time purchase,
that is owned forever,
allows for simultaneous readers,
has no restrictions on printing, and
can be downloaded as PDFs from within the library community.
Our digital library collections are a great solution to beat the rising cost of textbooks. E-books
can be loaded into their course management systems or onto students e-book readers.
TheBusiness Expert Press digital libraries are very affordable, with no obligation to buy in
future years. For more information, please visit www.businessexpertpress.com/librarians.
Toset up a trial in the United States, please email sales@businessexpertpress.com