Académique Documents
Professionnel Documents
Culture Documents
A brief
overview
An Introduction
Big data analytics is where advanced analytic techniques operate on big data
sets. Hence, big data analytics is really about two things plus how the two
have teamed up to create one of the most profound trends in business
intelligence (BI) today.
Analytics helps us discover what has changed and how we should react as
well as this is the best way to discover new customer segments, identify the
best suppliers, associate products of affinity, understand sales seasonality.
. To help user organizations select the right form of analytics and prepare big
data for analysis, this report will discuss new options for advanced analytics
and analytic databases for big data so that users can make intelligent
decisions as they embrace analytics.
This is a collection of related techniques and tool types, usually including
predictive analytics, data mining, statistical analysis, and complex SQL.
The three Vs of big data (volume, variety, and velocity) constitute a
comprehensive definition, and they bust the myth that big data is only about
data volume. In addition, each of the three Vs has its own ramifications for
analytics.
Its obvious that data volume is the primary attribute of big data. With that
in mind, most people define big data in terabytes
Its obvious that data volume is the primary attribute of big data. With that
in mind, most people define big data in terabytes. . Some organizations find
it more useful to quantify big data in terms of time.
One of the things that make big data really big is that its coming from a
greater variety of sources than ever before. Many of the newer ones are Web
sources, including logs, clickstreams, and social media.
The few organizations that have been analyzing this data now do so at a
more complex and sophisticated level. Big data isnt new, but the effective
analytical leveraging of big data is. The recent tapping of these sources for
analytics means that so-called structured data is now joined by unstructured
data and semi-structured data.
, with big data, variety is just as big as volume. In addition, variety and
volume tend to fuel each other. Big data can be described by its velocity or
speed.
. For example, think of the stream of data coming off of any kind of device or
sensor, say robotic manufacturing machines, thermometers sensing
temperature, microphones listening for movement in a secure area, or video
cameras scanning for a specific face in a crowd.
Web sites for years, using streaming data to make purchase
recommendations to Web visitors. With sensor and Web data flying at you
relentlessly in real time, data volumes get big in a hurry
Big data is an all-encompassing term for any collection of data sets so large
and complex that it becomes difficult to process using on-hand data
management tools or traditional data processing applications
The trend to larger data sets is due to the additional information derivable
from analysis of a single large set of related data, as compared to separate
smaller sets with the same total amount of data, allowing correlations to be
found to "spot business trends, prevent diseases, combat crime and so on."
The primary goal of big data analytics is to help companies make better
business decisions by enabling data scientists and other users to analyze
huge volumes of transaction data as well as other data sources that may be
left untapped by conventional business intelligence (BI) programs.
Big data analytics can be done with the software tools commonly used as
part of advanced analytics disciplines such as predictive analytics and data
mining.
The technologies associated with big data analytics include NoSQL
databases, Hadoop and MapReduce
Predictive analytics uses techniques like simulation, statistics, and machine
learning to extrapolate from past data or behavior to predict what might
happen. Variations might be introduced so that you can get an idea of future
results if you increase your sales force by 10%, decrease your price by 5%, or
increase your manufacturing capacity.
Big science
The Large Hadron Collider experiments represent about 150 million sensors
delivering data 40 million times per second. There are nearly 600 million
collisions per second. After filtering and refraining from recording more
than 99.999% of these streams, there are 100 collisions of interest per second.
As a result, only working with less than 0.001% of the sensor stream data,
the data flow from all four LHC experiments represents 25 petabytes annual
rate before replication (as of 2012). This becomes nearly 200 petabytes after
replication.
International development
Research on the effective usage of information and communication
technologies for development suggests that big data technology can make
important contributions but also present unique challenges to International
development.
Market
Big data has increased the demand of information management specialists in that
Software AG, Oracle Corporation, IBM, FICO, Microsoft, SAP, EMC, HP and Dell
have spent more than $15 billion on software firms only specializing in data
management and analytics
In 2010, this industry on its own was worth more than $100 billion and was growing
at almost 10 percent a year: about twice as fast as the software business as a whole.
Developed economies make increasing use of data-intensive technologies. There are
4.6 billion mobile-phone subscriptions worldwide and there are between 1 billion and
2 billion people accessing the internet
The world's effective capacity to exchange information through telecommunication
networks was 281 petabytes in 1986, 471 petabytes in 1993, 2.2 exabytes in 2000, 65
exabytes in 2007 and it is predicted that the amount of traffic flowing over the internet
will reach 667 exabytes annually by 2014.
Technologies
Big data requires exceptional technologies to efficiently process large
quantities of data within tolerable elapsed times.
A 2011 McKinsey report suggests suitable technologies include A/B testing,
crowdsourcing, data fusion and integration, genetic algorithms, machine
learning, natural language processing, signal processing, simulation, time
series analysis and visualisation.
The practitioners of big data analytics processes are generally hostile to
slower shared storage, preferring direct-attached storage (DAS) in its various
forms from solid state drive (SSD) to high capacity SATA disk buried inside
parallel processing nodes
Drivers
Marketing optimization
Marketing has evolved from a creative process into a highly data-driven
process. Marketing organizations use analytics to determine the outcomes
of campaigns or efforts and to guide decisions for investment and consumer
targeting
Portfolio analysis
A common application of business analytics is portfolio analysis. In this, a
bank or lending agency has a collection of accounts of varying value and
risk. The accounts may differ by the social status of the holder, the
geographical location, its net value, and many other factors. The lender must
balance the return on the loan with the risk of default for each loan.
The analytics solution may combine time series analysis with many other
issues in order to make decisions on when to lend money to these different
borrower segments, or decisions on the interest rate charged to members of a
portfolio segment to cover any losses among members in that segment.
Risk analytics
Predictive models in banking industry are widely developed to bring
certainty across the risk scores for individual customers. Credit scores are
built to predict individuals delinquency behaviour and also scores are
widely used to evaluate the credit worthiness of each applicant and rated
while processing loan application. Furthermore, risk analyses are carried out
in the scientific world and the insurance industry
Discovery
Iteration
Flexible Capacity
Mining and Predicting
Decision Management
Across all industries, the business case for big data is strongly
focused on addressing customer-centric objectives
A scalable and extensible information management foundation is a
prerequisite for big data advancement
Organizations are beginning their pilots and implementations by
using existing and newly accessible internal sources of data
Advanced analytic capabilities are required, yet often lacking, for
organizations to get the most value from big data.
As awareness and involvement in big data grows, four key stages of
big data adoption emerge along a continuum
Thus, based on the results of the above study, following were the ways devised
for the big players as well as small firms that would help them compete in the
market: Predict exactly what customers want before they ask for it.
Get customers excited about their own data.
Improve customer service interactions.
Identify customer pain points and solve them.
Reduce health care costs and improve treatment
GE
KAGGLE
AYASDI
MOUNT SINAI ICAHN SCHOOL OF MEDICINE
THE WEATHER COMPANY
KNEWTON
SPLUNK
GNIP
EVOLV
Market Opportunities
Big Data is the biggest game-changing opportunity for marketing and sales
since the Internet went mainstream almost 20 years ago. . New technologies
as well as rapidly proliferating channels and platforms have created a
massively complex environment. At the same time, the explosion in data and
digital technologies has opened up an unprecedented array of insights into
customer needs and behaviorsThose that use Big Data and analytics effectively
show productivity rates and profitability that are 5 6 percent higher than
those of their peers. Data on its own, however, is nothing more than 1s and 0s.
The companies that succeed today do three things well:
Use analytics to identify valuable opportunities.
Start with the consumer decision journey.
Keep it fast and simple.
Market Scenario
Chances of Growth
Recent Deals