Vous êtes sur la page 1sur 10

Leigh 1

Ryan Leigh
Connie Douglas
ENG-2116
21 November 2016

Purpose of Report
The purpose of this report is to educate the reader on the current state of Big Data and
data science applications today. This paper will focus on how Big Data and data science, also
referred to in this paper as data analytics, are being applied in businesses across various
industries. Included are analytics trends that are frequently seen in the business world,
discussion of frequently used technical practices, as well as some of the opportunities and
constraints facing data and analytics.
Current Situation
Over the past decade, there has been a subtle, yet significant shift in how businesses and
society as a whole are producing and utilizing data. In the past, data has always been collected
for different purposes, but often misused or underutilized. Data analysts were often dedicated to
scientific disciplines and had little use in other industries. More recently, new technologies are
allowing for better production and collection of massive amounts of information, known as Big
Data. Businesses and other areas of society began to realize the benefits this data could produce,
but needed the people and methods to properly understand and utilize them. As a result,
researchers have combined methods from multiple scientific and mathematical backgrounds to
develop a new discipline known as data science. Data science can be defined as a set of
disciplines necessary to solve big data challenges (Song and Zhu, 2016).

Leigh 2
The demand for analytics in Big Data is increasing exponentially, but there is still a
shortage in the talent and technology needed to see the full benefits. There are few educational
institutions in the United States (U.S.) that offer data science programs, and those that do have
just started to fully integrate data science programs into their curriculum. Despite more people
entering the field, the United States (U.S.) will still face a shortage of 140,000 to 190,000 data
analysts by 2018 (Carter and Sholler, 2016). As businesses increase their focus on acquiring
and analyzing data, the hope is that there will be more education and interest in the field in the
future.
Big Data in the Business World
Businesses across a diverse variety of industries are taking advantage of the benefits Big
Data provides. There are several examples of how Big Data practices are being implemented in
the following industries: retail, pharmacology, law enforcement, and healthcare.
Retail Industry
The retail industry was one of the first industries to begin using data to streamline
business practices. Big Data has helped retailers with marketing and advertising, determining
price, risk analysis, and more. Facebook and Google are two of the most recognizable examples
of companies who use the massive amounts of data they collect from users to target products,
services, and advertisements (Ranjan Das, 2016). They are also able to use this information to
better optimize price discrimination tactics, where prices are adjusted so that people can better
afford them. Retailers are also able to gain access to better data in real time, allowing them to
forecast events faster and more accurately.
Healthcare

Leigh 3
In pharmacology, analytics techniques are being used to help study topics, such as, the
effectiveness of a particular drug on a disease or to track where a disease is most likely to break
out and who will be most susceptible to contracting it. Other fields in the healthcare industry use
data gathered from electronic healthcare records, sensors, and biomedical imaging, such as CT
scans, to provide better methods of care (Yang and Pierangelo, 2015). In addition to improved
predictive modeling techniques used to develop better treatment plans for patients, hospitals and
doctors are using Big Data to enhance fields like telemedicine, where they can monitor a
patients vital statistics remotely and help treat injury or illness without the patient having to
visit.
Law Enforcement
Law enforcement agencies are also using Big Data and analytics to help prevent crime
before it happens and to solve existing cases. A technique known as cluster analysis is used to
detect criminal hot spots by comparing relationships between different criminal entities
(Hassani, et al., 2016). Police are able to discover crime patterns by identifying suspects and
linking them to related criminal activity. Another example includes techniques used to organize
crime data and activities into large data sets.
These industries are far from the only ones to have adopted Big Data and data science
practices with the hopes of becoming more efficient and successful. While each field has
benefited immensely from the value of data analytics, data science itself is still in its early stages.
As such, there are several constraints facing those in the field, all of which provide opportunities
to improve in the future.
Constraints and Opportunities

Leigh 4
While Big Data continues to grow in many aspects, there are many obstacles in place
preventing companies from realizing its maximum potential. The most commonly recognized
problem companies face when trying to analyze data today is simply the sheer volume that they
have to manage. Society, as a whole, is producing massive amounts of data each day; it is
estimated that "90% of the data created in the world has been produced in the past two years"
(Hai, et al., 2016). For companies to feel as comfortable with their observations and predictions,
most try to acquire as much information as they can. Unfortunately, a combination of inadequate
technology and under-skilled personnel have made it difficult to keep pace with the amount of
data being generated each day. Visualization techniques, in particular, struggle with this issue,
making it difficult to make sense of data that is being presented. Often times, an algorithm is
developed, but existing technology may not be able to optimally implement it for years to come
(Carbone, et al., 2016).
Rapid Growth
For most data scientists, however, the biggest concern is not acquiring the information; it
is what to actually do with it. One of the main challenges in Big Data can be summarized by
quoting a British statistician named David Spiegelhalter who said: there are lots of small data
problems that occur in Big Data. They dont disappear because youve got lots of stuff They
get worse (Flockhart, et al, 2016). Businesses are discovering that as more variables are
introduced, it becomes more difficult to draw meaningful conclusions from the information.
Traditionally, challenges for data scientists or analysts include missing data, lack of
randomization, and inadequate control variables. These problems do not go away as more data is
introduced; they often become harder to identify and manage.
Privacy and Security

Leigh 5
One other area of concern is privacy and security. There is a large portion of the public
that is uncomfortable with how their personal information is being used and who is using it.
They are also worried about whether or not that information is being adequately secured from
hackers, who may use it to steal someones identity or financial information. Due to these
concerns, researchers and scientists often run into roadblocks when trying to gain access to
especially sensitive data, known as administrative data, such as, tax returns, medical records, or
criminal history (Check-Hayden, 2016). Without strict regulations around keeping data safe and
anonymous, when necessary, data scientists will often find it difficult to access all of the
information they need.
Despite these problems, there are reasons to be optimistic about the future of Big Data.
As businesses discover new obstacles, and continue to work with existing ones, they are able to
identify the skills needed in an analyst or data scientist to help solve them. As previously
mentioned, there are a small number of educational institutions that offer data science-related
courses, but those that do are offering training in real-world practices, such as, data mining,
distributed storage, and parallel processing (Song and Zhu, 2016).
Artificial Intelligence
Advancements in artificial intelligence also provide an opportunity for the field to
develop. Machine learning algorithms are making it easier for computers to recognize patterns
and make intelligent predictions. Computers are increasingly able to make many of these
decisions on their own, without having to be explicitly programmed.
Complex Systems Science
Some researchers in the data science community have begun to look at methods outside
of the typical disciplines used in analytics. Complex Systems Science is a field in which various

Leigh 6
interacting variables are studied to see how they create the collective behaviors of the whole
dataset (Carbone, et al., 2016). By including other fields of study, such as, economics,
psychology, or biology, Big Data can be viewed from a variety of perspectives, offering
additional insights that traditional statistics or computer science methods might not.
Technical Practices
Despite the room for growth in data analytics, the current state of technology has still
offered many benefits for both businesses and researchers. Improvements in hardware and
software programs allow for data analysts to acquire more data than they ever have and are
helping them form intelligent conclusions and predictions. The following is a brief overview of
some of the techniques and methods used in processing Big Data.
Data Mining
One of the biggest challenges in Big Data is extracting meaningful conclusions from a
data set. One method designed to help solve this problem is known as data mining. Data mining
is the process of collecting, searching through, and analyzing a large data set with the purpose of
discovering patterns or relationships. Traditional data mining techniques were previously used to
analyze an entire data set, but as these collections begin to increase in size, new techniques are
being developed. One variation is known as data stream mining, where large data sets are
broken up into more manageable segments (Fong, et al. 2016). Segments can be processed in
parallel by using multiple processors, making the process faster and more efficient. This
technology is used to help businesses make conclusions on ideas such as what types of products
are bought as compliments to each other, the best times to alter prices, the optimal amount of a
product to order, and more.
Machine Learning

Leigh 7
Machine learning is an artificial intelligence technique that allows computers to
recognize patterns in data and make conclusions without having to be explicitly programmed.
As a computer is fed new data, it is able to categorize information into various subsets on its
own, where it can provide its own conclusions or predictions or neatly present information to an
analyst to make the process easier on them.
Supervise learning is a method where the data that a computer processes is tagged with
some type of identifier. For example, a computer may be given thousands of pictures of human
faces. Some photos have people smiling and are tagged as such, while the others are not smiling.
The computer is then able to learn what a smile looks like by taking different physical
characteristics of the faces in the photos tagged smile.
Unsupervised learning is a process where the data a computer receives is not tagged
with any sort of identifier. The computer separates the data into different categories based on
similarities or patterns that it finds. This technique is often referred to as clustering. A data
scientist is able to view the various categories the computer creates and determine what
relationships or correlations can be made.
Conclusion
The collision of Big Data and data science has opened up new opportunities for
businesses and countless other sectors of society. Overall, the net benefits appear to outweigh
potential constraints. Data analytics are being used to save money for businesses and consumers
alike, providing better healthcare opportunities, improving the quality and efficiency of artificial
intelligence, and much more. In addition, the number of career opportunities have increased for
people with backgrounds in fields including, but not limited to, computer science, statistics, and

Leigh 8
business. These opportunities should only continue to develop as advancements are made in the
logical and algorithmic aspects of the field.
Despite the progress that has been made, there is much room for improvement in how
data scientists are utilizing Big Data. As more data is produced, new ways need to be developed
to store and process it. Technology needs to continue to be improved to fully take advantage of
the algorithms used to mine data. Perhaps most importantly, education in the fields of data
science and Big Data need significant improvements. More universities and colleges need to
offer comprehensive programs, particularly at the undergraduate level, where there are currently
very few options available.
By continuing to recognize and develop opportunities for improvement, as well as
enhance current methods and techniques, Big Data has the potential to transform the way
businesses and societies operate in new and imaginative ways.

Leigh 9

Works Cited
Carbone, Anna, Meiko Jensen, and Aki-Hiro Sato. "Challenges In Data Science: A
Complex Systems Perspective." Chaos, Solitons & Fractals 90.(2016): 1-7.
Academic Search Complete. Web. 12 Oct. 2016.
Carter, Daniel, and Dan Sholler. "Data Science On The Ground: Hype, Criticism, And
Everyday Work." Journal Of The Association For Information Science &
Technology 67.10 (2016): 2309-2319. Academic Search Complete. Web. 30 Sept. 2016.
Check Hayden, Erika. "Researchers Wrestle with A Privacy Problem." Nature 525.7570
(2015): 440-442. Academic Search Complete. Web. 27 Oct. 2016.
Flockhart, David, et al. "Big Data: Challenges and opportunities for clinical
pharmacology." British Journal of Clinical Pharmacology May 2016: 804+. Academic
Search Complete. Web. 20 Oct. 2016.
Fong, Simon, et al. "Improvised Methods For Tackling Big Data Stream Mining
Challenges: Case Study of Human Activity Recognition." Journal of
Supercomputing 72.10 (2016):3927-3959. Academic Search Complete. Web. 21
Oct. 2016.
Hassani, Hossein, et al. "A Review Of Data Mining Applications In Crime." Statistical
Analysis & Data Mining 9.3 (2016): 139-154. Academic Search Complete. Web.
22 Oct. 2016.
Ranjan Das, Sanjiv. "Big Data's Big Muscle." Finance & Development 53.3 (2016): 2628. Business Source Complete. Web. 13 Oct. 2016.

Leigh 10
Song, Il-Yeol, and Yongjun Zhu. "Big Data And Data Science: What Should We Teach?."
Expert Systems 33.4 (2016): 364-373. Academic Search Complete. Web. 12 Oct.
2016.
Wang, Hai, et al. "Towards Felicitous Decision Making: An Overview On Challenges
And Trends Of Big Data." Information Sciences 367.(2016): 747-765. Academic
Search Complete. Web. 8 Oct. 2016.
Yang, Christopher C., and Pierangelo Veltri. "Intelligent Healthcare Informatics in Big
Data Era." Artificial Intelligence in Medicine 65.2 (2015): 75-77. Academic
Search Complete. Web. 26 Oct. 2016.

Vous aimerez peut-être aussi