Vous êtes sur la page 1sur 8

Term Project

Erik Kantrowitz
STA 3032
11/24/2014

PART A
1.
General Electric vs. Siemens

A.

General Electric Closing Price (September 2011


September 2014)
Mean

Standa
rd Error

22.501
02

0.2647
6

Standa
rd
Deviati
on
3.3174
3

Varianc
e

Q1

Median

Q3

IQR

Range

11.005
34

19.84

23.12

25.51

5.67

13.13

Siemens Closing Price (September 2011 September 2014)


Mean

Standa
rd Error

Standa
rd
Deviati

Varianc
e

Q1

Median

Q3

IQR

Range

109.75
86

1.2651
51

on
15.852
3

251.29
53

98.52

106.16

125.71
5

27.195

57.52

B.
Through analysis of the given data Siemens displays a greater variability over the
period of September 2011 through September 2014

C.
General Electric
20% Trimmed mean of volume of shares
purchased
194248489.1

Siemens
20% Trimmed mean of volume of shares
purchased
2287490.583

2.
United States Census

A.
ProbabilityPlot of 50STATESCENSUS
Normal
99

Mean
6053834
StDev
6823984
N
51
AD
3.958
P-Value
<0.005

95
90
80

Percent

70
60
50
40
30
20
10
5

-10000000

10000000 20000000
50 STATESCENSUS

30000000

40000000

B.
The Census Data does not follow a normal distribution, this is made obvious by how
different the graph is to the regression line in the graph, after further analysis of the data
it would appear that the distribution resembled a lognormal distribution, rather than a
normal

Probability Plot for 50STATESCENSUS


Normal - 95%CI

Exponential - 95%CI

99

Percent

Percent

Normal
AD =3.958
P-Value <0.005

90

90

50

50

10

Exponential
AD =0.480
P-Value =0.511

10
1

20000000

40000000

100000

50STATESCENSUS

1000000

10000000

50 STATESCENSUS

Lognormal - 95%CI

Weibull - 95%CI

99

99.9
90

Percent

90

Percent

Goodness of Fit Test

99.9

50

Lognormal
AD =0.366
P-Value =0.422
Weibull
AD =0.479
P-Value =0.233

50

10

10
1
100000

1000000

10000000

100000000

1
10000

100000

50 STA TES CENSUS

1000000

10000000 100000000

50 STATESCENSUS

C.

Populationof states
Missouri
4.4%

Category

Florida
13.9%

Texas
18.5%
Illinois
9.5%

Georgia
7.1%
New York
14.3%

Indiana
4.8%

California
27.5%

3.
Golf Association Distances

Florida - 13.9%
Illinois - 9.5%
New York - 14.3%
California - 27.5%
Indiana - 4.8%
Georgia - 7.1%
Texas - 18.5%
Missouri - 4.4%

A.
Stem-and-Leaf: Distances
Leaf Unit = 1.0

Ste
m
22
23
24
25

Leaf
6
2334677
001124445578

011112233444455555566
77899
26 000011123334444566778
88
27 000011222223333344466
788999
28 0035
Through analysis of the stem and leaf it appears that between 250 and 280 seems to be
the most common range. With the median being around 260.

B.
Mean

Standard
Deviation
13.40828

260.302

Median

90th percentile

260.85

277.68

C.

Histogramof Distances
Normal
16

Mean 260.3
StDev 13.41
N
100

14

Frequency

12
10
8
6
4
2
0

230

240

250

260
Distances

270

280

290

The Histogram above has a left skewed distribution, with a peak just after 250 and
dropping down right before 280.

D.
The histograms peak lines up with what was shown in the stem-and-leaf diagram, if you
were to tilt the stem-and-leaf diagram onto its side it would look the same as the
histogram.

E.
Boxplot of Distances
290
280

Distances

270
260
250
240
230
220

The Boxplot shows the interquartile range between 250 and just over 270, very similar to
what was seen in the
histogram and the stem-and-leaf diagram, also it shows the median
around 260.

F.
The boxplot is a much better interpretative tool than the stem-and-leaf if the median and
quartiles are what is needed, although the stem-and-leaf plot is better if access to the
individual, for instance if I was looking for the mode of the data the boxplot would not be
much help, but with the stem-and-leaf diagram it is easily determined to be 155 yards.

G.
With an interval of two standard deviations being the same as 95 percent, the 95 percent
confidence interval is (257.64, 262.96)

Alternative (233.48,287.12)

4.
Histogramof WEIGHT
Normal
Mean 3238
StDev 566.8
N
98

14
12

Frequency

10
8
6
4
2
0

2000

2400

2800

3200
WEIGHT

3600

4000

4400

B.
DRIVSTAR
Count
2
3
4
5
All

4
17
59
18
98

PASSSTAR
% of % of
Row Column
100
100
100
100

100
4.08
17.35
60.20
18.37
100.00

Percent of cars with 3 stars or less is


21.43%

Count
2
3
4
5
All

2
23
52
21
98

Row

% of
% of
Column

100
100
100
100

100
2.04
23.47
53.06
21.43
100.00

Percent of cars with 3 stars or less is


25.51%

The percent of cars with less than or equal to three stars varies depending on which
column of data we look at, but both fall right around the first quartile range

C.
The PASSCHEST mean is 50.224 while the DRIVCHEST mean is 49.663, therefore the
DRIVCHEST does not exceed PASSCHEST. The difference between DRIVCHEST and
PASSCHEST is .561, therefore PASSCHEST has a higher injury rate

D.
Variable

Mean

DRIVCHEST

98

49.663

PASSCHEST

98

50.224

St
Deviation
6.670
7.107

SE Mean

99% CI

0.674

47.893,
51.434
48.338,

0.718

52.111

Part B
Big data is defined as data that is too large to be processed currently with conventional
processing power, according to Thomas Davenport, the author of a book titled Big Data at
Work he describes big data as too big to be processed on one server, too fast-moving to be
sequestered in a data warehouse, or too unstructured to fit into a conventional database.[1]
This is a definition that will change as technologies are developed and new data emerges. Big
data is something that is not limited to one field of study or even just a small cluster but as
Davenport describes across almost every field of human endeavor, research suggests that data
and analysis yield more accurate and reliable decisions. [1] Big data brings with it seemingly
endless possibilities, from the internet of things which can lead to a smarter home, cars, and
factories, to improving how companies do business, big data is changing the way data is
analyzed and how it effects everyday life.
One way big data is changing the world we live in is with the internet of things, the internet of
things is a term used to describe the ever increasing connectivity of objects in our life, things like
a TV that we can control with our tablet, using a smartphone to remotely lock and unlock your
doors, or even dimming the lights from half way across the world with an app on your phone,
these things collect data and use that data to bring the user a more personalized experience.
Allowing your alarm clock to slowly raise the intensity on the room lights creating a more ideal
wake up for the individual. Currently though most Internet of things smart devices arent in your
home or phone they are in factories, businesses and healthcare [2]
One big (excuse the pun) example of how the Internet of Things is effecting everyday life is the
Smart factory. The idea of a smart factory is to have a fully functional factory using intelligent
machines connected by an Industrial Internet. [3] GE is partnered with Amazon and are already
working on a development of a Hadoop-based software platform that would allow for big data
analytics within an interconnected smart factory system. That is not the only way big data is
making factories smarter though, with analysis of all the data gathered on the factory floor errors
can be identified and corrected, optimizing the factories production and saving money in the long
run. [3]
Optimization is a huge thing for factories, and the enormous amount of data that these factories
can create is huge, for instance Raytheon Corp. monitors its assembly operations down to the
turn of a screw the factory will shut down when it detects a problem to decrease manufacturing
defects. [The] industry stands to reap many benefits from Big Data as more sophisticated and
automated data analytics technologies are developed.[4] Through this optimization Big Data
can also decrease work place accidents by decreasing the number of people who are needed to
run a plant. For example, FANUC Robotics, a Japanese producer of industrial robots, has
automated its newest tool plant to the point where it can run mostly unattended for 700 hours.
[4] that is not to say that a factory can operate completely on its own, but that the people
running the plant are not in as dangerous of a position, they can be sitting at a command module
rather than right next to the dangerous machinery.
As Big Data expands and more tools are developed to gather more data, the result is more data
that can be used. One example of this is the new Ford focus electric, the car gathers data when
being driven and gives the driver all sorts of useful information from this data like how far away it
is to the nearest charging station. The data does not end there though, because even when the

car is not being driven the sensors continue to stream data about the tire pressure and battery
system. [4] The data analysis does not end there though, because Ford gets information from
the car as well, engineers in Detroit use the cars communication modules and remote
application management software to aggregate the information about driving behaviors using
this data Ford can better increase their knowledge on how their cars are used so that when they
go to develop the next one they have all this data at their fingertips. [4]
For all the great things with Big Data, there are also plenty of issues with it, one of which is it still
being so new and the data sets so big, the analysis technology cannot keep up. Part of the
definition of big data from earlier is that it is data sets that are too big for current processing
technology. Because of this big data runs into challenges of finding bogus correlations within the
data, bogus correlations can be thought of as coincidences, they are things that happen
randomly but using such large data sets can appear to be significant. If you look 100 times for
correlations between two variables, you risk finding, purely by chance, about five bogus
correlations that appear statistically significant.[5]
Despite these issues, big data is constantly growing and maturing. With this continuing maturity
our ability to use this data to better the world we live in grows as well. Being able to track and
predict inventory so that your groceries are shipped to your house before you even realize you
needed more. It is not hard to see a world so interconnected, that our objects start to work
together to make life easier, and as factories become more interconnected their ability to
manufacture more proficiently will spark a drive for more innovation. Big data and smart
technology, even being as new as it is changing the world around us, as it continues to develop
more it brings limitless possibilities, to improve every facet of peoples lives, bringing the future
that science fiction movies have been portraying for decades into a reality. With the way big
data is advancing it is not hard to imagine a world like the Jetsons in the near future.

[1] Smith, N., "What's the big (data) idea? [Book interview]," Engineering & Technology ,
vol.9, no.4, pp.92,93, May 2014, http://ieeexplore.ieee.org/stamp/stamp.jsp?
tp=&arnumber=6823999&isnumber=6809530
doi: 10.1049/et.2014.0434
[2] Data Floq,. How the internet of things will make a smart world, [online] 2014 :

https://datafloq.com/read/internet-of-things-will-make-our-world-smart-infographic/302
(Accessed: 20 November 2014).

[3] Data Floq. The industrial internet will bring a revolution to the manufacturing
Industry, [online] https://datafloq.com/read/industrial-internet-bring-revolutionmanufacturing-industry/141 (Accessed: 20 November 20140
[4] Noor, Ahmed. Putting big data to work, Mechanical engineering magazine, [online] 2014.
https://www.asme.org/wwwasmeorg/media/ResourceFiles/Network/Media/Mechanical
%20Engineering%20Magazine/1013BigData.pdf (Accessed: 20 November 2014)
[5] Marcus G, Davis E. BRW, Nine large problems with big data [online] 10 April 2014,
http://www.brw.com.au/p/techgadgets/nine_large_problems_with_big_data_BOkbvT5G7f6Y2Jc2qi
MgGM (Accessed: 20 November 2014).

Vous aimerez peut-être aussi