Académique Documents
Professionnel Documents
Culture Documents
Kruno Martinovic
Saturday, December 20, 2014
Executive Summary
Synopsis
You work for Motor Trend, a magazine about the automobile industry. Looking at a data set of a
collection of cars, they are interested in exploring the relationship between a set of variables and
miles per gallon (MPG) (outcome). They are particularly interested in the following two questions:
Conclusion
Manual transmission cars have a slightly better fuel efficiency (1.80 MPG) than automatic cars,
however the transmission is not the most significant predictor of MPG. Number of cylindars, horse
power and weight are better predictors of fuel efficiency.
The Analysis
The data set
The data was extracted from the 1974 Motor Trend US magazine, and comprises fuel consumption
and 10 aspects of automobile design and performance for 32 automobiles (197374 models). It
consists of 32 observations on 11 variables:
##
##
##
##
##
##
$
$
$
$
$
$
wt :
qsec:
vs :
am :
gear:
carb:
num
num
num
num
num
num
Factorize variables
mtcars$cyl <- factor(mtcars$cyl)
mtcars$vs <- factor(mtcars$vs)
mtcars$gear <- factor(mtcars$gear)
mtcars$carb <- factor(mtcars$carb)
mtcars$am <- factor(mtcars$am,labels=c('Automatic','Manual'))
Boxplot in Figure 1 (in the appendix) shows that generally cars with manual transmission type have
better fuel efficiency than the automatic cars. Similarly, if we build a linear regression model that
predicts the fuel efficiency solely based on transmission, then the coefficient indicates that cars with a
manual transmission achieve a fuel effiency of 7.24 miles per gallon higher than cars with an
automatic transmission.
model1 <- lm(mpg ~ am, data = mtcars)
summary(model1)$coef
##
Estimate Std. Error t value Pr(>|t|)
## (Intercept)
17.147
1.125 15.247 1.134e-15
## amManual
7.245
1.764
4.106 2.850e-04
summary(model1)$adj.r.squared
## [1] 0.3385
Now, if we build a regression model which takes into account all variables (stepwise algorithm used to
select the best model) we end up with the model which achieves a better adjusted R squered value
(0.84) than model1 (0.34) which is based solely on the transmission type. We can also see that the
coefficient for transmission type has low significance (p-value) on the fuel efficiency in comparison to
some other coefficients returned by the step model. The estimate coefficient has decreased, indicating
that for the same number of cylinders, horse power and car weight, this model predicts that a car with
a manual transmission would achieve an improvement in fuel efficiency of just 1.80 miles per gallon.
model2 = step(lm(mpg~., data=mtcars), direction="backward", trace = 0)
summary(model2)$coef
##
##
##
##
##
##
##
(Intercept)
cyl6
cyl8
hp
wt
amManual
summary(model2)$adj.r.squared
## [1] 0.8401
t value
12.9404
-2.1540
-0.9472
-2.3450
-2.8194
1.2957
Pr(>|t|)
7.733e-13
4.068e-02
3.523e-01
2.693e-02
9.081e-03
2.065e-01
par(mfrow=c(2,2))
plot(model2)