Vous êtes sur la page 1sur 4

Automatic versus Manual Transmissions:

Mtcars Dataset Analysis


Domingos Savio Apolonio Santos
Tuesday, January 13, 2015

Executive Summary
This report presents the results of the course project for the Regression Models course, part of the Johns Hopkins Data Science Specialization on
Coursera.
It analyzes the Mtcars data in the R datasets package. The data is from the 1974 Motor Trend US magazine and comprises fuel consumption and ten
characteristics of automobile design and performance for 32 cars. The goals of this analysis are to:
Check if automatic or manual transmission is better for miles per gallon (mpg).
Quantify the MPG difference between automatic and manual transmissions.
This analysis was made using regression models and exploratory data analyses, and the findings are as follows:
Manual transmission is better than the automatic.
Cars analyzed with manual transmission can travel 7.24 more miles per gallon on average than the cars with automatic transmission.
Using Multivariable regression analysis, the results reveal that manual transmission cars get 1.4109 miles per gallon more than automatic transmission
cars for the same weigt (wt) and quarter mile time (qsec ).
The analysis showed that the variables type of transmission, weigt and quarter mile time influence significantly more variable miles per gallon.

Exploratory Data Analyses


The Mtcars dataset description (http://www.mortality.org/INdb/2008/02/12/8/document.pdf) shows us that the data frame has 32 observations on 11
variables: mpg - Miles/(US) gallon, cyl - Number of cylinders, disp - Displacement (cu.in.), hp - Gross horsepower, drat - Rear axle ratio, wt - Weight
(lb/1000), qsec - 1/4 mile time, vs V/S (0 = V engine, 1 = straight engine), am - Transmission (0 = automatic, 1 = manual), gear - Number of forward
gears, and carb - Number of carburetors.
The first step is to perform an exploratory analysis only with the variables am and mpg because the analyzed issues are directly related to these variables.
Then, the first question can be answered by analyzing the bloxpot graph shown in Appendix A (Miles per Gallon - mpg by Transmission - am).
It also can be note Appendix A that in the density plot of the variable mpg the distribution is normal. In addition, a t-test comparing the mean
between the two transmission data groups (Manual and Auto):

The confidence interval (95%) does not contain zero (-11.28,-3.21) and p-value is greater then 0.005. Then, it can conclud that the average
consumption, in miles per gallon, with automatic transmission is higher than the manual transmission. In this case, the mean analysis, it is possible to
quantify the MPG difference between automatic and manual transmissions: 7.24 mpg greater, subtracting means.

Additionally it is concluded that there are other variables correlated with mpg according to the graph analysis of Appendix B (Pairs Panel - Mtcars
variables). These correlations are evaluated in the regression analysis because it is the main topic addressed in this report.

Regression Analysis
At this stage of the analysis, it is evaluated a single model initially, considering transmission type as predictor and miles per gallon as the outcome. After
that, it is performed a multivariable analysis.

Single Model
This analysis is made to compare results from the exploratory analysis. The null hypothesis is that the difference between mpg and am means is zero,
considering transmission type as the predictor. The alternative hypothesis is the opposite.

The results show us that the p-value of the slope is less than 0.005. Then, it can reject the null hypothesis, and the results of the exploratory analysis were
confirmed: automatic transmission results are 7.245 miles per gallon greater. If the slope is greater than zero, manual transmission is better than the
automatic one.

Multivariable Analysis
Multivariate regression helps to estimate better the impact of transmission type on mpg to adjust for other confounding variables such as the weigt (wt)
and quarter mile time (qsec ), for example. It was decided to choose the stepAIC function from the MASS package to perform this analysis because this
way the choice of the best model is automated. The results are the following:

The best model indicated by the automated analysis consists of the variables wt, qsec, am and mpg as the outcome.

Then, the regression equation is mpg = 9.618 -3.917 wt + 1.226 qsec + 1.4109 am. It is assumed that Errors = 0. As the two-sided p-value for the am
coefficient is 0.04672, smaller than 0.05, it can we reject the null hypothesis.
Looking at the plots in Appendix C, Final Model Residuals, the visual analysis show us that the behavior of the best model is adequate considering
normal residuals and constant variability. The leverage is within reasonable upper limit.

Conclusions
These are the conclusions of the analysis:
Manual transmission is better than the automatic.
Cars analyzed with manual transmission can travel 7.24 more miles per gallon on average than the cars with automatic transmission
There is a correlation between mpg and transmission, but other variables should also be considered, as qsec and wt, beyond the type of transmission.
The obtained regression equation is mpg = 9.618 -3.917 wt + 1.226 qsec + 1.4109 am. Then, for the same weigt (wt) and quarter mile time (qsec ),
manual transmission cars get 1.4109 miles per gallon more than automatic transmission cars.
Intuitively, the cyl variable should be identified as related to mpg in the multivariable analysis. It will be need additional analysis to clarify this.
The online version of this report is available at http://rpubs.com/dsasas/mtcars (http://rpubs.com/dsasas/mtcars).

Appendix
A - Bloxpot - Miles per Gallon (MPG) by Transmission and Densit Plot

B - Pairs Panel - Mtcars variables

C - Final Model Residuals

Vous aimerez peut-être aussi