Vous êtes sur la page 1sur 24

RELATIONSHIP BETWEEN DIFFERENT PHYSICOCHEMICAL TEST ATTRIBUTES AND THE QUALITY OF WHITE AND RED WINE

BI MINI PROJECT

The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine. The dataset has 12 attributes and 4898 instances (white wine datasets) and 1599 instances (Red wine). Since the data were not nominal and the instances of items were huge it was decided to use M5P algorithm over J48 algorithm.

SUBMITTED BY: (GROUP 5A) ARPIT JAIN (1221007) SAJAN K SAKHARIA (1221027) MELVIN ROY (1221123) AKSHAT TYAGI (1221103)

BI MINI PROJECT

BI MINI PROJECT
INTRODUCTION
This bi project aims mainly to find out a pattern that effects the quality of bot red and white wine from a given data set. In the above reference, two datasets were created, using red and white wine samples. The inputs include objective tests (e.g. PH values) and the output is based on sensory data (median of at least 3 evaluations made by wine experts). Each expert graded the wine quality between 0 (very bad) and 10 (very excellent). The two datasets are related to red and white variants of the Portuguese "Vinho Verde" wine. The dataset has 12 attributes and 4898 instances (white wine datasets) and 1599 instances (Red wine). Since the data were not nominal and the instances of items were huge it was decided to use M5P algorithm over J48 algorithm.

ATTRIBUTES CONSIDERED
Input variables (based on physicochemical tests): 1 - Fixed acidity 2 - Volatile acidity 3 - Citric acid 4 - Residual sugar 5 - Chlorides 6 - Free sulfur dioxide 7 - Total sulfur dioxide 8 - Density 9 - PH 10 - Sulphates 11 - Alcohol Output variable (based on sensory data): 12 - Quality (score between 0 and 10)

BI MINI PROJECT

SOFTWARE USED
WEKA 3.6.9

M5P ALGORITHM
M5P algorithm is used for inducing trees of regression models. M5P combines a conventional decision tree with the possibility of linear regression functions at the nodes. M5P learns a "model" tree - this is a decision tree with linear regression functions at the leaves. It can be used to predict a numeric target (class) attribute. It produces a piecewise linear fit to the target. First, a decision-tree induction algorithm is used to build a tree, but instead of maximizing the information gain at each inner node, a splitting criterion is used that minimizes the intra-subset variation in the class values down each branch. The splitting procedure in M5P stops if the class values of all instances that reach a node vary very slightly, or only a few instances remain. Second, the tree is pruned back from each leaf. When pruning an inner node is turned into a leaf with a regression plane. Third, to avoid sharp discontinuities between the sub trees a smoothing procedure is applied that combines the leaf model prediction with each node along the path back to the root, smoothing it at each of these nodes by combining it with the value predicted by the linear model for that node.

TEST OPTIONS FOR RED WINE


For red wine test option of percentage split was used in which the data was divided into 60-40 format in which 60 percent of data became the learning data and the remaining became the test data. The log obtained from WEKA for red wine is as follows:
=== Run information === Scheme:weka.classifiers.trees.M5P -M 4.0 Relation: Untitled2-weka.filters.unsupervised.attribute.Remove-R13 Instances: 1599 Attributes: 12 fixed acidity volatile acidity

BI MINI PROJECT

citric acid residual sugar chlorides free sulfur dioxide total sulfur dioxide density pH sulphates alcohol quality Test mode:split 60.0% train, remainder test === Classifier model (full training set) === M5 pruned model tree: (using smoothed linear models) LM1 (1599/80.033%) LM num: 1 quality = -1.0128 * volatile acidity - 2.0178 * chlorides + 0.0051 * free sulfur dioxide - 0.0035 * total sulfur dioxide - 0.4827 * pH + 0.8827 * sulphates + 0.2893 * alcohol + 4.4301 Number of Rules : 1 Time taken to build model: 0.25 seconds === Evaluation on test split === === Summary === Correlation coefficient Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances 0.5625 0.5138 0.6611 76.3094 % 82.945 % 640

INFERENCES
The tree formed from the above data set is as follows:

BI MINI PROJECT

We can see that form the above data set only one linear model is constructed i.e. in every scenario that linear model will be followed. The linear model is
LM num: 1 quality = -1.0128 * volatile acidity - 2.0178 * chlorides + 0.0051 * free sulfur dioxide - 0.0035 * total sulfur dioxide - 0.4827 * pH + 0.8827 * sulphates + 0.2893 * alcohol + 4.4301

The correlation coefficient is 56 % which shows a good correlation between the various attributes and the quality of wine. We can also see that the relative absolute error is 76.3094 % which also indicates that the forecasts are 76% close to the eventual outcomes. This implies that the model obtained is reliable to be used for similar set of data for any extra customer records.

TEST OPTIONS FOR WHITE WINE


For red wine test option of percentage split was used in which the data was divided into 60-40 format in which 60 percent of data became the learning data and the remaining became the test data. The log from WEKA is as follows:
=== Run information === Scheme:weka.classifiers.trees.M5P -M 4.0 Relation: White wine Instances: 4898 Attributes: 12 fixed acidity volatile acidity citric acid residual sugar chlorides free sulfur dioxide total sulfur dioxide density pH sulphates alcohol quality Test mode:split 60.0% train, remainder test === Classifier model (full training set) ===

BI MINI PROJECT

M5 pruned model tree: (using smoothed linear models) alcohol <= 10.85 : | volatile acidity <= 0.283 : | | volatile acidity <= 0.208 : | | | residual sugar <= 12.575 : | | | | free sulfur dioxide <= 25.5 : | | | | | residual sugar <= 2.95 : LM1 (127/77.651%) | | | | | residual sugar > 2.95 : | | | | | | total sulfur dioxide <= 92 : LM2 (17/0%) | | | | | | total sulfur dioxide > 92 : | | | | | | | alcohol <= 10.3 : LM3 (29/47.578%) | | | | | | | alcohol > 10.3 : LM4 (15/66.397%) | | | | free sulfur dioxide > 25.5 : LM5 (393/74.992%) | | | residual sugar > 12.575 : | | | | alcohol <= 9.05 : | | | | | free sulfur dioxide <= 30.5 : LM6 (21/0%) | | | | | free sulfur dioxide > 30.5 : | | | | | | density <= 0.998 : LM7 (8/0%) | | | | | | density > 0.998 : | | | | | | | density <= 0.998 : LM8 (7/0%) | | | | | | | density > 0.998 : | | | | | | | | fixed acidity <= 7.15 : LM9 (8/37.346%) | | | | | | | | fixed acidity > 7.15 : | | | | | | | | | chlorides <= 0.056 : LM10 (9/81.289%) | | | | | | | | | chlorides > 0.056 : LM11 (6/0%) | | | | alcohol > 9.05 : LM12 (91/58.694%) | | volatile acidity > 0.208 : | | | alcohol <= 9.85 : | | | | residual sugar <= 12.65 : | | | | | chlorides <= 0.045 : | | | | | | pH <= 3.275 : LM13 (116/60.469%) | | | | | | pH > 3.275 : LM14 (33/80.362%) | | | | | chlorides > 0.045 : LM15 (296/62.13%) | | | | residual sugar > 12.65 : | | | | | residual sugar <= 15.05 : | | | | | | density <= 0.999 : | | | | | | | chlorides <= 0.056 : | | | | | | | | sulphates <= 0.445 : LM16 (13/53.547%) | | | | | | | | sulphates > 0.445 : | | | | | | | | | volatile acidity <= 0.265 : | | | | | | | | | | residual sugar <= 13.25 : | | | | | | | | | | | total sulfur dioxide <= 167.25 : LM17 (4/29.624%) | | | | | | | | | | | total sulfur dioxide > 167.25 : LM18 (5/0%) | | | | | | | | | | residual sugar > 13.25 : | | | | | | | | | | | chlorides <= 0.053 : LM19 (10/0%) | | | | | | | | | | | chlorides > 0.053 : | | | | | | | | | | | | fixed acidity <= 7.45 : LM20 (2/0%) | | | | | | | | | | | | fixed acidity > 7.45 : LM21 (2/0%) | | | | | | | | | volatile acidity > 0.265 : LM22 (12/0%) | | | | | | | chlorides > 0.056 : | | | | | | | | chlorides <= 0.062 : | | | | | | | | | sulphates <= 0.47 : | | | | | | | | | | fixed acidity <= 7.3 : LM23 (7/0%)

BI MINI PROJECT

| | | | | | | | | | fixed acidity > 7.3 : LM24 (2/0%) | | | | | | | | | sulphates > 0.47 : LM25 (4/0%) | | | | | | | | chlorides > 0.062 : LM26 (13/0%) | | | | | | density > 0.999 : | | | | | | | density <= 0.999 : | | | | | | | | residual sugar <= 14.2 : | | | | | | | | | citric acid <= 0.35 : LM27 (2/0%) | | | | | | | | | citric acid > 0.35 : LM28 (4/0%) | | | | | | | | residual sugar > 14.2 : LM29 (8/0%) | | | | | | | density > 0.999 : | | | | | | | | pH <= 3.085 : LM30 (8/0%) | | | | | | | | pH > 3.085 : | | | | | | | | | total sulfur dioxide <= 182.5 : | | | | | | | | | | sulphates <= 0.68 : | | | | | | | | | | | fixed acidity <= 8.5 : | | | | | | | | | | | | free sulfur dioxide <= 43.25 : LM31 (5/0%) | | | | | | | | | | | | free sulfur dioxide > 43.25 : | | | | | | | | | | | | | fixed acidity <= 7.15 : LM32 (4/0%) | | | | | | | | | | | | | fixed acidity > 7.15 : LM33 (2/0%) | | | | | | | | | | | fixed acidity > 8.5 : LM34 (3/0%) | | | | | | | | | | sulphates > 0.68 : LM35 (7/0%) | | | | | | | | | total sulfur dioxide > 182.5 : | | | | | | | | | | citric acid <= 0.375 : LM36 (6/0%) | | | | | | | | | | citric acid > 0.375 : LM37 (3/0%) | | | | | residual sugar > 15.05 : | | | | | | total sulfur dioxide <= 131.5 : | | | | | | | sulphates <= 0.475 : LM38 (26/0%) | | | | | | | sulphates > 0.475 : | | | | | | | | fixed acidity <= 6.25 : LM39 (4/0%) | | | | | | | | fixed acidity > 6.25 : LM40 (7/0%) | | | | | | total sulfur dioxide > 131.5 : | | | | | | | total sulfur dioxide <= 150 : LM41 (32/0%) | | | | | | | total sulfur dioxide > 150 : LM42 (83/69.564%) | | | alcohol > 9.85 : | | | | pH <= 3.325 : | | | | | free sulfur dioxide <= 27.5 : LM43 (130/73.807%) | | | | | free sulfur dioxide > 27.5 : | | | | | | sulphates <= 0.525 : LM44 (167/77.208%) | | | | | | sulphates > 0.525 : | | | | | | | total sulfur dioxide <= 172.5 : LM45 (65/62.966%) | | | | | | | total sulfur dioxide > 172.5 : LM46 (26/63.239%) | | | | pH > 3.325 : | | | | | density <= 0.994 : LM47 (45/76.232%) | | | | | density > 0.994 : LM48 (77/83.245%) | volatile acidity > 0.283 : | | free sulfur dioxide <= 20.5 : LM49 (243/71.159%) | | free sulfur dioxide > 20.5 : | | | alcohol <= 9.85 : | | | | citric acid <= 0.235 : LM50 (165/53.83%) | | | | citric acid > 0.235 : LM51 (462/60.182%) | | | alcohol > 9.85 : LM52 (251/66.801%) alcohol > 10.85 : | free sulfur dioxide <= 21.5 : | | free sulfur dioxide <= 10.5 : LM53 (98/107.924%) | | free sulfur dioxide > 10.5 : LM54 (366/88.527%) | free sulfur dioxide > 21.5 : LM55 (1349/83.105%)

BI MINI PROJECT

LM num: 1 quality = 0.0336 * fixed acidity - 0.2924 * volatile acidity - 0.0222 * citric acid + 0.0288 * residual sugar + 0.3607 * chlorides + 0.0384 * free sulfur dioxide - 0.0038 * total sulfur dioxide - 51.8314 * density + 0.7292 * pH + 0.1396 * sulphates + 0.2479 * alcohol + 51.6014 LM num: 2 quality = 0.1562 * fixed acidity - 0.2924 * volatile acidity - 1.3932 * citric acid + 0.0409 * residual sugar - 6.7054 * chlorides + 0.0046 * free sulfur dioxide - 70.8129 * density + 0.6557 * pH + 0.1396 * sulphates - 0.004 * alcohol + 73.6485 LM num: 3 quality = 0.2125 * fixed acidity + 4.0183 * volatile acidity - 0.7441 * citric acid + 0.0409 * residual sugar - 9.9275 * chlorides + 0.0046 * free sulfur dioxide - 70.8129 * density + 0.7846 * pH + 0.1396 * sulphates - 0.004 * alcohol + 72.1055 LM num: 4 quality = 0.2625 * fixed acidity - 0.2924 * volatile acidity - 3.9978 * citric acid + 0.0409 * residual sugar - 41.4714 * chlorides + 0.0046 * free sulfur dioxide - 70.8129 * density + 0.9107 * pH + 0.1396 * sulphates - 0.004 * alcohol

BI MINI PROJECT

+ 74.4148 LM num: 5 quality = 0.2232 * fixed acidity - 3.1992 * volatile acidity - 0.0222 * citric acid + 0.1143 * residual sugar + 3.4432 * chlorides + 0.0002 * free sulfur dioxide - 0.0026 * total sulfur dioxide - 257.0831 * density + 0.7719 * pH + 1.61 * sulphates - 0.004 * alcohol + 257.0153 LM num: 6 quality = 0.3587 * fixed acidity + 0.9029 * volatile acidity - 0.2718 * citric acid + 0.0136 * residual sugar + 16.8487 * chlorides + 0 * free sulfur dioxide - 189.0096 * density + 0.0807 * pH + 0.1184 * sulphates - 0.1169 * alcohol + 192.6218 LM num: 7 quality = 0.5325 * fixed acidity + 0.9029 * volatile acidity - 0.2718 * citric acid + 0.0136 * residual sugar + 27.6223 * chlorides + 0 * free sulfur dioxide - 294.3907 * density + 0.0807 * pH + 0.1184 * sulphates - 0.1169 * alcohol + 296.246 LM num: 8 quality = 0.5713 * fixed acidity + 0.9029 * volatile acidity - 0.2718 * citric acid + 0.0136 * residual sugar + 20.6484 * chlorides + 0 * free sulfur dioxide - 289.2966 * density + 0.0807 * pH + 0.1184 * sulphates

BI MINI PROJECT

- 0.1169 * alcohol + 291.038 LM num: 9 quality = 0.5835 * fixed acidity + 0.9029 * volatile acidity - 0.2718 * citric acid + 0.0136 * residual sugar + 20.6484 * chlorides + 0 * free sulfur dioxide - 262.6904 * density + 0.0807 * pH + 0.1184 * sulphates - 0.1169 * alcohol + 264.2695 LM num: 10 quality = 0.5649 * fixed acidity + 0.9029 * volatile acidity - 0.2718 * citric acid + 0.0136 * residual sugar + 26.5373 * chlorides + 0 * free sulfur dioxide - 0.0015 * total sulfur dioxide - 262.6904 * density + 0.0807 * pH + 0.1184 * sulphates - 0.1169 * alcohol + 264.4318 LM num: 11 quality = 0.5649 * fixed acidity + 0.9029 * volatile acidity - 0.2718 * citric acid + 0.0136 * residual sugar + 27.3786 * chlorides + 0 * free sulfur dioxide - 262.6904 * density + 0.0807 * pH + 0.1184 * sulphates - 0.1169 * alcohol + 264.1874 LM num: 12 quality = 0.0928 * fixed acidity + 0.5954 * volatile acidity - 0.2088 * citric acid + 0.0136 * residual sugar + 25.0042 * chlorides + 0 * free sulfur dioxide + 0.0031 * total sulfur dioxide - 64.9119 * density

BI MINI PROJECT

+ 0.0807 * pH + 0.1184 * sulphates + 0.0703 * alcohol + 67.449 LM num: 13 quality = 0.0132 * fixed acidity - 6.3384 * volatile acidity - 0.004 * citric acid + 0.0645 * residual sugar + 0.2219 * chlorides - 0.0011 * free sulfur dioxide + 0.0004 * total sulfur dioxide - 159.7261 * density + 0.0278 * pH + 0.057 * sulphates + 0.1049 * alcohol + 164.4112 LM num: 14 quality = 0.0132 * fixed acidity - 0.5499 * volatile acidity - 0.004 * citric acid + 0.0331 * residual sugar + 32.4025 * chlorides - 0.0147 * free sulfur dioxide + 0.001 * total sulfur dioxide - 63.4293 * density + 0.0278 * pH + 0.057 * sulphates + 1.256 * alcohol + 55.8515 LM num: 15 quality = 0.0108 * fixed acidity - 2.8479 * volatile acidity - 0.004 * citric acid + 0.0091 * residual sugar + 0.155 * chlorides + 0.0124 * free sulfur dioxide - 0.0019 * total sulfur dioxide - 91.1062 * density + 0.0278 * pH + 0.057 * sulphates + 0.0175 * alcohol + 96.1051 LM num: 16 quality = 0.2524 * fixed acidity + 3.7939 * volatile acidity - 0.2162 * citric acid + 0.1767 * residual sugar

10

BI MINI PROJECT

- 6.674 * chlorides - 0.0011 * free sulfur dioxide + 0.0006 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.0684 * sulphates + 0.0009 * alcohol + 9.9374 LM num: 17 quality = 0.2125 * fixed acidity + 4.3001 * volatile acidity - 0.1228 * citric acid + 0.2029 * residual sugar - 3.9577 * chlorides - 0.0011 * free sulfur dioxide - 0.0002 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.0684 * sulphates + 0.0009 * alcohol + 9.8307 LM num: 18 quality = 0.2 * fixed acidity + 4.3001 * volatile acidity - 0.1228 * citric acid + 0.2029 * residual sugar - 3.9577 * chlorides - 0.0011 * free sulfur dioxide - 0.0001 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.0684 * sulphates + 0.0009 * alcohol + 9.8961 LM num: 19 quality = 0.2 * fixed acidity + 4.3001 * volatile acidity - 0.1228 * citric acid + 0.198 * residual sugar - 6.9238 * chlorides - 0.0011 * free sulfur dioxide + 0.0006 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.0684 * sulphates + 0.0009 * alcohol + 10.0472 LM num: 20 quality =

11

BI MINI PROJECT

0.216 * fixed acidity + 4.3001 * volatile acidity - 0.1228 * citric acid + 0.198 * residual sugar - 7.8604 * chlorides - 0.0011 * free sulfur dioxide + 0.0006 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.0684 * sulphates + 0.0009 * alcohol + 9.9604 LM num: 21 quality = 0.216 * fixed acidity + 4.3001 * volatile acidity - 0.1228 * citric acid + 0.198 * residual sugar - 7.8604 * chlorides - 0.0011 * free sulfur dioxide + 0.0006 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.0684 * sulphates + 0.0009 * alcohol + 9.963 LM num: 22 quality = 0.2 * fixed acidity + 4.911 * volatile acidity - 0.1228 * citric acid + 0.1877 * residual sugar - 3.9577 * chlorides - 0.0011 * free sulfur dioxide + 0.0006 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.0684 * sulphates + 0.0009 * alcohol + 9.9252 LM num: 23 quality = -0.4953 * fixed acidity + 2.0389 * volatile acidity - 0.004 * citric acid + 0.1245 * residual sugar - 0.5005 * chlorides - 0.0011 * free sulfur dioxide + 0.0006 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 4.949 * sulphates + 0.0009 * alcohol

12

BI MINI PROJECT

+ 14.2031 LM num: 24 quality = -0.5286 * fixed acidity + 2.0389 * volatile acidity - 0.004 * citric acid + 0.1245 * residual sugar - 0.5005 * chlorides - 0.0011 * free sulfur dioxide + 0.0006 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 4.949 * sulphates + 0.0009 * alcohol + 14.4308 LM num: 25 quality = -0.4643 * fixed acidity + 2.0389 * volatile acidity - 0.004 * citric acid + 0.1245 * residual sugar - 0.5005 * chlorides - 0.0011 * free sulfur dioxide + 0.0006 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 5.8489 * sulphates + 0.0009 * alcohol + 13.6303 LM num: 26 quality = -0.0687 * fixed acidity + 2.0389 * volatile acidity - 0.004 * citric acid + 0.1245 * residual sugar - 0.5005 * chlorides - 0.0011 * free sulfur dioxide + 0.0006 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 1.5292 * sulphates + 0.0009 * alcohol + 12.6794 LM num: 27 quality = 0.072 * fixed acidity + 0.9101 * volatile acidity + 2.2109 * citric acid + 0.3163 * residual sugar - 0.5005 * chlorides - 0.0088 * free sulfur dioxide + 0.0042 * total sulfur dioxide

13

BI MINI PROJECT

- 9.0127 * density - 0.0544 * pH + 0.0684 * sulphates + 0.0009 * alcohol + 9.1785 LM num: 28 quality = 0.072 * fixed acidity + 0.9101 * volatile acidity + 2.1304 * citric acid + 0.3163 * residual sugar - 0.5005 * chlorides - 0.0088 * free sulfur dioxide + 0.0042 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.0684 * sulphates + 0.0009 * alcohol + 9.2188 LM num: 29 quality = 0.072 * fixed acidity + 0.9101 * volatile acidity + 1.3193 * citric acid + 0.3089 * residual sugar - 0.5005 * chlorides - 0.0088 * free sulfur dioxide + 0.0042 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.0684 * sulphates + 0.0009 * alcohol + 9.6209 LM num: 30 quality = 0.1642 * fixed acidity + 0.9101 * volatile acidity + 1.2122 * citric acid + 0.1552 * residual sugar - 0.5005 * chlorides - 0.0103 * free sulfur dioxide + 0.0027 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.6446 * sulphates + 0.0009 * alcohol + 10.7649 LM num: 31 quality = 0.2134 * fixed acidity + 0.9101 * volatile acidity + 1.2567 * citric acid

14

BI MINI PROJECT

+ 0.1552 * residual sugar - 0.5005 * chlorides - 0.0003 * free sulfur dioxide + 0.0027 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.7055 * sulphates + 0.0009 * alcohol + 9.946 LM num: 32 quality = 0.2329 * fixed acidity + 0.9101 * volatile acidity + 1.2567 * citric acid + 0.1552 * residual sugar - 0.5005 * chlorides - 0.0003 * free sulfur dioxide + 0.0027 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.7055 * sulphates + 0.0009 * alcohol + 9.8239 LM num: 33 quality = 0.2352 * fixed acidity + 0.9101 * volatile acidity + 1.2567 * citric acid + 0.1552 * residual sugar - 0.5005 * chlorides - 0.0003 * free sulfur dioxide + 0.0027 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.7055 * sulphates + 0.0009 * alcohol + 9.8092 LM num: 34 quality = 0.2272 * fixed acidity + 0.9101 * volatile acidity + 1.2567 * citric acid + 0.1552 * residual sugar - 0.5005 * chlorides - 0.0008 * free sulfur dioxide + 0.0027 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.7055 * sulphates + 0.0009 * alcohol + 9.8955 LM num: 35

15

BI MINI PROJECT

quality = 0.2025 * fixed acidity + 0.9101 * volatile acidity + 1.2567 * citric acid + 0.1552 * residual sugar - 0.5005 * chlorides - 0.0032 * free sulfur dioxide + 0.0027 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.7055 * sulphates + 0.0009 * alcohol + 10.2202 LM num: 36 quality = 0.1191 * fixed acidity + 0.9101 * volatile acidity + 1.8798 * citric acid + 0.1552 * residual sugar - 0.5005 * chlorides - 0.0138 * free sulfur dioxide + 0.0027 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.8768 * sulphates + 0.0009 * alcohol + 10.8179 LM num: 37 quality = 0.1191 * fixed acidity + 0.9101 * volatile acidity + 1.9304 * citric acid + 0.1552 * residual sugar - 0.5005 * chlorides - 0.0138 * free sulfur dioxide + 0.0027 * total sulfur dioxide - 9.0127 * density - 0.0544 * pH + 0.8768 * sulphates + 0.0009 * alcohol + 10.8241 LM num: 38 quality = -0.1201 * fixed acidity - 1.2894 * volatile acidity - 0.004 * citric acid - 0.0016 * residual sugar - 2.8182 * chlorides + 0 * free sulfur dioxide - 10.0117 * density - 0.2678 * pH - 1.5794 * sulphates + 0.0009 * alcohol

16

BI MINI PROJECT

+ 18.6031 LM num: 39 quality = -0.2708 * fixed acidity - 1.2894 * volatile acidity - 0.004 * citric acid - 0.0016 * residual sugar - 2.8182 * chlorides + 0 * free sulfur dioxide - 10.0117 * density - 0.2678 * pH - 2.5301 * sulphates + 0.0009 * alcohol + 19.8926 LM num: 40 quality = -0.2563 * fixed acidity - 1.2894 * volatile acidity - 0.004 * citric acid - 0.0016 * residual sugar - 2.8182 * chlorides + 0 * free sulfur dioxide - 10.0117 * density - 0.2678 * pH - 2.5301 * sulphates + 0.0009 * alcohol + 19.7332 LM num: 41 quality = -0.0115 * fixed acidity - 0.7223 * volatile acidity - 0.004 * citric acid - 0.0016 * residual sugar - 3.6614 * chlorides + 0 * free sulfur dioxide - 10.0117 * density - 0.1321 * pH + 0.0684 * sulphates + 0.0009 * alcohol + 16.0749 LM num: 42 quality = -0.0115 * fixed acidity - 4.5975 * volatile acidity - 0.004 * citric acid - 0.0016 * residual sugar - 11.6292 * chlorides + 0 * free sulfur dioxide - 10.0117 * density - 0.1321 * pH - 1.0566 * sulphates + 0.0009 * alcohol

17

BI MINI PROJECT

+ 18.2542 LM num: 43 quality = 0.0136 * fixed acidity - 5.5588 * volatile acidity - 0.004 * citric acid + 0.0488 * residual sugar + 0.0274 * chlorides + 0.0132 * free sulfur dioxide - 148.7779 * density + 0.0886 * pH + 0.1687 * sulphates + 0.0009 * alcohol + 153.8403 LM num: 44 quality = 0.0136 * fixed acidity - 4.1623 * volatile acidity - 0.004 * citric acid + 0.0094 * residual sugar - 0.5097 * chlorides + 0.0002 * free sulfur dioxide - 0.0002 * total sulfur dioxide - 23.2666 * density + 0.0886 * pH + 0.2333 * sulphates + 0.0009 * alcohol + 29.3753 LM num: 45 quality = 0.0136 * fixed acidity - 1.6725 * volatile acidity - 0.004 * citric acid + 0.0164 * residual sugar - 13.3806 * chlorides + 0.0002 * free sulfur dioxide - 0.0012 * total sulfur dioxide - 35.2451 * density + 0.0886 * pH + 0.3083 * sulphates + 0.0009 * alcohol + 41.6924 LM num: 46 quality = 0.0136 * fixed acidity - 2.3622 * volatile acidity - 0.004 * citric acid + 0.0231 * residual sugar - 5.0267 * chlorides + 0.0002 * free sulfur dioxide - 0.0019 * total sulfur dioxide - 46.6393 * density

18

BI MINI PROJECT

+ 0.0886 * pH + 0.3083 * sulphates + 0.0009 * alcohol + 52.566 LM num: 47 quality = 0.0958 * fixed acidity - 0.4684 * volatile acidity + 2.9888 * citric acid + 0.17 * residual sugar + 0.0274 * chlorides - 0.0011 * free sulfur dioxide - 0.0066 * total sulfur dioxide - 116.0895 * density + 0.1952 * pH + 0.1603 * sulphates + 0.0009 * alcohol + 120.1991 LM num: 48 quality = 0.072 * fixed acidity - 0.4684 * volatile acidity - 0.004 * citric acid + 0.0272 * residual sugar + 0.0274 * chlorides + 0.0108 * free sulfur dioxide - 88.7785 * density - 1.1684 * pH + 0.1603 * sulphates + 0.0009 * alcohol + 97.247 LM num: 49 quality = 0.1335 * fixed acidity - 0.9812 * volatile acidity + 0.014 * citric acid + 0.1021 * residual sugar - 0.0516 * chlorides + 0.0313 * free sulfur dioxide + 0.0029 * total sulfur dioxide - 306.7813 * density + 1.4085 * pH + 1.0316 * sulphates - 0.1888 * alcohol + 305.2403 LM num: 50 quality = -0.0042 * fixed acidity - 0.1751 * volatile acidity + 0.0352 * citric acid + 0.0027 * residual sugar - 0.19 * chlorides

19

BI MINI PROJECT

+ 0.0068 * free sulfur dioxide - 0 * total sulfur dioxide - 1.9899 * density - 0.6437 * pH + 0.0182 * sulphates + 0.0316 * alcohol + 8.6481 LM num: 51 quality = -0.1349 * fixed acidity - 1.6249 * volatile acidity + 0.0212 * citric acid - 0.0195 * residual sugar - 1.9931 * chlorides - 0.0034 * free sulfur dioxide - 0 * total sulfur dioxide + 81.6715 * density + 0.0312 * pH + 0.0182 * sulphates + 0.2681 * alcohol - 76.6942 LM num: 52 quality = -0.0015 * fixed acidity - 0.9451 * volatile acidity + 0.6862 * citric acid + 0.0713 * residual sugar - 0.0909 * chlorides + 0.0001 * free sulfur dioxide - 0.0045 * total sulfur dioxide - 126.2933 * density + 1.1068 * pH + 0.0182 * sulphates + 0.018 * alcohol + 127.7915 LM num: 53 quality = -0.0181 * fixed acidity - 0.3309 * volatile acidity + 1.8591 * citric acid + 0.1506 * residual sugar - 0.1628 * chlorides + 0.007 * free sulfur dioxide - 208.2595 * density + 0.0389 * pH + 1.5397 * sulphates + 0.038 * alcohol + 209.5885 LM num: 54 quality = -0.0036 * fixed acidity - 2.0544 * volatile acidity

20

BI MINI PROJECT

+ 0.1239 * residual sugar - 0.1628 * chlorides + 0.0023 * free sulfur dioxide - 280.443 * density + 0.9362 * pH + 0.0287 * sulphates + 0.0163 * alcohol + 281.1013 LM num: 55 quality = 0.154 * fixed acidity - 0.4969 * volatile acidity + 0.1111 * residual sugar - 6.8952 * chlorides + 0.0001 * free sulfur dioxide - 218.5719 * density + 1.3862 * pH + 1.1006 * sulphates + 0.0966 * alcohol + 215.9405 Number of Rules : 55 Time taken to build model: 0.9 seconds === Evaluation on test split === === Summary === Correlation coefficient Mean absolute error Root mean squared error Relative absolute error Root relative squared error Total Number of Instances 0.5432 0.5726 0.7575 85.6452 % 84.7088 % 1959

INFERENCES
While evaluating and creating a model for the above data set 55 rules were formed which would have to be followed in different scenarios. An overview of it can be shown dorm the following tree:

21

BI MINI PROJECT

As the image is not so clear we are also embedding an image file which would give a more clear understanding.

Bitmap Image

As we can see that the leaf node for every branch is a rule thus a model is followed whenever for a given data instance the branch becomes true. The correlation coefficient is 54 % which shows a good correlation between the various attributes and the quality of wine. We can also see that the relative absolute error is 85.64 % which also indicates that the forecasts are 85.64% close to the eventual outcomes. This implies that the model obtained is reliable to be used for similar set of data for any extra customer records.

CONCLUSION
From the M5P algorithm used to determine the relationship between different attributes and the quality of wine it can be concluded that in both white and red wine the different attributes somewhat effects the quality of wine as the correlation in both the cases is above 50 %. And since the relative absolute error is 80% we can say that the models will hold true for a majority of data. Another thing that we can infer form the whole analysis is that alcohol content is very closely related to the quality of wine. The other attributes also make their contribution but the most influential attribute is alcohol.

22

BI MINI PROJECT

So many wine companies can use this model to find out the perfect composition that would improve the quality of their wine and thus lowering down their investment in testing of the wine quality. This can also help companies to create different quality of wine without much headache.

REFRENCES
Dataset source: a) http://archive.ics.uci.edu/ml/datasets/Wine+Quality

23

Vous aimerez peut-être aussi