Vous êtes sur la page 1sur 25

TERM PAPER

On

Dispersion: A statistical tool

Department of Finance University of Dhaka

Dr. M. Khairul Hossain

Submitted To

Professor Department of Finance University of Dhaka

GROUP-2
Apel Mahmood Rifat 15-007 Sumaiya Amena 15-051 Shakira Mahzabeen 15-085 Khairul Bashar 15-153

Submitted By

Date of Submission: July 24, 2010

Letter of Transmittal
July 24, 2010. Dr. M. Khairul Hossain Professor Department of Finance University of Dhaka.

Subject: Submission of report named Dispersion: A statistical tool

Dear Sir, We take the pleasure to inform you that, we are going to submit the report that you had assigned us as a partial requirement for the course Business Statistics I (F-202) The report is prepared on Dispersion: A statistical tool. We sincerely hope that, you will enjoy going through this report, as we have felt great pleasure to prepare it. If any other information is required for further clarification, we will be pleased to provide you with that. We are thanking you heartily. We tried our best to make this report the best one. We think this report can serve us all as a means of tool for solving business decision problems. Finally, we would like to thank you for providing us the opportunity to work in such an interesting and enthusiastic report as we have enjoyed as well as learned a lot in preparing this report. Sincerely,

Apel Mahmood Rifat On behalf of group 2 15th batch, Section-A Department of Finance University of Dhaka

Acknowledgement
For the completion of this study we cant deserve all praise. There were a lot of people who helped us by providing valuable information, advice and guidance for the completion of this report in the scheduled time. Course report is an essential part of BBA program as one can gather practical knowledge within the short period of time by observing and doing the works of chosen topic. In this regard our report has been arranged on Dispersion. At first we like to pay our thanks to almighty Allah, for helping us to do all the works with perfection. We would like to pay our gratitude to our supervising course teacher Prof. Dr. M. Khairul Hossain who instruct us in the right way and give us proper guidelines for preparing this report. At last we must mention the wonderful working environment and group commitment that has enabled a lot deal to do and observe the process during our time.

Table of Content

Serial No.
01. 02. 03. 04. 05. 06. 07. 08. 09. 10. 11. 12. Executive Summary Introduction

Subject

Page No.
07 08 11 19 23 32 37 48

Macroenvironmental Forces Pharmaceuticals Industry Square Pharmaceuticals Ltd. Impact of Macroenvironment in Square Pharma Beximco Pharmaceuticals Ltd. Macro environmental factors affecting Beximco Parma to launch a new product Findings Conclusion Reference Bibliography

58 59 60 61

Introduction
Origin of the report
This report is generated under the academic supervision of our course teacher Prof. Dr. M. Khairul Hossain, Department of Finance, University of Dhaka. This report is prepared as the requirement of Business statistics course. The topic is Dispersion: A statistical tool.

Methodology
The methodology of the report is inductive. The report is based on secondary information.

Secondary Information: The secondary sources of data are different reference books,
website etc.

Key Parts of the report


The main view of the report is to discuss Dispersion, as a statistical tool. Different measures of dispersion and their use is discussed in this report.

Objectives of the report Broad Objectives: The main objective of the study is to evaluate the impact of macroenvironment forces in decision making of launching new product in pharmaceutical industry.

Specific Objectives:

To be acquainted with the Pharmaceutical industry To learn clear knowledge of macro-environment forces To learn about new product launching process of Pharmaceutical industry To have the practical knowledge of theoretical knowledge of Marketing theory

Scope
In this report, at first we cover the preliminary concept of Dispersion. Then we go for the classification of dispersion factors on launching a new product of a pharmaceuticals company.

Limitations
There were certain limitations of the problem we face in report preparing.

Unavoidable conditions: Some of the unavoidable conditions also had a deterring effect on preparing the report.

Restrictions that we faced: Lack of information, lack of technology etc. are the restrictions within the problem.

Absence of some information regarding data compilation: While making the survey for data collection, we have faced problems. Some of the information was really essential was hard to collect.

Introduction
While measures of central tendency are used to estimate "normal" values of a dataset, measures of dispersion are important for describing the spread of the data, or its variation around a central value. Two distinct samples may have the same mean or median, but completely different levels of variability, or vice versa. A proper description of a set of data should include both of these characteristics. There are various methods that can be used to measure the dispersion of a dataset, each with its own set of advantages and disadvantages. In statistics, statistical dispersion (also called statistical variability or variation) is variability or spread in a variable or a probability distribution. Common examples of measures of statistical dispersion are the variance, standard deviation and inter quartile range. Measures of dispersion express quantitatively the degree of variation or dispersion of values in a population or in a sample. Along with measures of central tendency, measures of dispersion are widely used in practice as descriptive statistics. Some measures of dispersion are the standard deviation, the average deviation, the range, the interquartile range. For example, the dispersion in the sample of 5 values (98,99,100,101,102) is smaller than the dispersion in the sample (80,90,100,110,120), although both samples have the same central location - "100", as measured by, say, the mean or the median . Most measures of dispersion would be 10 times greater for the second sample than for the first one (although the values themselves may be different for different measures of dispersion). Dispersion is contrasted with location or central tendency, and together they are the most used properties of distributions. A measure of statistical dispersion is a real number that is zero if all the data are identical, and increases as the data becomes more diverse. It cannot be less than zero. Most measures of dispersion have the same scale as the quantity being measured. In other words, if the measurements have units, such as meters or seconds, the measure of dispersion has the same units.

Importance
A study of dispersion enables us to get additional information about the composition of data. Confining mean will not provide us this vital information. Central tendency will only give information on the location of the data. Dispersion defines the spread of the data. In addition, shape should also be part of the defining criteria of data. So, dispersion describes location, spread & shape as best measures to define data. Two different set of data can have different mean but same variability. On the other hand two set of data can have same mean but different variability.

Shape A and B has the same mean but different variability

Curve B Curve A

Curve A and B have different mean but same variability.

Curve B

Curve A

Variability or variation is something connected with human life and study is very important for mankind. The total area of the earth may not be very important to a research minded person but the area under different crops, area covered by forests, area covered by residential and commercial buildings are figures of great importance because these figures keep on changing form time to time and from place to place. Very large number of experts is engaged in the study of changing phenomenon. Experts working in different countries of the world keep a watch on forces which are responsible for bringing changes in the fields of human interest. The agricultural, industrial and mineral production and their transportation from one part to the other parts of the world are the matters of great interest to the economists, statisticians, and other experts. The changes in human population, the changes in standard living, and changes in literacy rate and the changes in price attract the experts to make detailed studies about them and then correlate these changes with the human life. Thus variability or variation is something connected with human life and study is very important for mankind. The study of dispersion is very important in statistical data. Like Test the reliability of an average Control the variability Compare two or more sets of data with respect of their variability Facilitate the use of other statistical techniques

If in a certain factory there is consistence in the wages of workers, the workers will be satisfied. But if some workers have high wages and some have low wages, there will be unrest among the low paid workers and they might go on strikes and arrange demonstrations. If in a certain country some people are very poor and some are very high rich, we say there is economic disparity. It means that dispersion is large. The idea of dispersion is important in the study of wages of workers, prices of commodities, standard of living of different people, distribution of wealth, distribution of land among framers and various other fields of life.

Measures of dispersion are known as averages of the second order because they indicate the average deviation of individual observations from the mean. Measures of dispersion can be described from two perspectives. They are:1. Absolute form 2. Relative form A graphical representation is in the following:-

Measures of Dispersion
Absoulte Form Relative Form

Range

Quartile Deviation

Mean Deviation

Standard Deviation

Coefficient of Range

Coefficient of Quartile Deviation

Coefficient of Mean Deviation

Coefficient of Variation

Range: Considering the several measures of dispersion, the range is the first measure of the absolute form. The range is based on the largest and the smallest values in the data set. It is known as the simplest measure of dispersion. However, the range only provides information about the maximum and minimum values and does not say anything about the values in between. It is the difference between the largest and the smallest values in a data set. In the form of an equation, after rearranging the data, it will be like this: Range = Largest value Smallest value

The range is widely used in statistical process control (SPC) applications because it is very easy to calculate and understand.

Quartile Deviation: The quartile deviation is half the difference between the upper and lower quartiles in a distribution. It is a measure of the spread through the middle half of a distribution. It can be useful because it is not influenced by extremely high or extremely low scores. Quartile Deviation is an ordinal statistic and is most often used in conjunction with the median. The formula to calculate quartile deviation is:

Where, QD = Quartile Deviation Q3 = Third Quartile Q1 = First Quartile

Mean Deviation: A defect of the range is that it is based on only two values, the highest and the lowest. It does not take into consideration all of the values. The mean deviation does. It measures the mean amount by which the values in a population or sample vary from their mean. In terms of a definition, mean deviation is the arithmetic mean of the absolute values of the deviations from the arithmetic mean. The formula is:MD = l X X l n

Where, X is the value of each observation X is the arithmetic mean of the values n is the number of observations in the sample ll indicates the absolute value

Standard Deviation: The variance and the standard deviation are also based on the deviations
from the mean. However, instead of using the absolute value of the deviations, the variance and the standard deviation square the deviations. Features of standard deviation are as follows:

The standard deviation is the square root of the sample variance. Defined so that it can be used to make inferences about the population variance.

Calculated using the formula:

The values computed in the squared term, xi x bar, are anomalies, which is discussed in another section Not restricted to large sample data sets, compared to the root mean square anomaly

Variance: The arithmetic mean of the squared deviations from the mean is known as the variance.
The variance is nonnegative and is zero only if all the observations are the same. The formula is:-

Measures of Relative Dispersion


A measure of relative variation is the ratio of the measure of the absolute variation to an average. It is sometimes called the co-efficient of variation because co-efficient means a pure number that is independent of the unit of measurement. It should be remembered that while computing the relative variation the average used as base should be the same one from which the absolute variations were measured. The relative variations are:

Coeffcient of range

Coeffcient of mean deviation

Coeffcient of quartile deviation

Coeffcient of variance
Coefficient of range The relative measure corresponding to a range called the coefficient of range, is obtained by applying the following formula: Coefficient of range = In a frequency distribution, coefficient of range is calculated by taking the difference between the lower limit of the lower class and the upper limit of the upper class. Example:
The following are the prices of shares of a company from Monday to Saturday:

Day Monday Tuesday

Price 200 210

Day Thursday Friday

Price 160 220

Wednesday

208

Saturday

250

Solution: Range= L S =250 160 Coefficient of range = = =0.219 Coefficient of quartile deviation The relative measure corresponding to a quartile deviation called the coefficient of quartile deviation is calculated as follows: Here, Largest value = 160 and Smallest value = 250

Coefficient of quartile deviation = Coefficient of quartile deviation can be used to compare the degree of variation in different distributions. Coefficient of mean deviation The relative measure corresponding to a mean deviation called the coefficient of mean deviation is calculated as follows: Coefficient of mean deviation = If mean has been used while calculating the value of mean deviation in such a case coefficient of mean deviation can be obtained by diving average deviation by the mean. Coefficient of variation The relative measure corresponding to a variation is called the coefficient of variation. This measure developed by Karl Pearson is the most commonly used measure of relative variation. It is used in such problems where we want to compare

the variability of two or more than two series. Coefficient of variation denoted by C.V is obtained as follows: C.V. = Percentile: If the data are organized in ascending form and then which single data divides the information into hundred, it is called percentile.

Percentile = If the
( )

; i= 1,2,3,..,99 is in fraction, then

Percentile=

value + *(

For frequency distribution, Percentile= Example: Find the Solution: Here, n= 9 Percentile= =
( )

percentile of 2, 4,6,8,10,12,14,16,18.

th value

value

=10.

Decile: In descriptive statistics, a decile is any of the nine values that divide the sorted data
into ten equal parts, so that each part represents 1/10 of the sample or population. Thus:

The 1st decile cuts off the lowest 10% of data, i. e., the 10th percentile. The 5th decile cuts off lowest 50% of data, i. e., the 50th percentile, 2nd quartile, or median. The 9th decile cuts off lowest 90% of data, i. e., the 90th percentile.

Empirical Rule:

Provides significant information into the distribution of data around the mean, approximating normality. 1. The mean one standard deviation contains approximately 68.26% of the measurements in the series. 2. The mean two standard deviations contain approximately 95.5% of the measurements in the series. 3. The mean three standard deviations contain approximately 99.7% of the measurements in the series.

Climatologists often use standard deviations to help classify abnormal climatic conditions. The chart below describes the abnormality of a data value by how many standard deviations it is located away from the mean. The probabilities in the third column assume the data is normally distributed.

Standard Deviations Away From Mean beyond -3 sd -3 to -2 sd -2 to -1 sd -1 to +1 sd +1 to +2 sd +2 to +3 sd beyond +3 sd

Abnormality extremely subnormal greatly subnormal subnormal normal above normal greatly above normal extremely above normal

Probability of Occurance 0.15% 2.35% 13.5% 68.0% 13.5% 2.35% 0.15%

Oliver, John E. Climatology: Selected Applications. p 45.

Chebyshevs Theorem: A large standard deviation reveals that the observations are widely scattered about the mean. The Russian mathematician P. L. Chebyshev (1821-1894) developed a theorem that allows us to determine the minimum proportion of the values that lie within specified number of standard deviations of the mean. For example, according to Chebyshevs Theorem, at least three of four values, or 75 percent, must lie between the mean plus two standard deviations and the mean minus two standard deviations. This relationship applies regardless of the shape of the distribution. Further, at least eight of nine values, o 88.9 percent will lie between plus three standard deviations and minus three standard deviations of the mean. At least 24 of 25 values, or 96 percent, will lie between plus and minus five standard deviations of the mean.

For any set of observations, the proportion of the values lie within k standard deviations of the mean is at least 1 1/k2 , where k is any constant greater than 1.

Which Measure of Variation to Use The choice of a suitable measure of dispersion depends on the following three factors: 1. The type of data available: If observations are few in numbers, avoid the standard deviation. If they are generally skewed, avoid the mean deviation as well. If they have gaps around the quartiles, the quartile deviation should be avoided. If there are open-end classes, the quartile measure of variation should be preferred. 2. The purpose of investigation: In an elementary treatment of statistical series in which a measure of variability is desired only for itself, any of the three measures, namely, range, mean deviation, quartile deviation would be acceptable. Probably the man deviation would be superior. In usual practice, the measure of variability is employed in further statistical analysis. For such a purpose, the standard deviation is by far the most popularly used. It is free from those defects with which other measures suffer. It lends itself to the analysis of variability in terms of normal curve of error. Practically, all advanced statistical methods deal with variability and centre around the standard deviation. Hence, unless the circumstances warrant for the use of any other measure, we should make use of standard deviation for measuring variability.

A tabular format comparison among the measures of dispersion is drawn in the following:

Characteristics Clear Definition Easily Understandable Determination Procedures For further Algebraic Process Usage of all item in a data set Effect of extreme values Effect of sample fluctuations

Range Yes Yes Easy Not Eligible No Yes No

Quartile Deviation Yes Yes Average Not Eligible No Not much Not much

Mean Deviation Yes No Average Not Eligible Yes Yes Not much

Standard Deviation Yes Yes Not that easy Eligible Yes Not much Not much

From the above discussions, it is seen that standard deviation supports almost all the characteristics of an ideal measures of dispersion. Therefore, we can say that, standard deviation is the ideal measure of dispersion.

Some practical applications of Measures of Dispersion:

The following data show the lifetime of laptops of two different brands. Life time (Years) 24 46 68 8 10 10-12 i. ii. Solution: The brand, which has greater mean, has the greater lifetime. If the prices were same the brand which has less variability has to be preferred. The brand which has less coefficient of variance has less variability. At first let us take Compaq, Life time (Years) 24 46 68 8 10 10-12 Dell (f) 20 15 25 30 35 X 3 5 7 9 11 fX 60 75 175 270 385 180 375 1225 2430 4235 Dell 20 15 25 30 35 No. of laptop HP 15 20 20 25 15

Find which of the brands shows a greater lifetime? Which of the brands you would prefer if the prices were same? Why?

( )

C.V.

Life time (Years) 24 46 68 8 10 10-12

HP(f) 15 20 20 25 15 95

X 3 5 7 9 11

fX 45 100 140 225 165 675 135 500 980 2025 1815 5455

( )

C.V.

Dell has the mean 7.72 and HP has the mean 7.11. As Dell has the greater mean than HP, so Dell has the greater lifetime. The covariance of Dell is 36.53% and the covariance of HP is 36.99%. If the prices were same, Dell is more preferable as it has less variability and it indicates better quality and higher consistency.

Application of Empirical Rule: A sample of the rental rates at University Park Apartments approximates a symmetrical, bell-shaped distribution. The sample mean is 500 taka and the standard deviation is 20 taka. Using the empirical rule, we have to determine: 1. About 68 percent of the monthly food expenditures are between what two amounts? 2. About 95 percent of the monthly food expenditures are between what two amounts? 3. Almost all of the monthly expenditures are between what two amounts?

Solution: 1. About 68 percent of the monthly food expenditures are between

X 1s = 500 1(20) That is 480 and 520 taka

2. About 95 percent of the monthly food expenditures are between

X 2s = 500 2(20) That is 460 and 540 taka

3. Almost all (99.7 percent) are between

X 3s = 500 3(20)

That is 440 and 560 taka.

Vous aimerez peut-être aussi