Vous êtes sur la page 1sur 5

QUANTITATIVE METHOD-CP-102

Q.1. What do you mean by measures of central tendency?


Measures of central tendency are very useful in Statistics. Their
importance is because of the following reasons:
(i) To find representative value:
Measures of central tendency or averages give us one value for the
distribution and this value represents the entire distribution. In this way
averages convert a group of figures into one value.
(ii) To condense data:
Collected and classified figures are vast. To condense these figures we
use average. Average converts the whole set of figures into just one
figure and thus helps in condensation.
(iii) To make comparisons:
To make comparisons of two or more than two distributions, we have to
find the representative values of these distributions. These
representative values are found with the help of measures of the central
tendency.
(iv) Helpful in further statistical analysis:
Many techniques of statistical analysis like Measures of Dispersion,
Measures of Skewness, Measures of Correlation, and Index Numbers
are based on measures of central tendency. That is why; measures of
central tendency are also called as measures of the first order.
Seeing this importance of averages in statistics, Prof. Bowley said
"Statistics may rightly be called as science of averages."
Averages are very useful in Economics. It is because of the following
reasons:
(i) Helpful in knowing the structure of any economy:
For studying the structure of any economy we use per capita income,
per capita consumption, per capita saving, per hectare production, per
worker production, etc. All these are averages.
(ii) Helpful in comparing different economies:
Suppose we are to compare the economies of Punjab, Haryana and
Himachal. For this purpose we shall use per capita income which is
nothing but an average.
(iii) Helpful in studying various economic problems:
These days the different economic problems are studied with the help
of Index numbers. For example problem of inflation is studied with the
help of price index number. Index numbers are nothing but special type
of averages.
(iv) Helpful in formulating and evaluating economic policy:
Averages are used in formulating and evaluating economic policy. For
example, if we are to study the effect of economic planning on Indian
economy, we may use per capita income.
(v) Helpful in research:
Measures of central tendency are used in statistical analysis. Therefore,
these are used for research in Economics.
Limitations
In spite of this importance, measures of central tendency have many
limitations, which are as follows:
(i) It can be used properly only by skilled persons.
(ii) Sometimes, average is such value which is not in the distribution
hence is not true representative. For example mean of 100, 300, 100, 50
and 250 is 160 which are not in the distribution and hence not true
representative.
(iii) Sometimes average gives absurd results. For example, we find
average number of members per family as 2.3.
(iv) Measures of central tendency do not describe the true structure of
the distribution. Two or more than two distributions may have same
mean but different structure.
Q.2.What is measure of dispersion state its range & standard deviation
?
Ans:- A measure of statistical dispersion is a nonnegative real
number that is zero if all the data are the same and increases as the
data become more diverse.
Most measures of dispersion have the same units as the quantity being
measured. In other words, if the measurements are in metres or

seconds, so is the measure of dispersion. Such measures of dispersion


include:
Sample standard deviation
Interquartile range (IQR)
Range
Mean absolute difference (also known as Gini mean absolute
difference)
Median absolute deviation (MAD)
Average absolute deviation (or simply called average deviation)
Distance standard deviation
These are frequently used (together with scale factors)
as estimators of scale parameters, in which capacity they are
calledestimates of scale. Robust measures of scale are those unaffected
by a small number of outliers, and include the IQR and MAD.
All the above measures of statistical dispersion have the useful property
[clarification
that they are location-invariant, as well as linear in scale.
needed]
So if a random variable X has a dispersion of SX then a linear
transformation Y = aX + b for real aand b should have
dispersion SY = |a|SX.
Other measures of dispersion are dimensionless. In other words, they
have no units even if the variable itself has units. These include:
Coefficient of variation
Quartile coefficient of dispersion
Relative mean difference, equal to twice the Gini coefficient
Entropy: While the entropy of a discrete variable is locationinvariant and scale-independent, and therefore not a measure of
dispersion in the above sense, the entropy of a continuous variable
is location invariant and additive in scale: If Hz is the entropy of
continuous variable z and y=ax+b, then Hy=Hx+log(a).
There are other measures of dispersion:
Variance (the square of the standard deviation) locationinvariant but not linear in scale.
Variance-to-mean ratio mostly used for count data when the
term coefficient of dispersion is used and when this ratio
is dimensionless, as count data are themselves dimensionless, not
otherwise.
Some measures of dispersion have specialized purposes, among them
the Allan variance and the Hadamard variance.
For categorical variables, it is less common to measure dispersion by a
single number; see qualitative variation. One measure that does so is
the discrete entropy.
6 properties of a good Measure of Dispersion
Since measures of dispersion are usually called as averages of the
second order, they should possess all the qualities of a good average.
According to Yule and Kendall, they are as follows
1) It should be easy to calculate and simple to follow.
2) It should be rigidly defined: For the same data, all the methods should
produce the same result.
3) It should be based on all the items so as to be more representative.
4) It should be amenable to further algebraic treatment.
5) It should have sampling stability.
6) It should not be unduly affected by the extreme items.
Q.3. What is skewness state different co-efficient of variation .
Ans:Comparison of skewness coefficient, coefficient of variation,
and Gini coefficient as inequality measures within populations
Summary. The moment skewness coefficient, coefficient of variation
and Gini coefficient are contrasted as statistical measures of inequality
among members of plant populations. Constructed examples, real data
examples, and distributional considerations are used to illustrate
pertinent properties of these statistics to assess inequality. All three
statistics possess some undesirable properties but these properties are
shown to be often unimportant with real data. If the underlying
distribution of the variable follows the often assumed two-parameter
lognormal model, it is shown that all three statistics are likely to be

highly and positively correlated. In contrast, for distributions which are


not two-parameter lognormally distributed, and when the distribution is
not concentrated near zero, the coefficient of variation and Gini
coefficient, which are sensitive to small shifts in the mean, are often of
little practical use in ordering the equality of populations. The coefficent
of variation is more sensitive to individuals in the right-hand tail of a
distribution than is the Gini coefficient. Therefore, the coefficient of
variation may often be recommended over the Gini coefficient if a
measure of relative precision is selected to assess inequality. The
skewness coefficient is suggested when the distribution is either threeparameter lognormally distributed (or close to such), or when a
measure of relative precision is not indicated.
Measures of Variance
Common Measures of Variance
Range
The range is the difference between the high and low values. Since it
uses only the extreme values, it is greatly affected by extreme values.
Procedure for finding
1.
Take the largest value and subtract the smallest value
Formula

3.
Take the absolute value of each deviation from the mean
4.
Total the absolute values of the deviations from the mean
5.
Divide the total by the sample size.
Formula

Variation
The variation is the sum of the squares of the deviations from the mean.
It has units that are squared instead of the same as the original data and
it does not take the sample size into account.
Procedure for finding
1.
Find the mean of the data
2.
Subtract the mean from each value to find the deviation from
the mean
3.
Square the deviation from the mean
4.
Total the squares of the deviation from the mean
Formula

Range Rule of Thumb


Variance
The variance is the average squared deviation from the mean. It
usefulness is limited because the units are squared and not the same as
2
the original data. The sample variance is denoted by s , it is an unbiased
estimator of the population variance.
Procedure for finding
1.
Find the mean of the data
2.
Subtract the mean from each value to find the deviation from
the mean
3.
Square the deviation from the mean
4.
Total the squares of the deviation from the mean
5.
Divide by the degrees of freedom (one less than the sample
size)
Formula

Standard Deviation
The standard deviation is the average deviation from the mean. It is
found by taking the square root of the variance and solves the problem
of not having the same units as the original data. The sample standard
deviation is denoted by s. It is not an unbiased estimator of the
population standard deviation.
Procedure for finding
1.
Find the variance
2.
Take the square root
Formula

Less Common Measures of Variance


Mean Absolute Deviation
The sum of the deviations from the mean will always be zero. We need
to make sure that none of the deviations are negative. We can do this
by squaring each deviation (as we do in the variance or standard
deviation) or by taking the absolute value (as we do in the mean
absolute deviation).
Procedure for finding
1.
Find the mean of the data
2.
Subtract the mean from each data value to get the deviation
from the mean

The range rule of thumb says that the range is approximately four times
the standard deviation. Alternatively, the standard deviation is
approximately one-fourth the range. That means that most of the data
lies within two standard deviations of the mean.
Procedure for finding
1.
Find the range
2.
Divide it by four
Formula

Pearson's Index of Skewness


Pearson's index of skewness can be used to determine whether the data
is symmetric or skewed. If the index is between -1 and 1, then the
distribution is symmetric. If the index is no more than -1 then it is
skewed to the left and if it is at least 1, then it is skewed to the right.
Procedure for finding
1.
Find the mean, median, and standard deviation of the data.
2.
Subtract the median from the mean.
3.
Multiply by 3
4.
Divide by the standard deviation
Formula

Coefficient of Variation
The coefficient of variation is expressed as a percent and describes the
standard deviation relative to the mean. It can be used to compare
variability when the units are different (the units will divide out,
providing just a raw number).
Procedure for finding
1.
Find the mean and standard deviation for the data
2.
Divide the standard deviation by the mean
3.
Multiply by 100
Formula

Q.4. What is maximum & minimum & its application ?


Ans:-the Main Provisions

The following are the important provisions under Banking Regulation


Act, 1949 regarding control and regulation of Banking Sector in India.
The requirements regarding the minimum paid-up capital and reserves
for commence mint of banking business. Prohibition of charge on
unpaid capital.Payment of Dividends only after writing off all Capitalized
expenses.
Transfer to reserve fund out of Profits. (Minimum 20 per cent)
Maintenance of cash reserves by the non- scheduled banks. (Minimum 3
per cent) Restrictions on holding shares in other companies.
Restrictions on loans and advances to directors and others.Licensing of
banking companies.Licences for opening of new branches and transfer
of existing place of business. Maintenance of a percentage of liquid as
sets (SLR). (Minimum 25 per cent and maximum 40 per cent)
Maintenance of Assets in India By a banking company. (Minimum 75 per
cent of DTL) Submission of Return of unclaimed Deposits.
1. Power to call for and publish the information. Preparation of
Accounts and Balance Sheets.Audit of the Balance sheet and Profit &
Loss Account.Publication of Audited Accounts and Balance
Sheet.Inspection of books and accounts of banking companies by
RBI.Giving directions to banking companies.
2. Prior approval from RBI for appointment of managing directors.
3. Removal of managerial and any other persons from office.
4. Power of RBI to appoint additional directors
5. Moratorium under the orders of a High Court.
6. Winding up of banking companies.
7. Scheme of amalgamation to be sanctioned by the RBI.
8. Power of RBI to apply to the
9. Central Government for an order of mortal rim in respect of a banking
company and for a scheme of reconstruction or amalgamation.
10. Power of RBI to examine the record of proceedings and tender
advice in winding up proceedings.
11. Power of RBI to inspect and make its report to winding up.
12. Power of RBI to call for Returns and information from the Liquidator
of a Banking company.
13. Issue of No Objection Certificate for change of name.
14. Issue of No objection certificate for the Alteration of memorandum
of a banking company. Central Government to consult the RBI for
making rules regarding banking companies. Recommend to the Central
Government for exempting any bank from the provisions of the Banking
Regulation Act 1949.
Q.5. What do you mean by concept of probability & what is probability
distribution ?
Ans:A probability is a number that reflects the chance or likelihood that a particular
event will occur. Probabilities can be expressed as proportions that range from 0
to 1, and they can also be expressed as percentages ranging from 0% to 100%. A
probability of 0 indicates that there is no chance that a particular event will
occur, whereas a probability of 1 indicates that an event is certain to occur. A
probability of 0.45 (45%) indicates that there are 45 chances out of 100 of the
event occurring.
The concept of probability can be illustrated in the context of a study of obesity
in children 5-10 years of age who are seeking medical care at a particular
pediatric practice. The population (sampling frame) includes all children who
were seen in the practice in the past 12 months and is summarized below.
Age (years)
5

10

Total

Boys

432

379

501

410

420

418

2,560

Girls

408

513

412

436

461

500

2,730

Totals

840

892

913

846

881

918

5,290

A probability function is a function which assigns probabilities to the values of a


random variable.
All the probabilities must be between 0 and 1 inclusive
The sum of the probabilities of the outcomes must be 1.
If these two conditions aren't met, then the function isn't a probability function.
There is no requirement that the values of the random variable only be between
0 and 1, only that the probabilities be between 0 and 1.
Probability Distributions

A listing of all the values the random variable can assume with their
corresponding probabilities make a probability distribution.
A note about random variables. A random variable does not mean that the values
can be anything (a random number). Random variables have a well defined set of
outcomes and well defined probabilities for the occurrence of each outcome. The
random refers to the fact that the outcomes happen by chance -- that is, you
don't know which outcome will occur next.

Q.6.What is sampling & what are its types & techniques


Ans:-sampling
A finite subset of the population, selected from it with the objective of
investigating its properties is called a sample and the number of units in
the sample is known as the sample size.
Sampling is a tool which enables us to draw conclusions about the
characteristics of the population after studying only those objects or
items that are included in the sample. The main objectives of the
sampling theory are:
(i) To obtain the optimum results, i.e., the maximum information about
the characteristics of the population with the available sources at our
disposal in terms of time, money and manpower by studying the sample
values only.
(ii) To obtain the best possible estimates of the population parameters.
The Essentials of good Sampling?
In order to reach at right conclusions, a sample must possess the
following essential characteristics.
1. Representative:
The sample should truly represent the characteristics of the verse. For
this investigator should be free from bias and the method of collection
should be appropriate.
2. Adequacy:
The size of the sample should be adequate i.e., neither too large nor
small but commensurate with the size of the population.
3. Homogeneity:
There should be homogeneity in the nature of all the units selected for
the sample. If the units of the sample are of heterogeneous character it
will impossible to make a comparative study with them.
4. Independent ability:
The method of selection of the sample should be such that the items of
the sample are selected in an independent manner. This means that
lection of one item should not influence the selection of another item in
any manner d that each item should be selected on the basis of its own
merit.
Types of Purposive Sampling
Following are the main types of purposive sampling.
A. Quota Sampling:
It is a type of purposive sampling in which the whole universe is divided
first into certain parts and the total sample is allocated among these
parts! Each part of' the population is assigned to an investigator for
whom the quota of the units to be examined by him is fixed in advance
according to certain specified characteristics such as sex, age,
occupation, income group, political or religious affiliation.
The invigilator is asked to select the required number of units of the
sample of his own accord and examine them to get the desired
information as quickly as possible.
He is also authorized to substitute the new units in the quota if he finds
that any unit of the sample so selected is not responding up to the
mark. This method is very often used in opinion poll surveys, market
surveys and political surveys.
B. Convenience Sampling
It is a type of purposive sampling in which the sample units are selected
purposively by the investigator to suit his convenience in the matter of
location and contract with the units.
This method of selecting the sample is also called 'Chunk' since the
samples under this method are selected neither on the basis of the rules
of probability nor on the basis of the judgment of the investigator but
on the basis of convenience on the part of the investigator.
For example, a sample obtained from a list of students in a college for
enquiring into their educational problems, is a matter of convenience
sampling as in such a case it will be convenient for the investigator to

Categories of Sampling
There are two major categories of sampling:
1.Random or Probability Sampling: Random sampling is also called
Probability Sampling because the laws of probability can be applied to it.
Note that the term 'random sample' is not used to describe data in a
sample; it is a process used to select the sample from a population.
Random Sampling does not depend upon the existence of detailed
information about the universe. It also provides such data as are
unbiased. Also, we can measure the relative efficiency of different
sample designs with random sampling methods.
Limitations of this type of sampling cannot be ignored either. It requires
high levels of skill. Also, it consumes a lot of time for planning the
process of actual sampling. The cost of execution of this sampling
method is very high.
2. Non-Random or Judgment Sampling: This is a process of sample
selection where we do not use random methods. A non-random sample
is selected on the basis of judgment or convenience. There is no
selection on the basis of probability considerations. The pattern of
sample variability in the process cannot be known.
Q.7.What is hypothesis testing? states its technique using chi-square
test?
Ans:-Hypothesis: Formulation, Types and Testing
In hypothesis testing, we must state the hypothesized value of the
population parameter before we begin sampling. The hypothesis we
wish to test is called Null Hypothesis and is denoted as H0. Example: If
we want to test the hypothesis that the population mean is equal to
600, we can write it as follows: H0: p = 600 and read, "The null
hypothesis is that the population mean is equal to 600."
Hypothesis Testing Test of Means
1. By One-Tailed Test: Take an example of a drug, which is frequently
used by a hospital. The individual dose of this drug is 125 cc. There is no
harm when body takes excessive does of this drug. But on the other
hand, insufficient doses do not assist doctors in the necessary medical
treatment. The hospital has been purchasing the same drug from the
same manufacturer for many years and the population's standard
deviation is 4 cc. The hospital inspects 50 doses of this drug at random
from a very large consignment and calculates the mean of these doses
to be 99.5 cc. The data in this case are:
pH0 =125 (hypothesised value of the population mean) cr = 4 (population
standard deviation) n = 50 (sample size) x = 99.5 (sample mean)
The hospital sets a 0.10 significance level. We have to find out "whether
the dosages in this consignment are too small."
In order to find the answer, we can state the problem as follows: H 0: p =
125 (null hypothesis) H,:p<125 (alternative hypothesis)
a = 0.10 -level of significance for testing this hypothesis
Here, we would calculate the standard error or the mean, (the
population size is assumed to be infinite).
The hospital wants to know whether the actual dosage is 125 cc, or the
dosage is too small. The hospital must see that the dosage should be
more than a certain amount otherwise it should reject the consignment.
This is one-tailed test and the shaded portion is representing the 0.10
significance level. In Fig. 8.9, the acceptance region includes 40 per cent
of the area on the left side of the mean and 50 per cent of the area on
the right side of the mean. The non-acceptance region has an area of 10
per cent. It has been shown by shaded portion.
+As we know the population standard deviation and n is larger than 30,
we can use the nor distribution. The appropriate z value for 40 per cent
of the area under the curve is 1.28. Using information, we can calculate
the acceptance region's lower limit: pH0 - 1.28 oxen = 125 - 1.28 (0.5658)
= 125 - 0.7242 = 124.276 cc (lower limit)
As a result, the hospital should accept the null hypothesis, because
there is no significant difference between our hypothesized mean of
125 cc and the observed mean of the sample i.e., 99.5. On the basis of
this sample of 50 doses, the hospital should accept the consignment.
2. By Two-Tailed Test: An engineering firm supplies water pumps to a
hotel. These pumps must have a pumping capacity of 40,000 gallon per

minute each. If the pumping capacity has to be very exact, then


production costs rise (because of the use of extremely sturdy impellers
and other testing procedures). Researchers have concluded that the
standard deviation of the pumps is 2000 gallons per minute. The
manufacturer selects a sample of 40 pumps from the fresh production
lot. These pumps are subjected to tests. It is found that the mean
pumping capacity of the sample of pumps is 39,500 gallons per minute.
If we write these data in a symbolic notation, we get:
pH0 (hypothesised value of the population mean) = 40000 gpm o
population's standard deviation) = 2000 gpm n (sample size) = 40 pumps
x (sample mean) = 39,500
It the supplier uses a significance level (a) of 0.05 in testing, we can state
the problem as:
H0: p = 40000 (null hypothesis: the true mean is 40,000) H,: p * 40000
(alternative hypothesis is not 40,000) a = 0.05 <- level of significance for
testing this hypothesis
We know that standard deviation and size of the population are large
enough to be treated as infinity. So, we may use normal distribution in
our testing. First of all, we calculate standard error of the mean by using
the following equation:
2000
= 316.225
6.3246
= 316.225 (standard error of the mean) Figure 8.10 shows the
significance level of 0.05 (the two shaded regions contain 0.025 part of
the area). The 0.95 acceptance region contains two equal areas of 0.475
each. The appropriate value for 0.475 part of the area under the curve is
1.96. Now, we can find out the limits of the acceptance regions:
pH0 + 1.96 = 40000 + 1.96 (316.2255) = 40000 + 619.80
= 40619.80 (upper limit)
Pho " 196 ctx = 40000 - 1.96 (316.225) = 40000-619.80 = 39380.20
(lower limit)
Now, there are two limits of the acceptance region-40619.80 and
39380.20. It is quite obvious that sample mean, 39,500, lies within the
acceptance region. The supplier should accept the null hypothesis
because there is no significant difference between the hypothesis mean
of40, 000 and the observed mean of the sample.
A chi-square test ( Snedecor and Cochran, 1983) can be used to test if
the variance of a population is equal to a specified value. This test can
be either a two-sided test or a one-sided test. The two-sided version
tests against the alternative that the true variance is either less than or
greater than the specified value. The one-sided version only tests in one
direction. The choice of a two-sided or one-sided test is determined by
the problem. For example, if we are testing a new process, we may only
be concerned if its variability is greater than the variability of the
current process.
Q.8. What is co-relation & regression analysis ?what are its types &
application or concept of parametric & non parametric ?
Ans:- The goal of a correlation analysis is to see whether two
measurement variables co vary, and to quantify the strength of the
relationship between the variables, whereas regression expresses the
relationship in the form of an equation.
For example, in students taking a Maths and English test, we could use
correlation to determine whether students who are good at Maths tend
to be good at English as well, and regression to determine whether the
marks in English can be predicted for given marks in Maths.
We can use the correlation coefficient, such as the Pearson Product
Moment Correlation Coefficient, to test if there is a linear relationship
between the variables. To quantify the strength of the relationship, we
can calculate the correlation coefficient (r). Its numerical value ranges
from +1.0 to -1.0. r > 0 indicates positive linear relationship, r < 0
indicates negative linear relationship while r = 0 indicates no linear
relationship.
In regression analysis, the problem of interest is the nature of the
relationship itself between the dependent variable (response) and the
(explanatory) independent variable.

The analysis consists of choosing and fitting an appropriate model, done


by the method of least squares, with a view to exploiting the
relationship between the variables to help estimate the expected
response for a given value of the independent variable. For example, if
we are interested in the effect of age on height, then by fitting a
regression line, we can predict the height for a given age..
1. Most of the variables show some kind of relationship. For instance,
there is relationship between price and supply, income and expenditure
etc. With the help of correlation analysis we can measure in one figure
the degree of relationship.
2. Once we know that two variables are closely related, we can estimate
the value of one variable given the value of another. This is known with
the help of regression.
3. Correlation analysis contributes to the understanding of economic
behavior, aids in locating the critically important variables on which
others depend.
4. Progressive development in the methods of science and philosophy
has been characterized by increase in the knowledge of relationship. In
nature also one finds multiplicity of interrelated forces.
5. The effect of correlation is to reduce the range of uncertainty. The
prediction based on correlation analysis is likely to be more variable and
near to reality.
Correlation
X:

10

15

20

25

30

Y:

10

13

18

17

21

29

Thus, from the above example it is clear that the ratio of change
between two variables is not same. Now, if we plot all these variables
on a graph, they would not fall on a straight line.
C. Number of Variables
According to the number of variables, correlation is said to be of the
following three types viz;
(i) Simple Correlation.
(ii) Partial Correlation.
(iii) Multiple Correlations.
(i) Simple Correlation:
In simple correlation, we study the relationship between two variables.
Of these two variables one is principal and the other is secondary? For
instance', income and expenditure, price_ and demand etc. Here
income and price are principal variables while expenditure and demand
are secondary variables.
(ii) Partial Correlation:
If in a given problem, more than two variables are involved and of these
variables we study the relationship between only two variables keeping
the other variables constant, correlation is said to be partial. It is so
because the effect of other variables is assumed" to be constant
(iii) Multiple Correlations:
Under multiple correlations, the relationship between two and more
variables is studied jointly. For instance, relationship between rainfall,
use of fertilizer, manure on per hectare productivity of maize crop.

Vous aimerez peut-être aussi