Académique Documents
Professionnel Documents
Culture Documents
Unit 1
Definition of statistics:
1. Yule and Kendall defines that,
By statistics we mean quantitative data affected to a marked extend
multiplicity of causes.
2. Dr. A.N.Bowley defined statistics as ,
Statistics are numerical statements of facts in a department of enquiry,
places in relations to each other.
3. Corner defines statistics as,
Statistics are measurements, enumerations or estimates of natural or
social phenomena, systematically arranged so as to exhibit their inter
relations.
Importance of Statistics:
The methods and techniques available in statistics helps people to solve
various problems presented in statistical data. Hence it has a good application
and scope in the field of commerce, economics, physics, chemistry, botany,
zoology, psychology etc.
Statistics in Business:
Statistics is the most commonly used in business. It helps to take decision
regarding whether the company start a new business. The existing companies
can also make comparative study about their performance with the performance
of other companies through statistical analysis. The existing companies can also
project their future with regression and correlation analysis.
Statistics in Economics:
The problems in economics cannot be studied without the use of
statistics. The laws of economics always refer to statistics, in order to prove
their accuracy. The wider use of application of economics is not possible
without the knowledge of statistics.
Statistics in Astronomy:
Astronomers were the first who made recordings of the movements of
heavenly bodies and studied the eclipse and astronomical issues on the basis of
statistics.
Statistics in Education:
Statics is widely used in education. Research has become a common
feature in all branches of activities. Statistics is necessary for the formulation of
policies to start new courses, consideration of facilities available for new
courses etc. there are crores of people engaged in research work to test the past
knowledge and evolve new knowledge, and these are possible only through
statistics.
Statistics in Mathematics:
Statistics can be considered to be an important member of mathematics
family. Statistics indicates a quantitative study of many facts such as Average,
interpolation, extrapolation, correlation, regression analysis of the time series,
index numbers etc. To make these studies, the application of mathematics is
unavailable. Thus, we find that there is a close relationship between
mathematics and statistics.
Statistics in Management:
The important managerial activities like planning, directing and
controlling are properly executed with the help of statistical data and statistical
analysis and statistical analysis. Statistical techniques can also be used for the
payment of wages to the employees of the company.
Statistics in Banking and Finance:
Banking and Financial activities use statistics most commonly. Banks
also applies statistical techniques in calculating interest. Stock exchange,
financial institutions like Industrial Development Bank of India, State financial
corporation of India also uses statistics in projecting the future and to solve
various statistical problems.
Limitations of Statistics:
The following are the important limitations of statistics:
1. Statistics studies only quantitative phenomenon:
Statistics studies only the quantitative phenomenon. It wont deal with the
phenomenon which could not be expressed in quantitative value. Hence
qualitative phenomenon like intelligence, honesty, studies etc. Cannot be
dealt with by statistics unless they are expressed in quantitative values.
2. Statistics deals with aggregates and not with individual measurement:
Statistics deals with aggregate facts. It requires a series of figures for
calculating averages and for analysis. The individual measurement has no
recognition. It could not be taken into consideration for any statistical
analysis.
3. The statistical results are not perfectly accurate:
The statistical theories wont give accurate results. The results would be
only approximate values. First of all the data collected for analysis may
not be accurate also.
4. Data must be accurate in statistics :
Data for statistical analysis must be uniform. For example the data related
with the income of people in a locality could be mixed with the data
related to the expenditure of people. These tow phenomena should be
studied separately.
5. Statistics can be misused :
Only experienced and efficient persons can handle statistics in a proper
way. Untrained and inefficient persons may not produce accurate results
with the help of statistical techniques.
5.
6.
7.
8.
9.
Distrust of statistics:
1. Deliberate twisting of facts.
2. Inconsistent definitions.
3. Failure to represent complete data.
4. Inappropriate comparison.
5. Wrong inference drawn.
6. Using misleading basis.
7. Improperly classified data.
8. Data collected by improper persons.
9. Inaccurate measurement.
10.Arithmetical errors.
11.Lack of technical knowledge of statistics.
12.Being biased opinion if the investigator.
2. Existent population:
A universe containing persons or concrete objects is known as
existent or real population.
Example: the no. of students in the university, the number of population
of the city, the no. of population in the employee
Hypothetical population:
A hypothetical universe which is also known as theoretical
population. It is the one which does not consist of concrete objects.
Example: If we toss a coin infinite no. of times it is a hypothetical
population.
Information or population can be collected in two ways:
1. Census method.
2. Sample method.
1. Census method:
In census or universal coverage every element of the population is
included in the investigation where we make a complete enumeration of
all items in the population it is known as census method.
Example: if study the average expenditure of a particular university say X
and if there are 50000 students studying in that university we must study
the expenditure of all 50000 students. This method is known as census
method.
Merits:
a. The data are collected from each and every item of the population.
b. The results are more accurate and reliable.
c. Intensive study is possible.
d. The data collected may be used for various surveys, analyses etc.
Demerits:
a. It requires a large number of enumerators.
b. It is a costly method.
c. It requires more money, labour, time, energy etc.
d. It is not possible when the universe is infinite.
2. Sample method:
In our daily life we have been using sampling without knowing
about it.
Example: a homemaker tests a small quantity of rice to see whether it has
been well cooked. But will not inspect the all the rice therefore in this
method only a part of group of population will be studied in the case of
sample enquiry.
Merits:
a. It saves time when the results are urgently required.
b. It reduces cost since few items are selected for sampling.
c. It has administrative convenience and more scientific.
d. The degree of accuracy in this method is higher than census h and
every unit method.
SAMPLING METHODS
Methods of sampling:
1. Random sampling method:
a. Sample or unrestricted method.
b. Restricted or stratified method.
Stratified sampling
Systematic sampling
Cluster sampling
2. Non-random sampling:
a. Judgement or purposive sampling
b. Quota sampling
c. Convenience sampling.
1. Random sampling method:
A random sample is one where each item in the universe has an equal
chance of known opportunity of being selected A random sample is a
sample selected in such a way that any item in the population has equal
chance of being included.
Sample- simple random sampling
It is the technique in which sample is so drawn that each and every
unit in the population has an equal and independent chance of being
included in sample. The two methods adopted are:
Lottery method
Table of random numbers
Merits:
1. More scientific.
2. More representation.
3. Sampling error can be measured.
Demerits:
1. When the distribution is large this method cannot be used.
2. If the sample size is small, then it does not represent population.
Restricted random sampling
When the population is having difficult segments with respect to
the variable under study then it is stratified sampling. First the population
is divided into two sub-groups and a sample is drawn from it. There are
two types of stratified sampling:
Proportional sampling
Non proportional sampling
Merits
1. It ensures greater accuracy.
2. It is easy to administer and sub-divide.
Demerits
1. It requires more money, time and statistical experience.
Systematic sampling:
It is also known as quasi random sampling. A systematic sample is
selected at random. When a complete list of population is available this
method is used, we average the items in numerical, geographical or
alphabetical.
Merits
1. It is simple and convenient.
2. The items and work is much reduced.
Demerits
1. It may not represent the whole population.
2. There is the element of personal bias of investigators.
Cluster sampling:
It is also known as multi stage sampling. It refers to sampling
procedure which is carried out in several stages, the whole population is
divided into sampling units and these units are again divided into
subunits.This process will continue when we reach the least number.
Merits:
1. It introduces flexibility in the sampling method.
2. It is helpful in large scale survey and time consuming or
expensive.
3. It is valuable in under developed countries.
Demerits:
1. It is less accurate than other models.
2. Non random sampling:
Judgement sampling:
The investigator has the power to select or reject any item
investigation; the choice of the sample items depends on the judgement of
the investigator. He has the role to play in collecting information.
Merits
1. It is simple method.
2. It is used to obtain a more representative sample.
3. It is helpful to make public policy decision.
Demerits
1. Due to individual sample bias it may not be a representative one.
2. It is difficult to get correct sampling errors.
3. The estimates are not accurate.
4. Its results cannot be compared with other sampling studies.
Quota sampling:
This sampling is similar to stratified sampling. It is used in USA
for investigating public opinion and consumer research. To collect data
the universe is divided into quota according to some good characteristics.
Each enumeration is then told to interview a certain number of persons
who are in quota. The selection of sample item depends on the personal
judgement.
Merits:
1. It saves time and money. 2. It will give quite reliable results.
Demerits:
1. Personal prejudice and individual bias are there.
2. Sampling error cannot be determined.
Convenience sampling:
The other name is chunk sampling. It is a convenient slice of
population which is commonly referred to as a sample. It is obtained by
selecting convenient population unit.
Merits
1. It is suitable when the universe is not clearly defined.
2. Sample unit is not clear.
3. Complete source list is not available.
Demerits
1. The result cannot be representative.
2. They are unsatisfactory.
3. They are bias.
Questionnaire:
It is the media of communication between investigator and the responder.
The success of investigator depends on construction of the questionnaire.
Point to be followed while forming a questionnaire:
1. The questionnaire should be brief.
2. The questions be simple understand.
3. Questions should be logically.
4. There must be choice like simple alternative questions, multiple
choice and specific information questions.
5. Proper words be should be used in questionnaire.
6. Questions of a sensitive and personal nature should be avoided.
7. Necessary instruction should be given to informanent.
8. Questions related to mathematical calculations should not be asked.
9. Questions must be capable of an objective answer.
10. A questionnaire should look attractive.
11. Pre-testing the questionnaire must be done before posting it.
12. The accuracy of questionnaire must be judged.
Pre- cautions required in the use of a questionnaire:
1. The person conducting the survey must introduce himself.
2. The aim and objective of the enquiry should be known to informants.
3. The number of questions should be restricted to the minimum. A
reasonable questionnaire may be 20-25 questions.
4. Instruction for filling the questionnaire should be given.
5. The questions should be attractive and intresting through proper layout.
6. The questionnaire should be pre-tested to find out its short comings if
any.
7. Questions of personal matter should be asked carefully.
Unit 3: Classification
Classification is the process of arranging the available facts into a
homogeneous groups or classes according to the resemblance and similarity.
Objects of classification:
The chief objects of classification are:
1. The condense of mass data.
2. To present the facts in a simple form.
3. To bring out clearly the point of similarity and dissimilarity.
4. To facilitate comparison.
5. To prepare data for tabulation
6. To eliminate unnecessary data.
7. To facilitate easy.
Rules of classification:
1. Exactness.
2. Mutually exclusive.
3. Stability.
4. Flexibility.
5. Suitability.
6. Homogeneity.
7. Mathematical accuracy.
Types of classification:
There are four types of classification:
1. Geographical classification.
2. Chronological classification.
3. Qualitative classification.
4. Quantitative classification.
1. Geographical classification:
In this method the data are classified like states, districts, cities, talukas,
regions, zones etc.
Example:
Name of the town
1. Madras
2. Trichy
3. Madurai
4. Coimbatore
5. Kanyakumari
No. of employees
15000
13000
11000
8000
5000
2. Chronological classification:
In this type of data is classified according to the time of its occurrence,
such as years, months, weeks, days, hours etc.
4. Quantitative classification:
In this method, the data are classified according to some characteristics
which are capable of quantitative measurements like age, height, weight,
price, production, sales, income etc. then it is called quantitative
classification.
TABULATION
Tabulation:
Tabulation is a systematic presentation of numeric data in columns
and rows in accordance with salient features and characteristics.
Croxton and Cowden state that
Either for ones own use or for the use of others the data must be
presented in a suitable form.
Definition:
Tabulation is the process of systematic and scientific presentation
of data in a compact form for further analysis. Tabulation is orderly
arrangement of data in columns and rows.
The main objectives of tabulation are:
1. To clarify the object of investigation.
2. To simplify the complex data.
3. To clarify the characteristic data.
4. To present fact in the minimum of space.
5. To facilitate comparison.
6. To detect errors and omission of data.
7. To facilitate statistical processing.
8. To help reference.
Rules of tabulation:
A good statistical table is an art. The following parts must be
present in all the tables:
1. Table number.
2. Title.
3. Head note.
4. Caption.
5. Stubs.
6. Body of the table.
7. Foot note.
8. Source note.
1. Table number: Table must be arranged with number in order to
identify the table.
2. Title: Table must have title. The title should be clear, brief and
concise. It should convey the content and purpose of the table.
3. Head note: It is a statement, given below the title and enclosed in
brackets. For example, the unit of measurement is written as
headnote, such as in millions or in crores.
4. Captions: These are the headings for the vertical columns. They
must be brief and self- explanatory. The main heading and sub
heading must be written in small letters.
5. Stubs: These are the heading or designation for the horizontal rows.
Stubs are wider than the columns.
6. Body of the table: It contains the numerical information. It is the
most important part of the table. The arrangement of the body is
generally in the form left to right in rows and from top to bottom
columns.
7. Foot note: If any explanation or elaboration regarding any items is
necessary, foot notes should be given.
8. Source note: It refers to the source from where the information has
been taken. It is useful to the reader to check the figures and gather
additional information.
Requirement (requisites) for good tabulation:
The following are the requirement for a good tabulation:
1. Table must be simple. It should clearly convey the purpose for which
it is prepared.
2. Columns and rows should not be too narrow or too wide.
3. Brief description should be given for the particulars of columns and
rows. If more details are required, it should be given as footnotes.
4. Table should clearly show the units of measurement.
5. Table should be an optimum one. It should appeal the readers.
6. Figures which are closely related should be kept together in rows and
columns.
Types of tables:
Tables may be classified into four types:
1. Simple table.
2. Complex table.
3. General purpose table.
4. Special purpose table.
1. Simple table:
Simple table may be defined as a table showing only one
characteristic. It is also called one way table. The model of one
way table or simple table is shown below.
Marks obtained by students of a class
Marks in Statistic No. of Students
Below 20
5
20 40
8
40 60
13
60 80
12
80 100
7
Total
45
2. Complex table:
Complex table may be defined as tables showing two or
more characteristics. If the table shows two characteristics, it is
called two way table or double tabulation.
If the table shows three characteristics it is called triple tabulation,
if the table shows four or more characteristics it is called manifold
tabulation.
Marks scored by students in two classes
Marks
No. of Students Total
Class A Class B
Below 20
5
10
15
20 40
8
15
23
40 60
13
20
33
60 80
12
18
30
80 100
7
7
14
Total
45
70
114
3. General table:
General purpose tables are otherwise called reference tables
or repository tables. It provides information for general us.
Example: for this type of tables are published by statistical
department of government like census, tables about industrial and
agricultural production in various periods etc.
4. Special purpose table:
Special purpose table is otherwise called summary tables or
derivative tables. Special purpose tables are derived from the
information of the general purpose tables.
LS
LS
Merits of range:
1. It is simple to compute and understand.
2. It gives a rough but quick answer.
Demerits of range:
1. Only two extreme values are taken into account.
2. It is affected by extreme values.
3. It cannot be applied to open end cases.
4. It is not suitable for mathematical treatment.
Uses of range:
1. It facilitates statistical quality control.
2. It facilitates weather forecasting.
Quartile Deviation:
Quartiles may be defined as those values which divide the total frequencies
into four equal parts. They are termed as lower quartile Q1 , Median Q2 and
upper quartile Q3 .
Q3 Q1
2
Q3 Q1
Quartile Coefficient =
Q3 Q1
Quartile deviation =
Measures of Skewness:
Measures of skewness indicate not only the extent of skewness but also
the direction. (i.e) the manner in which the deviations are distributed.
Absolute measures of Skewness:
(i) Karl Pearsons Coefficient of Skewness: Sk
3 Mean Median
Q Q 2 Median
3 1
Q3 Q1
It is simple to calculate.
It is difficult to calculate.
LORENZ CURVE
Definition:
Lorenz curve is a graphical method of studying dispersion. It is a percentage
cumulative curve in which the percentage of cumulative values of one variable
is combined with the percentage of cumulative values of the other variable.
Procedures for drawing the Lorenz curve:
Suppose X and Y are two variables considered for drawing the Lorenz
curve. The following steps are adopted for drawing the Lorenz curve.
1. Write down the cumulative values of X.
2. Express in percentage these cumulative values.
3. Write down the cumulative values of Y.
4. Express in percentage these cumulative values.
5. In graph paper, mark the percentages 0, 10, 20 100.
6. Join the diagonal points (0, 0) and (100,100).
7. Plot points (x, y) where x and y are the pairs of cumulative percentage
values of x and y.
8. Draw a smooth curve joining the plotted points.
Uses of Lorenz curve:
1. It is used to show the dispersion
2. It is a device used to show the measurement of economic inequalities in
distribution of income and wealth.
Measurement of mortality:
1. Crude death rate( C.D.R)
2. Specific death rate( S.D.R)
3. Standardised death rate(STDR)
1. Crude death rate:
The simplest way of measuring the mortality is to relate the total
number of deaths during the period to the total population at the middle
of the period.
Crude death rate =
D x 1000
Px
Dx
1000
Px
m x Pxs
Pxs
b.
Indirect Method:
m xs Pxs
Pxa
Fertility:
The word fertility is used in relation to the actual production of children
or occurrence of births, especially live births.
Measurement of Fertility:
1. Crude Birth Rate (C.B.R)
2. General Fertility Rate (G.F.R)
3. Total Fertility Rate (T.F.R)
1. Crude Birth Rate:
Crude birth rate shows the rate at which population increases through
births. It is simple to calculate and has the sane defects as crude death rate.
Further-more, the total population used in the denominator is only the female
population and that too within a certain age interval in the child bearing
period.
Crude Birth Rate =
Bx 1000
Px
Bx 1000
f Px
Bx
f
1000
Px
Total fertility rate shows the rate at which a new born female
would on an average add to the total population, if the remained alive and
experienced fertility rates throughout the child bearing period. TFR is
only a hypothetical measure because all new females cannot be expected
to remain alive till the end of child-bearing period.
Total fertility rate = 5 i x
Population Growth:
Population growth is analysed by demographers in terms of four factors;
fertility, mortality, immigration and emigration.
Measurement of Population Growth:
1. Vital index.
2. Gross Reproduction Rate (G.R.R)
3. Net Reproduction Rate (N.R.R)
1. Vital index:
The way of measuring the growth population using birth-death
ratio is called vital index.
Vital index
2. Gross Reproduction Rate:
Gross Reproduction Rate measures the rate at which a new born
female would, on an average, add to the total female population, if she
remained alive and experienced the age-specific fertility rates till the end
of the child-bearing period. Like Total fertility rate, GRR is also a
hypothetical figure because; it does not take into account the mortality
experiences during the period. In the measurement of population growth,
it is appropriate to consider only the female births.
G.R.R =
or
G.R.R = 5 i x , where i x
f
f
f
Bx
1000
Px