Vous êtes sur la page 1sur 22

Origin and Development of Statistics

Statistics, in a sense, is as old as the human society itself. Its origin can be traced to the
old days when it was regarded as the ‘science of statecraft’ and was the by-product of
administrative activities of the State.
The word Statistics has been derived from the Latin word ‘Status’, the Italian word
‘Statista’ the German word ‘Statistik’, each of which means a political state.
In ancient times, the government used to collect the information regarding the
population and property or wealth of the country. The former enabling the government
to have an idea of the manpower of the country (To safeguard the nations against
external aggressions, if any) and the latter provides a basis for introducing new taxes
and levies.
In India, an efficient system of collecting official and administrative statistics existed
even more than 2000 years ago, in particularly during the reign of Chandragupta
Maurya (324-300 BC). From Kautilya’s Arsthashastra, it is known that even before 300
BC, there existed a very good system of collecting vital statistics (Data relates to
population). During Akbar’s reign (1556-1605 AD), Raja Thodormal the then land and
revenue minister, maintained the records of land and agricultural statistics. In Aina–e–
Abkari written by Abul Fazal (in 1596-97), one of the nine gems of Abkar, we find
detailed accounts of the administrative and statistical surveys conducted during Akbar’s
reign.
In Germany, the systematic collection of official statistics originated towards the end of
the 18th century. Information regarding population and output (Industrial and
agricultural) were collected in order to have an idea of the relative strength of different
areas.
In England, statistics are the outcome of the Napoleonic war. The wars necessitated the
systematic collection of numerical data to enable the government to assess the revenue
and expenditure with greater precision and then to levy new taxes in order to meet the
cost of war.
The origin of vital statistics was in the 17 th century. Capt. John Grant (1620-74) known
as the father of vital statistics, was the first man to study the statistics of birth and
death. The computation of mortality table and the calculation of expectation of life at
different ages lead to the idea of life insurance and the first life insurance institution
was founded in London in 1968. The theoretical development of the so called modern
statistics came during the 17th century with the introduction of theory of probability and
theory of games and chances, the chief contributors being the mathematicians and
gamblers of France, Germany and England. Francis Dalton (1822-1921, with his works
on regression, pioneered the use of statistical methods in the field of biometry. Karl
Pearson (1911) is the pioneer of correlation analysis. His discovery of Chi-square test,
the first and the most important of test of significance won for statistics a place as a
science.
Sir Ronald A. Fisher (1890-1962) known as the father of statistics, placed
statistics on a very sound footing by applying it to various diversified fields such as
genetics, education, agriculture etc.

1
Definitions of Statistics:
Statistics has been defined differently by different statisticians from time to time. The
reasons for a variety of definitions are two.
1. In ancient times, statistics was confined only to the affairs of the state. But
in modern times, the field of utility of statistics has widened considerably.
Hence, a number of old definitions which were confined to a very narrow
field of enquiry were replaced by new definitions which are much more
comprehensive and exhaustive.
2. Statistics has been defined in two ways. Some statisticians have defined it as
statistical data i.e.; numerical statement of facts, while others define it as
statistical methods, i.e.; complete body of principles and techniques used in
collecting and analyzing such data.
Some Important Definitions
Statistics as statistical data:
Webster defines statistics as “classified facts representing the conditions of people in a
State – especially those facts which can be stated in numbers or in any other tabular or
classified arrangements.” This definition confines statistics only to the data pertaining
to a State.
Bowley defines statistics as “numerical statements of facts in any department of
enquiry placed in relation to each other.”
An exhaustive definition is given by Prof. Horace Secrist, “By statistics, we mean
aggregate of facts affected to a marked extent by multiplicity of causes numerically
expressed, enumerated or estimated according to reasonable standards of accuracy
collected in a systematic manner for a predetermined purpose and placed in relation to
each other.”
Statistics as Statistical Method:
Bowley defines statistics in the following three ways:
1. Statistics may be called the science of counting.
2. Statistics may be rightly called the science of averages.
3. Statistics is the science of the measurement of social organism.
But none of Bowley’s definition is adequate. Firstly, Statistics is not merely confined to
the collection of data as other aspects like Presentation, analysis and interpretation are
also covered by it. Secondly, averages are only a part of statistical tools used in the
analysis of the data. Others being dispersion, skewness, correlation etc. Finally, his
definition restricts the application of statistics to Sociology alone. But in modern days,
statistics is used in almost all the fields.
According to Boddington, “Statistics is the science of estimates and probabilities.”
This definition constitutes only a part of statistical methods.

2
According to King, “Statistics is the method of judging collective, natural or social
phenomenon from the results obtained from the analysis or enumeration or collection
of estimates.”
Lovitt defines statistics as the “Science which deals with the collection, classification
and tabulation of numerical facts as the basis for explanation, description and
comparison of phenomenon.”
The best definition is given by Croxton and Cowden. According to them, “Statistics is
defined as the science which deals with the collection, analysis and interpretation of
numerical data.”
Functions of Statistics
1. It presents the facts in a definite form: Statistics presents facts in a precise
and definite from by expressing it in the numerical or quantitative from. For
e.g., the statement ‘the number of students passed in statistics paper in NUALS
in the year 2002-03 was higher than that in 2001-02 will not give a clear idea of
the situation. However, the statement ‘the number of students who passed in
statistics paper in NUALS in the year 2002-03 was 54 as compared to the year
2001-02 in which the number of students passed was 50 conveys a definite
information.
2. It simplifies a mass of figures: Statistics helps in condensing mass of data into
few significant figures. Hence, the statistical methods present meaningful
overall information of the mass of data.
3. It helps in formulating and testing of hypothesis: Statistical methods are
extremely helpful in formulating and testing hypothesis to develop new
theories. For e.g., whether students have benefited from extra coaching can be
tested by appropriate statistical tools.
4. It helps in prediction: Statistical methods provide helpful means of forecasting
future events. For e.g., a cement manufacturer can predict how much cement he
should produce in 2010 based on the demand for it in the current year.
5. It helps in formulation of suitable policies: Statistics provide the basic
material for framing suitable policies for the Government and other agencies.
For e.g., the data regarding population helps in determining the future needs
such as food, clothing etc.
6. It facilitates comparison: Statistical methods provide comparison for same
kind of figures. For e.g., if we know the average marks of students of 2
semesters for a particular subject, we can compare the average marks and
conclude students which semester is better in that subject.

3
Limitations of Statistics

Statistics is not suited to the study of qualitative phenomenon. Statistics being a


science dealing with a set of numerical data is applicable tothe study of only
those subjects of enquiry which are capable of quantitative measurement.
Qualitative phenomenon like honesty, intelligence, culture etc. which cannot be
Expressed numerically are not capable of direct statistical analysis. However statistical
techniques may be applied indirectly by first reducing the qualitative expressions to
Precise quantitative forms
E.g. the Intelligence of a group of candidates can be studied on the basis of their scores
in certain test.
Statistics is liable to be misused. The Most important limitation of statistics is that it mu
st be misused by inexperienced and untrained persons. The use of statistical tools by the
m leads to fallacious conclusions. One of the greatest shortcomings of statistics is that
it can be manipulated and moulded in any manner to support one’s way of argument
and reasoning. According to the Statistician ‘King’, “Statistics are like clay of
which one can make God or Devil as one pleases”. If Statistical conclusions are based
on incomplete information one may arrive at awrong conclusion. E.g. The Argument
that Drinking Beer is bad for longevity because 99% of the persons who take beer die
before the age of 60 is statistically defective because itdoes not convey what
percentage of persons who do not take beer die before the age of 60 Years.
Statisticsdoes not study individuals. Statistics deals with an aggregate of objects and d
oes not give any specific recognition to the individual items of a series. E.g. The indivi
dual figures of agricultural position of any country for a particular year is meaningless
unless it facilitates comparison with similar figures of other countries or of the same
country for different years is given.
To conclude, unless the data are properly collected and critically interpreted there is eve
ry possibility of reaching wrong conclusions. Statistics is only a tool, that is a method
of approach. Tools if properly used, do wonders if misused prove disastrous
Importance of Statistics in Different fields
Importance and scope of statistics
In modern times, statistics is viewed not as a mere device for collecting numerical data
but as a means of developing sound techniques for their handling and analysis for
drawing valid inferences from them. As such it is not confined to the affairs of the state
but is intruding constantly into various diversified spheres of life – social, economic
and political. It is now finding wide applications in almost all sciences – social as well
as physical – such as biology, psychology, education, economics, business management
etc. It is hardly possible to enumerate even a single department of human activity

4
where statistics does not creep in. it is rather indispensable in all phases of human
endeavour.
1. Statistics and Planning
Statistics is indispensable to planning. In the modern age, which is termed as the
age of planning, in almost all over the world, governments (particularly of the
budding economies) are resorting to planning for economic development. The
success of planning must be based soundly on the correct analysis of complex
statistical data.

2. Statistics and Economics


Statistical data and statistical methods are of immense help in the proper
understanding of the economic problems and formulation of economic policies.
For e.g. What to produce, how to produce and for whom to produce – these are
the questions that need a lot of statistical data to arrive at correct decisions.
Statistical data and methods are the tools of an economist’s laboratory. Statistics
is the very foundation stone in the theory of exchange. How the national income
is to be calculated and how it is to be distributed cannot be answered without
statistics.
In recent years, “econometrics” which comprises of the application of statistical
methods to the theoretical economic methods is widely used in economic
research (Economics + Mathematics = Mathematical Economics)
Statistical methods help not only in framing economic policies but also in
evaluating their effect. As Alfred Marshall, the renowned economist observed
“Statistics is the straw out of which, I, like every other economist, have to make
bricks.

3. Statistics and Business


Statistics is an indispensable tool of production control also. Business
executives are relying more and more on statistical techniques for studying the
needs and desires of the consumers and for many other purposes. The success of
a business more or less depends upon the accuracy and precision of a statistical
forecast.
Suppose, a businessman wants to manufacture readymade garments, before
starting with the production process, he must have an overall idea as to “how
many garments have to be manufactured, how much raw materials and labour
are needed for that” and “what is the quantity, shape, colour, size etc.” Thus the
formulation of a production plan in advance is a must which cannot be done
without having quantitative facts about the details mentioned above. That is
why most of the industrial and commercial enterprises are employing trained
and efficient statisticians.
4. Statistics and Industry
In industry, statistics is very widely used in quality control. In production
engineering, to find out whether the product is conforming to specification or not.
Statistical tools like inspection, plans, control
In inspection plans, we have to resolve to some kinds-which are a very
important aspect of statistics.

5
5. Statistics and State
Since ancient times the ruling kings and chiefs have relied heavily on statistics
in framing suitable military and fiscal policies.Most of the statistics such as that
of crimes, military strength, population, taxes etc. that were collected by them
were a by product of administrative action. Concept of State has changed from
that of simply maintaining law and order to that of a welfare state. Statistical
data and methods are of great help in promoting human welfare. The state
collects statistics on several problems. These statistics helps in formulating
suitable policies. For E.g. The transport department cannot solve the problems
unless it knows how many buses are operating at present, what the total
requirement is, etc?

6. Statistics and natural science

Statistical techniques have proved to be extremely useful in the study of all


natural sciences like biology, zoology, medicine etc. In Diagnosing diseases,
doctor has to rely heavily on actual data like temperature of the body, pulse rate
etc. Similarly, in judging the efficacy of a particular drug for using against a
certain disease, experiments have to be conducted and the success and failure
would depend upon the number of people who are cured after using the drug.

7. Statistics is indispensable in research. Most of the advancements in knowledge


have taken place because of experiments conducted with the help of statistical
methods.
8. Statistics and other uses
Statistics are useful to bankers, brokers, insurance companies, social workers,
labour unions, politicians etc.
E.g. Politicians and their supporters are immensely interested in knowing their
prospects of winning an election. By sampling a few voters prior to the
election, the percentage of the votes the candidate will receive in the election
can be worked out. This estimated percentage could be used to decide whether a
greater candidate is required to assure success. Similarly premium rates of the
life insurance companies are based upon very careful study of expectancy of
life.
Statistical Survey
A survey is a process of collecting data from existing population units with no
particular control over the factors that may affect the population characteristics of
interests in the study; for e.g., in the study of salary of workers in a factory, the salary
may be affected by a number of factors such as educational level, nature of job etc. As
we get info about workers’ salary, we have no control over these factors – they happen
to be the existing attributes of the workers. A statistical survey may be either a general
purpose survey or a specific purpose survey (also known as special purpose survey). In
general purpose survey, we may obtain data which are useful for several purposes; for
e.g., population census. Such survey provides info not only about the total population,
but also about its divisions into males and females, literates and illiterates, employed
and unemployed etc. A special purpose survey is that in which the data obtained are
useful in analyzing a particular problem only.

6
A statistical survey passes through several stages before completion, starting from
planning the survey and ending with writing the final report. These stages can be
summarized under two broad heads: Planning the survey and executing the survey.

Planning the survey


Proper planning of survey is very much important because the quality of survey
research depends on the preparations made before the survey is conducted. The matters
which require careful considerations at the planning stage are:
1. Statement of the problem / Purpose of the survey:
Purpose of the survey should be clearly set out at the beginning. It will
necessitate a clear statement of the problem indicating what we are interested in
determining. The object of an enquiry may be either to collect specific
information relating to a particular problem or adequate data to verify a given
proposition or to test a hypothesis.
2. Scope of the survey:
Once the purpose of the survey has been clearly stated, the next step is to decide
about the scope of the survey, i.e., its coverage with regard to the type of
information, the subject matter, and the geographical area. For e.g.; an enquiry
relating to the socio-economic conditions of industrial workers may be
undertaken with the help of data relating to age, family details, income,
expenditure etc. Likewise, an enquiry may relate to India as a whole or a
particular state or an industrial town.
Three factors exert great influence on the scope of the survey:

a. Object of the enquiry

b. Availability of time

c. Availability of resources

The investigation should be carried out within a reasonable period of time;


otherwise the information collected may be outdated. For e.g., if a commission
is set up to recommend DA (Dearness Allowance) on the basis of the rise in
price, and the commission takes more than 3 years to submit the report, there is
every possibility of its findings being outdated.
3. Unit of data collection
Before collecting the data, the statistical unit must be clearly defined for the
purpose of investigation (Statistical unit is the unit in terms of which the
investigator selects the attributes for the enumeration, analysis and
interpretation); for e.g., in a population census, the statistical unit is a person.
However, the problem of defining the unit is not as simple as it appears to be.
For e.g., if we want to conduct a study, the size of a sugar mill, we’ve different
criteria to measure the size of sugar mill such as capital employed, number of

7
employed, total production etc. The investigator has to select one of these for
classification and then proceed to collect necessary info.
While fixing the statistical unit for an enquiry, it is useful to keep in view the
following points:

1. Units must suit the purpose of the study.


2. It should be simple to understand.
3. It should be specific.
4. It should be stable in character.
4. Sources of data collection
The sources of info may be either primary or secondary. Primary data is original
in character and it is also called first hand information. Whereas secondary data
is collected from published or unpublished sources. The data which is primary
in the hands of one person becomes secondary info for the other person. For
e.g.; if an investigator wants to collect some info regarding the smoking habits
of students in NUALS, if the investigator approaches directly the students and
collects info, such info constitutes primary data for the investigator. Suppose the
similar data already collected by the Students Council of NUALS and the
investigator approaches the Students Council members and collects the info,
such data constitutes secondary data for the investigator.
 Primary sources of collecting data: Questionnaires; Interview; Scheduled
methods; Observation; Correspondence.
5. Technique of Data collection
There are two techniques of data collection – census method and sample
method. A census is a complete enumeration of each and every unit of the
universe. In sample method, only a part of the universe is studied and the
conclusions about the entire universe are drawn on that basis. The choice
between the census and sample methods depends upon the availability of
resources, time factor, degree of accuracy desired and nature and scope of the
enquiry.
6. Frame
Frame refers to listing of all units in the population under study. For e.g.; if we
want to find out the number of workers in a small scale industry in Delhi, we
must’ve a complete list of names and addresses of all the small scale industries.
This list of names and addresses is called a frame. To a considerable extent, the
whole structure of enquiry is determined by the frame. Frames may be
inaccurate, incomplete, subject to duplication, inadequate and out of date. So, it
is therefore essential at the outset of the survey to carry out a careful
investigation of the frame.
7. Degree of Accuracy Desired

8
The investigator has to decide about the degree of accuracy that he wants to
attain. It may be pointed out that, absolute accuracy is not possible in a
statistical work because a). Statistics is based on estimates; b). Tools of
measurement are always not perfect; c). There may be unintentional bias on the
part of the investigator, enumerator or the informant.

8. Miscellaneous considerations:
Considerations should be given to various other matters such as whether the
enquiry is:
a. Official, semi-official or non-official;
b. Confidential or non-confidential;
c. Regular or ad hoc.
Executing the Survey
After a plan of data collection has been prepared, the next step is to execute the survey.
The various phases of the work subsequent to the planning stage are:
1. Setting up an administrative organization:
The administrative organisation required for an enquiry will depend very much
on the nature and scope of the enquiry. When the enquiry covers a large area,
supervision from a central office is likely to be difficult and in such cases, it is
best to establish regional offices.
2. Design of forms:
Careful attention should be given to the deciding of various forms that will be
used in the course of enquiry, especially the questionnaire.
3. Selection, Training and the Supervision of the Field Investigators:
In most of the surveys, the data are to be collected through enumerators who
work part-time or full-time basis. The success of the survey depends upon the
field investigators. So, it is essential that they are properly selected, thoroughly
trained and their work closely supervised.
The enumerators should be honest, intelligent, hard working and able to create
friendly atmosphere and make the respondent feel at ease. He must speak the
language of the respondent, ask the questions properly and intelligently and
record the response accurately and completely.
After having selected the enumerators, the next step is to give them proper
training. The enumerators should know the purpose of the survey, the manner in
which the data are to be collected and the interview should be conducted. They
should know the definitions of the terms used in the questionnaire or schedule;
for e.g.; the question nature of family - Nuclear family (Not exceeding 5);
Medium family (6-10 members) and joint family (More than 1 family living
together). It is also necessary to watch carefully the work of the enumerators.
The supervision should be carried by superior staff (Better paid, better qualified
and more experience).

9
4. Control over a quality of field work and field edit.
The field check should be carried out by the supervisors and it should be conducted in
such a manner that the investigators do not have prior knowledge of the work going to
be checked.
After the work of collecting data is completed, the questionnaire or schedule is handed
over to the enumerators by the supervisor. While in the field, the supervisor should
scrutinize these to check omissions, inconsistencies etc. This editing is highly useful
because (1). unless the questionnaires are edited on the spot, the need for further
information to correct some of the wrong entries made by the enumerators may only be
discovered when the enumerators have moved to another area.
(2). If the errors are discovered at this stage, the enumerators can be instructed not to
make such errors in the future.
5. Processing of Data
After the data have been collected, the efforts shift from the field to the office.
The data are to be given a thorough check, coded, transferred to cards and
tabulated. The process of coding involves translating responses in numerical
terms in order to facilitate the analysis. For e.g.:- the sex of the respondent may
be called as male 1, female 2. After the material is edited and coded, it is ready
for analysis which can be performed either by hand or machines.
6. Preparation of Report
After the data have been collected and analyzed, it is usually necessary to
embody the results of the survey in the form of a report. The preparation of
report therefore constitutes the final step in the execution. The following aspects
of the survey should be highlighted in the report.
a. Statement of the purpose of the survey
A general indication of the purpose of the survey should be given in the
report.
b. Description of the coverage
An exact description of the geographical region, the branch of economic and
social graphs covered by the survey should be given in the survey.
c. Collection of Information should be reported
The method of collecting data should be briefly explained and the copy of
questionnaire or schedule which is used for survey should be attached in the
final report.
d. Numerical Result
A general indication should be given about the methods followed in the
derivation of numerical results.
e. Miscellaneous Consideration

10
It is also important to touch upon such aspects like prior to which data refer,
time taken for the field survey, the reference of the available reports,
journals, publications etc.

Collection of Data
Data may be obtained from either primary source or secondary source. Primary
data means the data collected by an individual himself. Such data are original in
character. Whereas, secondary data is the data which are not originally collected but
rather obtained from published or unpublished sources. Data which are primary in the
hands of one becomes secondary in the hands of another. For e.g., suppose an
investigator wants to collect data about the smoking habits of students in NUALS is if
the investigator collects the data himself or through his agents adopting any suitable
methods, the data would constitute primary data for him. On the other hand, if the
student council has already made a similar survey and the investigator or his agent
obtains data from union office, such data would constitute secondary data for him.
Advantages of Secondary Data
1. It is highly convenient to use info which someone else has complied. There
is no need for printing data collection forms, appointing enumerators,
editing and tabulating the results.
2. Secondary data are much quicker to obtain than the primary data.
3. Secondary data may be available on some subjects where it would be
impossible to collect primary data.
The choice between primary and secondary data depends on:
1. Nature and scope of the enquiry
2. Availability of financial resources
3. Availability of time
4. Degree of accuracy desired
5. Collecting agency
Methods of Collecting Primary Data
1. Direct Personal Interview: Under this method, there is a face to face contact
with the persons from whom the info is to be obtained (informants). The
interviewer asks them questions pertaining to the survey and collects the desired
information.
Merits of Direct Personal Interview
a. Response is for encouraging because most of the people are willing to supply
info when approached personally.

11
b. The info obtained by this method is more accurate because the interviewer can
clear the doubts of the informants about certain questions.
c. It is also possible to collect supplementary info about the informant’s personal
characteristics and environment.
d. The questions about which the informant is likely to be sensitive can be
carefully sandwiched between other questions by the interviewer.
e. The language of communication can be adjusted to the status and educational
level of the person interviewed.
Limitations
a. It may be very costly where the number of persons to be interviewed is very
large and they are spread over a wide area.
b. The interviewer have to be thoroughly trained and supervised, otherwise they
may not be able to obtain the desired info.
c. More time is required for collecting info by this method as compared to other
methods because interviews can be held only at the convenience of the
informants.
Indirect Oral Interviews
Under this method, the investigator contacts 3rd parties (Known as witness)
capable of supplying the necessary information. This method is generally adopted in
those cases where the information to be obtained is complex in nature and the
informants are not willing to respond if approached directly.
The correctness of information obtained depends upon:
1. The type of person whose evidence is being recorded: If the people do not know
the full facts of the problem under investigation, it will not be possible to arrive
at correct conclusions.
2. The ability of the interviewers to draw out the info from the witness by means
of appropriate questions and cross-examinations.
3. The honesty of the interviewers who collect the info
Information through Correspondents
Under this method, the investigator appoints local agents or correspondents in
different places to collect information. These correspondents collect the information
and transmit it to the central office where the data are processed. Newspaper agencies
usually adopt this method.
Mail Questionnaire Method
Under this method, list of questions pertaining to the survey is prepared and
sent to various informants by post. Request is made to the informants through a
covering letter to fill up the questionnaire and sent it back within a specific time.

12
The main advantages are
This method can be easily adopted where the field of investigation is very vast and the
informants are spread over a wide geographical area.
On questions of personal nature, this method is generally superior to other methods.
Major limitations are: this method can be adopted only where informants are literate
It involves some uncertainty about the response. Cooperation on the part of informants
may be difficult to presume.
The information supplied by the informants may not be correct and it may be difficult
to verify the accuracy.
The success of this method depend upon the sill with which the questionnaire is drafted
and the extent to which willing cooperation of the informants are secured.
To make this method work effectively, the following suggestions are made
The questionnaire should be so framed that it doesnot become an undue burden on the
respondents.
Self addressed stamped envelop should be attached.
The sample should be large.
Attach gift coupen along with the questionnaire.
Schedule method
Under this method , the enumerators contact the informants, get replies to the questions
contained in the schedule and fill them in their own handwriting.
The essential difference between the mailed questionnaire method and schedule method
is that the whereas in the former the questionnaire is sent to the informants by post and
it is filled by the informants himself. But in latter, the enumerators carry the schedule
personally to the informants and enumerators fill the questionnaire/schedule.
Merits
It can be adopted in the case of illiterates.
Very little non response as the enumerators go personally to the field.
More reliable information.
Demerits
Compared to other methods, this method is very costly because the enumerators are
generally paid persons.
The success of this method depends upon the training imparted to the enumerators..
Census and Sampling

13
Under the census method or complete enumeration survey method, data are
collected from each and every unit of the population or universe. For e.g., if an
investigator wants to calculate the average wage of workers in a particular factory, he
should collect the data related to wages of each and every workers in the factory.

Merits of Census Method


1. Data are obtained from each and every unit of the population.
2. The results obtained are likely to be more representative, accurate and reliable.
3. It is an appropriate method of obtaining info on certain things like age, group of
workers, educational level etc.
Sampling
Sampling is simply the process of learning about the population on the basis of
the sample drawn from it. Thus, in the sampling technique, instead of every unit of the
universe, only a part of the universe is studied and the conclusions are drawn on that
basis for the entire universe.
For e.g., a housewife examines only 2 or 3 grains of boiling rice to know
whether the entire pot of rice is ready or not.
Essentials of sampling
For the sample results to have any meaning, it is necessary that a sample should
possess the following essentials:
1. Representativeness: A sample should be so selected that, it truly represents the
universe. To ensure representativeness, random method of selection should be used.
2. Adequacy: the size of the sample should be adequate; otherwise it may not
represent the characteristics of the universe.
3. Independence: all items of the sample should be selected independently of one
another; then only all items of the universe should have the same chance of being
selected in the sampling.
4. Homogeneity: here, it means that there is no basic difference in the nature of the
units of the universe and that of the sample.

Methods of sampling

Methods of Sampling

Probability Non-Probability

14
Simple Stratified Systematic Cluster Judgement Convenient quota

Various methods of sampling can be grouped under 2 broad heads:-


a) Probability sampling ( random sampling){ simple random, systematic, cluster
form, strict form}
b) Non-probability sampling (non random sampling){ judgment, convenient}
Probability sampling methods are those in which every item in the universe has a non
chance or probability of being chosen for the sampling. This implies that selection of
sample item is independent of the person making the study.
Non probability sampling methods are those in which do not provide every item in the
universe with a non chance of being included in the sampling.
Different Methods of Probability sampling are:
1. Simple Random Sampling
2. Stratified Random Sampling
3. Systematic Sampling
4. Cluster Sampling
Simple Random Sampling
Simple random sampling refers to that sampling technique in which each and
every unit of the population has an equal opportunity of being selected in the sample.
Two methods were used to select the sample:
1. Lottery method
2. Random Number Table method

Lottery Method: In lottery method, all items of the universe are numbered or named
on separate slips of paper having identical sides, shape, colour etc. These slips are then
folded and mixed up in a bowl. A blindfold selection is then made of the number of
slips required to substitute the desired sample size.
Merits:
1. Since the selection of items in the sample depends entirely on chance, there is
no possibility of personal bias affecting the results.
2. As size of the sample increases, it becomes increasingly representative of the
population.
Demerits:
1. The use of simple random sampling necessitates a completely catalogued
universe from which to draw the sample. But, it is often difficult for the
investigator to have up to date list of all the items of the population to be
sampled.

15
2. From the point of view of field survey, it has been claimed that cases selected
by random sampling tend to be too widely dispersed geographically and that the
time and cost of collecting data become too large.

Stratified Sampling
Under this method, the universe is subdivided into different groups (Strata) and a
sample is then chosen independently from each group by either lottery method or
random table method. Stratification is based on some common characteristics of the
data. For example, if we want to collect data regarding the consumption pattern of
people in India, the country is divided into different states. Again, states are divided
into different districts. Districts are then divided into zones. Zones are then divided into
Wards, etc. And from each part, a sample may be taken at random.
Next step is to select the sample size within each stratum. Usually proportionate
stratified sampling is used. It means that the number of items drawn from each stratum
is proportional to the size of the strata. The population is divided into three groups, say,
A, B, C and each group consist of 300, 600 and 900 people respectively. From these 3
groups, sample size 600 is to be selected.
Based on proportionate stratified sampling technique,
A=(300 x 600)/1800=100
B=(600 x 600)/1800=200
C=(900 x 600)/1800=300
From Group A, 100 samples, from Group B, 200 samples and from Group C, 300
samples are selected.
Merits
1. Since the population is first divided into various strata, then a sample has to be
drawn from each stratum, there is a little possibility of any essential group of
population being completely excluded. (More representativeness)
2. Each stratum is so framed that it consists of uniform or homogeneous items. So,
greater accuracy is there in the selection of samples. (Greater Accuracy)
3. As compared to random sample method, stratified samples have more
geographical concentration, i.e.; units from the different strata may be selected
in such a way that all of them are localised in one geographical area.
The main disadvantage of this method is that, if proper stratification of the
population is not done, the sample may have the effect of bias.
Systematic Sample
It is formed by selecting one unit at random and then selecting additional units
at evenly spaced intervals until the sample has been formed. This method is popularly

16
used in those cases where a complete list of the population from which the sample is to
be drawn is available.
The list may be prepared in alphabetical, geographical, numerical or some other
order. The items are serially numbered. The first item is selected at random by lottery
method. Subsequent items are selected by taking every kth item from the list. K refers to
sampling interval or sample ratio, i.e.; ratio of population size to the size of the sample.
k=N/n
Where k is sampling interval, N is the size of the universe and n is the sample size.
The merits of this method are that it is simple and convenient to adopt. Time and work
involved in sampling by this method are relatively less.
Cluster of Multi-Stage Sampling
Under this method, the random selection is made of primary, intermediate and final
units from a given population of stratum.
There are several stages in which the sampling process is carried out.
At first stage, units are sampled by some suitable methods such as simple random
sampling.
When a sample of second stage unit is selected from each of the selected first stage
units, again by some suitable method which may be same as or different from the
method employed for the first stage units. Further, stages may be added as required. For
example, suppose we want to take a sample of 500 households from the state of UP.
At the first stage, the state may be divided into number of districts and a few districts
selected at random.
At the second stage, each selected district may be subdivided into number of villages
and a sample of village may be taken at random.
At the third stage, a number of households may be selected from each of the villages
selected at the second stage.
The advantages of this method are:
1. It introduces flexibility in the sampling method.
2. Sub-division of the second stage units are carried out for only those first stage units
which are included in the sample.

Non-Probability Sampling
1. Judgement Sampling: In this method, the choice of the sample items
depends exclusively on the judgement of the investigator.
In other words, the investigator exercises this judgement in the choice and
includes those items in the sample which he thinks are most typical of the
universe with regard to the characteristics under investigation.

17
e.g.; If a sample of ten students is to be from a class of 60 for analyzing the
spending habits of the students, the investigator would select 10 students who in his
opinion are representative of the class.
Limitations
This method is not scientific because the population units to be sampled may be
effected by personal prejudice or bias of the investigator. For example, if an
investigator holds the view that the wages of workers in a certain establishment are
very low and if he adopts judgement sampling method, he may include only those
workers whose wages are low and thereby establish his point of view which may be far
from the truth.
Convenient Sampling
A convenient sample is obtained by selecting convenient population units. The method
of convenient sampling is also called ‘chunk’. A chunk refers to that fraction of the
population being investigated which is selected neither by probability nor by judgement
sampling, but by convenience.
The sample obtained from readily available list like telephone directory is a convenient
sample. For example, if a person is to submit a project report on labour management
relation in textile industry and he takes a textile mill used to his office and interviews
some people over there, he is following the convenient sampling method.
Convenient sampling is often used for making piolet study or pre-testing the
questionnaire.
Quota Sampling
It is a type of judgement sampling and commonly used sampling technique in non-
probability category. In a quota sample, quotas are set up according to some specified
characteristics. Each interviewer is then asked to interview a certain number of persons
which constitute his quota. Within the quota, the selection of sample items depends on
personal judgement. For example, in a radio listening survey, the interviewer may be
asked to interview people living in a certain area. Quotas may consist of housewives,
farmers, children etc. Within theses quotas, interviewer is free to select the sample.
Quota sampling and stratified sampling are almost similar. In both methods, the
universe is divided into different parts and the sample is selected from each part. The
only difference is that in stratified random sampling, the sample within each stratum is
selected at random. But in quota sampling, the sample within the quotas is not selected
at random.
Merits of Sampling
1. Less time consuming: Since the sample is a study of a part of the population,
considerable time and labour are saved when a sample survey is carried out.
Time is saved not only in collecting data, but also in processing of it.

2. Less Cost: The total financial burden of a sample survey is generally less than
that of complete enumeration. This is because of the fact that in sampling, we

18
study only a part of the population and the total expense of collecting data is
less than that required in census method.

3. More Detailed Info: Since sampling techniques save time and labour, it is
possible to collect more detailed info in sample survey.

4. Sampling method is the only method that can be used in certain cases. For
example, if an investigator interested in testing the breaking strength of chalks
manufactured in a factory, under census method, all the chalks would be broken
in the process of testing.

Limitations of Sampling
1. A sample survey must be carefully planned and executed. Otherwise, the results
obtained may be inaccurate and misleading.
2. Sometimes the sampling plan may be more complicated than it requires more
time, labour and money than a complete count. This is because the size of the
sample is a large proportion of the total population.
Classification and Tabulation of Data
Classification of Data
The process of arranging things in groups or classes according to their common
characteristics is called classification of data. According to Secrist, “Classification is
the process of arranging data into sequences and groups according to their common
characteristics or separating them into different but related parts.
Requisites of a Good Classification
The main characteristics of a good classification are:
1. It should be exhaustive: Classification must be exhaustive in the sense that
each and every item in the data must belong to one of the classes.
2. It should be unambiguous: Classification is meant for removing ambiguity. It
is necessary that various classes should be so defined that there is no room for
doubt or confusion.
3. It should be mutually exclusive: Each item of the given data should fit only in
one class. In other words, the classes must not overlap.
4. It should be homogeneous: The items included in each class must be
homogeneous. Otherwise, there may be further classification into sub groups.
Purpose of Classification of data
1. It condenses the mass of data and ignores the unnecessary details, thereby
making available input data to study or survey.

19
2. It facilitates comparison between data.
3. It helps in studying the relationship between several characteristics.
4. It facilitates further statistical treatments.
5. It helps in preparing the data for tabulation.
6. It presents facts in a simple form.
7. It brings out clearly the points of similarity and dissimilarity.
Types of Classification
1. Quantitative Classification: When the basis of classification is according to
differences in quantity, the classification is called quantitative classification. In
other words, quantitative classification is made according to numerical size. A
quantitative classification is the classification which is based on such
characteristics which are capable of quantitative measurement such as height,
weight, marks obtained etc of individuals. Here, height, weight etc is a variable
and the number of persons indicates frequency.
2. Temporal Classification / Chronological Classification: When the basis of
classification is according to differences in time, the classification is called
temporal or chronological classification. For e.g., the students who got first
division during the last three years are classified year wise.
3. Spatial / Geographical Classification: When the basis of classification is
according to geographical location or place, such classification is called spatial
or geographical classification. For e.g., the crime rate in different states.
4. Qualitative Classification: When the basis of classification is according to
characteristics or attributes like social status etc, it is called qualitative
classification. For e.g., educated and uneducated persons, married and
unmarried persons.
Classification of this nature is of two types:
1. Simple classification
2. Manifold classification
If the data are classified only into two categories according to the presence or absence
of only one attribute, such type of classification is known as simple or twofold or
dichotomous classification. For e.g., the population of India maybe divided into males
and females. Manifold classification is a classification where more than two attributes
are involved. For e.g., when the population of males and females are further subdivided
into literates and illiterates, we find there two attributes under the study.
Tabulation of Data
The last stage in the compilation of data is tabulation. After the data have been
collected and classified, it is essential to put them in the form of tables. Tabulation is a
scientific process used in setting of the collected data in an understandable form.

20
According to Prof. Cuttle, “the logical listing of related quantitative data in vertical
columns and horizontal rows of numbers with sufficient explanatory and qualifying
words, phrases and statements in the form of title, headings and explanatory notes to
make clear the full meaning, context and origin of the data.”
Objectives of Tabulation
1. To simplify the complex data: In the process of tabulation, the unnecessary
details are avoided. All tabular data are presented in such a manner that they
become more meaningful and can be easily understood by a common man.
2. To clarify the objective of investigation: The purpose of tabulation is to
arrange the data in easily assessable form, the answers with which the
investigation is concerned.
3. To facilitate comparison: It facilitates comparison of data shown in rows and
columns. Sometimes, comparable figures are placed in columns or rows.
4. To depict trend and pattern of data: Tabulation of data shows the trend of
info under the study. It reveals the patterns within the figures which cannot be
understood in a descriptive form of presentation.
5. To help reference for future studies: Data arranged in tables with titles and
table numbers can be easily identified and made use of as source reference for
future use and studies.
6. To facilitate statistical analysis: It is only after classification and tabulation
that the statistical data becomes fit for analysis and interpretation. Various
statistical measures such as averages, dispersion, correlation etc can be
calculated from the data which is systematically classified and tabulated.
Difference between Classification and Tabulation
The basic points of difference between classification and tabulation, besides
these are closely related, are as given below:
1. Classification of data is a process of statistical analysis while tabulation is a
process of presentation.
2. Classification is the basis for tabulation because the data is classified first
and then tabulated.
3. In classification, the data is divided into various groups and sub-groups
based on their similarities and dissimilarities, while tabulation is a process
of arranging the classified data in rows and columns with suitable heads and
sub-heads.
Essential Parts of a Statistical Table
1. Table Number: A table should be numbered for identification, especially, when
there are a large number of tables in a study. The number may be put at the
centre above the title.

21
2. Title of the Table: Every table should have a title. It should be clear, brief and
self-explanatory. The title should be set in bold type so as to give it prominence.
3. Stub / Row Heading: Each row of the table must have a heading. The headings
of the rows are called stubs. Stubs clarify the figures in the rows. As far as
possible, the items should be condensed so that they can be included in a single
row.
4. Caption / Column Heading: A table has many columns and the sub-headings
of the columns are called captions or column headings. They should be well-
defined and brief.
5. Body of the Table: It is the most vital part of a table. It contains numerical
values. It should be made as comprehensive as possible. The actual data should
be arranged in such a manner that any figure maybe readily located.
6. Unit of Measurement: The unit of measurement should be stated along with
the title, if this is uniform throughout. If different units have been adopted, then
they should be stated along the stub or caption.
7. Source Notes: A note at the bottom of the table should always be given to
indicate the primary source as well as the secondary source from where the data
has been taken, particularly when there is more than one source.
8. Footnotes and References: It is always placed at the bottom of the table. It is a
statement containing explanation of some specific items which cannot be
understood by the reader from the title, captions and stubs.

22

Vous aimerez peut-être aussi