Vous êtes sur la page 1sur 152

MB0034-Unit 1-An Introduction to

Research
Unit 1 An Introduction to Research

Meaning and Definition of Research

Research simply means a search for facts – answers to questions and solutions to
problems. It is a purposive investigation. It is an organized inquiry. It seeks to find
explanations to unexplained phenomenon to clarify the doubtful facts and to correct the
misconceived facts.

The search for facts may be made through either:

• Arbitrary (or unscientific) Method: It’s a method of seeking answers to


question consists of imagination, opinion, blind belief or impression. E.g. it was
believed that the shape of the earth was flat; a big snake swallows sun or moon
causing solar or lunar eclipse. It is subjective; the finding will vary from person to
person depending on his impression or imagination. It is vague and inaccurate. Or
• Scientific Method: this is a systematic rational approach to seeking facts. It
eliminates the drawbacks of the arbitrary method. It is objective, precise and
arrives at conclusions on the basis of verifiable evidences.

Therefore, search of facts should be made by scientific method rather than by arbitrary
method. Then only we may get verifiable and accurate facts. Hence research is a
systematic and logical study of an issue or problem or phenomenon through scientific
method.

Young defines Research as “a scientific undertaking which, by means of logical and


systematic techniques, aims to:

a) Discover of new facts or verify and test old facts,

b) Analyze their sequences, interrelationships and causal explanations,

c) Develop new scientific tools, concepts and theories which would facilitate
reliable and valid study of human behaviour.

d) Kerlinger defines research as a “systematic, controlled, empirical and critical


investigation of hypothetical propositions about the presumed relations among natural
phenomena.
Objectives:

After studying this lesson the students should be able to understand:

• Research and scientific method


• Characteristics of Research
• Purpose of research
• Different types of Research
• Research Approaches
• Significance of research in Social and Business Sciences

1.1.1 Research and Scientific Method

Research is a scientific endeavour. It involves scientific method. “The scientific method


is a systematic step-by-step procedure following the logical processes of reasoning”.
Scientific method is a means for gaining knowledge of the universe. It does not belong to
any particular body of knowledge; it is universal. It does not refer to a field of specific
subject of matter, but rather to a procedure or mode of investigation.

The scientific method is based on certain “articles of faith.” These are:

• Reliance on Empirical Evidence: Truth is established on the basis of evidence.


Conclusion is admitted, only when it is based on evidence. The answer to a
question is not decided by intuition or imagination. Relevant data are collected
through observation or experimentation. The validity and the reliability of data are
checked carefully and the data are analyzed thoroughly, using appropriate
methods of analysis.
• Use of Relevant Concepts: We experience a vast number of facts through our
sense. Facts are things which actually exist. In order to deal with them, we use
concepts with specific meanings. They are symbols representing the meaning that
we hold. We use them in our thinking and communication. Otherwise, clarity and
correct understanding cannot be achieved.
• Commitment of Objectivity: Objectivity is the hallmark of the scientific
method. It means forming judgement upon facts unbiased by personal
impressions. The conclusion should not vary from person to person. It should be
the same for all persons.
• Ethical Neutrality: Science does not pass normal judgment on facts. It does not
say that they are good or bad. According to Schrödinger “Science never imposes
anything, science states. Science aims at nothing but making true and adequate
statements about its object.”
• Generalization: In formulating a generalization, we should avoid the danger of
committing the particularistic fallacy, which arises through an inclination to
generalize on insufficient or incomplete and unrelated data. This can be avoided
by the accumulation of a large body of data and by the employment of
comparisons and control groups.
• Verifiability: The conclusions arrived at by a scientist should be verifiable. He
must make known to others how he arrives at his conclusions. He should thus
expose his own methods and conclusions to critical scrutiny. When his conclusion
is tested by others under the same conditions, then it is accepted as correct.
• Logical reasoning process: The scientific method involves the logical process of
reasoning. This reasoning process is used for drawing inference from the finding
of a study or for arriving at conclusion

Characteristics of Research

• It is a systematic and critical investigation into a phenomenon.


• It is a purposive investigation aiming at describing, interpreting and explaining a
phenomenon.
• It adopts scientific method.
• It is objective and logical, applying possible test to validate the measuring tools
and the conclusions reached.
• It is based upon observable experience or empirical evidence.
• Research is directed towards finding answers to pertinent questions and solutions
to problems.
• It emphasizes the development of generalization, principles or theories.
• The purpose of research is not only to arrive at an answer but also to stand up the
test of criticism.

Purpose of Research

The objectives or purposes of research are varied. They are:

• Research extends knowledge of human beings, social life and environment. The
search is for answers for various types of questions: What, Where, When, How
and Why of various phenomena, and enlighten us.
• Research brings to light information that might never be discovered fully during
the ordinary course of life.
• Research establishes generalizations and general laws and contributes to theory
building in various fields of knowledge.
• Research verifies and tests existing facts and theory and these help improving our
knowledge and ability to handle situations and events.
• General laws developed through research may enable us to make reliable
predictions of events yet to happen.
• Research aims to analyze inter-relationships between variables and to derive
causal explanations: and thus enables us to have a better understanding of the
world in which we live.
• Applied research aims at finding solutions to problems… socio-economic
problems, health problems, human relations problems in organizations and so on.
• Research also aims at developing new tools, concepts and theories for a better
study of unknown phenomena.
• Research aids planning and thus contributes to national development.

Types of Research

Although any typology of research is inevitably arbitrary, Research may be classified


crudely according to its major intent or the methods. According to the intent, research
may be classified as:

Pure Research

It is undertaken for the sake of knowledge without any intention to apply it in practice,
e.g., Einstein’s theory of relativity, Newton’s contributions, Galileo’s contribution, etc. It
is also known as basic or fundamental research. It is undertaken out of intellectual
curiosity or inquisitiveness. It is not necessarily problem-oriented. It aims at extension of
knowledge. It may lead to either discovery of a new theory or refinement of an existing
theory. It lays foundation for applied research. It offers solutions to many practical
problems. It helps to find the critical factors in a practical problem. It develops many
alternative solutions and thus enables us to choose the best solution.

Applied Research

It is carried on to find solution to a real-life problem requiring an action or policy


decision. It is thus problem-oriented and action-directed. It seeks an immediate and
practical result, e.g., marketing research carried on for developing a news market or for
studying the post-purchase experience of customers. Though the immediate purpose of an
applied research is to find solutions to a practical problem, it may incidentally contribute
to the development of theoretical knowledge by leading to the discovery of new facts or
testing of theory or o conceptual clarity. It can put theory to the test. It may aid in
conceptual clarification. It may integrate previously existing theories.

Exploratory Research

It is also known as formulative research. It is preliminary study of an unfamiliar problem


about which the researcher has little or no knowledge. It is ill-structured and much less
focused on pre-determined objectives. It usually takes the form of a pilot study. The
purpose of this research may be to generate new ideas, or to increase the researcher’s
familiarity with the problem or to make a precise formulation of the problem or to gather
information for clarifying concepts or to determine whether it is feasible to attempt the
study. Katz conceptualizes two levels of exploratory studies. “At the first level is the
discovery of the significant variable in the situations; at the second, the discovery of
relationships between variables.”

Descriptive Study

It is a fact-finding investigation with adequate interpretation. It is the simplest type of


research. It is more specific than an exploratory research. It aims at identifying the
various characteristics of a community or institution or problem under study and also
aims at a classification of the range of elements comprising the subject matter of study. It
contributes to the development of a young science and useful in verifying focal concepts
through empirical observation. It can highlight important methodological aspects of data
collection and interpretation. The information obtained may be useful for prediction
about areas of social life outside the boundaries of the research. They are valuable in
providing facts needed for planning social action program.

Diagnostic Study

It is similar to descriptive study but with a different focus. It is directed towards


discovering what is happening, why it is happening and what can be done about. It aims
at identifying the causes of a problem and the possible solutions for it. It may also be
concerned with discovering and testing whether certain variables are associated. This
type of research requires prior knowledge of the problem, its thorough formulation, clear-
cut definition of the given population, adequate methods for collecting accurate
information, precise measurement of variables, statistical analysis and test of
significance.

Evaluation Studies

It is a type of applied research. It is made for assessing the effectiveness of social or


economic programmes implemented or for assessing the impact of developmental
projects on the development of the project area. It is thus directed to assess or appraise
the quality and quantity of an activity and its performance, and to specify its attributes
and conditions required for its success. It is concerned with causal relationships and is
more actively guided by hypothesis. It is concerned also with change over time.

Action Research

It is a type of evaluation study. It is a concurrent evaluation study of an action


programme launched for solving a problem for improving an exiting situation. It includes
six major steps: diagnosis, sharing of diagnostic information, planning, developing
change programme, initiation of organizational change, implementation of participation
and communication process, and post experimental evaluation.

According to the methods of study, research may be classified as:


1. Experimental Research: It is designed to asses the effects of particular variables
on a phenomenon by keeping the other variables constant or controlled. It aims at
determining whether and in what manner variables are related to each other.
2. Analytical Study: It is a system of procedures and techniques of analysis applied
to quantitative data. It may consist of a system of mathematical models or
statistical techniques applicable to numerical data. Hence it is also known as the
Statistical Method. It aims at testing hypothesis and specifying and interpreting
relationships.
3. Historical Research: It is a study of past records and other information sources
with a view to reconstructing the origin and development of an institution or a
movement or a system and discovering the trends in the past. It is descriptive in
nature. It is a difficult task; it must often depend upon inference and logical
analysis or recorded data and indirect evidences rather than upon direct
observation.

4. Survey: It is a fact-finding study. It is a method of research involving


collection of data directly from a population or a sample thereof at particular time.
Its purpose is to provide information, explain phenomena, to make comparisons
and concerned with cause and effect relationships can be useful for making
predications

Research Approaches

There are two main approaches to research, namely quantitative approach and qualitative
approach. The quantitative approach involves the collection of quantitative data, which
are put to rigorous quantitative analysis in a formal and rigid manner. This approach
further includes experimental, inferential, and simulation approaches to research.
Meanwhile, the qualitative approach uses the method of subjective assessment of
opinions, behaviour and attitudes. Research in a situation is a function of the researcher’s
impressions and insights. The results generated by this type of research are either in non-
quantitative form or in the form which cannot be put to rigorous quantitative analysis.
Usually, this approach uses techniques like depth interviews, focus group interviews, and
projective techniques.

Significance of Research in Social and Business Sciences

According to a famous Hudson Maxim, “All progress is born of inquiry. Doubt is often
better than overconfidence, for it leads to inquiry, and inquiry leads to invention”. It
brings out the significance of research, increased amounts of which makes progress
possible. Research encourages scientific and inductive thinking, besides promoting the
development of logical habits of thinking and organization.
The role of research in applied economics in the context of an economy or business is
greatly increasing in modern times. The increasingly complex nature of government and
business has raised the use of research in solving operational problems. Research
assumes significant role in formulation of economic policy, for both the government and
business. It provides the basis for almost all government policies of an economic system.
Government budget formulation, for example, depends particularly on the analysis of
needs and desires of the people, and the availability of revenues, which requires research.
Research helps to formulate alternative policies, in addition to examining the
consequences of these alternatives. Thus, research also facilitates the decision making of
policy-makers, although in itself it is not a part of research. In the process, research also
helps in the proper allocation of a country’s scare resources. Research is also necessary
for collecting information on the social and economic structure of an economy to
understand the process of change occurring in the country. Collection of statistical
information though not a routine task, involves various research problems. Therefore,
large staff of research technicians or experts is engaged by the government these days to
undertake this work. Thus, research as a tool of government economic policy formulation
involves three distinct stages of operation which are as follows:

• Investigation of economic structure through continual compilation of facts


• Diagnoses of events that are taking place and the analysis of the forces underlying
them; and
• The prognosis, i.e., the prediction of future developments

Research also assumes a significant role in solving various operational and planning
problems associated with business and industry. In several ways, operations research,
market research, and motivational research are vital and their results assist in taking
business decisions. Market research is refers to the investigation of the structure and
development of a market for the formulation of efficient policies relating to purchases,
production and sales. Operational research relates to the application of logical,
mathematical, and analytical techniques to find solution to business problems such as
cost minimization or profit maximization, or the optimization problems. Motivational
research helps to determine why people behave in the manner they do with respect to
market characteristics. More specifically, it is concerned with the analyzing the
motivations underlying consumer behaviour. All these researches are very useful for
business and industry, which are responsible for business decision making.

Research is equally important to social scientist for analyzing social relationships and
seeking explanations to various social problems. It gives intellectual satisfaction of
knowing things for the sake of knowledge. It also possesses practical utility for the social
scientist to gain knowledge so as to be able to do something better or in a more efficient
manner. This, research in social sciences is concerned with both knowledge for its own
sake, and knowledge for what it can contribute to solve practical problems.

Summary
Research simply means a search for facts. The search for facts may be made through
either arbitrary (or unscientific) method or scientific method. Young defines Research as
“a scientific undertaking which, by means of logical and systematic techniques, aims to:
Discover of new facts or verify and test old facts, analyze their sequences,
interrelationships and causal explanations, develop new scientific tools, concepts and
theories which would facilitate reliable and valid study of human behaviour. Kerlinger
defines research as a “systematic, controlled, empirical and critical investigation of
hypothetical propositions about the presumed relations among natural phenomena.

The scientific method is based on certain “articles of faith.” These are:

1. Reliance on empirical evidence:


2. Use of relevant concepts
3. Commitment of objectivity
4. Ethical neutrality
5. Generalization
6. Verifiability
7. Logical reasoning process

Research is directed towards finding answers to pertinent questions and solutions to


problems. It emphasizes the development of generalization, principles or theories. The
purpose of research is not only to arrive at an answer but also to stand up the test of
criticism. The purpose of research is to extend knowledge of human beings Research
establishes generalizations and general laws and contributes to theory building in various
fields of knowledge. Research verifies and tests existing facts and theory and these help
improving our knowledge and ability to handle situations and events. General laws
developed through research may enable us to make reliable predictions of events yet to
happen. Research aims to analyze inter-relationships between variables and to derive
causal explanations: and thus enables us to have a better understanding of the world in
which we live.

Applied research aims at finding solutions to problems… socio-economic problems,


health problems, human relations problems in organizations and so on. Research also
aims at developing new tools, concepts and theories for a better study of unknown
phenomena. Research aids planning and thus contributes to national development. Pure
Research is undertaken for the sake of knowledge without any intention to apply it in
practice. Applied Research is carried on to find solution to a real-life problem requiring
an action or policy decision. It is thus problem-oriented and action-directed. Exploratory
Research is also known as formulative research. It is preliminary study of an unfamiliar
problem about which the researcher has little or no knowledge. Descriptive Study is a
fact-finding investigation with adequate interpretation. Diagnostic Study
is similar to descriptive study but with a different focus. Evaluation Studies
is a type of applied research. Action Research
is a type of evaluation study. The role of research in applied economics in the context of
an economy or business is greatly increasing in modern times. Research also assumes a
significant role in solving various operational and planning problems associated with
business and industry. Research is equally important to social scientist for analyzing
social relationships and seeking explanations to various social problems.

Copyright © 2009 SMU

Powered by Sikkim Manipal University

MB0034-Unit 2 -Selection and


Formulation of a Research Problem
Unit 2 -Selection and Formulation of a

Research Problem

Meaning of Research Problem

Research really begins when the researcher experiences some difficulty, i.e., a problem
demanding a solution within the subject-are of his discipline. This general area of
interest, however, defines only the range of subject-matter within which the researcher
would see and pose a specific problem for research. Personal values play an important
role in the selection of a topic for research. Social conditions do often shape the
preference of investigators in a subtle and imperceptible way.

The formulation of the topic into a research problem is, really speaking the first step in a
scientific enquiry. A problem in simple words is some difficulty experienced by the
researcher in a theoretical or practical situation. Solving this difficulty is the task of
research.

R.L. Ackoffs analysis affords considerable guidance in identifying problem for research.
He visualizes five components of a problem.

1. Research-consumer: There must be an individual or a group which experiences


some difficulty.
2. Research-consumer’s Objectives: The research-consumer must have available,
alternative means for achieving the objectives he desires.
3. Alternative Means to Meet the Objectives: The research-consumer must have
available, alternative means for achieving the objectives he desires.
4. Doubt in Regard to Selection of Alternatives: The existence of alternative courses
of action in not enough; in order to experience a problem, the research consumer
must have some doubt as to which alternative to select.
5. There must be One or More Environments to which the Difficulty or Problem
Pertains: A change in environment may produce or remove a problem. A
research-consumer may have doubts as to which will be the most efficient means
in one environment but would have no such doubt in another.

Objectives:

After studying this unit you should be able to understand:

• The meaning of Research Problem


• Choosing the problem
• Review of Literature
• Criteria for formulating the problem
• Objective of Formulating the Problem
• Techniques involved in Formulating the Problem
• Criteria of Good Research Problem

Choosing the Problem

The selection of a problem is the first step in research. The term problem means a
question or issue to be examined. The selection of a problem for research is not an easy
task; it self is a problem. It is least amenable to formal methodological treatment. Vision,
an imaginative insight, plays an important role in this process. One with a critical, curious
and imaginative mind and is sensitive to practical problems could easily identify
problems for study.

The sources from which one may be able to identify research problems or develop
problems awareness are:

• Review of literature
• Academic experience
• Daily experience
• Exposure to field situations
• Consultations
• Brain storming
• Research
• Intuition

Review of literature

Frequently, an exploratory study is concerned with an area of subject matter in which


explicit hypothesis have not yet been formulated. The researcher’s task then is to review
the available material with an eye on the possibilities of developing hypothesis from it. In
some areas of the subject matter, hypothesis may have been stated by previous research
workers. The researcher has to take stock of these various hypotheses with a view to
evaluating their usefulness for further research and to consider whether they suggest any
new hypothesis. Sociological journals, economic reviews, the bulletin of abstracts of
current social sciences research, directory of doctoral dissertation accepted by
universities etc afford a rich store of valuable clues. In addition to these general sources,
some governmental agencies and voluntary organizations publish listings of summaries
of research in their special fields of service. Professional organizations, research groups
and voluntary organizations are a constant source of information about unpublished
works in their special fields.

Formulating the problem

The selection of one appropriate researchable problem out of the identified problems
requires evaluation of those alternatives against certain criteria, which may be grouped
into:

Internal Criteria

Internal Criteria consists of:

1) Researcher’s interest: The problem should interest the researcher and be a


challenge to him. Without interest and curiosity, he may not develop sustained
perseverance. Even a small difficulty may become an excuse for discontinuing the
study. Interest in a problem depends upon the researcher’s educational background,
experience, outlook and sensitivity.

2) Researcher’s competence: A mere interest in a problem will not do. The


researcher must be competent to plan and carry out a study of the problem. He must
have the ability to grasp and deal with int. he must possess adequate knowledge of the
subject-matter, relevant methodology and statistical procedures.

3) Researcher’s own resource: In the case of a research to be done by a researcher


on his won, consideration of his own financial resource is pertinent. If it is beyond his
means, he will not be able to complete the work, unless he gets some external
financial support. Time resource is more important than finance. Research is a time-
consuming process; hence it should be properly utilized.

External Criteria

1) Research-ability of the problem: The problem should be researchable, i.e.,


amendable for finding answers to the questions involved in it through scientific
method. To be researchable a question must be one for which observation or other
data collection in the real world can provide the answer.

2) Importance and urgency: Problems requiring investigation are unlimited, but


available research efforts are very much limited. Therefore, in selecting problems for
research, their relative importance and significance should be considered. An
important and urgent problem should be given priority over an unimportant one.
3) Novelty of the problem: The problem must have novelty. There is no use of
wasting one’s time and energy on a problem already studied thoroughly by others.
This does not mean that replication is always needless. In social sciences in some
cases, it is appropriate to replicate (repeat) a study in order to verify the validity of its
findings to a different situation.

4) Feasibility: A problem may be a new one and also important, but if research on
it is not feasible, it cannot be selected. Hence feasibility is a very important
consideration.

5) Facilities: Research requires certain facilities such as well-equipped library


facility, suitable and competent guidance, data analysis facility, etc. Hence the
availability of the facilities relevant to the problem must be considered.

6) Usefulness and social relevance: Above all, the study of the problem should
make significant contribution to the concerned body of knowledge or to the solution
of some significant practical problem. It should be socially relevant. This
consideration is particularly important in the case of higher level academic research
and sponsored research.

7) Research personnel: Research undertaken by professors and by research


organizations require the services of investigators and research officers. But in India
and other developing countries, research has not yet become a prospective profession.
Hence talent persons are not attracted to research projects.

Each identified problem must be evaluated in terms of the above internal and external
criteria and the most appropriate one may be selected by a research scholar.

Objective of Formulating the Problem

A problem well put is half-solved. The primary task of research is collection of relevant
data and the analysis of data for finding answers to the research questions. The proper
performance of this task depends upon the identification of exact data and information
required for the study. The formulation serves this purpose. The clear and accurate
statement of the problem, the development of the conceptual model, the definition of the
objectives of the study, the setting of investigative questions, the formulation of
hypothesis to be tested and the operational definition of concepts and the delimitation of
the study determine the exact data needs of the study. Once the exact data requirement is
known, the researcher can plan and execute the other steps without any waste of time and
energy. Thus formulation gives a direction and a specific focus to the research effort. It
helps to delimit the field of enquiry by singling out the pertinent facts from a vast ocean
of facts and thus saves the researcher from becoming lost in a welter of irrelevancies. It
prevents a blind search and indiscriminate gathering of data which may later prove
irrelevant to the problem under study. It helps in determining the methods to be adopted
for sampling and collection of data
Techniques involved in Formulating Problem

The problem selected for research may initially be a vague topic. The question to be
studied or the problem to be solved may not be known. Hence the selected problem
should be defined and formulated. This is a difficult process. It requires intensive reading
of a few selected articles or chapters in books in order to understand the nature of the
problem selected.

The process of defining a problem includes:

1. Developing title: The title should be carefully worded. It should indicate the core
of the study, reflect the real intention of the researcher, and show on what is the
focus e.g., “Financing small-scale industries by commercial banks.” This shows
that the focus is on commercial banks and not on small-scale industries. On the
other hand, if the title is “The Financial Problem of Small-scale industries”, the
focus is on small-scale industries.

2. Building a conceptual model: On the basis of our theoretical knowledge of


the phenomenon under study, the nature of the phenomenon, its properties /
elements and their inter-relations should be identified and structured into a
framework. This conceptual model gives an exact idea of the research problem
and shows its various properties and variables to be studied. It serves as a basis
for the formulation of the objectives of the study, on the hypothesis to be tested.
In order to workout a conceptual model we must make a careful and critical study
of the available literature on the subject-matter of the selected research problem.
It is for this reason; a researcher is expected to select a problem for research in his
field of specialization. Without adequate background knowledge, a researcher
cannot grasp and comprehend the nature of the research problem.

3. Define the Objective of the Study: The objectives refer to the questions to
be answered through the study. They indicate what we are trying to get through
the study. The objectives are derived from the conceptual model. They state
which elements in the conceptual model-which levels of, which kinds of cases,
which properties, and which connections among properties – are to be
investigated, but it is the conceptual model that defines, describes, and states the
assumptions underlying these elements. The objectives may aim at description or
explanation or analysis of causal relationship between variables, and indicate the
expected results or outcome of the study. The objectives may be specified in the
form of either the statements or the questions.

Criteria of Good research Problem

Horton and Hunt have given following characteristics of scientific research:

1. Verifiable evidence: That is factual observations which other observers can see
and check.
2. Accuracy: That is describing what really exists. It means truth or correctness of a
statement or describing things exactly as they are and avoiding jumping to
unwarranted conclusions either by exaggeration or fantasizing.
3. Precision: That is making it as exact as necessary, or giving exact number or
measurement. This avoids colourful literature and vague meanings.
4. Systematization: That is attempting to find all the relevant data, or collecting
data in a systematic and organized way so that the conclusions drawn are reliable.
Data based on casual recollections are generally incomplete and give unreliable
judgments and conclusions.
5. Objectivity: That is free being from all biases and vested interests. It means
observation is unaffected by the observer’s values, beliefs and preferences to the
extent possible and he is able to see and accept facts as they are, not as he might
wish them to be.
6. Recording: That is jotting down complete details as quickly as possible. Since
human memory is fallible, all data collected are recorded.
7. Controlling conditions: That is controlling all variables except one and then
attempting to examine what happens when that variable is varied. This is the basic
technique in all scientific experimentation – allowing one variable to vary while
holding all other variables constant.
8. Training investigators: That is imparting necessary knowledge to investigators
to make them understand what to look for, how to interpret in and avoid
inaccurate data collection.

Summary

Research really begins when the researcher experiences some difficulty, i.e., a problem
demanding a solution within the subject-are of his discipline. The formulation of the topic
into a research problem is, really speaking the first step in a scientific enquiry. The
selection of one appropriate researchable problem out of the identified problems requires
evaluation of those alternatives against certain criteria, which may be grouped into
internal criteria and external criteria. A problem well put is half-solved. The primary task
of research is collection of relevant data and the analysis of data for finding answers to
the research questions. The problem selected for research may initially be a vague topic.
The process of defining a problem includes:

• Developing title
• Building a conceptual model
• Define the Objective of the Study

Horton and Hunt have given following characteristics of scientific research:

• Verifiable evidence
• Accuracy
• Precision
• Systematization
• Objectivity
• Recording
• Controlling conditions

Copyright © 2009 SMU

Powered by Sikkim Manipal University

MB0034- Unit 3 Hypothesis


Unit 3 Hypothesis

Introduction

A hypothesis is an assumption about relations between variables. It is a tentative


explanation of the research problem or a guess about the research outcome. Before
starting the research, the researcher has a rather general, diffused, even confused notion
of the problem. It may take long time for the researcher to say what questions he had
been seeking answers to. Hence, an adequate statement about the research problem is
very important. What is a good problem statement? It is an interrogative statement that
asks: what relationship exists between two or more variables? It then further asks
questions like: Is A related to B or not? How are A and B related to C? Is A related to B
under conditions X and Y? Proposing a statement pertaining to relationship between A
and B is called a hypothesis.

Objectives:

After studying this lesson you should be able to understand:

• Meaning and Examples of Hypothesis


• Criteria for constructing of hypothesis
• Nature of Hypothesis
• the need for having Hypothesis
• Characteristics of good hypothesis
• Types of hypothesis
• Null Hypothesis and alternative hypothesis
• Concepts of Hypothesis
• The level of Significance
• Decision rule of testing hypothesis
• Type I and Type II Errors
• Two Tailed and One Tailed Test
• Procedures for Testing hypothesis
• Testing of Hypothesis

Meaning and Examples of Hypothesis

According to Theodorson and Theodorson, “a hypothesis is a tentative statement


asserting a relationship between certain facts. Kerlinger describes it as “a conjectural
statement of the relationship between two or more variables”. Black and Champion have
described it as “a tentative statement about something, the validity of which is usually
unknown”. This statement is intended to be tested empirically and is either verified or
rejected. It the statement is not sufficiently established, it is not considered a scientific
law. In other words, a hypothesis carries clear implications for testing the stated
relationship, i.e., it contains variables that are measurable and specifying how they are
related. A statement that lacks variables or that does not explain how the variables are
related to each other is no hypothesis in scientific sense.

Criteria for Hypothesis Construction

Hypothesis is never formulated in the form of a question. The standards to be met in


formulating a hypothesis:

• It should be empirically testable, whether it is right or wrong.


• It should be specific and precise.
• The statements in the hypothesis should not be contradictory.
• It should specify variables between which the relationship is to be established.
• It should describe one issue only.

Nature of Hypothesis

A scientifically justified hypothesis must meet the following criteria:

• It must accurately reflect the relevant sociological fact.


• It must not be in contradiction with approved relevant statements of other
scientific disciplines.
• It must consider the experience of other researchers.

The Need for having Working Hypothesis

• A hypothesis gives a definite point to the investigation, and it guides the direction
on the study.
• A hypothesis specifies the sources of data, which shall be studied, and in what
context they shall be studied.
• It determines the data needs.
• A hypothesis suggests which type of research is likely to be most appropriate.
• It determines the most appropriate technique of analysis.
• A hypothesis contributes to the development of theory

Characteristics of Good Hypothesis

1. Conceptual Clarity
2. Specificity
3. Testability
4. Availability of Techniques
5. Theoretical relevance
6. Consistency
7. Objectivity
8. Simplicity

Types of Hypothesis

There are many kinds of hypothesis the researcher has to be working with. One
type of hypothesis asserts that something is the case in a given instance; that a
particular object, person or situation has particular characteristics. Another type of
hypothesis deals with the frequency of occurrence or of association among
variables; this type of hypothesis may state that X is associated with Y. A certain
Y proportion of items e.g. urbanism tends to be accompanied by mental disease or
than something are greater or lesser than some other thing in specific settings. Yet
another type of hypothesis asserts that a particular characteristics is one of the
factors which determine another characteristic, i.e. X is the producer of Y.
hypothesis of this type are called causal hypothesis.

Null Hypothesis and Alternative Hypothesis

In the context of statistical analysis, we often talk null and alternative hypothesis.
If we are to compare method A with method B about its superiority and if we
proceed on the assumption that both methods are equally good, then this
assumption is termed as null hypothesis. As against this, we may think that the
method A is superior, it is alternative hypothesis. Symbolically presented as:

Null hypothesis = H0 and Alternative hypothesis = Ha

Suppose we want to test the hypothesis that the population mean is equal to the
hypothesis mean (µ H0) = 100. Then we would say that the null hypotheses are
that the population mean is equal to the hypothesized mean 100 and symbolical
we can express as: H0: µ= µ H0=100
If our sample results do not support these null hypotheses, we should conclude
that something else is true. What we conclude rejecting the null hypothesis is
known as alternative hypothesis. If we accept H0, then we are rejecting Ha and if
we reject H0, then we are accepting Ha. For H0: µ= µ H0=100, we may consider
three possible alternative hypotheses as follows:

Alternative
To be read as follows
Hypothesis
(The alternative hypothesis is that the population mean is not
Ha: µ≠µ H0
equal to 100 i.e., it may be more or less 100)
(The alternative hypothesis is that the population mean is
Ha: µ>µ H0
greater than 100)
(The alternative hypothesis is that the population mean is less
Ha: µ< µ H0
than 100)

The null hypothesis and the alternative hypothesis are chosen before the sample is
drawn (the researcher must avoid the error of deriving hypothesis from the data he
collects and testing the hypothesis from the same data). In the choice of null
hypothesis, the following considerations are usually kept in view:

• Alternative hypothesis is usually the one which wishes to prove and the null
hypothesis are ones that wish to disprove. Thus a null hypothesis represents the
hypothesis we are trying to reject, the alternative hypothesis represents all other
possibilities.
• If the rejection of a certain hypothesis when it is actually true involves great risk,
it is taken as null hypothesis because then the probability of rejecting it when it is
true is α (the level of significance) which is chosen very small.
• Null hypothesis should always be specific hypothesis i.e., it should not state about
or approximately a certain value.
• Generally, in hypothesis testing we proceed on the basis of null hypothesis,
keeping the alternative hypothesis in view. Why so? The answer is that on
assumption that null hypothesis is true, one can assign the probabilities to
different possible sample results, but this cannot be done if we proceed with
alternative hypothesis. Hence the use of null hypothesis (at times also known as
statistical hypothesis) is quite frequent.

Concepts of Hypothesis Testing

Basic concepts in the context of testing of hypothesis need to be explained.

The Level of Significance

This is a very important concept in the context of hypothesis testing. It is always some
percentage (usually 5%) which should be chosen with great care, thought and reason. In
case we take the significance level at 5%, then this implies that H0 will be rejected when
the sampling result (i.e., observed evidence) has a less than 0.05 probability of occurring
if H0 is true. In other words, the 5% level of significance means that researcher is willing
to take as much as 5% risk rejecting the null hypothesis when it (H 0) happens to be true.
Thus the significance level is the maximum value of the probability of rejecting H0 when
it is true and is usually determined in advance before testing the

Decision Rule of Test of Hypothesis:

Given a hypothesis H0 and an alternative hypothesis H0 we make rule which is known as


decision rule according to which we accept H0 (i.e., reject Ha) or reject H0 (i.e., accept a).
For instance, if (H0 is that a certain lot is good (there are very few defective items in it)
against Ha that the lot is not good (there are many defective items in it), that we must
decide the number of items to be tested and the criterion for accepting or rejecting the
hypothesis. We might test 10 items in the lot and plan our decision saying that if there are
none or only 1 defective item among the 10, we will accept H 0 otherwise we will reject
H0 (or accept Ha). This sort of basis is known as decision rule.

Type I & Type II Errors

In the context of testing of hypothesis there are basically two types of errors that
researchers make. We may reject H0 when H0 is true & we may accept H0 when it is not
true. The former is known as Type I & the later is known as Type II. In other words,
Type I error mean rejection of hypothesis which should have been accepted & Type II
error means accepting of hypothesis which should have been rejected. Type I error is
donated by α (alpha), also called as level of significance of test; and Type II error is
donated by β(beta).

Decision
Accept H0 Reject H0
H0 (true) Correct decision Type I error (α error)
Ho (false) Type II error (β error) Correct decision

The probability of Type I error is usually determined in advance and is understood as the
level of significance of testing the hypothesis. If type I error is fixed at 5%, it means there
are about chances in 100 that we will reject H0 when H0 is true. We can control type I
error just by fixing it at a lower level. For instance, if we fix it at 1%, we will say that the
maximum probability of committing type I error would only be 0.01.

But with a fixed sample size, n when we try to reduce type I error, the probability of
committing type II error increases. Both types of errors can not be reduced
simultaneously. There is a trade-off in business situations, decision-makers decide the
appropriate level of type I error by examining the costs of penalties attached to both types
of errors. If type I error involves time & trouble of reworking a batch of chemicals that
should have been accepted, where as type II error means taking a chance that an entire
group of users of this chemicals compound will be poisoned, then in such a situation one
should prefer a type I error to a type II error means taking a chance that an entire group of
users of this chemicals compound will be poisoned, then in such a situation one should
prefer a type II error. As a result one must set very high level for type I error in one’s
testing techniques of a given hypothesis. Hence, in testing of hypothesis, one must make
all possible effort to strike an adequate balance between Type I & Type II error.

Two Tailed Test & One Tailed Test

In the context of hypothesis testing these two terms are quite important and must be
clearly understood. A two-tailed test rejects the null hypothesis if, say, the sample mean
is significantly higher or lower than the hypnotized value of the mean of the population.
Such a test inappropriate when we haveH0: µ= µ H0 and Ha: µ≠µ H0 which may µ>µ H0 or
µ<µ H0. If significance level is % and the two-tailed test to be applied, the probability of
the rejection area will be 0.05 (equally split on both tails of curve as 0.025) and that of
the acceptance region will be 0.95. If we take µ = 100 and if our sample mean deviates
significantly from µ, in that case we shall accept the null hypothesis. But there are
situations when only one-tailed test is considered appropriate. A one-tailed test would be
used when we are to test, say, whether the population mean in either lower than or higher
than some hypothesized value.

Procedure for Testing Hypothesis

To test a hypothesis means to tell (on the basis of the data researcher has collected)
whether or not the hypothesis seems to be valid. In hypothesis testing the main question
is: whether the null hypothesis or not to accept the null hypothesis? Procedure for
hypothesis testing refers to all those steps that we undertake for making a choice between
the two actions i.e., rejection and acceptance of a null hypothesis. The various steps
involved in hypothesis testing are stated below:

Making a Formal Statement

The step consists in making a formal statement of the null hypothesis (Ho) and also of the
alternative hypothesis (Ha). This means that hypothesis should clearly state, considering
the nature of the research problem. For instance, Mr. Mohan of the Civil Engineering
Department wants to test the load bearing capacity of an old bridge which must be more
than 10 tons, in that case he can state his hypothesis as under:

Null hypothesis HO: µ =10 tons

Alternative hypothesis Ha: µ >10 tons


Take another example. The average score in an aptitude test administered at the national
level is 80. To evaluate a state’s education system, the average score of 100 of the state’s
students selected on the random basis was 75. The state wants to know if there is a
significance difference between the local scores and the national scores. In such a
situation the hypothesis may be state as under:

Null hypothesis HO: µ =80

Alternative hypothesis Ha: µ ≠ 80

The formulation of hypothesis is an important step which must be accomplished with due
care in accordance with the object and nature of the problem under consideration. It also
indicates whether we should use a tailed test or a two tailed test. If Ha is of the type
greater than, we use alone tailed test, but when Ha is of the type “whether greater or
smaller” then we use a two-tailed test.

Selecting a Significant Level

The hypothesis is tested on a pre-determined level of significance and such the same
should have specified. Generally, in practice, either 5% level or 1% level is adopted for
the purpose. The factors that affect the level of significance are:

• The magnitude of the difference between sample ;

• The size of the sample;


• The variability of measurements within samples;
• Whether the hypothesis is directional or non – directional (A directional
hypothesis is one which predicts the direction of the difference between, say,
means). In brief, the level of significance must be adequate in the context of the
purpose and nature of enquiry.

Deciding the Distribution to Use

After deciding the level of significance, the next step in hypothesis testing is to determine
the appropriate sampling distribution. The choice generally remains between distribution
and the t distribution. The rules for selecting the correct distribution are similar to those
which we have stated earlier in the context of estimation.

Selecting A Random Sample & Computing An Appropriate Value

Another step is to select a random sample(S) and compute an appropriate value from the
sample data concerning the test statistic utilizing the relevant distribution. In other words,
draw a sample to furnish empirical data.
Calculation of the Probability

One has then to calculate the probability that the sample result would diverge as widely
as it has from expectations, if the null hypothesis were in fact true.

Comparing the Probability

Yet another step consists in comparing the probability thus calculated with the specified
value for α, the significance level. If the calculated probability is equal to smaller than α
value in case of one tailed test (and α/2 in case of two-tailed test), then reject the null
hypothesis (i.e. accept the alternative hypothesis), but if the probability is greater then
accept the null hypothesis. In case we reject H0 we run a risk of (at most level of
significance) committing an error of type I, but if we accept H 0, then we run some risk of
committing error type II.

Flow Diagram for Testing Hypothesis

committing type I error committing type II

error

Testing of Hypothesis
The hypothesis testing determines the validity of the assumption (technically described as
null hypothesis) with a view to choose between the conflicting hypotheses about the
value of the population hypothesis about the value of the population of a population
parameter. Hypothesis testing helps to secede on the basis of a sample data, whether a
hypothesis about the population is likely to be true or false. Statisticians have developed
several tests of hypothesis (also known as tests of significance) for the purpose of testing
of hypothesis which can be classified as:

• Parametric tests or standard tests of hypothesis ;


• Non Parametric test or distribution – free test of the hypothesis.

Parametric tests usually assume certain properties of the parent population from which
we draw samples. Assumption like observations come from a normal population, sample
size is large, assumptions about the population parameters like mean, variants etc must
hold good before parametric test can be used. But there are situation when the researcher
cannot or does not want to make assumptions. In such situations we use statistical
methods for testing hypothesis which are called non parametric tests because such tests
do not depend on any assumption about the parameters of parent population. Besides,
most non-parametric test assumes only nominal or original data, where as parametric test
require measurement equivalent to at least an interval scale. As a result non-parametric
test needs more observation than a parametric test to achieve the same size of Type I &
Type II error.

Important Parametric Tests

The important parametric tests are:

• z-test

• t-test

• x2-test

• f-test

All these tests are based on the assumption of normality i.e., the source of data is
considered to be normally distributed. In some cases the population may not be normally
distributed, yet the test will be applicable on account of the fact that we mostly deal with
samples and the sampling distributions closely approach normal distributions.

Z-test is based on the normal probability distribution and is used for judging the
significance of several statistical measures, particularly the mean. The relevant test
statistic is worked out and compared with its probable value (to be read from the table
showing area under normal curve) at a specified level of significance for judging the
significance of the measure concerned. This is a most frequently used test in research
studies. This test is used even when binomial distribution or t-distribution is applicable on
the presumption that such a distribution tends to approximate normal distribution as ‘n’
becomes larger. Z-test is generally used for comparing the mean of a sample to some
hypothesis mean for the population in case of large sample, or when population variance
is known as z-test is also used for judging the significance of difference between means
to of two independent samples in case of large samples or when population variance is
known z-test is generally used for comparing the sample proportion to a theoretical value
of population proportion or for judging the difference in proportions of two independent
samples when happens to be large. Besides, this test may be used for judging the
significance of median, mode, co-efficient of correlation and several other measures

T-test is based on t-distribution and is considered an appropriate test for judging the
significance of sample mean or for judging significance of difference between the two
means of the two samples in case of samples when population variance is not known (in
which case we use variance of the sample as an estimate the population variance). In case
two samples are related, we use paired t-test (difference test) for judging the significance
of their mean of difference between the two related samples. It can also be used for
judging the significance of co-efficient of simple and partial correlations. The relevant
test statistic, t, is calculated from the sample data and then compared with its probable
value based on t-distribution at a specified level of significance for concerning degrees of
freedom for accepting or rejecting the null hypothesis it may be noted that t-test applies
only in case of small sample when population variance is unknown.

X2-test is based on chi-square distribution and as a parametric test is used for comparing a
sample variance to a theoretical population variance is unknown.

F-test is based on f-distribution and is used to compare the variance of the two-
independent samples. This test is also used in the context of variance (ANOVA) for
judging the significance of more than two sample means at one and the same time. It is
also used for judging the significance of multiple correlation coefficients. Test statistic, f,
is calculated and compared with its probable value for accepting or rejecting the H0.

Summary

A hypothesis is an assumption about relations between variables. It is a tentative


explanation of the research problem or a guess about the research outcome. Before
starting the research, the researcher has a rather general, diffused, even confused notion
of the problem. A hypothesis gives a definite point to the investigation, and it guides the
direction on the study. A hypothesis specifies the sources of data, which shall be studied,
and in what context they shall be studied. In the context of hypothesis testing these two
terms are quite important and must be clearly understood. A two-tailed test rejects the
null hypothesis if, say, the sample mean is significantly higher or lower than the
hypnotized value of the mean of the population.
The hypothesis is tested on a pre-determined level of significance and such the same
should have specified. Generally, in practice, either 5% level or 1% level is adopted for
the purpose. After deciding the level of significance, the next step in hypothesis testing is
to determine the appropriate sampling distribution. The hypothesis testing determines the
validity of the assumption (technically described as null hypothesis) with a view to
choose between the conflicting hypotheses about the value of the population of a
population parameter. Z-test is based on the normal probability distribution and is used
for judging the significance of several statistical measures, particularly the mean. The
relevant test statistic is worked out and compared with its probable value (to be read from
the table showing area under normal curve) at a specified level of significance for judging
the significance of the measure concerned. This is a most frequently used test in research
studies. T-test is based on t-distribution and is considered an appropriate test for judging
the significance of sample mean or for judging significance of difference between the two
means of the two samples in case of samples when population variance is not known (in
which case we use variance of the sample as an estimate of the population variance). X2-
test is based on chi-square distribution and as a parametric test is used for comparing a
sample variance to a theoretical population variance is unknown. F-test is based on f-
distribution and is used to compare the variance of the two-independent samples.

Copyright © 2009 SMU

Powered by Sikkim Manipal University

MB0034- Unit 4 Research Design


Unit 4 Research Design

Meaning of Research Design

The research designer understandably cannot hold all his decisions in his head. Even if he
could, he would have difficulty in understanding how these are inter-related. Therefore,
he records his decisions on paper or record disc by using relevant symbols or concepts.
Such a symbolic construction may be called the research design or model. A research
design is a logical and systematic plan prepared for directing a research study. It specifies
the objectives of the study, the methodology and techniques to be adopted for achieving
the objectives. It constitutes the blue print for the collection, measurement and analysis of
data. It is the plan, structure and strategy of investigation conceived so as to obtain
answers to research questions. The plan is the overall scheme or program of research. A
research design is the program that guides the investigator in the process of collecting,
analyzing and interpreting observations. It provides a systematic plan of procedure for the
researcher to follow elltiz, Jahoda and Destsch and Cook describe, “A research design is
the arrangement of conditions for collection and analysis of data in a manner that aims to
combine relevance to the research purpose with economy in procedure.”
Objectives:

After studying this lesson you should be able to understand:

• Needs of Research Design


• Characteristics of a Good Research Design
• Components of Research Design
• Experimental and Non-experimental Hypothesis Testing Research
• Different Research Designs
• Research Design for Studies in Commerce and Management
• Research Design in Case of Exploratory Research Studies
• Research Design in case of Descriptive and Diagnostic Research Studies
• Research Design in case of Hypothesis testing Research Studies
• Principles of Experimental Designs
• Important Experimental Designs
• Formal Experimental Designs

Needs of Research Design

The need for the methodologically designed research:

a- In many a research inquiry, the researcher has no idea as to how accurate the
results of his study ought to be in order to be useful. Where such is the case, the
researcher has to determine how much inaccuracy may be tolerated. In a quite few
cases he may be in a position to know how much inaccuracy his method of research
will produce. In either case he should design his research if he wants to assure himself
of useful results.

b- In many research projects, the time consumed in trying to ascertain what the data
mean after they have been collected is much greater than the time taken to design a
research which yields data whose meaning is known as they are collected.

c- The idealized design is concerned with specifying the optimum research procedure
that could be followed were there no practical restrictions.

Characteristics of a Good Research Design

1. It is a series of guide posts to keep one going in the right direction.


2. It reduces wastage of time and cost.
3. It encourages co-ordination and effective organization.
4. It is a tentative plan which undergoes modifications, as circumstances demand,
when the study progresses, new aspects, new conditions and new relationships
come to light and insight into the study deepens.
5. It has to be geared to the availability of data and the cooperation of the
informants.
6. It has also to be kept within the manageable limits

Components of Research Design

It is important to be familiar with the important concepts relating to research design.


They are:

1. Dependent and Independent variables: A magnitude that varies is known as a


variable. The concept may assume different quantitative values, like height, weight,
income, etc. Qualitative variables are not quantifiable in the strictest sense of
objectivity. However, the qualitative phenomena may also be quantified in terms of
the presence or absence of the attribute considered. Phenomena that assume different
values quantitatively even in decimal points are known as ‘continuous variables’. But,
all variables need not be continuous. Values that can be expressed only in integer
values are called ‘non-continuous variables’. In statistical term, they are also known
as ‘discrete variable’. For example, age is a continuous variable; where as the number
of children is a non-continuous variable. When changes in one variable depends upon
the changes in one or more other variables, it is known as a dependent or endogenous
variable, and the variables that cause the changes in the dependent variable are known
as the independent or explanatory or exogenous variables. For example, if demand
depends upon price, then demand is a dependent variable, while price is the
independent variable. And if, more variables determine demand, like income and
prices of substitute commodity, then demand also depends upon them in addition to
the own price. Then, demand is a dependent variable which is determined by the
independent variables like own price, income and price of substitute.

2. Extraneous variable: The independent variables which are not directly related
to the purpose of the study but affect the dependent variable are known as extraneous
variables. For instance, assume that a researcher wants to test the hypothesis that
there is relationship between children’s school performance and their self-concepts, in
which case the latter is an independent variable and the former, the dependent
variable. In this context, intelligence may also influence the school performance.
However, since it is not directly related to the purpose of the study undertaken by the
researcher, it would be known as an extraneous variable. The influence caused by the
extraneous variable on the dependent variable is technically called as an
‘experimental error’. Therefore, a research study should always be framed in such a
manner that the dependent variable completely influences the change in the
independent variable and any other extraneous variable or variables.

3. Control: One of the most important features of a good research design is to


minimize the effect of extraneous variable. Technically, the term control is used when
a researcher designs the study in such a manner that it minimizes the effects of
extraneous independent variables. The term control is used in experimental research
to reflect the restrain in experimental conditions.

4. Confounded relationship: The relationship between dependent and independent


variables is said to be confounded by an extraneous variable, when the dependent
variable is not free from its effects.

• Research hypothesis: When a prediction or a hypothesized relationship is tested


by adopting scientific methods, it is known as research hypothesis. The research
hypothesis is a predictive statement which relates a dependent variable and an
independent variable. Generally, a research hypothesis must consist of at least one
dependent variable and one independent variable. Whereas, the relationships that
are assumed but not be tested are predictive statements that are not to be
objectively verified are not classified as research hypothesis.
• Experimental and control groups: When a group is exposed to usual conditions
in an experimental hypothesis-testing research, it is known as ‘control group’. On
the other hand, when the group is exposed to certain new or special condition, it is
known as an ‘experimental group’. In the afore-mentioned example, the Group A
can be called a control group and the Group B an experimental one. If both the
groups A and B are exposed to some special feature, then both the groups may be
called as ‘experimental groups’. A research design may include only the
experimental group or the both experimental and control groups together.
• Treatments: Treatments are referred to the different conditions to which the
experimental and control groups are subject to. In the example considered, the
two treatments are the parents with regular earnings and those with no regular
earnings. Likewise, if a research study attempts to examine through an experiment
regarding the comparative impacts of three different types of fertilizers on the
yield of rice crop, then the three types of fertilizers would be treated as the three
treatments.
• Experiment: An experiment refers to the process of verifying the truth of a
statistical hypothesis relating to a given research problem. For instance,
experiment may be conducted to examine the yield of a certain new variety of rice
crop developed. Further, Experiments may be categorized into two types namely,
absolute experiment and comparative experiment. If a researcher wishes to
determine the impact of a chemical fertilizer on the yield of a particular variety of
rice crop, then it is known as absolute experiment. Meanwhile, if the researcher
wishes to determine the impact of chemical fertilizer as compared to the impact of
bio-fertilizer, then the experiment is known as a comparative experiment.
• Experiment unit: Experimental units refer to the predetermined plots,
characteristics or the blocks, to which the different treatments are applied. It is
worth mentioning here that such experimental units must be selected with great
caution.

Experimental and Non-Experimental Hypothesis Testing Research


When the objective of a research is to test a research hypothesis, it is known as a
hypothesis-testing research. Such research may be in the nature of experimental design or
non-experimental design. A research in which the independent variable is manipulated is
known as ‘experimental hypothesis-testing research’, where as a research in which the
independent variable is not manipulated is termed as ‘non-experimental hypothesis-
testing research’. E.g., assume that a researcher wants to examine whether family income
influences the social attendance of a group of students, by calculating the coefficient of
correlation between the two variables. Such an example is known as a non-experimental
hypothesis-testing research, because the independent variable family income is not
manipulated. Again assume that the researcher randomly selects 150 students from a
group of students who pay their school fees regularly and them classifies them into tow
sub-groups by randomly including 75 in Group A, whose parents have regular earning,
and 75 in group B, whose parents do not have regular earning. And that at the end of the
study, the researcher conducts a test on each group in order to examine the effects of
regular earnings of the parents on the school attendance of the student. Such a study is an
example of experimental hypothesis-testing research, because in this particular study the
independent variable regular earnings of the parents have been manipulated

Different Research Designs

There are a number of crucial research choices, various writers advance different
classification schemes, some of which are:

1. Experimental, historical and inferential designs (American Marketing


Association).
2. Exploratory, descriptive and causal designs (Selltiz, Jahoda, Deutsch and Cook).
3. Experimental, and expost fact (Kerlinger)
4. Historical method, and case and clinical studies (Goode and Scates)
5. Sample surveys, field studies, experiments in field settings, and laboratory
experiments (Festinger and Katz)
6. Exploratory, descriptive and experimental studies (Body and Westfall)
7. Exploratory, descriptive and casual (Green and Tull)
8. Experimental, ‘quasi-experimental designs’ (Nachmias and Nachmias)
9. True experimental, quasi-experimental and non-experimental designs (Smith).
10. Experimental, pre-experimental, quasi-experimental designs and Survey Research
(Kidder and Judd).

These different categorizations exist, because ‘research design’ is a complex concept. In


fact, there are different perspectives from which any given study can be viewed. They
are:

1. The degree of formulation of the problem (the study may be exploratory or


formalized)
2. The topical scope-breadth and depth-of the study(a case or a statistical study)
3. The research environment: field setting or laboratory (survey, laboratory
experiment)
4. The time dimension(one-time or longitudinal)
5. The mode of data collection (observational or survey)
6. The manipulation of the variables under study (experimental or expost facto)
7. The nature of the relationship among variables (descriptive or causal)

Research Design for Studies in Commerce and Management

The various research designs are:

Research design in case of exploratory research studies Exploratory research studies


are also termed as formulative research studies. The main purpose of such studies is that
of formulating a problem for more precise investigation or of developing the working
hypothesis from an operational point of view. The major emphasis in such studies is on
the discovery of ideas and insights. As such the research design appropriate for such
studies must be flexible enough to provide opportunity for considering different aspects
of a problem under study. Inbuilt flexibility in research design is needed because the
research problem, broadly defined initially, is transformed into one with more precise
meaning in exploratory studies, which fact may necessitate changes in the research
procedure for gathering relevant data. Generally, the following three methods in the
context of research design for such studies are talked about:

1. The survey of concerning literature happens to be the most simple and fruitful
method of formulating precisely the research problem or developing hypothesis.
Hypothesis stated by earlier workers may be reviewed and their usefulness be
evaluated as a basis for further research. It may also be considered whether the
already stated hypothesis suggests new hypothesis. In this way the researcher
should review and build upon the work already done by others, but in cases where
hypothesis have not yet been formulated, his task is to review the available
material for deriving the relevant hypothesis from it. Besides, the bibliographical
survey of studies, already made in one’s area of interest may as well as made by
the researcher for precisely formulating the problem. He should also make an
attempt to apply concepts and theories developed in different research contexts to
the area in which he is himself working. Sometimes the works of creative writers
also provide a fertile ground for hypothesis formulation as such may be looked
into by the researcher.
2. Experience survey means the survey of people who have had practical experience
with the problem to be studied. The object of such a survey is to obtain insight
into the relationships between variables and new ideas relating to the research
problem. For such a survey, people who are competent and can contribute new
ideas may be carefully selected as respondents to ensure a representation of
different types of experience. The respondents so selected may then be
interviewed by the investigator. The researcher must prepare an interview
schedule for the systematic questioning of informants. But the interview must
ensure flexibility in the sense that the respondents should be allowed to raise
issues and questions which the investigator has not previously considered.
Generally, the experience of collecting interview is likely to be long and may last
for few hours. Hence, it is often considered desirable to send a copy of the
questions to be discussed to the respondents well in advance. This will also give
an opportunity to the respondents for doing some advance thinking over the
various issues involved so that, at the time of interview, they may be able to
contribute effectively. Thus, an experience survey may enable the researcher to
define the problem more concisely and help in the formulation of the research
hypothesis. This, survey may as well provide information about the practical
possibilities for doing different types of research.
3. Analyses of ‘insight-stimulating’ examples are also a fruitful method for
suggesting hypothesis for research. It is particularly suitable in areas where there
is little experience to serve as a guide. This method consists of the intensive study
of selected instance of the phenomenon in which one is interested. For this
purpose the existing records, if nay, may be examined, the unstructured
interviewing may take place, or some other approach may be adopted. Attitude of
the investigator, the intensity of the study and the ability of the researcher to draw
together diverse information into a unified interpretation are the main features
which make this method an appropriate procedure for evoking insights. Now,
what sorts of examples are to be selected and studied? There is no clear cut
answer to it. Experience indicates that for particular problems certain types of
instances are more appropriate than others. One can mention few examples of
‘insight-stimulating’ cases such as the reactions of strangers, the reactions of
marginal individuals, the study of individuals who are in transition from one stage
to another, the reactions of individuals from different social strata and the like. In
general, cases that provide sharp contrasts or have striking features are considered
relatively more useful while adopting this method of hypothesis formulation.
Thus, in an exploratory of formulative research study which merely leads to
insights or hypothesis, whatever method or research design outlined above is
adopted, the only thing essential is that it must continue to remain flexible so that
many different facets of a problem may be considered as and when they arise and
come to the notice of the researcher.

Research design in case of descriptive and diagnostic research studies

Descriptive research studies are those studies which are concerned with describing the
characteristics of a particular individual, or of a group, where as diagnostic research
studies determine the frequency with which something occurs or its association with
something else. The studies concerning whether certain variables are associated are the
example of diagnostic research studies. As against this, studies concerned with specific
predictions, with narration of facts and characteristics concerning individual, group of
situation are all examples of descriptive research studies. Most of the social research
comes under this category. From the point of view of the research design, the descriptive
as well as diagnostic studies share common requirements and as such we may group
together these two types of research studies. In descriptive as well as in diagnostic
studies, the researcher must be able to define clearly, what he wants to measure and must
find adequate methods for measuring it along with a clear cut definition of population he
wants to study. Since the aim is to obtain complete and accurate information in the said
studies, the procedure to be used must be carefully planned. The research design must
make enough provision for protection against bias and must maximize reliability. With
due concern for the economical completion of the research study, the design in such
studies must be rigid and not flexible and must focus attention on the following:

1. Formulating the objective of the study


2. Designing the methods of data collection
3. Selecting the sample
4. Collecting the data
5. Processing and analyzing the data
6. Reporting the findings.

In a descriptive / diagnostic study the first step is to specify the objectives with sufficient
precision to ensure that the data collected are relevant. If this is not done carefully, the
study may not provide the desired information. Then comes the question of selecting the
methods by which the data are to be obtained. While designing data-collection procedure,
adequate safeguards against bias and unreliability must be ensured. Which ever method is
selected, questions must be well examined and be made unambiguous; interviewers must
be instructed not to express their own opinion; observers must be trained so that they
uniformly record a given item of behaviour.

More often than not, sample has to be designed. Usually, one or more forms of
probability sampling or what is often described as random sampling, are used. To obtain
data, free from errors introduced by those responsible for collecting them, it is necessary
to supervise closely the staff of field workers as they collect and record information.
Checks may be set up to ensure that the data collecting staffs performs their duty honestly
and without prejudice. The data collected must be processed and analyzed. This includes
steps like coding the interview replies, observations, etc., tabulating the data; and
performing several statistical computations.

Last of all comes the question of reporting the findings. This is the task of
communicating the findings to others and the researcher must do it in an efficient
manner.

Research Design in case of Hypothesis-Testing Research Studies

Hypothesis-testing research studies (generally known as experimental studies) are those


where the researcher tests the hypothesis of causal relationships between variables. Such
studies require procedures that will not only reduce bias and increase reliability, but will
permit drawing inferences about causality. Usually, experiments meet these requirements.
Hence, when we talk of research design in such studies, we often mean the design of
experiments.

Principles of Experimental Designs

Professor Fisher has enumerated three principles of experimental designs:

1. The principle of replication: The experiment should be reaped more than once.
Thus, each treatment is applied in many experimental units instead of one. By doing
so, the statistical accuracy of the experiments is increased. For example, suppose we
are to examine the effect of two varieties of rice. For this purpose we may divide the
field into two parts and grow one variety in one part and the other variety in the other
part. We can compare the yield of the two parts and draw conclusion on that basis.
But if we are to apply the principle of replication to this experiment, then we first
divide the field into several parts, grow one variety in half of these parts and the other
variety in the remaining parts. We can collect the data yield of the two varieties and
draw conclusion by comparing the same. The result so obtained will be more reliable
in comparison to the conclusion we draw without applying the principle of
replication. The entire experiment can even be repeated several times for better
results. Consequently replication does not present any difficulty, but computationally
it does. However, it should be remembered that replication is introduced in order to
increase the precision of a study; that is to say, to increase the accuracy with which
the main effects and interactions can be estimated.

2. The principle of randomization: It provides protection, when we conduct an


experiment, against the effect of extraneous factors by randomization. In other words,
this principle indicates that we should design or plan the ‘experiment in such a way
that the variations caused by extraneous factors can all be combined under the general
heading of “chance”. For instance if we grow one variety of rice say in the first half
of the parts of a field and the other variety is grown in the other half, then it is just
possible that the soil fertility may be different in the first half in comparison to the
other half. If this is so, our results would not be realistic. In such a situation, we may
assign the variety of rice to be grown in different parts of the field on the basis of
some random sampling technique i.e., we may apply randomization principle and
protect ourselves against the effects of extraneous factors. As such, through the
application of the principle of randomization, we can have a better estimate of the
experimental error.

3. Principle of local control: It is another important principle of experimental


designs. Under it the extraneous factors, the known source of variability, is made to
vary deliberately over as wide a range as necessary and this needs to be done in such
a way that the variability it causes can be measured and hence eliminated from the
experimental error. This means that we should plan the experiment in a manner that
we can perform a two-way analysis of variance, in which the total variability of the
data is divided into three components attributed to treatments, the extraneous factor
and experimental error. In other words, according to the principle of local control, we
first divide the field into several homogeneous parts, known as blocks, and then each
such block is divided into parts equal to the number of treatments. Then the
treatments are randomly assigned to these parts of a block. In general, blocks are the
levels at which we hold an extraneous factors fixed, so that we can measure its
contribution to the variability of the data by means of a two-way analysis of variance.
In brief, through the principle of local control we can eliminate the variability due to
extraneous factors from the experimental error.

Important Experimental Designs

Experimental design refers to the framework or structure of an experiment and as such


there are several experimental designs. We can classify experimental designs into two
broad categories, viz., informal experimental designs and formal experimental designs.
Informal experimental designs are those designs that normally use a less sophisticated
form of analysis based on differences in magnitudes, where as formal experimental
designs offer relatively more control and use precise statistical procedures for analysis.

Informal experimental designs:

• Before and after without control design: In such a design, single test group or area
is selected and the dependent variable is measured before the introduction of the
treatment. The treatment is then introduced and the dependent variable is
measured again after the treatment has been introduced. The effect of the
treatment would be equal to the level of the phenomenon after the treatment
minus the level of the phenomenon before the treatment.
• After only with control design: In this design, two groups or areas (test and
control area) are selected and the treatment is introduced into the test area only.
The dependent variable is then measured in both the areas at the same time.
Treatment impact is assessed by subtracting the value of the dependent variable in
the control area from its value in the test area.
• Before and after with control design: In this design two areas are selected and the
dependent variable is measured in both the areas for an identical time-period
before the treatment. The treatment is then introduced into the test area only, and
the dependent variable is measured in both for an identical time-period after the
introduction of the treatment. The treatment effect is determined by subtracting
the change in the dependent variable in the control area from the change in the
dependent variable in test area.

Formal Experimental Designs

1. Completely randomized design (CR design): It involves only two principle viz.,
the principle of replication and randomization. It is generally used when
experimental areas happen to be homogenous. Technically, when all the
variations due to uncontrolled extraneous factors are included under the heading
of chance variation, we refer to the design of experiment as C R Design.
2. Randomized block design (RB design): It is an improvement over the C
Research design. In the RB design the principle of local control can be applied
along with the other two principles.
3. Latin square design (LS design): It is used in agricultural research. The
treatments in a LS design are so allocated among the plots that no treatment
occurs more than once in any row or column.
4. Factorial design: It is used in experiments where the effects of varying more than
one factor are to be determined. They are especially important in several
economic and social phenomena where usually a large number of factors affect a
particular problem.

Summary

A research design is a logical and systematic plan prepared for directing a research study.
In many research projects, the time consumed in trying to ascertain what the data mean
after they have been collected is much greater than the time taken to design a research
which yields data whose meaning is known as they are collected. Research design is a
series of guide posts to keep one going in the right direction. It is a tentative plan which
undergoes modifications, as circumstances demand, when the study progresses, new
aspects, new conditions and new relationships come to light and insight into the study
deepens. Exploratory research studies are also termed as formulative research studies.
The main purpose of such studies is that of formulating a problem for more precise
investigation or of developing the working hypothesis from an operational point of view.
Descriptive research studies are those studies which are concerned with describing the
characteristics of a particular individual, or of a group, where as diagnostic research
studies determine the frequency with which something occurs or its association with
something else.

Copyright © 2009 SMU

Powered by Sikkim Manipal University

MB0034- Unit 5-Case Study Method


Unit 5-Case Study Method

Meaning of Case Study


Case study is a method of exploring and analyzing the life of a social unit or entity, be it a
person, a family, an institution or a community. The aim of case study method is to locate
or identify the factors that account for the behaviour patterns of a given unit, and its
relationship with the environment. The case data are always gathered with a view to
attracting the natural history of the social unit, and its relationship with the social factors
and forces operative and involved in this surrounding milieu. In short, the social
researcher tries, by means of the case study method, to understand the complex of factors
that are working within a social unit as an integrated totality. Looked at from another
angle, the case study serves the purpose similar to the clue-providing function of expert
opinion. It is most appropriate when one is trying to find clues and ideas for further
research.

The major credit for introducing case study method into social investigation goes to
Frederick Leplay. Herbert Spencer was the first social philosopher who used case study
in comparative studies of different cultures. William Healey used case study in his study
of juvenile delinquency. Anthropologists and ethnologists have liberally utilized cast
study in the systematic description of primitive cultures. Historians have used this
method for portraying some historical character or particular historical period and
describing the developments within a national community.

Objectives:

After studying this lesson you should be able to understand:

Assumptions of Case Study Method

Advantages of Case Study Method

Disadvantages of Case Study Method

Making Case Study Effective

Case Study as a Method of Business Research

Assumptions of Case Study Method

• Case study would depend upon wit, commonsense and imagination of the person
doing the case study. The investigator makes up his procedure as he goes along.
• If the life history has been written in the first person, it must be as complete and
coherent as possible.
• Life histories should have been written for knowledgeable persons.
• It is advisable to supplement case data by observational, statistical and historical
data since these provide standards for assessing the reliability and consistency of
the case material.
• Efforts should be made to ascertain the reliability of life history data through
examining the internal consistency of the material.
• A judicious combination of techniques of data collection is a prerequisite for
securing data that are culturally meaningful and scientifically significant.

Advantages of Case Study Method

Case study of particular value when a complex set of variables may be at work in
generating observed results and intensive study is needed to unravel the complexities. For
example, an in-depth study of a firm’s top sales people and comparison with worst
salespeople might reveal characteristics common to stellar performers. Here again, the
exploratory investigation is best served by an active curiosity and willingness to deviate
from the initial plan when findings suggest new courses of inquiry might prove more
productive. It is easy to see how the exploratory research objectives of generating insights
and hypothesis would be well served by use of this technique

Disadvantages of Case Study Method

Blummer points out that independently, the case documents hardly fulfil the criteria of
reliability, adequacy and representativeness, but to exclude them form any scientific
study of human life will be blunder in as much as these documents are necessary and
significant both for theory building and practice.

Making Case Study Effective

Let us discuss the criteria for evaluating the adequacy of the case history or life history
which is of central importance for case study. John Dollard has proposed seven criteria
for evaluating such adequacy as follows:

i) The subject must be viewed as a specimen in a cultural series. That is, the case drawn
out from its total context for the purposes of study must be considered a member of the
particular cultural group or community. The scrutiny of the life histories of persons must
be done with a view to identify thee community values, standards and their shared way of
life.
ii) The organic motto of action must be socially relevant. That is, the action of the
individual cases must be viewed as a series of reactions to social stimuli or situation. In
other words, the social meaning of behaviour must be taken into consideration.

iii) The strategic role of the family group in transmitting the culture must be recognized.
That is, in case of an individual being the member of a family, the role of family in
shaping his behaviour must never be overlooked.

iv) The specific method of elaboration of organic material onto social behaviour must be
clearly shown. That is case histories that portray in detail how basically a biological
organism, the man, gradually blossoms forth into a social person, are especially fruitful.

v) The continuous related character of experience for childhood through adulthood must
be stressed. In other words, the life history must be a configuration depicting the inter-
relationships between thee person’s various experiences.

vi) Social situation must be carefully and continuously specified as a factor. One of the
important criteria for the life history is that a person’s life must be shown as unfolding
itself in the context of and partly owing to specific social situations.

vii) The life history material itself must be organised according to some conceptual
framework, this in turn would facilitate generalizations at a higher level.

Case Study as a Method of Business Research

In-depth analysis of selected cases is of particular value to business research when a


complex set of variables may be at work in generating observed results and intensive
study is needed to unravel the complexities. For instance, an in-depth study of a firm’s
top sales people and comparison with the worst sales people might reveal characteristics
common to stellar performers. The exploratory investigator is best served by the active
curiosity and willingness to deviate from the initial plan, when the finding suggests new
courses of enquiry, might prove more productive

Summary

Case study is a method of exploring and analyzing the life of a social unit or entity, be it a
person, a family, an institution or a community. Case study would depend upon wit,
commonsense and imagination of the person doing the case study. The investigator
makes up his procedure as he goes along. Efforts should be made to ascertain the
reliability of life history data through examining the internal consistency of the material..
A judicious combination of techniques of data collection is a prerequisite for securing
data that are culturally meaningful and scientifically significant. Case study of particular
value when a complex set of variables may be at work in generating observed results and
intensive study is needed to unravel the complexities. The case documents hardly fulfil
the criteria of reliability, adequacy and representativeness, but to exclude them form any
scientific study of human life will be blunder in as much as these documents are
necessary and significant both for theory building and practice. In-depth analysis of
selected cases is of particular value to business research when a complex set of variables
may be at work in generating observed results and intensive study is needed to unravel
the complexities.

Copyright © 2009 SMU

Powered by Sikkim Manipal University

MB0034- Unit 6-Sampling


Unit 6-Sampling

Meaning of Sampling

A part of the population is known as sample. The method consisting of the selecting for
study, a portion of the ‘universe’ with a view to draw conclusions about the ‘universe’ or
‘population’ is known as sampling. A statistical sample ideally purports to be a miniature
model or replica of the collectivity or the population constituted of all the items that the
study should principally encompass, that is, the items which potentially hold promise of
affording information relevant to the purpose of a given research.

Sampling helps in time and cost saving. It also helps in checking their accuracy. But on
the other hand it demands exercise of great care caution; otherwise the results obtained
may be incorrect or misleading.

Objectives

After studying this lesson you should be able to understand:

• Advantages of sampling
• Sampling procedure
• Characteristics of good sample
• Methods of Sampling
• Probability or Random Sampling
• Non-probability or Non Random Sampling
Advantage of Sample Survey

Sampling has the following advantages:

• The size of the population: If the population to be studied is quite large,


sampling is warranted. However, the size is a relative matter. Whether a
population is large or small depends upon the nature of the study, the purpose for
which it is undertaken, and the time and other resources available for it.
• Amount of funds budgeted for the study: Sampling is opted when the amount
of money budgeted is smaller than the anticipated cost of census survey.
• Facilities: The extent of facilities available – staff, access to computer facility and
accessibility to population elements – in another factor to be considered in
deciding to sample or not. When the availability of these facilities is limited,
sampling is preferable.
• Time: The time limit within the study should be completed in another important
factor to be considered in deciding the question of sample survey. This, in fact, is
a primary reason for using sampling by academic and marketing researchers.

Sampling Procedure

The decision process of sampling is complicated one. The researcher has to first identify
the limiting factor or factors and must judiciously balance the conflicting factors. The
various criteria governing the choice of the sampling technique:

1. Purpose of the Survey: What does the researcher aim at? If he intends to
generalize the findings based on the sample survey to the population, then an
appropriate probability sampling method must be selected. The choice of a
particular type of probability sampling depends on the geographical area of the
survey and the size and the nature of the population under study.
2. Measurability: The application of statistical inference theory requires
computation of the sampling error from the sample itself. Probability samples
only allow such computation. Hence, where the research objective requires
statistical inference, the sample should be drawn by applying simple random
sampling method or stratified random sampling method, depending on whether
the population is homogenous or heterogeneous.
3. Degree of Precision: Should the results of the survey be very precise, or even
rough results could serve the purpose? The desired level of precision as one of the
criteria of sampling method selection. Where a high degree of precision of results
is desired, probability sampling should be used. Where even crude results would
serve the purpose (E.g., marketing surveys, readership surveys etc) any
convenient non-random sampling like quota sampling would be enough.
4. Information about Population: How much information is available about the
population to be studied? Where no list of population and no information about its
nature are available, it is difficult to apply a probability sampling method. Then
exploratory study with non-probability sampling may be made to gain a better
idea of population. After gaining sufficient knowledge about the population
through the exploratory study, appropriate probability sampling design may be
adopted.
5. The Nature of the Population: In terms of the variables to be studied, is the
population homogenous or heterogeneous? In the case of a homogenous
population, even a simple random sampling will give a representative sample. If
the population is heterogeneous, stratified random sampling is appropriate.
6. Geographical Area of the Study and the Size of the Population: If the area
covered by a survey is very large and the size of the population is quite large,
multi-stage cluster sampling would be appropriate. But if the area and the size of
the population are small, single stage probability sampling methods could be
used.
7. Financial resources: If the available finance is limited, it may become necessary
to choose a less costly sampling plan like multistage cluster sampling or even
quota sampling as a compromise. However, if the objectives of the study and the
desired level of precision cannot be attained within the stipulated budget, there is
no alternative than to give up the proposed survey. Where the finance is not a
constraint, a researcher can choose the most appropriate method of sampling that
fits the research objective and the nature of population.
8. Time Limitation: The time limit within which the research project should be
completed restricts the choice of a sampling method. Then, as a compromise, it
may become necessary to choose less time consuming methods like simple
random sampling instead of stratified sampling/sampling with probability
proportional to size; multi-stage cluster sampling instead of single-stage sampling
of elements. Of course, the precision has to be sacrificed to some extent.
9. Economy: It should be another criterion in choosing the sampling method. It
means achieving the desired level of precision at minimum cost. A sample is
economical if the precision per unit cost is high or the cost per unit of variance is
low.

The above criteria frequently conflict and the researcher must balance and blend them to
obtain to obtain a good sampling plan. The chosen plan thus represents an adaptation of
the sampling theory to the available facilities and resources. That is, it represents a
compromise between idealism and feasibility. One should use simple workable methods
instead of unduly elaborate and complicated techniques

Characteristics of a Good Sample

The characteristics of a good sample are described below:

• Representativeness: a sample must be representative of the population.


Probability sampling technique yield representative sample.
• Accuracy: accuracy is defined as the degree to which bias is absent from the
sample. An accurate sample is the one which exactly represents the population.
• Precision: the sample must yield precise estimate. Precision is measured by
standard error.
• Size: a good sample must be adequate in size in order to be reliable.

Methods of Sampling

Sampling techniques or methods may be classified into two generic types:

Probability or Random Sampling

Probability sampling is based on the theory of probability. It is also known as random


sampling. It provides a known nonzero chance of selection for each population element.
It is used when generalization is the objective of study, and a greater degree of accuracy
of estimation of population parameters is required. The cost and time required is high
hence the benefit derived from it should justify the costs.

The following are the types of probability sampling:

i. Simple Random Sampling: This sampling technique gives each element an


equal and independent chance of being selected. An equal chance means equal
probability of selection. An independent chance means that the draw of one element
will not affect the chances of other elements being selected. The procedure of
drawing a simple random sample consists of enumeration of all elements in the
population.

1. Preparation of a List of all elements, giving them numbers in serial order


1, 2, B, and so on, and
2. Drawing sample numbers by using (a) lottery method, (b) a table of
random numbers or (c) a computer.

Suitability: This type of sampling is suited for a small homogeneous population.

Advantages: The advantage of this is that it is one of the easiest methods, all the
elements in the population have an equal chance of being selected, simple to
understand, does not require prior knowledge of the true composition of the
population.

Disadvantages: It is often impractical because of non-availability of population list


or of difficulty in enumerating the population, does not ensure proportionate
representation and it may be expensive in time and money. The amount of sampling
error associated with any sample drawn can easily be computed. But it is greater
than that in other probability samples of the same size, because it is less precise
than other methods.

ii. Stratified Random Sampling: This is an improved type of random or


probability sampling. In this method, the population is sub-divided into
homogenous groups or strata, and from each stratum, random sample is drawn. E.g.,
university students may be divided on the basis of discipline, and each discipline
group may again be divided into juniors and seniors. Stratification is necessary for
increasing a sample’s statistical efficiency, providing adequate data for analyzing
the various sub-populations and applying different methods to different strata. The
stratified random sampling is appropriate for a large heterogeneous population.
Stratification process involves three major decisions. They are stratification base or
bases, number of strata and strata sample sizes.

Stratified random sampling may be classified into:

a) Proportionate stratified sampling: This sampling involves drawing a


sample from each stratum in proportion to the latter’s share in the total
population. It gives proper representation to each stratum and its statistical
efficiency is generally higher. This method is therefore very popular. E.g., if the
Management Faculty of a University consists of the following specialization
groups:

Specialization stream No. of students Proportion of each stream

Production 40 0.4

Finance 20 0.2

Marketing 30 0.3

Rural development 10 0.1


100 1.0

The research wants to draw an overall sample of 30. Then the strata sample
sizes would be:

Strata Sample size

Production 30 x 0.4 12
Finance 30 x 0.2 6

Marketing 30 x 0.3 9

Rural development 30 x 0.1 3


30

Advantages: Stratified random sampling enhances the representativeness to


each sample, gives higher statistical efficiency, easy to carry out, and gives a
self-weighing sample.

Disadvantages: A prior knowledge of the composition of the population and


the distribution of the population, it is very expensive in time and money and
identification of the strata may lead to classification of errors.

b) Disproportionate stratified random sampling: This method does not


give proportionate representation to strata. It necessarily involves giving over-
representation to some strata and under-representation to others. The desirability
of disproportionate sampling is usually determined by three factors, viz, (a) the
sizes of strata, (b) internal variances among strata, and (c) sampling costs.

Suitability: This method is used when the population contains some small
but important subgroups, when certain groups are quite heterogeneous, while
others are homogeneous and when it is expected that there will be appreciable
differences in the response rates of the subgroups in the population.

Advantages: The advantages of this type is it is less time consuming and


facilitates giving appropriate weighing to particular groups which are small but
more important.

Disadvantages: The disadvantage is that it does not give each stratum


proportionate representation, requires prior knowledge of composition of the
population, is subject to classification errors and its practical feasibility is
doubtful.

iii. Systematic Random Sampling: This method of sampling is an alternative to


random selection. It consists of taking kth item in the population after a random start
with an item form 1 to k. It is also known as fixed interval method. E.g., 1 st, 11th,
21st ……… Strictly speaking, this method of sampling is not a probability
sampling. It possesses characteristics of randomness and some non-probability
traits.
Suitability: Systematic selection can be applied to various populations such as
students in a class, houses in a street, telephone directory etc.

Advantages: The advantages are it is simpler than random sampling, easy to use,
easy to instruct, requires less time, it’s cheaper, easier to check, sample is spread
evenly over the population, and it is statistically more efficient.

Disadvantages: The disadvantages are it ignores all elements between two k th


elements selected, each element does not have equal chance of being selected, and
this method sometimes gives a biased sample.

Cluster Sampling

It means random selection of sampling units consisting of population elements. Each


such sampling unit is a cluster of population elements. Then from each selected sampling
unit, a sample of population elements is drawn by either simple random selection or
stratified random selection. Where the population elements are scattered over a wide area
and a list of population elements is not readily available, the use of simple or stratified
random sampling method would be too expensive and time-consuming. In such cases
cluster sampling is usually adopted. The cluster sampling process involves: identify
clusters, examine the nature of clusters, and determine the number of stages.

Suitability: The application of cluster sampling is extensive in farm management


surveys, socio-economic surveys, rural credit surveys, demographic studies, ecological
studies, public opinion polls, and large scale surveys of political and social behaviour,
attitude surveys and so on.

Advantages: The advantages of this method is it is easier and more convenient, cost of
this is much less, promotes the convenience of field work as it could be done in compact
places, it does not require more time, units of study can be readily substituted for other
units and it is more flexible.

Disadvantages: The cluster sizes may vary and this variation could increase the bias of
the resulting sample. The sampling error in this method of sampling is greater and the
adjacent units of study tend to have more similar characteristics than do units distantly
apart.

Area sampling

This is an important form of cluster sampling. In larger field surveys cluster consisting of
specific geographical areas like districts, talluks, villages or blocks in a city are randomly
drawn. As the geographical areas are selected as sampling units in such cases, their
sampling is called area sampling. It is not a separate method of sampling, but forms part
of cluster sampling.

Multi-stage and sub-sampling


In multi-stage sampling method, sampling is carried out in two or more stages. The
population is regarded as being composed of a number of second stage units and so forth.
That is, at each stage, a sampling unit is a cluster of the sampling units of the subsequent
stage. First, a sample of the first stage sampling units is drawn, then from each of the
selected first stage sampling unit, a sample of the second stage sampling units is drawn.
The procedure continues down to the final sampling units or population elements.
Appropriate random sampling method is adopted at each stage. It is appropriate where the
population is scattered over a wider geographical area and no frame or list is available for
sampling. It is also useful when a survey has to be made within a limited time and cost
budget. The major disadvantage is that the procedure of estimating sampling error and
cost advantage is complicated.

Sub-sampling is a part of multi-stage sampling process. In a multi-stage sampling, the


sampling in second and subsequent stage frames is called sub-sampling. Sub-sampling
balances the two conflicting effects of clustering i.e., cost and sampling errors.

Random Sampling with Probability Proportional to Size

The procedure of selecting clusters with probability Proportional to size (PPS) is widely
used. If one primary cluster has twice as large a population as another, it is give twice the
chance of being selected. If the same number of persons is then selected from each of the
selected clusters, the overall probability of any person will be the same. Thus PPS is a
better method for securing a representative sample of population elements in multi-stage
cluster sampling.

Advantages: The advantages are clusters of various sizes get proportionate


representation, PPS leads to greater precision than would a simple random sample of
clusters and a constant sampling fraction at the second stage, equal-sized samples from
each selected primary cluster are convenient for field work.

Disadvantages: PPS cannot be used if the sizes of the primary sampling clusters are not
known.

Double Sampling and Multiphase Sampling

Double sampling refers to the subsection of the final sample form a pre-selected larger
sample that provided information for improving the final selection. When the procedure
is extended to more than two phases of selection, it is then, called multi-phase sampling.
This is also known as sequential sampling, as sub-sampling is done from a main sample
in phases. Double sampling or multiphase sampling is a compromise solution for a
dilemma posed by undesirable extremes. “The statistics based on the sample of ‘n’ can be
improved by using ancillary information from a wide base: but this is too costly to obtain
from the entire population of N elements. Instead, information is obtained from a larger
preliminary sample nL which includes the final sample n.
Replicated or Interpenetrating Sampling

It involves selection of a certain number of sub-samples rather than one full sample from
a population. All the sub-samples should be drawn using the same sampling technique
and each is a self-contained and adequate sample of the population. Replicated sampling
can be used with any basic sampling technique: simple or stratified, single or multi-stage
or single or multiphase sampling. It provides a simple means of calculating the sampling
error. It is practical. The replicated samples can throw light on variable non-sampling
errors. But disadvantage is that it limits the amount of stratification that can be employed.

Non-probability or Non Random Sampling

Non-probability sampling or non-random sampling is not based on the theory of


probability. This sampling does not provide a chance of selection to each population
element.

Advantages: The only merits of this type of sampling are simplicity, convenience and
low cost.

Disadvantages: The demerits are it does not ensure a selection chance to each population
unit. The selection probability sample may not be a representative one. The selection
probability is unknown. It suffers from sampling bias which will distort results.

The reasons for usage of this sampling are when there is no other feasible alternative due
to non-availability of a list of population, when the study does not aim at generalizing the
findings to the population, when the costs required for probability sampling may be too
large, when probability sampling required more time, but the time constraints and the
time limit for completing the study do not permit it. It may be classified into:

Convenience or Accidental Sampling

It means selecting sample units in a just ‘hit and miss’ fashion E.g., interviewing people
whom we happen to meet. This sampling also means selecting whatever sampling units
are conveniently available, e.g., a teacher may select students in his class. This method is
also known as accidental sampling because the respondents whom the researcher meets
accidentally are included in the sample.

Suitability: Though this type of sampling has no status, it may be used for simple
purposes such as testing ideas or gaining ideas or rough impression about a subject of
interest.

Advantage: It is the cheapest and simplest, it does not require a list of population and it
does not require any statistical expertise.
Disadvantage: The disadvantage is that it is highly biased because of researcher’s
subjectivity, it is the least reliable sampling method and the findings cannot be
generalized.

Purposive (or judgment) sampling

This method means deliberate selection of sample units that conform to some pre-
determined criteria. This is also known as judgment sampling. This involves selection of
cases which we judge as the most appropriate ones for the given study. It is based on the
judgement of the researcher or some expert. It does not aim at securing a cross section of
a population. The chance that a particular case be selected for the sample depends on the
subjective judgement of the researcher.

Suitability: This is used when what is important is the typicality and specific relevance
of the sampling units to the study and not their overall representativeness to the
population.

Advantage: It is less costly and more convenient and guarantees inclusion of relevant
elements in the sample.

Disadvantage: It is less efficient for generalizing, does not ensure the representativeness,
requires more prior extensive information and does not lend itself for using inferential
statistics.

Quota sampling

This is a form of convenient sampling involving selection of quota groups of accessible


sampling units by traits such as sex, age, social class, etc. it is a method of stratified
sampling in which the selection within strata is non-random. It is this Non-random
element that constitutes its greatest weakness.

Suitability: It is used in studies like marketing surveys, opinion polls, and readership
surveys which do not aim at precision, but to get quickly some crude results.

Advantage: It is less costly, takes less time, non need for a list of population, and field
work can easily be organized.

Disadvantage: It is impossible to estimate sampling error, strict control if field work is


difficult, and subject to a higher degree of classification.

Snow-ball sampling

This is the colourful name for a technique of Building up a list or a sample of a special
population by using an initial set of its members as informants. This sampling technique
may also be used in socio-metric studies.
Suitability: It is very useful in studying social groups, informal groups in a formal
organization, and diffusion of information among professional of various kinds.

Advantage: It is useful for smaller populations for which no frames are readily available.

Disadvantage: The disadvantage is that it does not allow the use of probability statistical
methods. It is difficult to apply when the population is large. It does not ensure the
inclusion of all the elements in the list.

Summary
A statistical sample ideally purports to be a miniature model or replica of the
collectivity or the population. Sampling helps in time and cost saving. If the
population to be studied is quite large, sampling is warranted. However, the
size is a relative matter. The decision regarding census or sampling depends
upon the budget of the study. Sampling is opted when the amount of money
budgeted is smaller than the anticipated cost of census survey. The extent of
facilities available – staff, access to computer facility and accessibility to
population elements - is another factor to be considered in deciding to
sample or not. In the case of a homogenous population, even a simple
random sampling will give a representative sample. If the population is
heterogeneous, stratified random sampling is appropriate. Probability
sampling is based on the theory of probability. It is also known as random
sampling. It provides a known non-zero chance of selection for each
population element.

Simple random sampling technique gives each element an equal and independent
chance of being selected. An equal chance means equal probability of selection.

Stratified random sampling is an improved type of random or probability sampling. In


this method, the population is sub-divided into homogenous groups or strata, and from
each stratum, random sample is drawn.

Proportionate stratified sampling involves drawing a sample from each stratum in


proportion to the latter’s share in the total population.

Disproportionate stratified random sampling does not give proportionate


representation to strata.

Systematic random sampling method is an alternative to random selection. It consists of


taking kth item in the population after a random start with an item form 1 to k. It is also
known as fixed interval method.

Cluster sampling means random selection of sampling units consisting of population


elements.
In Area sampling larger field surveys cluster consisting of specific geographical areas
like districts, taluks, villages or blocks in a city are randomly drawn.

Multi-stage sampling is carried out in two or more stages. The population is regarded as
being composed of a number of second stage units and so forth. That is, at each stage, a
sampling unit is a cluster of the sampling units of the subsequent stage.

Double sampling and multiphase sampling refers to the subsection of the final sample
form a pre-selected larger sample that provided information for improving the final
selection.

Replicated or interpenetrating sampling involves selection of a certain number of sub-


samples rather than one full sample from a population.

Non-probability or non random sampling is not based on the theory of probability.


This sampling does not provide a chance of selection to each population element.

Purposive (or judgment) sampling method means deliberate selection of sample units
that conform to some pre-determined criteria. This is also known as judgment sampling.

Quota sampling is a form of convenient sampling involving selection of quota groups of


accessible sampling units by traits such as sex, age, social class, etc. it is a method of
stratified sampling in which the selection within strata is non-random.

Snow-ball sampling is the colourful name for a technique of Building up a list or a


sample of a special population by using an initial set of its members as informants.

Copyright © 2009 SMU

Powered by Sikkim Manipal University

MB0034- Unit 7-Sources of Data


Unit 7-Sources of Data

Meaning and Importance of Data

The search for answers to research questions is called collection of data. Data are facts,
and other relevant materials, past and present, serving as bases for study and analyses.
The data needed for a social science research may be broadly classified into (a) Data
pertaining to human beings, (b) Data relating to organization and (c) Data pertaining to
territorial areas.
Objectives

After studying this lesson you should be able to understand:

• Primary sources of data


• Advantages and disadvantages of primary data
• Disadvantages of primary data
• Methods of collecting primary data
• Secondary sources of data
• Features of secondary data
• Use of Secondary data
• Advantages of secondary data
• Disadvantages of secondary data
• Evaluation and of secondary data

Personal data or data related to human beings consist of:

1. Demographic and socio-economic characteristics of individuals: Age, sex, race,


social class, religion, marital status, education, occupation income, family size,
location of the household life style etc.
2. Behavioral variables: Attitudes, opinions, awareness, knowledge, practice,
intentions, etc.
3. Organizational data consist of data relating to an organizations origin, ownership,
objectives, resources, functions, performance and growth.
4. Territorial data are related to geo-physical characteristics, resource endowment,
population, occupational pattern infrastructure degree of development, etc. of
spatial divisions like villages, cities, talluks, districts, state and the nation.

The data serve as the bases or raw materials for analysis. Without an analysis of factual
data, no specific inferences can be drawn on the questions under study. Inferences based
on imagination or guess work cannot provide correct answers to research questions. The
relevance, adequacy and reliability of data determine the quality of the findings of a
study.

Data form the basis for testing the hypothesis formulated in a study. Data also provide the
facts and figures required for constructing measurement scales and tables, which are
analyzed with statistical techniques. Inferences on the results of statistical analysis and
tests of significance provide the answers to research questions. Thus, the scientific
process of measurements, analysis, testing and inferences depends on the availability of
relevant data and their accuracy. Hence, the importance of data for any research studies.

The sources of data may be classified into (a) primary sources and (b) secondary sources.
Primary Sources of Data

Primary sources are original sources from which the researcher directly collects data that
have not been previously collected e.g.., collection of data directly by the researcher on
brand awareness, brand preference, brand loyalty and other aspects of consumer
behaviour from a sample of consumers by interviewing them,. Primary data are first hand
information collected through various methods such as observation, interviewing, mailing
etc.

Advantage of Primary Data

• It is original source of data


• It is possible to capture the changes occurring in the course of time.
• It flexible to the advantage of researcher.
• Extensive research study is based of primary data

Disadvantage of Primary Data

1. Primary data is expensive to obtain


2. It is time consuming
3. It requires extensive research personnel who are skilled.
4. It is difficult to administer.

Methods of Collecting Primary Data

Primary data are directly collected by the researcher from their original sources.
In this case, the researcher can collect the required date precisely according to his
research needs, he can collect them when he wants them and in the form he needs
them. But the collection of primary data is costly and time consuming. Yet, for
several types of social science research required data are not available from
secondary sources and they have to be directly gathered from the primary sources.

In such cases where the available data are inappropriate, inadequate or obsolete,
primary data have to be gathered. They include: socio economic surveys, social
anthropological studies of rural communities and tribal communities, sociological
studies of social problems and social institutions. Marketing research, leadership
studies, opinion polls, attitudinal surveys, readership, radio listening and T.V.
viewing surveys, knowledge-awareness practice (KAP) studies, farm
managements studies, business management studies etc.

There are various methods of data collection. A ‘Method’ is different from a


‘Tool’ while a method refers to the way or mode of gathering data, a tool is an
instruments used for the method. For example, a schedule is used for
interviewing. The important methods are
(a) observation, (b) interviewing, (c) mail survey, (d) experimentation,
(e) simulation and (f) projective technique. Each of these methods is discussed in
detail in the subsequent sections in the later chapters.

Secondary Sources of Data

These are sources containing data which have been collected and compiled for
another purpose. The secondary sources consists of readily compendia and
already compiled statistical statements and reports whose data may be used by
researchers for their studies e.g., census reports , annual reports and financial
statements of companies, Statistical statement, Reports of Government
Departments, Annual reports of currency and finance published by the Reserve
Bank of India, Statistical statements relating to Co-operatives and Regional
Banks, published by the NABARD, Reports of the National sample survey
Organization, Reports of trade associations, publications of international
organizations such as UNO, IMF, World Bank, ILO, WHO, etc., Trade and
Financial journals newspapers etc.

Secondary sources consist of not only published records and reports, but also
unpublished records. The latter category includes various records and registers
maintained by the firms and organizations, e.g., accounting and financial records,
personnel records, register of members, minutes of meetings, inventory records
etc.

Features of Secondary Sources

Though secondary sources are diverse and consist of all sorts of materials, they
have certain common characteristics.

First, they are readymade and readily available, and do not require the trouble of
constructing tools and administering them.

Second, they consist of data which a researcher has no original control over
collection and classification. Both the form and the content of secondary sources
are shaped by others. Clearly, this is a feature which can limit the research value
of secondary sources.

Finally, secondary sources are not limited in time and space. That is, the
researcher using them need not have been present when and where they were
gathered.

Use of Secondary Data


The second data may be used in three ways by a researcher. First, some specific
information from secondary sources may be used for reference purpose. For
example, the general statistical information in the number of co-operative credit
societies in the country, their coverage of villages, their capital structure, volume
of business etc., may be taken from published reports and quoted as background
information in a study on the evaluation of performance of cooperative credit
societies in a selected district/state.

Second, secondary data may be used as bench marks against which the findings of
research may be tested, e.g., the findings of a local or regional survey may be
compared with the national averages; the performance indicators of a particular
bank may be tested against the corresponding indicators of the banking industry
as a whole; and so on.

Finally, secondary data may be used as the sole source of information for a
research project. Such studies as securities Market Behaviour, Financial Analysis
of companies, Trade in credit allocation in commercial banks, sociological studies
on crimes, historical studies, and the like, depend primarily on secondary data.
Year books, statistical reports of government departments, report of public
organizations of Bureau of Public Enterprises, Censes Reports etc, serve as major
data sources for such research studies.

Advantages of Secondary Data

Secondary sources have some advantages:

1. Secondary data, if available can be secured quickly and cheaply. Once


their source of documents and reports are located, collection of data is just
matter of desk work. Even the tediousness of copying the data from the
source can now be avoided, thanks to Xeroxing facilities.
2. Wider geographical area and longer reference period may be covered
without much cost. Thus, the use of secondary data extends the
researcher’s space and time reach.
3. The use of secondary data broadens the data base from which scientific
generalizations can be made.
4. Environmental and cultural settings are required for the study.
5. The use of secondary data enables a researcher to verify the findings bases
on primary data. It readily meets the need for additional empirical support.
The researcher need not wait the time when additional primary data can be
collected.

Disadvantages of Secondary Data


The use of a secondary data has its own limitations.

6. The most important limitation is the available data may not meet our
specific needs. The definitions adopted by those who collected those data
may be different; units of measure may not match; and time periods may
also be different.
7. The available data may not be as accurate as desired. To assess their
accuracy we need to know how the data were collected.
8. The secondary data are not up-to-date and become obsolete when they
appear in print, because of time lag in producing them. For example,
population census data are published tow or three years later after
compilation, and no new figures will be available for another ten years.
9. Finally, information about the whereabouts of sources may not be
available to all social scientists. Even if the location of the source is
known, the accessibility depends primarily on proximity. For example,
most of the unpublished official records and compilations are located in
the capital city, and they are not within the easy reach of researchers based
in far off places.

Evaluation of Secondary Data

When a researcher wants to use secondary data for his research, he should
evaluate them before deciding to use them.

1. Data Pertinence

The first consideration in evaluation is to examine the


pertinence of the available secondary data to the research
problem under study. The following questions should be
considered.

• What are the definitions and classifications employed? Are they consistent ?
• What are the measurements of variables used? What is the degree to which they
conform to the requirements of our research?
• What is the coverage of the secondary data in terms of topic and time? Does this
coverage fit the needs of our research?

On the basis of above consideration, the pertinence of the secondary data to the research
on hand should be determined, as a researcher who is imaginative and flexible may be
able to redefine his research problem so as to make use of otherwise unusable available
data.

2. Data Quality

If the researcher is convinced about the available secondary data for his needs, the next
step is to examine the quality of the data. The quality of data refers to their accuracy,
reliability and completeness. The assurance and reliability of the available secondary data
depends on the organization which collected them and the purpose for which they were
collected. What is the authority and prestige of the organization? Is it well recognized? Is
it noted for reliability? It is capable of collecting reliable data? Does it use trained and
well qualified investigators? The answers to these questions determine the degree of
confidence we can have in the data and their accuracy. It is important to go to the original
source of the secondary data rather than to use an immediate source which has quoted
from the original. Then only, the researcher can review the cautionary ands other
comments that were made in the original source.

3. Data Completeness

The completeness refers to the actual coverage of the published data. This depends on the
methodology and sampling design adopted by the original organization. Is the
methodology sound? Is the sample size small or large? Is the sampling method
appropriate? Answers to these questions may indicate the appropriateness and adequacy
of the data for the problem under study. The question of possible bias should also be
examined. Whether the purpose for which the original organization collected the data had
a particular orientation? Has the study been made to promote the organization’s own
interest? How the study was conducted? These are important clues. The researcher must
be on guard when the source does not report the methodology and sampling design. Then
it is not possible to determine the adequacy of the secondary data for the researcher’s
study.

Summary

Data are facts and other relevant materials, past and present, serving as bases for study
and analyses. The data needed for a social science research may be broadly classified into
(a) Data pertaining to human beings, (b) Data relating to organization and (c) Data
pertaining to territorial areas. Personal data or data related to human beings consists of:
Demographic and socio-economic characteristics of individuals: Age, sex, race, social
class, religion, martial status, education, occupation income, family size, location of the
household life style etc.

Behavioural variables: Attitudes, opinions, awareness, knowledge, practice, intentions,


etc. Organizational data consist of data relating to an organizations origin, ownership,
objectives, resources, functions, performance and growth. Territorial data are related to
geophysical characteristics, resource endowment, population, occupational pattern
infrastructure degree of development, etc. of spatial divisions like villages, cities, taluks,
districts, state and the nation. Data form the basis for testing the hypothesis formulated in
a study. Data also provide the facts and figures required for constructing measurement
scales and tables. The sources of data may be classified into (a) primary sources and (b)
secondary sources. Primary data are first hand information collected through various
methods such as observation, interviewing, mailing etc. The secondary sources consist of
readily compendia and already complied statistical statements and reports. Finally
secondary sources are not limited in time and space. That is, the researcher using them
need not have been present when and where they were gathered. Secondary data, if
available can be secured quickly and cheaply.

Wider geographical area and longer reference period may be covered without much cost.
Thus, the use of secondary data extends the researcher’s space and time reach. The use of
secondary data broadens the data base from which scientific generalizations can be made.
The use of a secondary data has its own limitations. The most important limitation is the
available data may not meet our specific needs. The secondary data are not up-to-date
and become obsolete when they appear in print, because of time lag in producing them.
Primary data are directly collected by the researcher from their original sources. There
are various methods of data collection. A ‘Method’ is different from a ‘Tool’ while a
method refers to the way or mode of gathering data, a tool is an instruments used for the
method. For example, a schedule is used for interviewing. The important methods are (a)
observation, (b) interviewing, (c) mail survey, (d) experimentation, (e) simulation and
projective technique.

Copyright © 2009 SMU

Powered by Sikkim Manipal University

MB0034- Unit 8-Observation


Unit 8-Observation

Meaning of Observation

Observation means viewing or seeing. Observation may be defined as a systematic


viewing of a specific phenomenon in its proper setting for the specific purpose of
gathering data for a particular study. Observation is classical method of scientific study.

Objectives:

After studying this lesson you should be able to understand:

• General characteristics of observation method


• Process of observation
• Types of observation
• Participant Observation
• Non-participant observation
• Direct observation
• Indirect observation
• Controlled observation
• Uncontrolled observation
• Prerequisites of observation
• Advantages of observation
• Limitations of observation
• Use of observation in business research

General Characteristics of Observation Method

Observation as a method of data collection has certain characteristics.

1. It is both a physical and a mental activity: The observing eye catches many
things that are present. But attention is focused on data that are pertinent to the
given study.
2. Observation is selective: A researcher does not observe anything and everything,
but selects the range of things to be observed on the basis of the nature, scope and
objectives of his study. For example, suppose a researcher desires to study the
causes of city road accidents and also formulated a tentative hypothesis that
accidents are caused by violation of traffic rules and over speeding. When he
observed the movements of vehicles on the road, many things are before his eyes;
the type, make, size and colour of the vehicles, the persons sitting in them, their
hair style, etc. All such things which are not relevant to his study are ignored and
only over speeding and traffic violations are keenly observed by him.
3. Observation is purposive and not casual: It is made for the specific purpose of
noting things relevant to the study. It captures the natural social context in which
persons behaviour occur. It grasps the significant events and occurrences that
affect social relations of the participants.
4. Observation should be exact and be based on standardized tools of research
and such as observation schedule, social metric scale etc., and precision
instruments, if any.

Process of Observations

The use of observation method requires proper planning.

• First, the researcher should carefully examine the relevance of observation


method to the data needs of the selected study.
• Second, he must identify the specific investigative questions which call for use of
observation method. These determine the data to be collected.
• Third, he must decide the observation content, viz., specific conditions, events
and activities that have to be observed for the required data. The observation
content should include the relevant variables.
• Fourth, for each variable chosen, the operational definition should be specified.
• Fifth, the observation setting, the subjects to be observed, the timing and mode of
observation, recording, procedure, recording instruments to be used, and other
details of the task should be determined.
• Last, observers should be selected and trained. The persons to be selected must
have sufficient concentration powers, strong memory power and unobtrusive
nature. Selected persons should be imparted both theoretical and practical
training.

Types of Observations

Observations may be classified in different ways. With reference to investigator’s role, it


may be classified into (a) participant observation and (b) non-participant observation. In
terms of mode of observation, it may be classified into (c) direct observation. With
reference to the rigor of the system adopted. Observation is classified into (e) controlled
observation, and (f) uncontrolled observation

Participant Observation

In this observation, the observer is a part of the phenomenon or group which is observed
and he acts as both an observer and a participant. For example, a study of tribal customs
by an anthropologist by taking part in tribal activities like folk dance. The persons who
are observed should not be aware of the researcher’s purpose. Then only their behaviour
will be ‘natural’. The concealment of research objective and researcher’s identity is
justified on the ground that it makes it possible to study certain aspects of the group’s
culture which are not revealed to outsiders.

Advantages: The advantages of participant observation are:

• The observer can understand the emotional reactions of the observed group, and
get a deeper insight of their experiences.
• The observer will be able to record context which gives meaning to the observed
behaviour and heard statements.

Disadvantages: Participant observation suffers from some demerits.

1. The participant observer narrows his range of observation. For example, if there is
a hierarchy of power in the group/community under study, he comes to occupy
one position within in, and thus other avenues of information are closed to him.
2. To the extent that the participant observer participates emotionally, the objectivity
is lost.
3. Another limitation of this method is the dual demand made on the observer.
Recording can interfere with participation, and participation can interfere with
observation. Recording on the spot is not possible and it has to be postponed until
the observer is alone. Such time lag results in some inaccuracy in recording
Non-participant observations

In this method, the observer stands apart and does not participate in the phenomenon
observed. Naturally, there is no emotional involvement on the part of the observer. This
method calls for skill in recording observations in an unnoticed manner.

Direct observation

This means observation of an event personally by the observer when it takes place. This
method is flexible and allows the observer to see and record subtle aspects of events and
behaviour as they occur. He is also free to shift places, change the focus of the
observation. A limitation of this method is that the observer’s perception circuit may not
be able to cover all relevant events when the latter move quickly, resulting in the
incompleteness of the observation.

Indirect observation

This does not involve the physical presence of the observer, and the recording is done by
mechanical, photographic or electronic devices, e.g. recording customer and employee
movements by a special motion picture camera mounted in a department of a large store.
This method is less flexible than direct observations, but it is less biasing and less erratic
in recording accuracy. It is also provides a permanent record for an analysis of different
aspects of the event.

Controlled observation

This involves standardization of observational techniques and exercises of maximum


control over extrinsic and intrinsic variables by adopting experimental design and
systematically recording observations. Controlled observation is carried out either in the
laboratory or in the field. It is typified by clear and explicit decisions on what, how and
when to observe.

Uncontrolled observation

This does not involve control over extrinsic and intrinsic variables. It is primary used for
descriptive research. Participant observation is a typical uncontrolled one

Prerequisites of Effective Observation

The prerequisites of observation consist of:

• Observations must be done under conditions which will permit accurate results.
The observer must be in vantage point to see clearly the objects to be observed.
The distance and the light must be satisfactory. The mechanical devices used must
be in good working conditions and operated by skilled persons.
• Observation must cover a sufficient number of representative samples of the
cases.
• Recording should be accurate and complete.
• The accuracy and completeness of recorded results must be checked. A certain
number of cases can be observed again by another observer/another set of
mechanical devices, as the case may be. If it is feasible, two separate observers
and sets of instruments may be used in all or some of the original observations.
The results could then be compared to determine their accuracy and completeness.

Advantages of observation

Observation has certain advantages:

1. The main virtue of observation is its directness: it makes it possible to study


behaviour as it occurs. The researcher need not ask people about their behaviour
and interactions; he can simply watch what they do and say.
2. Data collected by observation may describe the observed phenomena as they
occur in their natural settings. Other methods introduce elements or artificiality
into the researched situation for instance, in interview; the respondent may not
behave in a natural way. There is no such artificiality in observational studies,
especially when the observed persons are not aware of their being observed.
3. Observations is more suitable for studying subjects who are unable to articulate
meaningfully, e.g. studies of children, tribal, animals, birds etc.
4. Observations improve the opportunities for analyzing the contextual back ground
of behaviour. Further more verbal resorts can be validated and compared with
behaviour through observation. The validity of what men of position and authority
say can be verified by observing what they actually do.
5. Observations make it possible to capture the whole event as it occurs. For
example only observation can provide an insight into all the aspects of the process
of negotiation between union and management representatives.
6. Observation is less demanding of the subjects and has less biasing effect on their
conduct than questioning.
7. It is easier to conduct disguised observation studies than disguised questioning.
8. Mechanical devices may be used for recording data in order to secure more
accurate data and also of making continuous observations over longer periods.

Limitations of Observation

Observation cannot be used indiscriminately for all purposes. It has its own limitations:
1. Observation is of no use, studying past events or activities. One has to depend
upon documents or narrations people for studying such things.
2. Observation is not suitable for studying and attitudes. However, an observation of
related behaviour affords a good clue to the attitudes. E.g. and observations of the
seating pattern of high caste and class persons in a general meeting in a village
may be useful for forming an index of attitude.
3. Observation poses difficulties in obtaining a representative sample. For
interviewing and mailing methods, the selection of a random sampling can be
rapidly ensured. But observing people of all types does not make the sample a
random one.
4. Observation cannot be used as and when the researcher finds a convenient to use
it. He has to wait for the eve n to occur. For example, an observation of folk dance
of a tribal community is possible, only when it is performed.
5. A major limitation of this method is that the observer normally must be at the
scene of the event when it takes place. Yet it may not be possible to predict where
and when the even will occur, e.g., road accident, communal clash.
6. Observation is slow and expensive process, requiring human observers and/or
costly surveillance equipments.

Use of Observation in Business Research

Observation is suitable for a variety of research purposes. It may be used for studying (a)
The behaviour of human beings in purchasing goods and services.: life style, customs,
and manner, interpersonal relations, group dynamics, crowd behaviour, leadership styles,
managerial style, other behaviours and actions; (b) The behaviour of other living
creatures like birds, animals etc. (c) Physical characteristics of inanimate things like
stores, factories, residences etc. (d) Flow of traffic and parking problems
(e) movement of materials and products through a plant.

Summary

Observation means viewing or seeing. Observation may be defined as a systematic


viewing of a specific phenomenon in its proper setting for the specific purpose of
gathering data for a particular study. Observation is classical method of scientific study.
Observation as a method of data collection has certain characteristics. Observations may
be classified in different ways. With reference to investigator’s role, it may be classified
into (a) participant observation and (b) non-participant observation. In terms of mode of
observation, it may be classified into (c) direct observation. With reference to the rigor of
the system adopted. Observation is classified into (e) controlled observation, and (f)
uncontrolled observation. This does not involve the physical presence of the observer,
and the recording is done by mechanical, photographic or electronic devices, e.g.
recording customer and employee movements by a special motion picture camera
mounted in a department of a large store. This involves standardization of observational
techniques and exercises of maximum control over extrinsic and intrinsic variables by
adopting experimental design and systematically recording observations. This does not
involve control over extrinsic and intrinsic variables. It is primary used for descriptive
research. Participant observation is a typical uncontrolled one.

Observation has certain advantages: Observation cannot be used indiscriminately for


all purposes. It has its own limitations. Observation is suitable for a variety of research
purposes. (a) The behaviour of human beings in purchasing goods and services: life style,
customs, and manner, interpersonal relations, group dynamics, crowd behaviour,
leadership styles, managerial style, other behaviours and actions.

Copyright © 2009 SMU

Powered by Sikkim Manipal University

MB0034- Unit 9-Schedule and


Questionnaire
Unit 9 Schedule and Questionnaire

Meaning of Schedule and Questionnaire

The mail survey is another method of collecting primary data. This method involves
sending questionnaires to the respondents with a request to complete them and return
them by post. This can be used in the case of educated respondents only. The mail
questionnaires should be simple so that the respondents can easily understand the
questions and answer them. It should preferably contain mostly closed-end and multiple
choice questions so that it could be completed within a few minutes.

The distinctive feature of the mail survey is that the questionnaire is self-administered by
the respondents themselves and the responses are recorded by them, and not by the
investigator as in the case of personal interview method. It does not involve face-to-face
conversation between the investigator and the respondent. Communication is carried out
only in writing and this required more cooperation from the respondents than in verbal
communication

Objectives

After studying this lesson you should be able to understand:

• Types of questionnaire
• Structured or standard questionnaire
• Unstructured questionnaire
• Processes of data collection
• Alternate method of sending questionnaires
• Importance of questionnaire
• Advantages of questionnaire
• Disadvantages of Questionnaire
• Distinction between schedule and questionnaire

Types of Questionnaires

Questionnaires may be classified as:

Structured/ standardized questionnaire

Structured questionnaires are those in which there are definite, concrete and preordained
questions with additional questions limited to those necessary to clarify inadequate
answers or to elicit more detailed responses. The questions are presented with exactly the
same wording and in the same order to all the respondents.

Unstructured questionnaire

In unstructured questionnaires the respondent is given the opportunity to answer in his


own terms and in his own frame of reference.
Process of Data Collection

The researcher should prepare a mailing list of the selected respondents by collecting the
addresses from the telephone directory of the association or organization to which they
belong.

A covering letter should accompany a copy of the questionnaire. Exhibit 7.1 is a copy of
a covering letter used by the author in a research study on ‘corporate planning’. It must
explain to the respondent the purpose of the study and the importance of his cooperation
to the success of the project. Anonymity may be assured.

Alternative Modes of Sending Questionnaires

There are some alternative methods of distributing questionnaires to the respondents.


They are: (1) personal delivery, (2) attaching questionnaire to a product (3) advertising
questionnaire in a newspaper of magazine, and
(4) news stand insets.

Personal Delivery

The researcher or his assistant may deliver the questionnaires to the potential respondents
with a request to complete them at their convenience. After a day or two he can collect
the completed questionnaires from them. Often referred to as the self-administered
questionnaire method, it combines the advantages of the personal interview and the mail
survey. Alternatively, the questionnaires may be delivered in person and the completed
questionnaires may be returned by mail by the respondents.

Attaching Questionnaire to a Product

A firm test marketing a product may attach a questionnaire to a product and request the
buyer to complete it and mail it back to the firm. The respondent is usually rewarded by a
gift or a discount coupon.

Advertising the Questionnaires

The questionnaire with the instructions for completion may be advertised on a page of
magazine or in section of newspapers. The potential respondent completes it tears it out
and mails it to the advertiser. For example, the committee of Banks customer services
used this method. Management studies for collecting information from the customers of
commercial banks in India. This method may be useful for large-scale on topics of
common interest.

News-Stand Inserts
This method involves inserting the covering letter, questionnaire and self addressed
reply-paid envelope into a random sample of news-stand copies of a newspaper or
magazine.

Improving the Response Rate in a Mail survey

The response rate in mail surveys is generally very low more so in developing countries
like India. Certain techniques have to be adopted to increase the response rate. They are:

1. Quality Printing: The questionnaire may be neatly printed in quality light


coloured paper, so as to attract the attention of the respondent.
2. Covering Letter: The covering letter should be couched in a pleasant style so as
to attract and hold the interest of the respondent. It must anticipate objections and
answer them briefly. It is a desirable to address the respondent by name.
3. Advance Information: Advance information can be provided to potential
respondents by a telephone call or advance notice in the newsletter of the
concerned organization or by a letter. Such preliminary contact with potential
respondents is more successful than follow up efforts.
4. Incentives: Money, stamps for collection and other incentives are also used to
induce respondents to complete and return mail questionnaire.
5. Follow-up-contacts: In the case of respondents belonging to an organization,
they may be approached through some one in that organization known as the
researcher.
6. Larger sample size: A larger sample may be drawn than the estimated sample
size. For example, if the required sample size is 1000, a sample of 1500 may be
drawn. This may help the researcher to secure an effective sample size closer to
the required size.

Importance of Questionnaire

The significance of questionnaire method is that it affords great facilities in collecting


data from large, diverse, and widely scattered groups of people. It is used in gathering
objective, quantitative data as well as for securing information of a qualitative nature. In
some studies, questionnaire is the sole research tool utilised but it is more often used in
conjunction with other methods of investigations. In questionnaire technique, great
reliance is placed on the respondent’s verbal report for data on the stimuli or experiences
which is exposed as also for data on his behaviour.

Advantages of Questionnaires

The advantages of mail surveys are:

• They are less costly than personal interviews, as cost of mailing is the same
through out the country, irrespective of distance.
• They can cover extensive geographical areas.
• Mailing is useful in contacting persons such as senior business executives who are
difficult to reach in any other way.
• The respondents can complete the questionnaires at their convenience.
• Mail surveys, being more impersonal, provide more anonymity than personal
interviews.
• Mail surveys are totally free from the interviewer’s bias, as there is no personal
contact between the respondents and the investigator.
• Certain personal and economic data may be given accurately in an unsigned mail
questionnaire.

Disadvantages of Questionnaires

The disadvantages of mail surveys are:

1. The scope for mail surveys is very limited in a country like India where the
percentage of literacy is very low.
2. The response rate of mail surveys is low. Hence, the resulting sample will not be a
representative one.

Distinction between schedules and questionnaires

Questionnaires are mailed to the respondent whereas schedules are carried by the
investigator himself. Questionnaires can be filled by the respondent only if he is able to
understand the language in which it is written and he is supposed to be a literate. This
problem can be overcome in case of schedule since the investigator himself carries the
schedules and the respondent’s response is accordingly taken. A questionnaire is filled by
the respondent himself whereas the schedule is filled by the investigator.

Summary

The mail survey is another method of collecting primary data. This method involves
sending questionnaires to the respondents with a request to complete them and return
them by post. The distinctive feature of the mail survey is that the questionnaire is self-
administered by the respondents themselves and the responses are recorded by them, and
not by the investigator as in the case of personal interview method. There are some
alternative methods of distributing questionnaires to the respondents. They are: (1)
personal delivery, (2) attaching questionnaire to a product
(3) advertising questionnaire in a newspaper or a magazine, and (4) news stand insets.
The response rate in mail surveys is generally very low, more so in developing countries
like India. Certain techniques have to be adopted to increase the response rate. They are
less costly than personal interviews, as cost of mailing is the same through out the
country, irrespective of distances. They can cover extensive geographical areas. Mailing
is useful in contacting persons such as senior business executives who are difficult to
reach in any other way. The respondents can complete the questionnaires at their
conveniences

Mail surveys, being more impersonal, provide more anonymity than personal interviews.
Mail surveys are totally free from the interviewer’s bias, as there is no personal contact
between the respondents and the investigator. Certain personal and economic data may be
given accurately in an unsigned mail questionnaire. The scope for mail surveys is very
limited in a country like India where the percentage of literacy is very low. The response
rate of mail surveys is low. Hence, the resulting sample will not be a representative one.
The significance of questionnaire method is that it affords great facilities in collecting
data from large, diverse, and widely scattered groups of people. Questionnaires are
mailed to the respondent whereas schedules are carried by the investigator himself. A
questionnaire is filled by the respondent himself whereas the schedule is filled by the
investigator.

Copyright © 2009 SMU

Powered by Sikkim Manipal University

MB0034- Unit 10-Interviewing


Unit 10 -Interviewing

Meaning of Interview

Interviewing is one of the prominent methods of data collection. It may be defined as a


two way systematic conversation between an investigator and an informant, initiated for
obtaining information relevant to a specific study. It involves not only conversation, but
also learning from the respondent’s gesture, facial expressions and pauses, and his
environment. Interviewing requires face to face contact or contact over telephone and
calls for interviewing skills. It is done by using a structured schedule or an unstructured
guide.

Interviewing may be used either as a main method or as a supplementary one in studies of


persons. Interviewing is the only suitable method for gathering information from illiterate
or less educated respondents. It is useful for collecting a wide range of data from factual
demographic data to highly personal and intimate information relating to a person’s
opinions, attitudes, values, beliefs past experience and future intentions. When qualitative
information is required or probing is necessary to draw out fully, and then interviewing is
required. Where the area covered for the survey is a compact, or when a sufficient
number of qualified interviewers are available, personal interview is feasible.
Interview is often superior to other data-gathering methods. People are usually more
willing to talk than to write. Once report is established, even confidential information
may be obtained. It permits probing into the context and reasons for answers to questions.

Interview can add flesh to statistical information. It enables the investigator to grasp the
behavioural context of the data furnished by the respondents.

Objectives

After studying this lesson you should be able to understand:

• Types of interviews
• Structured Directive interview
• Unstructured non-directive interview
• Focused interview
• Clinical interview
• Depth interview
• Approaches to the interview
• Qualities of interview
• Merits of interview method
• Demerits of interview method
• Interview techniques in business research
• Interview Problems
• Methods and Aims of controlling non-response
• Telephone Interviewing
• Group Interviews

Types of Interviews

The interview may be classified into: (a) structured or directive interview, (b)
unstructured or non-directive interview, (c) focused interview, (d) clinical interview and
(e) depth interview.

Structured Directive Interview

This is an interview made with a detailed standardized schedule. The same questions are
put to all the respondents and in the same order. Each question is asked in the same way
in each interview, promoting measurement reliability. This type of interview is used for
large-scale formalized surveys.
Advantages: This interview has certain advantages. First, data from one interview to the
next one are easily comparable. Second, recording and coding data do not pose any
problem, and greater precision is achieved. Lastly, attention is not diverted to extraneous,
irrelevant and time consuming conversation.

Limitation: However, this type of interview suffers from some limitations. First, it tends
to lose the spontaneity of natural conversation. Second, the way in which the interview is
structured may be such that the respondent’s views are minimized and the investigator’s
own biases regarding the problem under study are inadvertent introduced. Lastly, the
scope for exploration is limited.

Unstructured or Non-Directive Interview

This is the least structured one. The interviewer encourages the respondent to talk freely
about a give topic with a minimum of prompting or guidance. In this type of interview, a
detailed pre-planned schedule is not used. Only a broad interview guide is used. The
interviewer avoids channelling the interview directions. Instead he develops a very
permissive atmosphere. Questions are not standardized and ordered in a particular way.

This interviewing is more useful in case studies rather than in surveys. It is particularly
useful in exploratory research where the lines of investigations are not clearly defined. It
is also useful for gathering information on sensitive topics such as divorce, social
discrimination, class conflict, generation gap, drug-addiction etc. It provides opportunity
to explore the various aspects of the problem in an unrestricted manner.

Advantages: This type of interview has certain special advantages. It can closely
approximate the spontaneity of a natural conversation. It is less prone to interviewer’s
bias. It provides greater opportunity to explore the problem in an unrestricted manner.

Limitations: Though the unstructured interview is a potent research instrument, it is not


free from limitations. One of its major limitations is that the data obtained from one
interview is not comparable to the data from the next. Hence, it is not suitable for
surveys. Time may be wasted in unproductive conversations. By not focusing on one or
another facet of a problem, the investigator may run the risk of being led up blind ally. As
there is no particular order or sequence in this interview, the classification of responses
and coding may required more time. This type of informal interviewing calls for greater
skill than the formal survey interview.

Focused Interview

This is a semi-structured interview where the investigator attempts to focus the discussion
on the actual effects of a given experience to which the respondents have been exposed. It
takes place with the respondents known to have involved in a particular experience, e.g,
seeing a particular film, viewing a particular program on TV., involved in a train/bus
accident, etc. The situation is analysed prior to the interview. An interview guide
specifying topics relating to the research hypothesis used. The interview is focused on the
subjective experiences of the respondent, i.e., his attitudes and emotional responses
regarding the situation under study. The focused interview permits the interviewer to
obtain details of personal reactions, specific emotions and the like.

Merits: This type of interview is free from the inflexibility of formal methods, yet gives
the interview a set form and insured adequate coverage of all the relevant topics. The
respondent is asked for certain information, yet he has plenty of opportunity to present
his views. The interviewer is also free to choose the sequence of questions and determine
the extent of probing,

Clinical Interview

This is similar to the focused interview but with a subtle difference. While the focused
interview is concerned with the effects of specific experience, clinical interview is
concerned with broad underlying feelings or motivations or with the course of the
individual’s life experiences.

The ‘personal history’ interview used in social case work, prison administration,
psychiatric clinics and in individual life history research is the most common type of
clinical interview. The specific aspects of the individual’s life history to be covered by
the interview are determined with reference to the purpose of the study and the
respondent is encouraged to talk freely about them.

Depth Interview

This is an intensive and searching interview aiming at studying the respondent’s opinion,
emotions or convictions on the basis of an interview guide. This requires much more
training on inter-personal skills than structured interview. This deliberately aims to elicit
unconscious as well as extremely personal feelings and emotions.

This is generally a lengthy procedure designed to encourage free expression of affectively


charged information. It requires probing. The interviewer should totally avoid advising or
showing disagreement. Of course, he should use encouraging expressions like “uh-huh”
or “I see” to motivate the respondent to continue narration. Some times the interviewer
has to face the problem of affections, i.e. the respondent may hide expressing affective
feelings. The interviewer should handle such situation with great care.

Approaches to Interview

Interviewing as a method of data collection has certain features. They are:


The Participants: The interviewer and the respondent – are strangers. Hence, the
investigator has to get him introduced to the respondent in an appropriate manner.

The Relationship between the Participants is a Transitory one: It has a fixed


beginning and termination points. The interview proper is a fleeting, momentary
experience for them.

Interview is not a mere casual conversational exchange: Interview is a conversation


with a specific purpose, viz., obtaining information relevant to a study.

Interview is a mode of obtaining verbal answers to questions put verbally: The


interaction between the interviewer and the respondent need not necessarily be on a face-
to-face basis, because interview can be conducted over the telephone also. Although
interview is usually a conversation between two persons, it need not be limited to a single
respondent. It can also be conducted with a group of persons, such as family members, or
a group of children or a group of customers, depending on the requirements of the study.

Interview is an inter-actionable process: The interaction between the interviewer and


the respondent depends upon how they perceive each other.

The respondent reacts to the interviewer’s appearance, behaviour, gestures, facial


expression and intonation, his perception of the thrust of the questions and his own
personal needs. As far as possible, the interviewer should try to be closer to the social-
economic level of the respondents. Moreover, he should realize that his respondents are
under no obligations to extend response.

One should, therefore, be tactful and be alert to such reactions of the respondents as
lame-excuse, suspicion, reluctance or indifference, and deal with them suitably. One
should not also argue or dispute. One should rather maintain an impartial and objective
attitude. Information furnished by the respondent in the interview is recorded by the
investigator. This poses a problem of seeing that recording does not interfere with the
tempo of conversation.

Interviewing is not a standardized process: Like that of a chemical technician; it is


rather a flexible psychological process. The implication of this feature is that the
interviewer cannot apply unvarying standardized technique, because he is dealing with
respondents with varying motives and diverse perceptions. The extent of his success as an
interviewer is very largely dependent upon his insight and skill in dealing with varying
socio-physiological situations.

Qualities of Interviews

The requirements or conditions necessary for a successful interview are:


Data availability: The needed information should be available with the respondent. He
should be able to conceptualize it in terms to the study, and be capable of communicating
it.

Role perception: The respondent should understand his role and know what is required
of him. He should know what is a relevant and how complete it should be. He can learn
much of this from the interviewer’s introduction, explanations and questioning procedure.

The interviewer should also know his role: He should establish a permissive
atmosphere and encourage frank and free conversation. He should not affect the
interview situation through subjective attitude and argumentation.

Respondent’s motivation: The respondent should be willing to respond and give


accurate answer. This depends partly on the interviewer’s approach and skill. The
interview has interest in it for the purpose of his research, but the respondent has no
personal interest in it. Therefore, the interviewer should establish a friendly relationship
with the respondent, and create in him an interest in the subject-matter of the study. The
interviewer should try to reduce the effect of demotivating factors like desire to get on
with other activities, embarrassment at ignorance, dislike of the interview content,
suspicious about the interviewer, and fear of consequence, He should also try to build up
the effect of motivating actors like curiosity, loneliness, politeness, sense of duty, respect
of the research agency and liking for the interviewer.

The above requirement reminds that the interview is an interaction process. The
investigator should keep this in mind and take care to see that his appearance and
behaviour do not distort the interview situation.

Merits of Interview Method

There are several real advantages to personal interviewing.

• First the greatest value of this method is the depth and detail of information that
can be secured. When used with well conceived schedules, an interview can
obtain a great deal of information. It far exceeds mail survey in amount and
quality of data that can be secured.
• Second, the interviewer can do more to improve the percentage of responses and
the quality of information received than other method. He can note the conditions
of the interview situation, and adopt appropriate approaches to overcome such
problems as the respondent’s unwillingness, incorrect understanding of question,
suspicion, etc.
• Third, the interviewer can gather other supplemental information like economic
level, living conditions etc. through observation of the respondent’s environment.
• Fourth, the interviewer can use special scoring devices, visual materials and the
like in order to improve the quality of interviewing.
• Fifth, the accuracy and dependability of the answers given by the respondent can
be checked by observation and probing.
• Last, interview is flexible and adaptable to individual situations. Even more,
control can be exercised over the interview situation.

Demerits of Interview Method

Interviewing is not free limitations.

• Its greatest drawback is that it is costly both in money and time.


• Second, the interview results are often adversely affected by interviewer’s mode
of asking questions and interactions, and incorrect recording and also by the
respondent’s faulty perception, faulty memory, inability to articulate etc.
• Third, certain types of personal and financial information may be refused in face-
to face interviews. Such information might be supplied more willingly on mail
questionnaires, especially if they are to be unsigned.
• Fourth, interview poses the problem of recording information obtained from the
respondents. No full proof system is available. Note taking is invariably
distracting to both the respondent and the interviewer and affects the thread of the
conversation.
• Last, interview calls for highly interviewers. The availability of such persons is
limited and the training of interviewers is often a long and costly process.

Interviewing techniques in Business Research

The interview process consists of the following stages:

• Preparation
• Introduction
• Developing rapport
• Carrying the interview forward
• Recording the interview
• Closing the interview

Preparation

The interviewing requires some preplanning and preparation. The interviewer should
keep the copies of interview schedule/guide (as the case may be) ready to use. He should
have the list of names and addresses of respondents, he should regroup them into
contiguous groups in terms of location in order to save time and cost in traveling. The
interviewer should find out the general daily routine of the respondents in order to
determine the suitable timings for interview. Above all, he should mentally prepare
himself for the interview. He should think about how he should approach a respondent,
what mode of introduction he could adopt, what situations he may have to face and how
he could deal with them. The interviewer may come across such situations as
respondents; avoidance, reluctance, suspicion, diffidence, inadequate responses,
distortion, etc. The investigator should plan the strategies for dealing with them. If such
preplanning is not done, he will be caught unaware and fail to deal appropriately when he
actually faces any such situation. It is possible to plan in advance and keep the plan and
mind flexible and expectant of new development.

Introduction

The investigator is a stranger to the respondents. Therefore, he should be properly


introduced to each of the respondents. What is the proper mode of introduction? There is
no one appropriate universal mode of introduction. Mode varies according to the type of
respondents. When making a study of an organization or institution, the head of the
organization should be approached first and his cooperation secured before contacting the
sample inmates/employees. When studying a community or a cultural group, it is
essential to approach the leader first and to enlist cooperation. For a survey or urban
households, the research organization’s letter of introduction and the interviewer’s
identity card can be shown. In these days of fear of opening the door for a stranger,
residents cooperation can be easily secured, if the interviewer attempts to get him
introduced through a person known to them, say a popular person in the area e.g., a social
worker. For interviewing rural respondents, the interviewer should never attempt to
approach them along with someone from the revenue department, for they would
immediately hide themselves, presuming that they are being contacted for collection of
land revenue or subscription to some government bond. He should not also approach
them through a local political leader, because persons who do not belong to his party will
not cooperate with the interviewer. It is rather desirable to approach the rural respondents
through the local teacher or social worker.

After getting himself introduced to the respondent in the most appropriate manner, the
interviewer can follow a sequence of procedures as under, in order to motivate the
respondent to permit the interview:

1. With a smile, greet the respondent in accordance with his cultural pattern.
2. Identify the respondent by name.
3. Describe the method by which the respondent was selected.
4. Mention the name of the organization conducting the research.
5. Assure the anonymity or confidential nature of the interview.
6. Explain their usefulness of the study.
7. Emphasize the value of respondent’s cooperation, making such statements as
“You are among the few in a position to supply the information”. “Your response
is invaluable.” “I have come to learn from your experience and knowledge”.
Developing Rapport

Before starting the research interview, the interviewer should establish a friendly
relationship with the respondent. This is described as “rapport”. It means establishing a
relationship of confidence and understanding between the interviewer and the respondent.
It is a skill which depends primarily on the interviewer’s commonsense, experience,
sensitivity, and keen observation.

Start the conversation with a general topic of interest such as weather, current news,
sports event, or the like perceiving the probable of the respondent from his context. Such
initial conversation may create a friendly atmosphere and a warm interpersonal
relationship and mutual understanding. However, the interviewer should “guard against
the over rapport” as cautioned by Herbert Hyman. Too much identification and too much
courtesy result in tailoring replied to the image of a “nice interviewer.” The interviewer
should use his discretion in striking a happy medium.

Carrying the Interview Forward

After establishing rapport, the technical task of asking questions from the interview
schedule starts. This task requires care, self-restraint, alertness and ability to listen with
understanding, respect and curiosity. In carrying on this task of gathering information
from the respondent by putting questions to him, the following guidelines may be
followed:

1. Start the interview. Carry it on in an informal and natural conversational style.


2. Ask all the applicable questions in the same order as they appear on the schedule
without any elucidation and change in the wording. Ask all the applicable
questions listed in the schedule. Do not take answers for granted.
3. If interview guide is used, the interviewer may tailor his questions to each
respondent, covering of course, the areas to be investigated.
4. Know the objectives of each question so as to make sure that the answers
adequately satisfy the question objectives.
5. If a question is not understood, repeat it slowly with proper emphasis and
appropriate explanation, when necessary.
6. Talk all answers naturally, never showing disapproval or surprise. When the
respondent does not meet the interruptions, denial, contradiction and other
harassment, he may feel free and may not try to withhold information. He will be
motivated to communicate when the atmosphere is permissive and the listener’s
attitude is non judgmental and is genuinely absorbed in the revelations.
7. Listen quietly with patience and humility. Give not only undivided attention, but
also personal warmth. At the same time, be alert and analytic to incomplete, non
specific and inconsistent answers, but avoid interrupting the flow of information.
If necessary, jot down unobtrusively the points which need elaboration or
verification for later and timelier probing. The appropriate technique for this
probing is to ask for further clarification in such a polite manner as “I am not sure,
I understood fully, is this….what you meant?”
8. Neither argue nor dispute.
9. Show genuine concern and interest in the ideas expressed by the respondent; at
the same time, maintain an impartial and objective attitude.
10. Should not reveal your own opinion or reaction. Even when you are asked of your
views, laugh off the request, saying “Well, your opinions are more important than
mine.”
11. At times the interview “runs dry” and needs re-stimulation. Then use such
expressions as “Uh-huh” or “That interesting” or “I see” “can you tell me more
about that?” and the like.
12. When the interviewee fails to supply his reactions to related past experiences,
represent the stimulus situation, introducing appropriate questions which will aid
in revealing the past. “Under what circumstances did such and such a
phenomenon occur?” or “How did you feel about it and the like.
13. At times, the conversation may go off the track. Be alert to discover drifting, steer
the conversation back to the track by some such remark as, “you know, I was very
much interested in what you said a moment ago. Could you tell me more about
it?”
14. When the conversation turns to some intimate subjects, and particularly when it
deals with crises in the life of the individual, emotional blockage may occur. Then
drop the subject for the time being and pursue another line of conversation for a
while so that a less direct approach to the subject can be made later.
15. When there is a pause in the flow of information, do not hurry the interview. Take
it as a matter of course with an interested look or a sympathetic half-smile. If the
silence is too prolonged, introduce a stimulus saying “You mentioned that…
What happened then?”

Additional Sittings

In the case of qualitative interviews involving longer duration, one single sitting will not
do, as it would cause interview weariness. Hence, it is desirable to have two or more
sittings with the consent of the respondent.

Recording the Interview

It is essential to record responses as they take place. If the note taking is done after the
interview, a good deal of relevant information may be lost. Nothing should be made in
the schedule under respective question. It should be complete and verbatim. The
responses should not be summarized or paraphrased. How can complete recording be
made without interrupting the free flow of conversation? Electronic transcription through
devices like tape recorder can achieve this. It has obvious advantages over note-taking
during the interview. But it also has certain disadvantages. Some respondents may object
to or fear “going on record”. Consequently the risk of lower response rate will rise
especially for sensitive topics.

If the interviewer knows short-hand, he can use it with advantage. Otherwise, he can
write rapidly by abbreviating word and using only key words and the like. However, even
the fast writer may fail to record all that is said at conversational speed. At such times, it
is useful to interrupt by some such comment as “that seems to be a very important point,
would you mind repeating it, so that I can get your words exactly.” The respondent is
usually flattered by this attention and the rapport is not disturbed.

The interviewer should also record all his probes and other comments on the schedule, in
brackets to set them off from responses. With the pre-coded structured questions, the
interviewer’s task is easy. He has to simply ring the appropriate code or tick the
appropriate box, as the case may be. He should not make mistakes by carelessly ringing
or ticketing a wrong item.

Closing the Interview

After the interview is over, take leave off the respondent thanking him with a friendly
smile. In the case of a qualitative interview of longer duration, select the occasion for
departure more carefully. Assembling the papers for putting them in the folder at the time
of asking the final question sets the stage for a final handshake, a thank-you and a good-
bye. If the respondent desires to know the result of the survey, note down his name and
address so that a summary of the result could be posted to him when ready.

Editing

At the close of the interview, the interviewer must edit the schedule to check that he has
asked all the questions and recorded all the answers and that there is no inconsistency
between answers. Abbreviations in recording must be replaced by full words. He must
ensure that everything is legible. It is desirable to record a brief sketch of his impressions
of the interview and observational notes on the respondent’s living environment, his
attitude to the survey, difficulties, if any, faced in securing his cooperation and the
interviewer’s assessment of the validity of the respondent’s answers.

Interview Problems

In personal interviewing, the researcher must deal with two major problems, inadequate
response, non-response and interviewer’s bias.

Inadequate response
Kahn and Cannel distinguish five principal symptoms of inadequate response. They are:

• partial response, in which the respondent gives a relevant but incomplete answer
• non-response, when the respondent remains silent or refuses to answer the
question
• irrelevant response, in which the respondent’s answer is not relevant to the
question asked
• inaccurate response, when the reply is biased or distorted and
• Verbalized response problem, which arises on account of respondent’s failure to
understand a question or lack of information necessary for answering it.

Interviewer’s Bias

The interviewer is an important cause of response bias. He may resort to cheating by


‘cooking up’ data without actually interviewing. The interviewers can influence the
responses by inappropriate suggestions, word emphasis, tone of voice and question
rephrasing. His own attitudes and expectations about what a particular category of
respondents may say or think may bias the data. Another source of response of the
interviewer’s characteristics (education, apparent social status, etc) may also bias his
answers. Another source of response bias arises from interviewer’s perception of the
situation, if he regards the assignment as impossible or sees the results of the survey as
possible threats to personal interests or beliefs he is likely to introduce bias.

As interviewers are human beings, such biasing factors can never be overcome
completely, but their effects can be reduced by careful selection and training of
interviewers, proper motivation and supervision, standardization or interview procedures
(use of standard wording in survey questions, standard instructions on probing procedure
and so on) and standardization of interviewer behaviour. There is need for more research
on ways to minimize bias in the interview.

Non-response

Non–response refers to failure to obtain responses from some sample respondents. There
are many sources of non-response; non-availability, refusal, incapacity and
inaccessibility.

Non-availability

Some respondents may not be available at home at the time of call. This depends upon
the nature of the respondent and the time of calls. For example, employed persons may
not be available during working hours. Farmers may not be available at home during
cultivation season. Selection of appropriate timing for calls could solve this problem.
Evenings and weekends may be favourable interviewing hours for such respondents. If
someone is available, then, line respondent’s hours of availability can be ascertained and
the next visit can be planned accordingly.

Refusal

Some persons may refuse to furnish information because they are ill-disposed, or
approached at the wrong hour and so on. Although, a hardcore of refusals remains,
another try or perhaps another approach may find some of them cooperative. Incapacity
or inability may refer to illness which prevents a response during the entire survey period.
This may also arise on account of language barrier.

Inaccessibility

Some respondents may be inaccessible. Some may not be found due to migration and
other reasons. Non-responses reduce the effective sample size and its representativeness.

Methods and Aims of control of non-response

Kish suggests the following methods to reduce either the percentage of non-response or
its effects:

1. Improved procedures for collecting data are the most obvious remedy for non-
response. Improvements advocated are (a) guarantees of anonymity, (b)
motivation of the respondent to co-operate (c) arousing the respondents’ interest
with clever opening remarks and questions, (d) advance notice to the respondents.
2. Call-backs are most effective way of reducing not-at-homes in personal
interviews, as are repeated mailings to no-returns in mail surveys.
3. Substitution for the non-response is often suggested as a remedy. Usually this is a
mistake because the substitutes resemble the responses rather than the non-
responses. Nevertheless, beneficial substitution methods can sometimes be
designed with reference to important characteristics of the population. For
example, in a farm management study, the farm size is an important variable and
if the sampling is based on farm size, substitution for a respondent with a
particular size holding by another with the holding of the same size is possible.

Attempts to reduce the percentage or effects on non-responses aim at reducing the bias
caused by differences on non-respondents from respondents. The non-response bias
should not be confused with the reduction of sampled size due to non-response. The latter
effect can be easily overcome, either by anticipating the size of non-response in designing
the sample size or by compensating for it with a supplement. These adjustments increase
the size of the response and the sampling precision, but they do not reduce the non-
response percentage or bias.

Telephone Interviewing
Telephone interviewing is a non-personal method of data collection. It may be used as a
major method or supplementary method.

It will be useful in the following situations:

1. When the universe is composed of those persons whose names are listed in
telephone directories, e.g. business houses, business executives, doctors, other
professionals.
2. When the study required responses to five or six simple questions. E.g. Radio or
Television program survey.
3. When the survey must be conducted in a very short period of time, provided the
units of study are listed in telephone directory.
4. When the subject is interesting or important to respondents, e.g. a survey relating
to trade conducted by a trade association or a chamber of commerce, a survey
relating to a profession conducted by the concerned professional association.
5. When the respondents are widely scattered.

Advantages: The advantages of telephone interview are:

1. The survey can be completed at very low cost, because telephone survey does not
involve travel time and cost and all calls can be made from a single location.
2. Information can be collected in a short period of time. 5 to 10 interviews can be
conducted per hours.
3. Quality of response is good, because interviewer bias is reduced as there is no
face-to-face contact between the interviewer and the respondent.
4. This method of interviewing is less demanding upon the interviewer.
5. It does not involve field work.
6. Individuals who could not be reached or who might not care to be interviewed
personally can be contacted easily.

Disadvantages: Telephone interview has several limitations:

1. It is limited to persons with listed telephones. The sample will be distorted. If the
universe includes persons not on phone in several counties like India only a few
persons have phone facility and that too in urban areas only. Telephone facility is
very rare in rural areas. Hence, the method is not useful for studying the general
population.
2. There is a limit to the length of interview. Usually, a call cannot last over five
minutes. Only five or six simple questions can be asked. Hence, telephone cannot
be used for a longer questionnaire.
3. The type of information to be collected is limited to what can be given in simple,
short answers of a few words. Hence, telephone is not suitable for complex
surveys, and there is no possibility of obtaining detailed information.
4. If the questions cover personal matters, most respondents will not cooperate with
the interviewer.
5. The respondent’s characteristics and environment cannot be observed.
6. It is not possible to use visual aids like charts, maps, illustrations or complex
scales.
7. It is rather difficult to establish rapport between the respondent and the
interviewer.
8. There is no possibility to ensure the identity of the interviewer and to overcome
suspicions.

Group Interviews

A group interview may be defined as a method of collecting primary data in which a


number of individuals with a common interest interact with each other. In a personal
interview, the flow of information is multi dimensional. The group may consist of about
six to eight individuals with a common interest. The interviewer acts as the discussion
leader. Free discussion is encouraged on some aspect of the subject under study. The
discussion leader stimulates the group members to interact with each other.

The desired information may be obtained through self-administered questionnaire or


interview, with the discussion serving as a guide to ensure consideration of the areas of
concern. In particular, the interviewers look for evidence of common elements of
attitudes, beliefs, intentions and opinions among individuals in the group. At the same
time, he must be aware that a single comment by a member can provide important
insight.

Samples for group interview can be obtained through schools, clubs and other organized
groups. The group interview technique can be employed by researchers in studying
people’s reactions on public amenities, public health projects, welfare schemes etc. It is a
popular method in marketing research to evaluate new product or service concepts,
brands names, packages, promotional strategies and attitudes. When an organization
needs a great variety of information in as much detail as possible at a relatively low cost
and in a short period of time, the group interview technique is more useful. It can be used
to generate primary data in the exploratory phase of a project.

Advantages: The advantages of this technique are:

1. The respondents comment freely and in detail.


2. The method is highly flexible. The flexibility helps the research work with new
concepts or topics which have not been previously investigated.
3. Visual aids can be used.
4. A group can be interviewed in the time required for one personal interview.
5. The client can watch the interview unobserved.
6. Respondents are more articulated in a group than in the individual interviews.
7. The technique eliminates the physical limitations inherent in individual
interviews.

Disadvantages: This method is not free from draw backs.

1. It is difficult to get a representative sample.


2. There is the possibility of the group being dominated by one individual.
3. The respondents may answer to please the interviewer or the other members in the
group.
4. Nevertheless, the advantage of this technique outweighs the disadvantages and the
technique is found to be useful for surveys on topics of common interest.

Summary

Interviewing is one of the prominent methods of data collection. The interview may be
classified into: (a) structured or directive interview,
(b) unstructured or non-directive interview, (c) focused interview, and
(d) clinical interview and (e) depth interview. Structured interview is made with a details
standardized schedule. The same questions are put to all the respondents and in the same
order. Non-directive method is the least structured one. The interviewer encourages the
respondent to talk freely about a given topic with a minimum of prompting or guidance.
In focused type of interview, a detailed pre-planned schedule is not used. Clinical
interview is a semi-structured interview where the investigator attempts to focus the
discussion on the actual effects of a given experience to which the respondents have been
exposed. This is similar to the focused interview but with a subtle difference. While the
focused interview is concerned with the effects of specific experience, clinical interview
is concerned with broad underlying feelings or motivations or with the course of the
individual’s life experiences. This is an intensive and searching interview aiming at
studying the respondent’s opinion, emotions or convictions on the basis of an interview
guide. Detailed interview requires much more training on inter-personal skills than
structured interview. This deliberately aims to elicit unconscious as well as extremely
personal feelings and emotions.

Interviewing as a method of data collection has certain features. They are:

1. The requirements or conditions necessary for a successful interview are:


2. There are several real advantages to personal interviewing.
3. Interviewing is not free limitations.
In personal interviewing, the researcher must deal with two major problems,
inadequate response, non-response and interviewer’s bias. Telephone
interviewing is a non-personal method of data collection. It may be used as a
major method or supplementary method. It will be useful in the following
situations. A group interview may be defined as a method of collecting primary
data in which a number of individuals with a common interest interact with each
other. In a personal interview the flow of information is multi dimensional. The
group may consist of about six to eight individuals with a common interest. The
interviewer acts as the discussion. The quality of data collected depends
ultimately upon the capabilities of interviewers. Hence, careful selection and
proper training of interviewers is essential.

Copyright © 2009 SMU

Powered by Sikkim Manipal University

MB0034- Unit 11-Processing Data


Unit 11-Processing Data

Meaning of Data Processing

Data in the real world often comes with a large quantum and in a variety of formats that
any meaningful interpretation of data cannot be achieved straightaway. Social science
researches, to be very specific, draw conclusions using both primary and secondary data.
To arrive at a meaningful interpretation on the research hypothesis, the researcher has to
prepare his data for this purpose. This preparation involves the identification of data
structures, the coding of data and the grouping of data for preliminary research
interpretation. This data preparation for research analysis is teamed as processing of data.
Further selections of tools for analysis would to a large extent depend on the results of
this data processing.

Data processing is an intermediary stage of work between data collections and data
interpretation. The data gathered in the form of questionnaires/interview schedules/field
notes/data sheets is mostly in the form of a large volume of research variables. The
research variables recognized is the result of the preliminary research plan, which also
sets out the data processing methods beforehand. Processing of data requires advanced
planning and this planning may cover such aspects as identification of variables,
hypothetical relationship among the variables and the tentative research hypothesis.

The various steps in processing of data may be stated as:

- Identifying the data structures

- Editing the data

- Coding and classifying the data

- Transcription of data

- Tabulation of data.

Objectives:

After studying this lesson you should be able to understand:

• Checking for analysis


• Editing
• Coding
• Classification
• Transcription of data
• Tabulation
• Construction of Frequency Table
• Components of a table
• Principles of table construction
• Frequency distribution and class intervals
• Graphs, charts and diagrams
• Types of graphs and general rules
• Quantitative and qualitative analysis
• Measures of central tendency
• Dispersion
• Correlation analysis
• Coefficient of determination

Checking for Analysis

In the data preparation step, the data are prepared in a data format, which allows the
analyst to use modern analysis software such as SAS or SPSS. The major criterion in this
is to define the data structure. A data structure is a dynamic collection of related variables
and can be conveniently represented as a graph where nodes are labelled by variables.
The data structure also defines and stages of the preliminary relationship between
variables/groups that have been pre-planned by the researcher. Most data structures can
be graphically presented to give clarity as to the frames researched hypothesis. A sample
structure could be a linear structure, in which one variable leads to the other and finally,
to the resultant end variable.

The identification of the nodal points and the relationships among the nodes could
sometimes be a complex task than estimated. When the task is complex, which involves
several types of instruments being collected for the same research question, the
procedures for drawing the data structure would involve a series of steps. In several
intermediate steps, the heterogeneous data structure of the individual data sets can be
harmonized to a common standard and the separate data sets are then integrated into a
single data set. However, the clear definition of such data structures would help in the
further processing of data.

Editing

The next step in the processing of data is editing of the data instruments. Editing is a
process of checking to detect and correct errors and omissions. Data editing happens at
two stages, one at the time of recording of the data and second at the time of analysis of
data.

Data Editing at the Time of Recording of Data

Document editing and testing of the data at the time of data recording is done considering
the following questions in mind.

• Do the filters agree or are the data inconsistent?


• Have ‘missing values’ been set to values, which are the same for all research
questions?
• Have variable descriptions been specified?
• Have labels for variable names and value labels been defined and written?

All editing and cleaning steps are documented, so that, the redefinition of variables or
later analytical modification requirements could be easily incorporated into the data sets.

Data Editing at the Time of Analysis of Data

Data editing is also a requisite before the analysis of data is carried out. This ensures that
the data is complete in all respect for subjecting them to further analysis. Some of the
usual check list questions that can be had by a researcher for editing data sets before
analysis would be:
1. Is the coding frame complete?
2. Is the documentary material sufficient for the methodological description of the
study?
3. Is the storage medium readable and reliable.
4. Has the correct data set been framed?
5. Is the number of cases correct?
6. Are there differences between questionnaire, coding frame and data?
7. Are there undefined and so-called “wild codes”?
8. Comparison of the first counting of the data with the original documents of the
researcher.

The editing step checks for the completeness, accuracy and uniformity of the data as
created by the researcher.

Completeness: The first step of editing is to check whether there is an answer to all the
questions/variables set out in the data set. If there were any omission, the researcher
sometimes would be able to deduce the correct answer from other related data on the
same instrument. If this is possible, the data set has to rewritten on the basis of the new
information. For example, the approximate family income can be inferred from other
answers to probes such as occupation of family members, sources of income,
approximate spending and saving and borrowing habits of family members’ etc. If the
information is vital and has been found to be incomplete, then the researcher can take the
step of contacting the respondent personally again and solicit the requisite data again. If
none of these steps could be resorted to the marking of the data as “missing” must be
resorted to.

Accuracy: Apart from checking for omissions, the accuracy of each recorded answer
should be checked. A random check process can be applied to trace the errors at this step.
Consistency in response can also be checked at this step. The cross verification to a few
related responses would help in checking for consistency in responses. The reliability of
the data set would heavily depend on this step of error correction. While clear
inconsistencies should be rectified in the data sets, fact responses should be dropped from
the data sets.

Uniformity: In editing data sets, another keen lookout should be for any lack of
uniformity, in interpretation of questions and instructions by the data recorders. For
instance, the responses towards a specific feeling could have been queried from a positive
as well as a negative angle. While interpreting the answers, care should be taken as a
record the answer as a “positive question” response or as “negative question” response in
all uniformity checks for consistency in coding throughout the questionnaire/interview
schedule response/data set.

The final point in the editing of data set is to maintain a log of all corrections that have
been carried out at this stage. The documentation of these corrections helps the researcher
to retain the original data set.
Coding

The edited data are then subject to codification and classification. Coding process assigns
numerals or other symbols to the several responses of the data set. It is therefore a pre-
requisite to prepare a coding scheme for the data set. The recording of the data is done on
the basis of this coding scheme.

The responses collected in a data sheet varies, sometimes the responses could be the
choice among a multiple response, sometimes the response could be in terms of values
and sometimes the response could be alphanumeric. At the recording stage itself, if some
codification were done to the responses collected, it would be useful in the data analysis.
When codification is done, it is imperative to keep a log of the codes allotted to the
observations. This code sheet will help in the identification of variables/observations and
the basis for such codification.

The first coding done to primary data sets are the individual observation themselves. This
responses sheet coding gives a benefit to the research, in that, the verification and editing
of recordings and further contact with respondents can be achieved without any
difficulty. The codification can be made at the time of distribution of the primary data
sheets itself. The codes can be alphanumeric to keep track of where and to whom it had
been sent. For instance, if the data consists of several public at different localities, the
sheets that are distributed in a specific locality may carry a unique part code which is
alphabetic. To this alphabetic code, a numeric code can be attached to distinguish the
person to whom the primary instrument was distributed. This also helps the researcher to
keep track of who the respondents are and who are the probable respondents from whom
primary data sheets are yet to be collected. Even at a latter stage, any specific queries on
a specific responses sheet can be clarified.

The variables or observations in the primary instrument would also need codification,
especially when they are categorized. The categorization could be on a scale i.e., most
preferable to not preferable, or it could be very specific such as Gender classified as Male
and Female. Certain classifications can lead to open ended classification such as
education classification, Illiterate, Graduate, Professional, Others. Please specify. In such
instances, the codification needs to be carefully done to include all possible responses
under “Others, please specify”. If the preparation of the exhaustive list is not feasible,
then it will be better to create a separate variable for the “Others please specify” category
and records all responses as such.
Numeric Coding: Coding need not necessarily be numeric. It can also be alphabetic.
Coding has to be compulsorily numeric, when the variable is subject to further parametric
analysis.

Alphabetic Coding: A mere tabulation or frequency count or graphical representation of


the variable may be given in an alphabetic coding.

Zero Coding: A coding of zero has to be assigned carefully to a variable. In many


instances, when manual analysis is done, a code of 0 would imply a “no response” from
the respondents. Hence, if a value of 0 is to be given to specific responses in the data
sheet, it should not lead to the same interpretation of ‘non response’. For instance, there
will be a tendency to give a code of 0 to a ‘no’, then a different coding than 0 should be
given in the data sheet. An illustration of the coding process of some of the demographic
variables is given in the following table.
= Could be treated as a separate variable/observation and the actual response could be
recorded. The new variable could be termed as “other occupation”

The coding sheet needs to be prepared carefully, if the data recording is not done by the
researcher, but is outsourced to a data entry firm or individual. In order to enter the data
in the same perspective, as the researcher would like to view it, the data coding sheet is to
be prepared first and a copy of the data coding sheet should be given to the outsourcer to
help in the data entry procedure. Sometimes, the researcher might not be able to code the
data from the primary instrument itself. He may need to classify the responses and then
code them. For this purpose, classification of data is also necessary at the data entry
stage.

Classification

When open ended responses have been received, classification is necessary to code the
responses. For instance, the income of the respondent could be an open-ended question.
From all responses, a suitable classification can be arrived at. A classification method
should meet certain requirements or should be guided by certain rules.

First, classification should be linked to the theory and the aim of the particular study. The
objectives of the study will determine the dimensions chosen for coding. The
categorization should meet the information required to test the hypothesis or investigate
the questions.

Second, the scheme of classification should be exhaustive. That is, there must be a
category for every response. For example, the classification of martial status into three
category viz., “married” “Single” and “divorced” is not exhaustive, because responses
like “widower” or “separated” cannot be fitted into the scheme. Here, an open ended
question will be the best mode of getting the responses. From the responses collected, the
researcher can fit a meaningful and theoretically supportive classification. The inclusion
of the classification “Others” tends to fill the cluttered, but few responses from the data
sheets. But “others” categorization has to carefully used by the researcher. However, the
other categorization tends to defeat the very purpose of classification, which is designed
to distinguish between observations in terms of the properties under study. The
classification “others” will be very useful when a minority of respondents in the data set
give varying answers. For instance, the reading habits of newspaper may be surveyed.
The 95 respondents out of 100 could be easily classified into 5 large reading groups while
5 respondents could have given a unique answer. These given answer rather than being
separately considered could be clubbed under the “others” heading for meaningful
interpretation of respondents and reading habits.

Third, the categories must also be mutually exhaustive, so that each case is classified only
once. This requirement is violated when some of the categories overlap or different
dimensions are mixed up.

The number of categorization for a specific question/observation at the coding stage


should be maximum permissible since, reducing the categorization at the analysis level
would be easier than splitting an already classified group of responses. However the
number of categories is limited by the number of cases and the anticipated statistical
analysis that are to be used on the observation.
Transcription of Data

When the observations collected by the researcher are not very large, the simple
inferences, which can be drawn from the observations, can be transferred to a data sheet,
which is a summary of all responses on all observations from a research instrument. The
main aim of transition is to minimize the shuffling proceeds between several responses
and several observations. Suppose a research instrument contains 120 responses and the
observations has been collected from 200 respondents, a simple summary of one response
from all 200 observations would require shuffling of 200 pages. The process is quite
tedious if several summary tables are to be prepared from the instrument. The
transcription process helps in the presentation of all responses and observations on data
sheets which can help the researcher to arrive at preliminary conclusions as to the nature
of the sample collected etc. Transcription is hence, an intermediary process between data
coding and data tabulation.

Methods of Transcription

The researcher may adopt a manual or computerized transcription. Long work sheets,
sorting cards or sorting strips could be used by the researcher to manually transcript the
responses. The computerized transcription could be done using a data base package such
as spreadsheets, text files or other databases.

The main requisite for a transcription process is the preparation of the data sheets where
observations are the row of the database and the responses/variables are the columns of
the data sheet. Each variable should be given a label so that long questions can be
covered under the label names. The label names are thus the links to specific questions in
the research instrument. For instance, opinion on consumer satisfaction could be
identified through a number of statements (say 10); the data sheet does not contain the
details of the statement, but gives a link to the question in the research instrument though
variable labels. In this instance the variable names could be given as CS1, CS2, CS3,
CS4, CS5, CS6, CS7, CS8, CS9 and CS10. The label CS indicating Consumer
satisfaction and the number 1 to 10 indicate the statement measuring consumer
satisfaction. Once the labelling process has been done for all the responses in the research
instrument, the transcription of the response is done.

Manual Transcription

When the sample size is manageable, the researcher need not use any computerization
process to analyze the data. The researcher could prefer a manual transcription and
analysis of responses. The choice of manual transcription would be when the number of
responses in a research instrument is very less, say 10 responses, and the numbers of
observations collected are within 100. A transcription sheet with 100×50 (assuming each
response has 5 options) row/column can be easily managed by a researcher manually. If,
on the other hand the variables in the research instrument are more than 40 and each
variable has 5 options, it leads to a worksheet of 100×200 sizes which might not be easily
managed by the researcher manually. In the second instance, if the number of responses
is less than 30, then the manual worksheet could be attempted manually. In all other
instances, it is advisable to use a computerized transcription process.

Long Worksheets

Long worksheets require quality paper; preferably chart sheets, thick enough to last
several usages. These worksheets normally are ruled both horizontally and vertically,
allowing responses to be written in the boxes. If one sheet is not sufficient, the researcher
may use multiple rules sheets to accommodate all the observations. Heading of responses
which are variable names and their coding (options) are filled in the first two rows. The
first column contains the code of observations. For each variable, now the responses from
the research instrument are then transferred to the worksheet by ticking the specific
option that the observer has chosen. If the variable cannot be coded into categories,
requisite length for recording the actual response of the observer should be provided for
in the work sheet.

The worksheet can then be used for preparing the summary tables or can be subjected to
further analysis of data. The original research instrument can be now kept aside as safe
documents. Copies of the data sheets can also be kept for future references. As has been
discussed under the editing section, the transcript data has to be subjected to a testing to
ensure error free transcription of data.

Transcription can be made as and when the edited instrument is ready for processing.
Once all schedules/questionnaires have been transcribed, the frequency tables can be
constructed straight from worksheet. Other methods of manual transcription include
adoption of sorting strips or cards.
In olden days, data entry and processing were made through mechanical and semi auto-
metric devices such as key punch using punch cards. The arrival of computers has
changed the data processing methodology altogether.

Tabulation

The transcription of data can be used to summarize and arrange the data in compact form
for further analysis. The process is called tabulation. Thus, tabulation is a process of
summarizing raw data displaying them on compact statistical tables for further analysis.
It involves counting the number of cases falling into each of the categories identified by
the researcher.

Tabulation can be done manually or through the computer. The choice depends upon the
size and type of study, cost considerations, time pressures and the availability of software
packages. Manual tabulation is suitable for small and simple studies.

Manual Tabulation

When data are transcribed in a classified form as per the planned scheme of
classification, category-wise totals can be extracted from the respective columns of the
work sheets. A simple frequency table counting the number of “Yes” and “No” responses
can be made easily by counting the “Y” response column and “N” response column in the
manual worksheet table prepared earlier. This is a one-way frequency table and they are
readily inferred from the totals of each column in the work sheet. Sometimes the
researcher has to cross tabulate two variables, for instance, the age group of vehicle
owners. This requires a two-way classification and cannot be inferred straight from any
technical knowledge or skill. If one wants to prepare a table showing the distribution of
respondents by age, a tally sheet showing the age groups horizontally is prepared. Tally
marks are then made for the respective group i.e., ‘vehicle owners’, from each line of
response in the worksheet. After every four tally, the fifth tally is cut across the previous
four tallies. This represents a group of five items. This arrangement facilitates easy
counting of each one of the class groups. Illustration of this tally sheet is present below.
Although manual tabulation is simple and easy to construct, it can be tedious, slow and
error-prone as responses increase.

Computerized tabulation is easy with the help of software packages. The input
requirement will be the column and row variables. The software package then computes
the number of records in each cell of three row column categories. The most popular
package is the Statistical package for Social Science (SPSS). It is an integrated set of
programs suitable for analysis of social science data. This package contains programs for
a wide range of operations and analysis such as handling missing data, recording variable
information, simple descriptive analysis, cross tabulation, multivariate analysis and non-
parametric analysis.

Construction of Frequency Table

Frequency tables provide a “shorthand” summary of data. The importance of presenting


statistical data in tabular form needs no emphasis. Tables facilitate comprehending
masses of data at a glance; they conserve space and reduce explanations and descriptions
to a minimum. They give a visual picture of relationships between variables and
categories. They facilitate summation of item and the detection of errors and omissions
and provide a basis for computations.

It is important to make a distinction between the general purpose tables and specific
tables. The general purpose tables are primary or reference tables designed to include
large amount of source data in convenient and accessible form. The special purpose
tables are analytical or derivate ones that demonstrate significant relationships in the data
or the results of statistical analysis. Tables in reports of government on population, vital
statistics, agriculture, industries etc., are of general purpose type. They represent
extensive repositories and statistical information. Special purpose tables are found in
monographs, research reports and articles and reused as instruments of analysis. In
research, we are primarily concerned with special purpose.

Components of a Table

The major components of a table are:

A Heading:

(a) Table Number

(b) Title of the Table

(c) Designation of units


B Body

1. Sub-head, Heading of all rows or blocks of stub items


1. Body-head: Headings of all columns or main captions and their sub-
captions.
2. Field/body: The cells in rows and columns.

C Notations:

• Footnotes, wherever applicable.


• Source, wherever applicable.

Principles of Table Construction

There are certain generally accepted principles of rules relating to construction of tables.
They are:

1. Every table should have a title. The tile should represent a succinct description of
the contents of the table. It should be clear and concise. It should be placed above
the body of the table.
2. A number facilitating easy reference should identify every table. The number can
be centred above the title. The table numbers should run in consecutive serial
order. Alternatively tables in chapter 1 be numbered as 1.1, 1.2, 1….., in chapter 2
as 2.1, 2.2, 2.3…. and so on.
3. The captions (or column headings) should be clear and brief.
4. The units of measurement under each heading must always be indicated.
5. Any explanatory footnotes concerning the table itself are placed directly beneath
the table and in order to obviate any possible confusion with the textual footnotes
such reference symbols as the asterisk (*) DAGGER (+) and the like may be used.
6. If the data in a series of tables have been obtained from different sources, it is
ordinarily advisable to indicate the specific sources in a place just below the table.
7. Usually lines separate columns from one another. Lines are always drawn at the
top and bottom of the table and below the captions.
8. The columns may be numbered to facilitate reference.
9. All column figures should be properly aligned. Decimal points and “plus” or
“minus” signs should be in perfect alignment.
10. Columns and rows that are to be compared with one another should be brought
closed together.
11. Totals of rows should be placed at the extreme right column and totals of columns
at the bottom.
12. In order to emphasize the relative significance of certain categories, different
kinds of type, spacing and identifications can be used.
13. The arrangement of the categories in a table may be chronological, geographical,
alphabetical or according to magnitude. Numerical categories are usually arranged
in descending order of magnitude.
14. Miscellaneous and exceptions items are generally placed in the last row of the
table.
15. Usually the larger number of items is listed vertically. This means that a table’s
length is more than its width.
16. Abbreviations should be avoided whenever possible and ditto marks should not be
used in a table.
17. The table should be made as logical, clear, accurate and simple as possible.

Text references should identify tables by number, rather than by such expressions as “the
table above” or “the following table”. Tables should not exceed the page size by photo
stating. Tables those are too wide for the page may be turned sidewise, with the top
facing the left margin or binding of the script. Where tables should be placed in research
report or thesis? Some writers place both special purpose and general purpose tables in an
appendix and refer to them in the text by numbers. This practice has the disadvantages of
inconveniencing the reader who wants to study the tabulated data as the text is read. A
more appropriate procedure is to place special purpose tables in the text and primary
tables, if needed at all, in an appendix.

Frequency Distribution and Class Intervals

Variables that are classified according to magnitude or size are often arranged in the form
of a frequency table. In constructing this table, it is necessary to determine the number of
class intervals to be used and the size of the class intervals.

A distinction is usually made between continuous and discrete variables. A continuous


variable has an unlimited number of possible values between the lowest and highest with
no gaps or breaks. Examples of continuous variable are age, weight, temperature etc. A
discrete variable can have a series of specified values with no possibility of values
between these points. Each value of a discrete variable is distinct and separate. Examples
of discrete variables are gender of persons (male/female) occupation (salaried, business,
profession) car size (800cc, 1000cc, 1200cc)

In practice, all variables are treated as discrete units, the continuous variables being stated
in some discrete unit size according to the needs of a particular situation. For example,
length is described in discrete units of millimetres or a tenth of an inch.

Class Intervals: Ordinarily, the number of class intervals may not be less than 5 not
more than 15, depending on the nature of the data and the number of cases being studied.
After noting the highest and lower values and the feature of the data, the number of
intervals can be easily determined.

For many types of data, it is desirable to have class intervals of uniform size. The
intervals should neither be too small nor too large. Whenever possible, the intervals
should represent common and convenient numerical divisions such as 5 or 10, rather than
odd division such as 3 to 7. Class intervals must be clearly designated in a frequency
table in such a way as to obviate any possibility of misinterpretation of confusion. For
example, to present the age group of a population, the use of intervals of 1-20, 20-50, and
50 and above would be confusing. This may be presented as 1-20, 21-50, and above 50.

Every class interval has a mid point. For example, the midpoint of an interval 1-20 is 10.5
and the midpoint of class interval 1-25 would be 13. Once class intervals are determined,
it is routine work to count the number of cases that fall in each interval.

One-Way Tables: One-way frequency tables present the distribution of cases on only a
single dimension or variable. For example, the distribution of respondents of gender, by
religion, socio economic status and the like are shown in one way tables (Table 10.1)
lustrates one-way tables. One way tables are rarely used since the result of frequency
distributions can be described in simple sentences. For instance, the gender distribution of
a sample study may be described as “The sample data represents 58% by males and 42%
of the sample are females.”

Tow-Way Table: Distributions in terms of two or more variables and the relationship
between the two variables are show in two-way table. The categories of one variable are
presented one below another, on the left margin of the table those of another variable at
the upper part of the table, one by the side of another. The cells represent particular
combination of both variables. To compare the distributions of cases, raw numbers are
converted into percentages based on the number of cases in each category. (Table 10.2)
illustrate two-way tables.

TABLE10.2

Another method of constructing a two-way table is to state the percent of representation


as a within brackets term rather than as a separate column. Here, special care has been
taken as to how the percentages are calculated, either on a horizontal representation of
data or as vertical representation of data. Sometimes, the table heading itself provides a
meaning as to the method of representation in the two-way table.
Graphs, Charts & Diagrams

In presenting the data of frequency distributions and statistical computations, it is often


desirable to use appropriate forms of graphic presentations. In additions to tabular forms,
graphic presentation involves use of graphics, charts and other pictorial devices such as
diagrams. These forms and devices reduce large masses of statistical data to a form that
can be quickly understood at the glance. The meaning of figures in tabular form may be
difficult for the mind to grasp or retain. “Properly constructed graphs and charts relieve
the mind of burdensome details by portraying facts concisely, logically and simply.”
They, by emphasizing new and significant relationship, are also useful in discovering
new facts and in developing hypothesis.

The device of graphic presentation is particularly useful when the prospective readers are
non-technical people or general public. It is useful to even technical people for
dramatizing certain points about data; for important points can be more effectively
captured in pictures than in tables. However, graphic forms are not substitutes for tables,
but are additional tools for the researcher to emphasize the research findings.

Graphic presentation must be planned with utmost care and diligence. Graphic forms
used should be simple, clear and accurate and also be appropriate to the data. In planning
this work, the following questions must be considered.

(a) What is the purpose of the diagram?

(b) What facts are to be emphasized?

(c) What is the educational level of the audience?

(d) How much time is available for the preparation of the diagram?

(e) What kind of chart will portray the data most clearly and accurately?

Types of Graphs and General Rules

The most commonly used graphic forms may be grouped into the following categories:

a) Line Graphs or Charts


b) Bar Charts

c) Segmental presentations.

d) Scatter plots

e) Bubble charts

f) Stock plots

g) Pictographs

h) Chesnokov Faces

The general rules to be followed in graphic representations are:

1. The chart should have a title placed directly above the chart.
2. The title should be clear, concise and simple and should describe the nature of the
data presented.
3. Numerical data upon which the chart is based should be presented in an
accompanying table.
4. The horizontal line measures time or independent variable and the vertical line the
measured variable.
5. Measurements proceed from left to right on the horizontal line and from bottom to
top on the vertical.
6. Each curve or bar on the chart should be labelled.
7. If there are more than one curves or bar, they should be clearly differentiated from
one another by distinct patterns or colours.
8. The zero point should always be represented and the scale intervals should be
equal.
9. Graphic forms should be used sparingly. Too many forms detract rather than
illuminating the presentation.
10. Graphic forms should follow and not precede the related textual discussion.

Line Graphs

The line graph is useful for showing changes in data relationship over a period of time. In
this graph, figures are plotted in relation to two intersecting lines or axes. The horizontal
line is called the abscissa or X-axis and the vertical, the ordinal or Y-axis. The point at
which the two axes intersect is zero for both X and Y axis. The ‘O’ is the origin of
coordinates. The two lines divide the region of the plane into four sections known as
quadrants that are numbered anti-clockwise. Measurements to the right and above ‘O’ are
positive (plus) and measurements to the left and below ‘O’ are negative (minus). is an
illustration of the features of a rectangular coordinate type of graph. Any point of plane of
the two axes is plotted in terms of the two axes reading from the origin ‘O’. Scale
intervals in both the axes should be equal. If a part of the scale is omitted, a set of parallel
jagged lines should be used to indicate the break in the scale. The time dimension or
independent variable is represented by the X-axis and the other variable by Y-axis.

Quantitative and Qualitative Analysis

Measures of Central Tendency

Analysis of data involves understanding of the characteristics of the data. The following
are the important characteristics of a statistical data: -

• Central tendency
• Dispersion
• Skew ness
• Kurtosis

In a data distribution, the individual items may have a tendency to come to a central
position or an average value. For instance, in a mark distribution, the individual students
may score marks between zero and hundred. In this distribution, many students may
score marks, which are near to the average marks, i.e. 50. Such a tendency of the data to
concentrate to the central position of the distribution is called central tendency. Central
tendency of the data is measured by statistical averages. Averages are classified into two
groups.

1. Mathematical averages
2. Positional averages

Arithmetic mean, geometric mean and harmonic mean are mathematical averages.
Median and mode are positional averages. These statistical measures try to understand
how individual values in a distribution concentrate to a central value like average. If the
values of distribution approximately come near to the average value, we conclude that the
distribution has central tendency.

Arithmetic Mean

Arithmetic mean is the most commonly used statistical average. It is the value obtained
by dividing the sum of the item by the number of items in a series. Symbolically we say

If x1 x2 x3… xn are the values of a series, then arithmetic mean of the series obtained by

(x1 + x2 + x3… +xn) / n. If put (x1 + x2 + x3… +xn) = Σ X,

then arithmetic mean = Σ X/n

When frequencies are also given with the values, to calculate arithmetic mean, the values
are first multiplied with the corresponding frequency. Then their sum is divided by the
number of frequency. Thus in a discrete series, arithmetic mean is calculated by the
following formula.

Arithmetic mean = Σ fx/ Σ f

Where, Σ fx = sum the values multiplied by the corresponding


frequency.

Σ f = sum of the frequency

If x1 x2 x3… xn are the values of a series, and f1 f2 f3… fn are their corresponding
frequencies,

Arithmetic mean is calculated by (f1 x1 + f2 x2 + f3x3… + fn xn) / (f1 + f2 + f3… + fn) or

Arithmetic mean = Σ fx / Σ f

Individual series

• Find arithmetic mean of the following data.

58 67 60 84 93 98 100

Arithmetic mean = Σ X/n


Where Σ X = the sum of the item

n = the number of items in the series.

Σ X = 58 + 67+ 60 + 84 + 93 + 98 + 100 = 560

n = 7

Σ X = 560/7 = 80

• Find arithmetic mean for the following distribution

2.0 1.8 2.0 2.0 1.9 2.0 1.8 2.3 2.5 2.3

1.9 2.2 2.0 2.3

Arithmetic mean = Σ X/n

Where Σ X = the sum of the item

n = the number of items in the series.

Σ X = 2.0 + 1.8 + 2.0 + 2.0+ 1.9 + 2.0 + 1.8 + 2.3 + 2.5 + 2.3 +
1.9 + 2.2 + 2.0 + 2.3 = 29

n = 14

Σ X = 29/14 = 2.07

Discrete series

o Calculate arithmetic mean of the following 50 workers according to their


daily wages.

Daily wage : 15 18 20 25 30 35 40 42

Numbers of workers : 2 3 5 10 12 10 5 2

Arithmetic mean using direct formula


Arithmetic mean = Σ fx/ Σ f

Where, Σ fx = 473

Σ f = 0

Arithmetic mean = 1473 /50

29.46

Continuous Series

• Find arithmetic mean for the following distribution.

Marks : 10-20 20-30 30-40 40-50 50-60 60-70 70-80 80-90

No. of students : 6 12 18 20 20 14 8 2
Arithmetic mean = Σ fx/ Σ f

Where, Σ fx = 4700

Σ f = 100

Arithmetic mean = 4700 / 100

= 47

Geometric Mean

Geometric mean is defined as the nth root of the product of N items of a series. If
there are two items in the data, we take the square root; if there are three items we
take the cube root, and so on.

Symbolically,

GM =

Where x1, x2. ..xn are the items of the given series. To simplify calculations,
logarithms are used.

Accordingly,

GM = Anti log of (Σ log x /n)


In discrete series

GM = Anti log of
Σ f . log x / Σ f

Illustration

GM = Anti log of (Σ log x /n)

= Anti log of (19.9986 / 10)

= Anti log of 1.9986

= 99.967

Geometric mean for discrete series

Calculate geometric mean of the following data given below:-

Class No. of families Income

Landlords 1 100

Cultivators 50 80
Landless labourers 25 40

Money lenders 2 750

Scholl teachers 3 100

Shop keepers 4 150

Carpenters 3 120

Weavers 5 60

GM = Anti log of Σ f. log x / Σ f

= Anti log of 173.7907 / 93

= Anti log 1. 86871

= 73.91

Harmonic Mean

In individual series

HM = N / Σ (1/x)

In discrete series

HM = N / Σ f (1/m)

N = Total frequency

M = Mi values of the class


Illustration

For individual series

1. Find harmonic mean of the following data

5 10 3 7 125 58 47 80 45 26

HM = N / Σ (1/x)

HM = 10 / .89

= 11.235

Harmonic mean for discrete series

Compute harmonic mean for the following data

Marks : 10 20 25 30 40 50

Frequency : 20 10 15 25 10 20
HM = N / Σ f (1/x)

HM = 100/ 4.58

= 21.834

Harmonic mean for continuous series

1. Calculate harmonic mean for the given data.

Class : 10-20 20-30 30-40 40-50 50-60 60-70

Frequency : 5 7 3 15 12 8

HM = N / Σ (1/x)

HM = 50 / 1.369 = 37.8689

Median

Median is the middlemost item of a given series. In individual series, we arrange


the given data according to ascending or descending order and take the
middlemost item as the median. When two values occur in the middle, we take the
average of these two values as median. Since median is the central value of an
ordered distribution, there occur equal number of values to the left and right of the
median.

Individual series

Median = (N+ 1 / 2) th item

Illustration

1. Find the median of the following scores.


97 50 95 51 90 60 85 64 81 65 80 70 75

First we arrange the series according to ascending order.

50 51 60 64 65 70 75 80 81 85 90 95 97

Median = (N+ 1) / 2 th item

= (13+ 1) / 2 th item

= (14 / 2) th item

= (7) th item

= 75

Median for distribution with even number of items

2. Find the median of the following data.

95 51 91 60 90 64 85 69 80 70 78 75

First we arrange the series according to ascending order.

51 60 64 69 70 75 78 80 85 90 91 95

Median = (N+ 1) / 2 th item

= (12+ 1) / 2 th item

= (13 / 2) th item

= (6.5) th item

= (6th item + 7th item) / 2

= (75 + 78) / 2

= 153/2

= 76.5

Median for Discrete Series


To find the median of a grouped series, we first of all, cumulate the frequencies.
Locate median at the size of (N+ 1) / 2 th cumulative frequency. N is the
cumulative frequency taken.

Steps

2. Arrange the values of the data in ascending order of magnitude.


3. Find out cumulative frequencies
4. Apply the formula (N+ 1) / 2 th item
1. Look at the cumulative frequency column and find the value of the
variable corresponding to the above.

Find median for the following data.

Income : 100 150 80 200 250 180

Number of persons : 24 26 16 20 6 30

First of all arrange the data according to ascending order.

Median = (N+ 1) / 2 th item

= (122+ 1) / 2 th item

= (123) / 2 th item

= (61.5) th item

= Value at the 61.5 cumulative frequency is taken as median

Therefore Median = 150

Median for Continuous Series


To find the median of a grouped series, with class interval, we first of all,
cumulate the frequencies. Locate median at the size of (N) / 2 th cumulative
frequency. Apply the interpolation formula to obtain the median

Median = L1 + (N/2 – m) / f X C

L1 = Lower limit of the median Class

N/2 = Cumulative frequency/ 2

m = Cumulative frequency of the class preceding the median class

f = frequency of the median class

C = Class interval

Find median of the following data.

Class : 12-14 15-17 18-20 21-23 24-26

Frequency : 1 3 8 2 6

Median = L1 + (N/2 – m) / f X C

L1 = 18

N/2 = 10

m = 4

f = 8

C = 2

= 18+ (10 – 4) / 8 X 2

= 18 + 6/8 X 2
= 18 + (12/8)

= 18 + 1.5

= 19.5

Merits of Median

2. Median is easy to calculate and simple to understand.


3. When the data is very large median is the most convenient measure of
central tendency.
4. Median is useful finding average for data with open-ended classes.
5. The median distributes the values of the data equally to either side of the
median.
6. Median is not influenced by the extreme values present in the data.
7. Value of the median can be graphically determined.

Demerits of Median

• To calculate median, data should be arranged according to ascending order. This


is tedious when the number of items in a series is numerous.
• Since the value of median is determined by observation, it is not a true
representative of all the values.
• Median is not amenable to further algebraic treatment.
• The value of median is affected by sampling fluctuation.

Mode

Mode is the most repeating value of a distribution. When one item repeats more number
of times than other or when two items repeat equal number of times, mode is ill defined.
Under such case, mode is calculated by the formula (3 median – 2 mean).

Mode is a widely used measure of central tendency in business. We speak of model wage
which is the wage earned by most of the workers. Model shoe size is the mostly
demanded shoe.

Merits of Mode

• Mode is the most typical and frequented value of the distribution.


• It is not affected by extreme values.
• Mode can be determined even for series with open-ended classes.
• Mode can be graphically determined.

Demerits of Mode
1. It is difficult to calculate mode when one item repeats more number of times than
others.
2. Mode is not capable of further algebraic treatment.
3. Mode is not based on all the items of the series.
4. Mode is not rigidly defined. There are several formulae for calculating mode.

Mode for Individual Series

1. Calculation of mode for the following data.

7 10 8 5 8 6 8 9

Since item 8 repeats more number of times. Therefore mode = 8

Calculation of mode when mode is ill defined.

2. Calculation of mode for the following data.

15 25 14 18 21 16 19 20

Since no item repeats more number of times mode is ill defined.

Mode = (3 median – 2 mean)

Mean = 18.5

Median = (18 +19)/2

= 18.5

Mode = (3 X 18.5) – (2 X 18.5)

= 55.5 – 36.5 = 19

Mode for Discrete data Series

In discrete series the item with highest frequency is taken as mode.

3. Find mode for the following data.


Since 65 is the highest frequency its size is taken as mode

Mode = 31

Calculation of Mode Using Grouping Table and Analysis Table

To make Grouping Table

1. Group the frequency in two


2. Frequencies are grouped in two leaving the first frequency.
3. Group the frequency in three
4. Frequencies are grouped in three leaving the first frequency.
5. Frequencies are grouped in three leaving the first and second frequency.

To make Analysis Table

1. Analysis table is made based on grouping table.


2. Circle the highest value of each column.
3. Assign marks to classes, which constitute the highest value of the column.
4. Count the number of marks.
5. The class with the highest marks is selected as the model class.
6. Apply the interpolation formula and find the mode.

Mode = L1 + (f1 – f0 / 2f1-f0-f2) X C

L1 = Lower limit of the model class

f1 = frequency
of the model class

f0 = frequency
of the class preceding the model class

f2 = frequency
of the class succeeding the model class
C = class interval

Illustration

Find mode for the following data using grouping table and analysis table.

Steps

1. In column I, the frequencies are grouped in two


2. In column II, frequencies are grouped in two, leaving the first frequency.
3. In column III, frequencies are grouped in three
4. In column IV frequencies are grouped in three, leaving the first frequency.
5. In column V frequencies are grouped in three, leaving the first and second
frequency.
Since highest mark is 5 and is obtained by the class 40-60.

Therefore model class = 40-60

Mode is calculated by the formula

Mode = L1 + (f1 – f0)


/ (2f1-f0-f2) X C

L1 = Lower limit of the model class = 40

f1 = frequency
of the model class = 27

f0 = frequency
of the class preceding the model class = 15

f2 = frequency
of the class succeeding the model class = 13

C = class interval = 20

Mode = 40 + (27 – 15) / (2 X 27 –15-13) X 20

= 40 + (12/ 54-28) 20

= 40 + (12/ 26) 20

= 40 + (.4615) 20

= 40 + 9.23

= 49.23
Dispersion

Dispersion is the tendency of the individual values in a distribution to spread away from
the average. Many economic variables like income, wage etc., are widely varied from the
mean. Dispersion is a statistical measure, which understands the degree of variation of
items from the average.

Objectives of Measuring Dispersion

Study of dispersion is needed to:

1. To test the reliability of the average


2. To control variability of the data
3. To enable comparison with two or more distribution with regard to their
variability
4. To facilitate the use of other statistical measures.

Measures of dispersion points out as to how far the average value is representative of the
individual items. If the dispersion value is small, the average tends to closely represent
the individual values and it is reliable. When dispersion is large, the average is not a
typical representative value.

Measures of dispersion are useful to control the cause of variation. In industrial


production, efficient operation requires control of quality variation.

Measures of variation enable comparison of two or more series with regard to their
variability. A high degree of variation would mean little consistency and low degree of
variation would mean high consistency.

Properties of a Good Measure of Dispersion

A good measure of dispersion should be simple to understand.

1. It should be easy to calculate


2. It should be rigidly defined
3. It should be based on all the values of a distribution
4. It should be amenable to further statistical and algebraic treatment.
5. It should have sampling stability
6. It should not be unduly affected by extreme values.

Measures of Dispersion

1. Range
2. Quartile deviation
3. Mean deviation
4. Standard deviation
5. Lorenz curve

Range, Quartile deviation, Mean deviation and Standard deviation are


mathematical measures of dispersion. Lorenz curve is a graphical measure of
dispersion.

Measures of dispersion can be absolute or relative. An absolute measure of


dispersion is expressed in the same unit of the original data. When two sets of
data are expressed in different units, relative measures of dispersion are used for
comparison. A relative measure of dispersion is the ratio of absolute measure to
an appropriate average.

The following are the important relative measures of dispersion.

6. Coefficient of range
7. Coefficient of Quartile deviation
8. Coefficient of Mean deviation
9. Coefficient of Standard deviation

Range

Range is the difference between the lowest and the highest value.

Symbolically, range = highest value – lowest value

Range = H–L

H = highest value

L = lowest value

Relative measure of dispersion is co-efficient of range. It is obtained by the


following formula.

Coefficient of range = (H – L) / (H + L)

1. Calculate of range of the following distribution, giving income of 10


workers. Also calculate the co-efficient of range.

25 37 40 23 58 75 89 20 81 95

Range = H–L
H = highest value = 95

L = lowest value = 20

Range = 95 –20 = 75

Coefficient of range = (H – L) / (H + L)

= (95 –20) / (95 +20)

= 75/ 115

= .6521

Range is simple to understand and easy to calculate. But it is not based on all
items of the distribution. It is subject to fluctuations from sample to sample.
Range cannot be calculated for open-ended series.

Quartile Deviation

Quartile deviation is defined as inter quartile range. It is based on the first and the
third quartile of a distribution. When a distribution is divided into four equal
parts, we obtain four quartiles, Q1, Q2, Q3 and Q4.

First quartile Q1 is point of the distribution where 25% of the items of the
distribution lie below Q1, and 75% of the items of the distribution lie above the Q1.
Q2 is the median of the distribution, where 50% of the items of the distribution lie
below Q2, and 50% of the items of the distribution lie above the Q2. Third quartile
Q3 is point of the distribution where 75% of the items of the distribution lie below
Q3, and 25% of the items of the distribution lie above the Q3.

Quartile deviation is based on the difference between the third and first quartiles.
So quartile deviation is defined as the inter-quartile range.

Symbolically, inter-quartile range = Q3- Q1

Quartile Deviation = (Q3- Q1) / 2

Co-efficient of Quartile Deviation = (Q3- Q1)


/ (Q3 + Q1)

Merits of Quartile Deviation

1. Quartile Deviation is superior to range as a rough measure of dispersion.

2. It has a special merit in measuring dispersion in open-ended series.


3. Quartile Deviation is not affected by extreme values.

Demerits of Quartile Deviation

4. Quartile Deviation ignores the first 25% of the distribution below Q1 and
25% of the distribution above the Q3.
5. Quartile Deviation is not amenable to further mathematical treatment.
6. Quartile Deviation is very much affected by sampling fluctuations.

Problems

Individual Series

10. Find the Quartile Deviation and its co-efficient.

20 58 40 12 30 15 50

First of all arrange the data according to ascending order.

12 15 20 28 30 40 50

Q1 = Size of (N+1) / 4 th item

= Size of (7+1) / 4 th item

= Size of (8 / 4) th item

= 2nd item

= 15

Q3 = Size of 3(N+1) / 4 th item

= Size of 3 X (7+1) / 4 th item

= Size of 3 X 8 / 4 th item

= (3 X 2) nd item

= 6th item

= 40

Co-efficient of Quartile Deviation = (Q3- Q1) / (Q3 + Q1)


= (40- 15) / (40+ 15)

= 25/55

= .4545

Discrete Series

11. Find quartile Deviation and its co-efficient for the following data.

Income : 110 120 130 140 150 160 170 180 190 200

Frequency: 50 45 40 35 30 25 20 15 10 5

Q1 = Size of (N+1) / 4 th item

= Size of (275+1) / 4 th item

= Size of (276 / 4) th item

= size of 69th cumulative frequency

= 120

Q3 = Size of 3(N+1) / 4 th item

= Size of 3 X (275 +1) / 4 th item

= Size of 3 X69 th item

= Size of 207th cumulative frequency


= 160

Quartile Deviation = (160 –120) /2

= 40/2

= 20

Co-efficient of Quartile Deviation = (Q3- Q1) / (Q3 + Q1)

= (160- 120
/ (160+ 120)

= 20/280

= .0714

Continuous Series

Find quartile deviation for the following series

Marks : 0-20 20-40 40-60 60-80 80-100

Frequency : 10 30 36 30 14

Q1 = lies in (N) / 4 th class

= lies in (120) / 4 th class

= lies in (30) th cumulative frequency class

= lies in 20- 40

Q1 can be obtained by applying the interpolation formula

= L1 + (N/4) – m / f X C
= 20 + (30 – 10) / 30 X 20

= 20 + 20/ 30 X 20

= 20 + 400/30

= 20 + 13.33

= 33.33

Q3 = lies in 3(30)th cumulative frequency class

= lies in 60-80 class

Q3 can be obtained by applying the interpolation formula

= L1 + 3 (N/4) – m / f X C

= 60 + (90 – 76) / 30 X 20

= 60 + (14/ 30) X 20

= 60 + 280/30

= 60 + 9.33

= 69.33

Quartile Deviation = (Q3- Q1) /2

= (69.33 –33.33) 2

= 36/2

= 18

Co-efficient of Quartile Deviation = (Q3- Q1) / (Q3 + Q1)

= (69.33 –33.33) / (69.33 + 33.33)

= 36/ 102.66

= .3505

Mean Deviation
Range and quartile deviation do not show any scatter ness from the average.
However, mean deviation and standard deviation help us to achieve the
dispersion.

Mean deviation is the average of the deviations of the items in a distribution from
an appropriate average. Thus, we calculate mean deviation from mean, median or
mode. Theoretically, mean deviation from median has an advantage because sum
of deviations of items from median is the minimum when signs are ignored.
However, in practice, mean deviation from mean is frequently used. That is why it
is commonly called as mean deviation.

Formula for calculating mean deviation = ΣD/N

Where

ΣD = sum of the deviation of the items from mean, median or mode

N = number of items

D is mode less meaning values or deviation is taken without signs.

Steps

1. Calculate mean, median or mode of the series


2. Find the deviation of items from the mean, median or mode
3. Sum the deviations and obtain ΣD
4. Take the average of the deviations ΣD/N, which is the mean deviation.

The co- efficient of mean deviation is the relative measure of mean deviation. It is
obtained by dividing the mean deviation by a particular measure of average used
for measuring mean deviation.

If mean deviation is obtained from median, the co-efficient of mean deviation is


obtained by dividing mean deviation by median.

The co-efficient of mean deviation = mean deviation / median

If mean deviation is obtained from mean, the co-efficient of mean deviation is


obtained by dividing mean deviation by mean.

The co-efficient of mean deviation = mean deviation / mean


If mean deviation is obtained from mode, the co-efficient of mean deviation is
obtained by dividing mean deviation by mode.

The co-efficient of mean deviation = mean deviation / mode

Problems

Calculate mean deviation for the following data from mean

Daily wages : 15 18 20 25 30 35 40 42 45

Frequency : 2 3 5 10 12 10 5 2 1

Mean = 1473/50

= 20

Mean deviation = ΣfD/N

= 505/50

= 10.1

The co-efficient of mean deviation = mean deviation / mean

= 10.1 /20

= .505

Continuous series
The procedure remains the same. The only difference is that we have to obtain the
midpoints of the various classes and take deviations of these midpoints. The
deviations are multiplied by their corresponding frequencies. The value so
obtained is added and its average is the mean deviation.

Calculate mean deviation for the following data.

Class : 5-10 10-15 15-20 20-25 25-30 30-35 35-40 40-45

Frequency : 6 5 15 10 5 4 3 2

Arithmetic mean = A + Σ fx / ΣF

= 22.5 + 65/50

= 22.5 +1.3

= 28.8

Mean deviation from mean = ΣfD/N

= 516.6/50

= 10.332

The co-efficient of mean deviation = mean deviation / mean

= 10.332 / 28.8

= .3762

Mean deviation from median


To find median

Median = L1 + (n/2 – m/f) C

= 15 + 25 – 11/ 15 X 5

= 15 + 6/15 X 5

= 15 + 30/15

= 15 + 2

= 17

Mean deviation from median = ΣfD/N

= 369/50

= 7.38

The co-efficient of mean deviation = mean deviation / median

= 7.38/17

= .434

Mean deviation from mode = model class 15-20

= L1 + (f1-f0 / 2 f1-f0-f2) C

= 15 + (15-5 / 2X15-5-10) X 5
= 15 + (10 / 30-5-10) X 5

= 15 + (10 / 15) X 5

= 15 + 3.33

= 18.33

Mean deviation from mode = ΣfD/N

= 356.72/50

= 7.13

The co-efficient of mean deviation = mean deviation / mode

= 7.16/ 18.3

= .3912

Merits of Mean Deviation

5. Mean deviation is simple to understand and easy to calculate


6. It is based on each and every item of the distribution
7. It is less affected by the values of extreme items compared to standard
deviation.
8. Since deviations are taken from a central value, comparison about
formation of different distribution can be easily made.

Demerits of Mean Deviation

9. Algebraic signs are ignored while taking the deviations of the items.
10. Mean deviation gives the best result when it is calculated from median.
But median is not a satisfactory measure when variability is very high.
11. Various methods give different results.
12. It is not capable of further mathematical treatment.
13. It is rarely used for sociological studies.

Standard deviation

Standard deviation is the most important measure of dispersion. It


satisfies most of the properties of a good measure of dispersion. It was
introduced by Karl Pearson in 1893. Standard deviation is defined as
the mean of the squared deviations from the arithmetic . σ mean.
Standard deviation is denoted by the Greek letter

Mean deviation and standard deviation are calculated from deviation of


each and every item. Standard deviation is different from mean
deviation in two respects. First of all, algebraic signs are ignored in
calculating mean deviation. Secondly, signs are taken into account in
calculating standard deviation whereas, mean deviation can be found
from mean, median or mode. Whereas, standard deviation is found
only from mean.

Standard deviation can be computed in two methods

12.Taking deviation from actual mean


13. Taking deviation from assumed mean.

Formula for finding standard deviation is √


Σ (x-x)2 / N

Steps

14. Calculate the actual mean of the series Σ x / N


15. Take deviation of the items from the mean ( x-x)
16. Find the square of the deviation from actual ( x-x) mean2 / N
17. Sum the squares of the deviations Σ ( x-x)2
18. Find the average of the squares of the deviations Σ ( x-x)2 / N
19. Take the square root of the average of the sum of the deviation

Problems

1. Calculate the standard deviation of the following data

49 50 65 58 42 60 51 48 68 59

Standard deviation from actual mean


Arithmetic mean = Σ x/N

= 550 /10

= 55

S.D = √
Σ (x-x) 2 / N

= √ 614 /10

= √ 61.4

= 7.836

Standard deviation from assumed mean

Assumed mean = 50
S.D = √
Σ (x-x) 2 / N – Σ {(x-x) / N} 2

= √ 864 /10 – 50/10

= √ 86.4 – 52

= √ 81.4 – 25

= √ 61.4

= 7.836

Discrete Series

Standard deviation can be obtained by three methods.

20. Direct method


21. Short cut method
22. Step deviation

Direct method

Under this method formula is

S.D = √
Σ (fx) 2 / N – Σ {(fx) / N}2
Calculate standard deviation for the following frequency distribution.

Marks : 20 30 40 50 60 70

Frequency : 8 12 20 10 6 4

S.D = √
Σ (FX) 2 / N – Σ {(FX) / N} 2

= √ 112200/60 – Σ {2460 / 60}2

= √ 141870 – 2

= √ 1870 – 1681

= √ 189

= 13.747

Correlation Analysis

Economic and business variables are related. For instance, demand and supply of
a commodity is related to its price. Demand for a commodity increases as price
falls. Demand for a commodity decreases as its price rises. We say demand and
price are inversely related or negatively correlated. But sellers supply more of a
commodity when its price rises. Supply of the commodity decreases when its
price falls. We say supply and price are directly related or positively co-related.
Thus, correlation indicates the relationship between two such variables in which
changes in the value of one variable is accompanies with a change in the value of
other variable.

According to L.R. Connor, “if two or more quantities vary in sympathy so that
movements in the one tend to be accompanied by corresponding movements in
the other(s) they are said to be correlated”.
W.I. King defined “Correlation means that between two series or groups of data,
there exists some casual connection”.

The definitions make it clear that the term correlation refers to the study of
relationship between two or more variables. Correlation is a statistical device,
which studies the relationship between two variables. If two variables are said to
be correlated, change in the value of one variable result in a corresponding change
in the value of other variable. Heights and weights of a group of people, age of
husbands and wives etc., are examples of bi-variant data that change together.

Correlation and Causation

Although, the term correlation is used in the sense of mutual dependence of two
or more variable, it is not always necessary that they have cause and effect
relation. Even a high degree of correlation between two variables does not
necessarily indicate a cause and effect relationship between them. Correlation
between two variables can be due to following reasons:-

1. Cause and effect relationship: Heat and temperature are cause and effect
variable. Heat is the cause of temperature. Higher the heat, higher will be
the temperature.
2. Both the correlated variables are being affected by a third variable. For
instance, price of rice and price of sugar are affected by rainfall. Here
there may not be any cause and effect relation between price of rice and
price of sugar.
3. Related variable may be mutually affecting each other so that none of
them is either a cause or an effect. Demand may be the result of price.
There are cases when price rise due to increased demand.
4. The correlation may be due to chance. For instance, a small sample may
show correlation between wages and productivity. That is, higher wage
leading to lower productivity. In real life it need not be true. Such
correlation is due to chance.
5. There might be a situation of nonsense or spurious correlation between
two variables. For instance, relationship between number of divorces and
television exports may be correlated. There cannot be any relationship
between divorce and exports of television.

The above points make it clear that correlation is only a statistical relationship and
it does not necessarily signify a cause and effect relationship between the
variables.

Types of Correlation Analysis

Correlation can be:

• Positive or negative
• Linear or non-linear
• Simple, multiple or partial

Positive and Negative Correlation

When values of two variables move in the same direction, correlation is


said to be positive. When prices rise, supply increases and when prices
fall supply decreases. In this case, an increase in the value of one
variable on an average, results in an increase in the value of other
variable or decrease in the value on one variable on an average results
in the decrease in the value of other variable.

If on the other hand, values of two variables move in the opposite


direction, correlation is said to be negative. When prices rise, demand
decreases and when prices fall demand increases. In this case, an
increase in the value of one variable on an average results in a
decrease in the value of other variable.

Linear and Non-Linear Correlation

When the change in one variable leads to a constant ratio of change in


the other variable, correlation is said to be linear. In case on linear
correlation, points of correlation plotted on a graph will give a straight
line. Correlation is said to be non-linear when the change in one
variable is not accompanied by a constant ratio of change in the other
variable. In case of non-linear correlation, points of correlation plotted
on a graph do not give a straight line. It is called curvilinear correlation
because graph of such correlation results in a curve.

Simple, Partial and Multiple Correlations

Simple correlation studies relationship between two variables only. For


instance, correlation between price and demand is simple as only two
variables are studied in this case. Multiple correlation studies
relationship of one variable with many variables. For instance,
correlation of agricultural production with rainfall, fertilizer use and
seed quality is a multiple correlation. Partial correlation studies the
relationship of a variable with one of the many variables with which it
is related. For instance, seed quality, temperature and rainfall are
three variables, which determine yield of a crop. In this case, yield and
rainfall is a partial correlation.

Utility of Correlation

Study of correlation is of immense practical use in business and economics.


o Correlation analysis enables us to measure the magnitude of relationship
existing between variables under study.
o Once we establish correlation, we can estimate the value of one variable
on the basis of the other. This is done with the help of regression
equations.
o The correlation study is useful for formulation of economic policies. In
economics, we are interested in finding the important dependant variables
on the basis of independent variable.
o Correlation study helps us to make relatively more dependable forecasts

Methods of Studying Correlation

Following methods are used in the study of correlation:

o Scatter diagram
o Karl Pearson method of Correlation
o Spearman’s Rank correlation method
o Concurrent Deviation method

Scatter Diagram

This is a graphical method of studying correlation between two


variables. In scatter diagram, one variable is measured on the x-axis
and the other is measured on the y-axis of the graph. Each pair of
values is plotted on the graph by means of dot marks. If plotted points
do not show any trend, two variables are not correlated. If the trend
shows upward rising movement, correlation is positive. If the trend is
downward sloping, correlation is negative.

Karl Pearson’s Co-Efficient of Correlation

Karl Pearson’s Co-Efficient of Correlation is a mathematical method for


measuring correlation. Karl Pearson developed the correlation from the
covariance between two sets of variables. Karl Pearson’s Co-Efficient of
Correlation is denoted by symbol r. The formula for obtaining Karl
Pearson’s Co-Efficient of Correlation is:

Direct method

xy / N –Σ Covariance between x and y = y/N)Σ x/N X Σ (


SDx = standard deviation of x series = √ xΣ (2 / N) – x/N)Σ ( 2

SDy = standard deviation of y series = √ yΣ (2 y/N)Σ / N) – ( 2

Shortcut Method using Assumed Mean

If short cut method is used using assumed mean, the formula for
obtaining Karl Pearson’s Co-Efficient of Correlation is:

dxdy / N –Σ Covariance between x and y = dy/N)Σ dx/N X Σ (

SDx = √ dxΣ (2 dx /N)Σ / N) – ( 2

SDy = √ dyΣ (2 dy /N)Σ / N) – ( 2

Steps in calculating Karl Pearson’s Correlation Coefficient using


Shortcut Method

o Assume means of x and y series


o Take deviations of x and y series from assumed mean and get ∑dx and
∑dy
o Square the dx and dy and find the sum of squares and get ∑dx2 and ∑dy2.
o Multiply the corresponding deviations of x and y series and total the
products to get ∑dxdy.

If the deviations are taken from the arithmetic mean ∑dx = 0 and ∑dy
=0 and the formula becomes

Shortcut Method using Arithmetic Mean

If short cut method is used using actual mean, the formula for
obtaining Karl Pearson’s Co-Efficient of Correlation is:

Interpreting Co-Efficient of Correlation


The Co-Efficient of Correlation measures the correlation between two
variables. The value of Co-Efficient of Correlation always lies between
+1 and –1. It can be interpreted in the following ways.

If the value of Co-Efficient of Correlation r is 1 it is interpreted as


perfect positive correlation.

If the value of Co-Efficient of Correlation r is -1, it is interpreted as


perfect negative correlation.

If the value of Co-Efficient of Correlation r is 0 < r < 0.5, it is


interpreted as poor positive correlation.

If the value of Co-Efficient of Correlation r is 0.5 < r < 1, it is


interpreted as good positive correlation.

If the value of Co-Efficient of Correlation r is 0 > r > -0.5, it is


interpreted as poor negative correlation.

If the value of Co-Efficient of Correlation r is –0.5 > r > -1, it is


interpreted as good negative correlation.

If the value of Co-Efficient of Correlation r is 0, it is interpreted as zero


correlation.

Probable Error

Probable Error of Correlation coefficient is estimated to find out the extent to


which the value of r is dependable. If Probable Error is added to or subtracted
from the correlation coefficient, it would give such limits within which we can
reasonably expect the value of correlation to vary.

If the coefficient of correlation is less than Probable Error it will not be


significant. If the coefficient of correlation r is more than six times the
Probable Error, correlation is definitely significant. If Probable Error is
0.5 or more, it is generally considered as significant. Probable Error is
estimated by the following formula

PE = 0.6745 (1- r2/ √ N)

Coefficient of Determination

Besides probable error, another important method of interpreting coefficient of


correlation is the Coefficient of Determination. Coefficient of Determination is
the square of correlation or r2. For instance, suppose the coefficient of correlation
between price and supply is 0.8. We calculate the coefficient of determination as
r2, which is .82 or .64. It means that 64% of the variation in supply is on account
of changes in price.
Spearman’s Rank Correlation Method

Charles Edward Spearman, a British psychologist devised a method for measuring


correlation between two variables based on ranks given to the observations. This
method is adopted when the variables are not capable of quantitative
measurements like intelligence, beauty etc. in such cases, it is impossible to
assign numerical values for change taking place in such variables. It is in such
cases rank correlation is useful.

Spearman’s rank correlation coefficient is given by

rk = 1- 6 ∑D2 / n (n2-1)

Where D is the difference between ranks and n, number of pairs correlated.

Concurrent Deviation Method

In this method, correlation is calculated between direction of deviations and not


their magnitudes. As such only the direction of deviations is taken into account in
the calculation of this coefficient and their magnitude is ignored.

The formula for the calculation of coefficient of concurrent deviations is given


below:

rc = +- √
 2C-n / n

Steps in the Calculation of Concurrent Deviation

o Find out the direction of change of x-variable. When a successive figure in


the series increase direction is marked as + and when a successive figure
in the series decrease direction of change is marked as -. It is denoted as
dx.
o Find out the change in direction of y-variable. It is denoted as dy.
o Multiply dx and dy and determine the value of C. C is the number of
positive products of dxdy
 (- X – or + X +).
o Use the formula rc = +- √
 2C-n / n to obtain the value of coefficient of rc.

Problems

1. Calculate Karl Pearson’s co-efficient of correlation for the following data.

X : 43 44 46 40 44 42 45 42 38 40 42 57
Y : 29 31 19 18 19 27 27 29 41 30 26 10

Direct method

xy / N –Σ Covariance between x and y = y/N)Σ x/N X Σ (

DΣ x = standard deviation of x series = √ xΣ (2 / N) – x/N)Σ ( 2

Dy = standard deviation of y series =Σ √ yΣ (2 y/N)Σ / N) – ( 2

Shortcut Method using Assumed Mean

If short cut method is used using assumed mean, the formula for
obtaining Karl Pearson’s Co-Efficient of Correlation is:

dxdy / N –Σ Covariance between x and y = dy/N)Σ dx/N X Σ (

DΣ x = √ dxΣ (2 dx /N)Σ / N) – ( 2

Dy = Σ√ dyΣ (2 dy /N)Σ / N) – ( 2
dxdy = 494Σ

N = 12

dx = 43Σ

dy = 54Σ

2
dxΣ = 407

2
dyΣ = 944

= 0.714

Interpretation: There is good positive correlation between x and y variable.

Summary
Data processing is an intermediary stage of work between data collections and
data interpretation. The various steps in processing of data may be stated as:

- Identifying the data structures

- Editing the data

- Coding and classifying the data

- Transcription of data

- Tabulation of data.

The identification of the nodal points and the relationships among the nodes could
sometimes be a complex task than estimated. When the task is complex, which
involves several types of instruments being collected for the same research
question, the procedures for drawing the data structure would involve a series of
steps. Data editing happens at two stages, one at the time of recording the data
and second at the time of analysis of data. All editing and cleaning steps are
documented, so that the redefinition of variables or later analytical modification
requirements could be easily incorporated into the data sets. The editing step
checks for the completeness, accuracy and uniformity of the data set created by
the researcher. The edited data are then subject to codification and classification.
Coding process assigns numerals or other symbols to the several responses of the
data set. It is therefore a pre-requisite to prepare a coding scheme for the data set.
The recording of the data is done on the basis of this coding scheme.

o Numeric Coding: Coding need not necessarily be numeric. It can also be


alphabetic. Coding has to be compulsorily numeric, when the variable is
subject to further parametric analysis.
o Alphabetic Coding: A mere tabulation or frequency count or graphical
representation of the variable may be given an alphabetic coding.
o Zero Coding: A coding of zero has to be assigned carefully to a variable.

The transcription of data can be used to summarize and arrange the data in
compact form for further analysis. Computerized tabulation is easy with the help
of software packages. Frequency tables provide a “shorthand” summary of data.
The importance of presenting statistical data in tabular form needs no emphasis.
The major components of a table are:

- A Heading:

- Table Number

- Title of the Table


- Designation of units

- B Body

- Stub-head, Heading of all rows or blocks of sub items

- Body-head: Headings of all columns or main captions and their sub-captions.

- Field/body: The cells in rows and columns.

- C Notations:

- Footnotes, wherever applicable.

- Source, wherever applicable.

Variables that are classified according to magnitude or size are often arranged in
the form of a frequency table. In constructing this table, it is necessary to
determine the number of class intervals to be used and the size of the class
intervals. The most commonly used graphic forms may be grouped into the
following categories:

- Line Graphs or Charts

- Bar Charts

- Segmental presentations.

- Scatter plots

- Bubble charts

- Stock plots

- Pictographs

Chesnokov Faces

Copyright © 2009 SMU

Powered by Sikkim Manipal University

.
MB0034- Unit 12 -Research Report
Writing
Unit 12 -Research Report Writing

Meaning of Research Report

Research report is a means for communicating research experience to others. A research


report is a formal statement of the research process and it results. It narrates the problem
studied, methods used for studying it and the findings and conclusions of the study.

Objectives:

After learning this lesson you should be able to understand:

• Purpose of Research Report


• Characteristics of Research Report
• Functions of Research Report
• Types of Research Report
• Contents of Reports
• Styles of Reporting
• Steps in Drafting Reports
• Editing the Final Draft
• Evaluating the Final Drafts

Purpose of Research Report

The purpose of the research report is to communicate to interested persons the


methodology and the results of the study in such a manner as to enable them to
understand the research process and to determine its validity. The aim is not to convince
but to convey what was done, why and what was its outcome.

Characteristics of Research Report

Research report is a narrative and authoritative document on the outcome of a research


effort. It represents highly specific information for a clearly designated audience. It is
simple, readable and accurate form of communication.

Functions of Research Report


It serves as a means for presenting the problem studied, methods and techniques used for
collecting and analyzing data, findings and conclusions and recommendations. It serves
as a basic reference material for future use.

• It is a means for judging the quality of research project.


• It is a means for evaluating researcher’s competency.
• It provides a systematic knowledge on problems and issues analyzed.

Types of Research Report

Research reports can be classified as:

• Technical reports
• Popular reports
• Summary reports
• Research abstract
• Research article

These differ in terms of the degree of formality, physical form, scope, style and size.

Technical Reports

In a technical report a comprehensive full report of the research process and its outcome
are included. It covers all the aspects of the research process. A description of the
problem studied, the objectives of the study, method and techniques used, a detailed
account of sampling filed and other research procedures, sources of data, tools for data
collection, methods of data processing and analysis, detailed findings and conclusions
and suggestion.

Popular Reports

In popular report the reader is less interested in the methodological details, but more
interested in the findings of the study. Complicated statistics are avoided and pictorial
devices are used. After a brief introduction to the problem and the objectives of the study,
an abstract of the findings of the study, conclusion and recommendations are presented.
More headline, underlining pictures and graphs may be used. Sentences and paragraphs
should be short.

Interim Report

When there is a time lag between data collection and presentation of the result, the study
may lose significance and usefulness. An interim report in such case can narrate what has
been done so far and what was its outcome. It presents a summary of the findings of that
part of analysis which has been completed.
Summary Reports

Summary report is meant for lay audience i.e., the general pubic. It is written in non-
technical, simple language with pictorial charts that just contains objectives, findings and
its implications. It is a short report of two to three pages.

Research Abstract

Research abstract is a short summary of technical report. It is prepared by a doctoral


student on the eve of submitting his thesis. It contains a brief presentation of the
statement of the problem, the objectives of the study, methods and techniques used and
an overview of the report. A brief summary of the results of the study may also be used.

Research Article

Research article is designed for publication in a professional journal. A research article


must be clearly written in concise unambiguous language. It must be logically organized.
Progression from a statement of a problem and purpose of the study, through analysis of
evidence to the conclusions and implications are given in the report.

Contents of the Research Report

The outline of a research report is given below:

I. Prefatory Items

• Title page
• Declaration
• Certificates
• Preface/ acknowledgements
• Table of contents
• List of tables
• List of graphs/ figures/ charts
• Abstract or synopsis

II. Body of the Report

• Introduction
• Theoretical background of the topic
• Statement of the problem
• Review of literature
• The scope of the study
• The objectives of the study
• Hypothesis to be tested
• Definition of the concepts
• Models if any
• Design of the study
• Methodology
• Method of data collection
• Sources of data
• Sampling plan
• Data collection instruments
• Field work
• Data processing and analysis plan
• Overview of the report
• Limitation of the study
• Results: findings and discussions
• Summary, conclusions and recommendations

III. Reference Material

• Bibliography
• Appendix
• Copies of data collection instruments
• Technical details on sampling plan
• Complex tables
• Glossary of new terms used.

Styles of Reporting

Communicate to a Specific Audience

The first step is to know the audience, its background, and its objectives. Most effective
presentations seem live conversations or memos to a particular person as opposed to an
amorphous group. Audience identification affects presentation decisions such as selecting
the material to be included and the level of presentation. Excessive detail or material
presented at too low a level can be boring. The audience can become irritated when
material perceived as relevant is excluded or the material is presented at too high level. In
an oral presentation, the presenter can ask audience whether they already know some of
the material.

Frequently, a presentation must be addressed to two or more different audiences. There


are ways to deal with such a problem. In a written presentation, an executive summary at
the outset can provide an overview of the conclusions for the benefit of those in the
audience who are not interested in details. The presentation must respect the audience’s
time constraints. An appendix can be used to reach some people selectively, without
distracting the others. Sometimes introduction to a chapter or a section can convey the
nature of the contents, which certain audiences may bypass. In an oral presentation, the
presence of multiple audiences should be recognized.

Structure the Presentation

Each piece of presentation should fit into the whole, just as individual pieces fit into a
jigsaw puzzle. The audience should not be muttering. The solution to this is to provide a
well-defined structure. The structure should include an introduction, a body, and a
summary. Further, each of the major sections should be structured similarly. The precept
is to tell the audience what you are going to say, say it and then tell them what you said.
Sometimes you want to withhold the conclusion to create interest.

Introduction should play several roles. First, it should provide audience interest. A second
function is to identify the presentation’s central idea or objective. Third, it should provide
a road map to the rest of the presentation so that the audience can picture its organisation
and flow.

It is better to divide the body of the presentation into two to five parts. The audience will
be able to absorb only so much information. If that information can be aggregated into
chunks, it will be easier to assimilate. Sometimes the points to be made cannot be
combined easily or naturally. In that case, it is necessary to use a longer list. One way to
structure the presentation is by the research questions. Another method that is often
useful when presenting the research proposal is to base it on the research process. The
most useful presentations will include a statement of implications and recommendations
relevant to the research purpose. However, when researcher lacks information about the
total situation because the research study addresses only a limited aspect of it, the ability
to generate recommendations may be limited.

The purpose of the presentation summary is to identify and underline the important points
of the presentations and to provide some repetition of their content. The summary should
support the presentation communication objectives by helping the audience to retain the
key parts of the content. The audience should feel that there is a natural flow from one
section to another.

Create Audience Interest

The audience should be motivated to read or listen to the presentation’s major parts and
to the individual elements of each section the audience should know why the presentation
is relevant to them and why each section was included. A section that cannot hold interest
should be excluded or relegated to appendix.

The research purpose and objectives are good vehicles to provide motivation. The
research purpose should specify decisions to be made and should relate to the research
questions. A presentation that focuses on those research questions and their associated
hypothesis will naturally be tied to relevant decisions and hold audience interest. In
contrast, a presentation that attempts to report on all the questions that were included in
the survey and in the cross-tabulations often will be long, uninteresting and of little value.

As the analysis proceeds and presentation is being prepared, the researcher should be on
the lookout for results that are exceptionally persuasive, relevant, interesting, and
unusual. Sometimes, the deviant respondent with strange answers can provide the most
insight in his or her responses that are pursued and not discarded.

Be Specific and Visual

Avoid taking or writing in the abstract. If different members of the audience have
different or vague understandings of important concepts, there is a potential problem.
Terms that are ambiguous or not well known should be defined and illustrated or else
omitted. The most interesting presentations usually use specific stories, anecdotes,
studies, or incidents to make points.

Address Validity and Reliability Issues

The presentation should help the audience avoid misinterpreting the results. The wording
of the questions, the order in which they are asked, and the sampling design are among
the design dimensions that can lead to biased results and misinterpretations. The
presentation should not include an exhaustive description of all the design considerations.
Nobody is interested in a textbook discussion of the advantages of telephone over mail
surveys, or how you locate homes in an area sampling design.

The presentation should include some indication of the reliability of the results. At the
minimum, it always should be clear what sample size was involved. The key results
should be supported by more precise information in the form of interval estimates or a
hypothesis test. The hypothesis test basically indicates, given the sample size, what
probability exists that the results were merely an accident of sampling. If the probability
of the latter is not low, then the results probably would not be repeated. Do not imply
more precision than is warranted.

Steps in Drafting the Research Report

Along with the related skill of working with and motivating people, the ability to
communicate effectively is undoubtedly the most important attribute a manager can have.
Effective communication between research users and research professional is extremely
important to the research process. The formal presentation usually plays a key role in the
communication effort. Generally, presentations are made twice during the research
process. First, there is the research proposal presentation. Second, there is the
presentation of the research results.

Guidelines for successful presentations

In general a presenter should:

• Communicate to a specific audience.


• Structure the presentation.
• Create audience interest
• Be specific and visual
• Address validity and reliability issues

Editing the Final draft

A research report requires clear organisation. Each chapter may be divided into two or
more sections with appropriate headings and in each section margin headings and
paragraph headings may be used to indicate subject shifts. Physical presentation is
another aspect of organisation. A page should not be fully filled in from top to bottom.
Wider margins should be provided on both sides and on top and bottom as well.

Centred section heading is provided in the centre of the page and is usually in solid font
size. It is separated from other textual material by two or three line space.

Marginal heading is used for a subdivision in each section. It starts from the left side
margin without leaving any space.

Paragraph heading is used to head an important aspect of the subject matter discussed in a
subdivision. There is some space between the margin and this heading.

Presentation should be free form spelling and grammar errors. If the writer is not strong
in grammar, get the manuscript corrected by a language expert.

Use the rules of punctuations.

Use present tense for presenting the findings of the study and for stating generalizations.

Do not use masculine nouns and pronouns when the content refers to both the genders.
Do not abbreviate words in the text; spell out them in full. Footnote citation is indicated
by placing an index number, i.e., a superscript or numeral, at the point of reference.
Reference style should have a clear format and used consistently.
Evaluating the Final Draft

The general guidelines discussed so far are applicable to both written and oral
presentations. However, it is important to generate a research report that will be
interesting to read. Most researchers are not trained in effective report writing. In their
enthusiasm for research, they often overlook the need for a good writing style. In writing
a report, long sentences should be reconsidered and the critical main points should stand
out.

Here are some hints for effective report writing.

• Use main heading and subheadings to communicate the content of the material
discussed.
• Use the present tense as much as possible to communicate information.
• Whether the presentation is written or oral, use active voice construction to make
it lively and interesting, passive voice is wordy and dull.
• Use computer-generated tables and graphs for effective presentations.
• Use informative headings.
• Use double-sided presentation if possible. For example, tables or graphs could be
presented on the left side of an open report and their descriptions on the right side.

Summary

Research report is a means for communicating research experience to others. The purpose
of the research report is to communicate to interested persons the methodology and the
results of the study in such a manner as to enable them to understand the research process
and to determine its validity. Research report is a narrative and authoritative document on
the outcome of a research effort. It represents highly specific information for a clearly
designated audience. It serves as a means for presenting the problem studied, methods
and techniques used for collecting and analyzing data, findings and conclusions and
recommendations. It serves as a basic reference material for future use. It is a means for
judging the quality of research project. It is a means for evaluating researcher’s
competency. It provides a systematic knowledge on problems and issues analyzed. In a
technical report a comprehensive full report of the research process and its outcome. It
covers all the aspects of the research process. In popular report the reader is less
interested in the methodological details, but more interested in the findings of the study.
An interim report in such case can narrate what has been done so far and what was its
outcome. It presents a summary of the findings of that part of analysis which has been
completed. Summary report is meant for lay audience i.e., the general pubic. It is written
in non-technical, simple language with pictorial charts it just contains objectives, findings
and its implications. It is a short report of two to three pages. Research abstract is a short
summary of technical report. It is prepared by a doctoral student on the eve of submitting
his thesis. Research article is designed for publication in a professional journal. A
research article must be clearly written in concise and unambiguous language.
References:

1. R. Pannershelvam, Research Methodology, Prentice-Hall of India,


New Delhi, 2004.
2. P. L. Bhandarkar and T. S. Wilkinson, Methodology and Techniques of
Social Research, Himalaya Publishing House, Delhi.
3. Ackoff R. L., The Design of Social Research, Chicago, 1953.

––––––––––––––––––

Copyright © 2009 SMU

Powered by Sikkim Manipal University

Vous aimerez peut-être aussi