Vous êtes sur la page 1sur 18

Student Name : Pankaj Vohra

Roll No. 530910143


MBA – SEM 1
Statistics – MB0027
Set 1

1. What do you mean by sample survey? What are the


different sampling methods? Briefly describe them.

Answer: Sample: It is a finite subset of a population drawn from it


to estimate the characteristics of the population. Sampling is a tool
which enables us to draw conclusions about the characteristics of
the population.

Sample survey can also be described as the technique used to


study about a population with the help of a sample. Population is the
totality all objects about
which the study is proposed. Sample is only a portion of this
population, which is
selected using certain statistical principles called sampling designs

A sample is selected on the basis of following laws

1) Law of statistical Regularity: The group chosen tends to


posses the characteristic of the large population.
2) Principle of inertia of large no: Other things being equal, as
sample size increases the results tend to be more reliable and
accurate.
3) Principle of persistence of small no. : If population posses a
markedly distinct character, then it will be reflected in the sample
too.
4) Principle of validity; Sample is valid if only it enables to test
and estimation about population parameters.
5) Principle of optimization; to obtain desired value of efficiency
at minimum cost

Different Types of Sampling

i. Probability Sampling
ii. Non-Probability Sampling

1) Probability Sampling. Different ways of assigning probability


are

i. Each unit has the same chance of being selected.

ii. Sampling units have varying probability


iii. Units have probability proportional to the sample size.

Some of the important sampling designs are:

i. Simple Random Sampling Sample units are drawn so that each


and every unit in the population has an equal and independent
chance of being included in the sample. If sample unit is replaced
before drawing next unit, then it is known as Simple Random
Sampling with replacement [SRSWR]. Here probability of drawing a
unit is 1/N. If the sample unit is not replaced before drawing next
unit, then it is called Simple Random Sampling without replacement
[SRSWOR]. Here probability of drawing a unit is 1/N n. N is the
population size. Selection of Simple Random Sampling can be done
by

a) Lottery Method

b) The use of table of random numbers.

ii. Stratified Random Sampling

It is used when the population is heterogeneous with respect to


characteristic under study or the population distribution is highly
skewed.

We subdivide the population into several groups or strata such that

i) Units within each stratum is more homogeneous


ii) Units between stratum are heterogeneous and
iii) Strata do not overlap, in other words every unit of population
belongs to one and only one stratum.

The criterion used for stratification are geographical, sociological,


age, sex, income etc. The population of size N is divided into ‘K’
strata relatively homogenous of size N1, N2….Nk such that N1 + N2
+……… + Nk = N. Then we draw a simple random sample from each
stratum either proportional to size of stratum OR equal units from
each stratum.

Merits

a. Sample is more representative.

b. Provides more efficient estimate


c. Administratively more convenient

d. Can be applied in situation where different degrees of


accuracy is desired for different segments of population

Demerits

a. Many times the stratification is not effective.

b. Appropriate sample sizes are not drawn from each of the


stratum.

Systematic Sampling This design is recommended if we have a


complete list of sampling units arranged in some systematic order
such as geographical, chronological or alphabetical order.

Merits

a. Very easy to operate and easy to check

b. It saves time and labour

c. More efficient than Simple Random Sampling if we have up-


to-date frame

Demerits

a. Many case we do not get up-to-date list

b. It gives biased results if periodic feature exist in the data.

Cluster Sampling

The total population is divided into recognizable sub-divisions,


known as clusters such that within each cluster units are more
heterogeneous and between clusters they are homogenous. The
units are selected from each cluster by suitable sampling
techniques.

Multi-stage Sampling

The total population is divided into several stages. The sampling


process is carried out through several stages.

Merits

a. Greater flexibility in Sampling method

b. Existing division can be used.


Demerits are

a. Estimates are less accurate

b. Investigator should have knowledge of the entire population


that will be sampled.

Non-Probability Sampling

A predetermined number of sample units is selected purposely so


that they represent the true characteristics of the population,
depending upon the object of enquiry and other considerations

Demerit

a. It is highly subjective in nature.

b. The selection of sample units depends entirely upon the


personal convenience, biases, prejudices and beliefs of the
investigator.

Judgment Sampling

The investigator’s experience and knowledge about the population


will help to select the sample units, as the choice of sample depends
exclusively on the judgment of the investigator. It is most suitable
method if the population size is less.

Merits

a. Most useful for small population

b. Most useful to study some unknown traits of a population


some of whose characteristics are known.

c. To solve day-to-day problem

Demerits

a. It is not a scientific method

b. It has a risk of investigator’s bias being introduced.

Convenience Sampling
It is also called “chunk” which refers to the fraction of the population
being investigated which is selected neither by probability nor by
judgment. The sample units are selected according to convenience
of the investigator.

Quota Sampling

It is a type of judgment sampling. Under this design Quotas are set


up according to some specified characteristic such as age group,
income groups etc. From each group a specified number of units are
sampled according to the Quota allotted to the group.

2. What is the different between correlation and


regression? What do you understand by rank Correlation.
When we use rank correlation and when we use Pearsonian
correlation Coefficient? Fit a linear regression line in the
following data –
X 12 15 18 20 27 34 28 48
Y 123 150 158 170 180 184 176 130

Answer- Difference between Correlation and Regression.

1. Correlation: When two or more variables move in sympathy


with other, then they are said to be correlated. If both variables
move in the same direction then they are said to be positively
correlated. If the variables move in opposite direction then they
are said to be negatively correlated. If they move haphazardly
then there is no correlation between them.
Regression: Regression is defined as, “the measure of the
average relationship between two or more variables in terms of
the original units of the data.”

2. Correlation analysis deals with


1) Measuring the relationship between variables.
2) Testing the relationship for its significance.
3) Giving confidence interval for population correlation
measure.
Regression analysis is used to estimate the value of dependent
variables from the values of independent variables.

3. Correlation analysis -; To study the relationship between the


two variables x and y. Regression analysis-: To predict the
average x for a given y. In Regression it is attempted to quantify
the dependence of one variable on the other
4. Correlation quantifies the degree to which two variables are
related. Correlation does not find a best-fit line while regression
an be fit. You simply are computing a correlation coefficient (r)
that tells you how much one variable tends to change when the
other one does.

5. With correlation you don't have to think about cause and


effect. You simply quantify how well two variables relate to each
other. With regression, you have to consider about cause and
effect as the regression line is determined as the best way to
predict Y from X.

6. With correlation, it doesn't matter which of the two variables


you call "X" and which you call "Y". You'll get the same
correlation coefficient if you swap the two. with linear
regression, the decision of which variable you call "X" and
which you call "Y" matters a lot, as you'll get a different best-fit
line if you swap the two. The line that best predicts Y from X is
not the same as the line that predicts X from Y.

7. Correlation is almost always used when you measure both


variables. It rarely is appropriate when one variable is something
you experimentally manipulate. With linear regression, the X
variable is often something you experimental manipulate (time,
concentration...) and the Y variable is something you measure.

8. The correlation answers the STRENGTH of linear association


between paired variables, say X and Y. On the other hand, the
regression tells us the from of linear association that best
predicts Y from the values of X.

9. Correlation is calculated whenever:


a. Both X and Y is measured in each subject and quantifies
how much they are linearly associated.
b. In particular the Pearson's product moment correlation
coefficient is used when the assumption of both X and Y
are sampled from normally distributed populations are
satisfied
c. Or the Spearman's moment order correlation coefficient
is used if the assumption of normality is not satisfied.
d. Correlation is not used when the variables are manipulated,
for example, in experiments.
Linear regression is used whenever:
a. At least one of the independent variables (Xi's) is to
predict the dependent variable Y. Note: Some of the Xi's are
dummy variables, i.e. Xi = 0 or 1, which are used to code
some nominal variables.
b. If one manipulates the X variable, e.g. in an experiment.
10. Linear regression are not symmetric in terms of X and Y. That is
interchanging X and Y will give a different regression model (i.e.
X in terms of Y) against the original Y in terms of X.
On the other hand, if you interchange variables X and Y in the
calculation of
correlation coefficient you will get the same value of this
correlation
coefficient.

11. The "best" linear regression model is obtained by selecting the


variables
(X's) with at least strong correlation to Y, i.e. >= 0.80 or <=
-0.80.

12. The same underlying distribution is assumed for all variables in


linear regression. Thus, linear regression will underestimate the
correlation of the independent and dependent when they (X's
and Y) come from different underlying distributions.

Spearman’s Rank Correlation Coefficient

Charles Spearman rank correlation is denoted by the Greek letter ρ


(rho) or as rs, is a nonparametric

It assumes

i) Samples are drawn from a normal population.

ii) The variables under study are affected by a large number


of independent causes so as to form a normal distribution.

When we do not know the shape of population distribution and when


the data is qualitative type Spearman’s Ranks correlation coefficient
is used to measure relationship.

It is defined as

Where D is the difference between ranks assigned to the variables.


Value of r lies between – 1 and +1 and its interpretation is same as
that of Karl Pearson’s correlation coefficient.
If tied ranks exist, classic Pearson's correlation coefficient between
ranks has to be
used.

One has to assign the same rank to each of the equal values. It is an
average of
their positions in the ascending order of the values.

X 12 15 18 20 27 34 28 48
Y 123 150 158 170 180 184 176 130
Linear Regression Line for the above data

Total Numbers : 8
Slope (b) :0.16701
Y-Intercept (a) : 154.65
Regression Equation : 154.66 + 0.17x

_____________________________________________________________________
___
Q3. What do you mean by business forecasting? What are
the different methods of business Forecasting. Describe the
effectiveness of time-series analysis as a mode of business
forecasting. Describe the method of moving averages.

Answer- Business Forecasting

Business forecasting is the analysis of past and present economic


conditions with the object of drawing inferences about probable
future business conditions. The process of making definite estimates
of future course of events is referred to as forecasting and the figure
or statements obtained from the process is known as ‘forecast’
future course of events is rarely known These are two aspects of
scientific business forecasting.

i. Analysis of past economic conditions


ii. Analysis of present economic conditions:

Main methods of business forecasting:-

1. Business Barometers

Business indices are the indicators of future conditions, so they are


also known as “Business Barometers” or ‘Economic Barometers’.
Which can help in forecasting and decision making. It consist of
gross national product, wholesale prices, consumer prices, industrial
production, stock prices, bank deposits etc. These quantities may be
concerted into relatives on a certain base. The relatives so obtained
may be weighted and their average be computed. The index thus
arrived at in the business barometer.

The business barometers are of three types:

i. Barometers relating to general business activities.

ii. Business barometers for specific business or industry:.

iii. Business barometers concerning to individual business firm

2. Time Series Analysis

The forecasting through time series analysis is possible only when


the business data of various years are available which reflects a
definite trend and seasonal variation
3. Extrapolation

Extrapolation is the simplest method of business forecasting. By


extrapolation, a businessman find out the possible trend of demand
of his goods and about their future price trends also. The accuracy
of extrapolation depends on two factors:

i. Knowledge about the fluctuations of the figures,

ii. Knowledge about the course of events relating to the problem


under consideration.

Thus there are two assumptions on which extrapolation is based:

i.) There is no sudden jumps in figures from one period to another,

ii.) There is regularity in fluctuations and the rise and fall in uniform.

4. Regression Analysis

It is the means by which we select from among the many possible


relationships between variables in a complex economy those which
will be useful for forecasting. Regression relationship may involve
one predicted or dependent and one independent variables simple
regression, or it may involve relationships between the variable to
be forecast and several independent variables under multiple
regressions. Statistical techniques to estimate the regression
equations are often fairly complex and time-consuming but there
are many computer programs now available that estimate simple
and multiple regressions quickly.

5. Modern Econometric Methods

The term econometrics refers to the application of mathematical


economic theory and statistical procedures to economic data in
order to verify economic theorems. Models take the form of a set of
simultaneous equations. The value of the constants in such
equations are supplied by a study of statistical time series, and a
large number of equation may be necessary to produce an adequate
model.

6. Exponential Smoothing Method

This method is regarded as the best method of business forecasting


as compared to other methods. Exponential smoothing is a special
kind of weighted average and is found extremely useful in short-
term forecasting of inventories and sales.
7. Choice of a Method of Forecasting

The selection of an appropriate method depends on many factors –


the context of the forecast, the relevance and availability of
historical data, the degree of accuracy desired, the time period for
which forecasts are required, the cost benefit of the forecast to the
company, and the time available for making the analysis.

Effectiveness of Time Series Analysis :


Time series analysis is also used for the purpose of making
business forecasting. The forecasting through time series analysis is
possible only when the business data of various years are available
which reflects a definite trend and seasonal variation. By time series
analysis the long term trend, secular trend, seasonal and cyclical
variations are ascertained, analyzed and separated from the data of
various years.

Merits:
i) It is an easy method of forecasting.

ii) By this method a comparative study of variations can be


made.

iii) Reliable results of forecasting are obtained as this method is


based on
mathematical model.

The following are the possible uses of the time series:

i. Comparative study of the behavior of the variable over different


periods of time can be done. The variable may be export figures,
quantity of industrial production etc:

ii. Forecasting can be done using the time series. By studying the
variations and other behavior of the variables over a sufficiently
long period of time, it may be possible to forecast the future
behavior of the variables. However, such a forecast has meaning
only if the period of forecast is a normal period. For example,
various five-year plans by the Government of India are
formulated by studying the time series and forecasting.

iii. Study of the time series helps in analysing the post behavior of
the variables. This helps in identifying the various forces that
effect its behavior.
Method of Moving Averages

This method is used for smoothing the time series. That is, it
smoothens the fluctuations of the data by the method of moving
averages.

A) When Period of moving average is odd: To determine the trend


by this method, we use the following method:

i.) Obtain the time series

ii). Select a period of moving average such as 3 years, 5 years etc.

iii) Compute moving totals according to the length of the period of


moving average.

If the length of the period of moving average is 3 i,e., 3-yearly


moving average is to be calculated, compute moving totals as
follows:

a + b + c, b + c + d, c + d + e, d + e + f…..

Placing the moving totals at the centre of the time span from
which they are computed.

iv.) Compute moving averages by moving totals in step (3) by the


length of the period of moving average and place them at the
centre of the time span from which the moving totals are
computed. These moving averages are also called the trend
values.

By plotting these trend values (if desired) one can obtain the
trend curve with the help of which we can determine the trend
whether it is increasing or decreasing.

If needed, one can also compute short-term fluctuations by


subtracting the trend values from the actual values.

B). When period of moving averages is even: when period of


moving average is even (4years etc) we compute the moving
averages by using the following steps:

i) . Obtain the time series

ii.) Obtain the length of the period of moving average. Let the length
of the moving averages period be 4-years.
iii.) Compute 4 yearly moving totals and place them at the centre of
time span. The four – yearly moving totals are computed as
follows:

a + b + c + d, b + c + d + e, c + d + e + f,

iv.) Compute 4 – yearly moving average and place them at the


centre of the time span. Note that this placement is inconvenient,
because the moving average so placed would not coincide with
original time period.

v.) Take two – period moving average of moving averages and place
them at the middle of the periods. This process is called centring
of moving averages.

Merits of method of moving averages:

i.) This method is simple.

ii.) This method is objective in the sense that any body working on a
problem with this method will get the same results.

iii.) This method is used for determining seasonal, cyclic and


irregular variations besides the trend values.

iv.) This method is flexible enough to add more figures to the data
because the entire calculations are not changed.

v.) If the period of moving averages coincides with the period of


cyclic fluctuations in the data, such fluctuations are automatically
eliminated.

Limitations:

i.) There is no functional relationship between the values and the


time. Thus, this method is not helpful in forecasting and predicting
the values on the basis of time.

ii). There are no trend values for some year in the beginning and
some in the end. For example, for 5 – yearly moving average there
will be no trend values for the first two years and the last three
years.

iii.) In case of non – linear trend the values obtained by this method
are biased in one or the other direction.

iv. )The selection of the period of moving average is a difficult task.


Therefore great care has to be taken in selecting the period,
particularly, when there is no business cycle during that time.
____________________________________________________________________

4. What is definition of Statistics? What are the different


characteristics of statistics?
What are the different functions of Statistics? What are the
limitations of Statistics?

Answer ; Definition for “Statistics”


Different authors provide different definitions for statistics

1. Boddington “Statistics is the science of estimates and


probabilities’.

2. Croxton and Cowden, ‘Statistics is the science of collection,


presentation, analysis and interpretation of numerical data.’

3. Prof.Horace Secrit Statistics deals with aggregate of facts,


affected to marked extent by multiplicity of causes, numerically
expressed, enumerated or estimated according to a reasonable
standard of accuracy, collected in a systematic manner for a
predetermined purpose and placed in relation to each other.

Characteristic of Statistics

Statistics Deals with aggregate of facts: Single figure cannot


be analyzed..

1. Statistics Deals with aggregate of facts: Single Figure


cannot be analyzed. Thus, €the fact "Mr Lee is 170cms tall"
cannot be statistically analysed. On the other hand, if we know
the heights of 60 students of a class, we can comment upon the
average height, variations etc.
2. Statistics are affected to a marked extent by multiplicity
of causes: The statistics of yield of paddy is the result of factors
such as fertility of soil, amount of rainfall, quality of seed used,
quality and quantity of fertilizer used, etc.
3. Statistics are numerically expressed: Only numerical facts
can be statistically analyzed. Therefore, facts as ‘price decreases
with increasing production’ cannot be called statistics.
4. Statistics are enumerated or estimated according to
reasonable standards of accuracy: The facts should be
enumerated (collected from the field) or estimated (computed)
with required degree of accuracy. The degree of accuracy differs
from purpose to purpose.
5. Statistics are collected in a systematic manner: The facts
should be collected according to planned and scientific methods.
Otherwise, they are likely to be wrong and misleading.
6. Statistics are collected for a pre-determined purpose;
There must be a definite purpose for collecting facts. Eg.
Movement of wholesale price of a commodity.
7. Statistics are placed in relation to each other: The facts
must be placed in such a way that a comparative and analytical
study becomes possible. Thus, only related facts which are
arranged in logical order can be called statistics.

Functions of Statistics

• It simplifies mass data


• It makes comparison easier
• It brings out trends and tendencies in the data
• It brings out hidden relations between variables.
• Decision making process becomes easier.

Limitations of Statistics

1. Statistics does not deal with qualitative data. It deals only with
quantitative data.
2. Statistics does not deal with individual fact: Statistical methods
can be applied only to aggregate to facts.
3. Statistical inferences (conclusions) are not exact: Statistical
inferences are true only on an average. They are probabilistic
statements.
4. Statistics can be misused and misinterpreted: Increasing misuse
of Statistics has led to increasing distrust in statistics.
5. Common men cannot handle Statistics properly: Only statisticians
can handle statistics properly.

____________________________________________________________________

5. What are the different stages of planning a statistical


survey?
Describe the various methods for collecting data in a
statistical survey.

Answers-
The planning stage consists of the following sequence of activities.

1. Nature of the problem to be investigated should be clearly


defined in an
un ambiguous manner.
2. Objectives of investigation should be stated at the outset.
Objectives could be to obtain certain estimates or to establish a
theory or to verify a existing statement to find relationship
between characteristics etc.
3. The scope of investigation has to be made clear. It refers to area
to be covered, identification of units to be studied, nature of
characteristics to be observed, accuracy of measurements,
analytical methods, time, cost and other resources required.
4. Whether to use data collected from primary or secondary source
should be determined in advance.
5. The organization of investigation is the final step in the process.
It encompasses the determination of number of investigators
required, their training, supervision work needed, funds required
etc.

Methods of Collection Data

1) Primary data collection.


i. Direct personal observation
ii. Indirect oral interview
iii. Information through agencies
iv. Information through mailed questionnaires
v. Information through schedule filled by investigators.

2) In Direct personal observation the investigator collects data by


having direct contact with units of investigation.

3) Indirect oral interview is used when area to be covered is large. The


data is collected from a third party or witness or head of institution.

4) Through local agencies and correspondents.

5) Through Questionnaires. Generally adopted by research workers


and other official and non-official agencies

6) Through schedules filled by investigator through personal contact .

7) Secondary data may be collected either by census or sampling


methods.

8) Pilot survey: It is a small trial survey undertaken before main


survey. It gives a measure of efficiency of the Questionnaire
____________________________________________________________________

6. What are the functions of classification? What are the


requisites of a good
classification? What is Table and describe the usefulness of
a table in mode of presentation of data?

Answer-> Functions of Classification

a. It reduce the bulk data

b. It simplifies the data and makes the data more


comprehensible.
c. It facilitates comparison of characteristics.

d. It renders the data ready for any statistical analysis.

Requisites of a good classification

i. Unambiguous: It should not lead to any confusion

ii. Exhaustive: every unit should be allotted to one and only


one class

iii. Mutually exclusive: There should not be any overlapping.

iv. Flexibility: It should be capable of being adjusted to


changing situation.

v. Suitability: It should be suitable to objectives of survey.

vi. Stability: It should remain stable through out the


investigation

vii. Homogeneity: Similar units are placed in the same class.

viii. Revealing: Should bring out essential features of the


collected data.

Table .

It is a logical listing of related data in rows and columns.

Objectives of tabulation are:

i. To simplify complex data

ii. To highlight important characteristics

iii. To present data in minimum space

iv. To facilitate comparison

v. To bring out trends and tendencies

vi. To facilitate further analysis

Usefulness of a table in mode of presentation of data

Parts of a Table.

i. Table number:
ii. Title
iii. Captions
iv. Stubs
v. Body of the table
vi. Ruling and Spacing
vii. Head Note
viii. Source Note

Types of Table

a. Purpose of investigation: two types.

i. General purpose table or also known as reference


table. They are formed without specific objective, but can be
used for any specific purpose. They contain large mass of
data. Example: Census.

ii. Specific purpose table or text table or summary table


deals with specific problems. They are smaller in size and they
highlight relationship between characteristics. Example: Cost
of living indices.

b. The nature of presented figures: two types:

i. Primary Table: They contain data in the form in which it


were originally collected

ii.Derived Table: They represents figures like totals,


averages, ratios etc. derived from original data.

c. Construction: 3 types

i. Simple table: Presents only one characteristic.

ii. Complex table: Presents Two or more characteristics.

iii. The cross – classified Table: entries are classified in both


directions.

Vous aimerez peut-être aussi