Vous êtes sur la page 1sur 126

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

INTRODUCTION TO QUANTITATIVE TECHNIQUES IN BUSINESS

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

SESSION 1: INTRODUCTION TO QUANTITATIVE TECHNIQUES IN BUSINESS


A. Lesson Objective
This lesson will enable students to:
1. What is meant by Quantitative Techniques in Business? 2. Why to study Quantitative Techniques in Business? 3. Identify the Research Problem and how to write effective Problem

Statement?
4. Understand some core concepts including constant, variables research

questions, hypothesis and data?

A. Lesson Outline
1) What is QTB? 2) Why to study QTB? 3) Some core concepts in Quantitative Techniques in Business a. Research Problems & Problem statement b. Constant and Variables i. Types of variables 1. With respect to relationship a. Independent variable b. Dependent variable c. Mediating variable d. Moderating variable 2. With respect to data a. Categorical variable (Nominal, Ordinal) b. Numerical variable (Discrete, continues) c. Research Questions i. Types of Research questions 1. Descriptive research Questions 2. Differential research Questions 3. Associational research Questions 4. Complex Research Questions d. Research Hypothesis i. Types of Hypothesis 1. Null Hypothesis 2. Alternative Hypothesis e. Data i. Types of data

QUANTITATIVE TECHNIQUES IN BUSINESS 1. Cross-sectional data 2. Time series data ii.Data Matrix

QTB

QUATTITATIVE TECHNIQUES IN BUSINESS Business is all about decision making related to different managerial functions including marketing, management, finance, human resource, production and procurement with the objective of increasing profit and maximize market share. We live in a world of uncertainties and there in no way to eliminate completely the risks of wrong decisions in business. Hence being good businessmen (managers) we should resolve our problems so intelligently that the risk of uncertainty could be minimized in our business decisions. In this regard we use different techniques to gather, sort, analyze and interpret the data that help us improve our business decisions. Since this data is quantitative in nature hence these techniques are called quantitative techniques in business. Examples
Marketing department needs to have updated information about the

target markets, competitors, consumer buying behaviors and market situation In order to launch a new product. HR department needs to have data of current employees and growth rates of the company in order to predict and plan the future needs of human resources. Finance department needs to have statistical data regarding cost of production and sales to have financial forecasts breakeven analysis and investments decisions. Why to study Quantitative techniques in Business? QTB is different from other related courses offered to students as it encompasses the whole sphere of issues related to managerial decision irrespective of the area in which they are operating. Other statistical courses are theoretically taught while QTB is emphasized on deriving information for

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

solving practical problem. Furthermore this course is essential to study as this course enables us to:

Gather, sort, analyze and interpret the data Have latest updated, accurate, yet relevant information about different environmental factors Understand and compare different types of situations we confront in our business activities Predict and forecast about the future needs of the business Develop effective policies and business related strategies Make effective decisions that helps to achieve business goals efficiently All research whether academic or applied is based on Quantitative Techniques Thesis writing, which is essential for attaining degree, is based on Quantitative Techniques

SOME BASIC CONCEPTS/TERMS OF QUANTITATIVE RESEARCH Before proceeding to the quantitative techniques used in business, it is essential to understand some basic concepts related to these techniques. These concepts are as follows:
a) Research Problems:

Any problem that needs to be solved with the help of data collected through research is called Research Problem According the Kerlinger, in order to solve a problem, one must know what the problem is. Understanding, and defining the problem faced by managers, is critical to solve it because it is said that problem well defined and understood is half solved. Defining a problem is the first and the most important step in problem solving process. It serves as the foundation of a research study thus if well formulated, you expect a good study to follow. The way you formulate a research problem, determines almost every step that follows in the research study.

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

A Research Question is a statement that identifies the phenomenon to be studied. Problem statement A problem statement is a clear and concise description of the business issue faced by managers and that needs to be solved by them. Research problem is a statement that asks about the relationships between two or more variables. A good problem statement is in which it is clearly defined that
1. What actually problem is? 2. Who are the stakeholders of the problem 3. What is the scope and limitation of problem (rationally justified)

Examples What is the best strategy to promote a particular product? (Marketing) What is the main reason for employee turnover? (HRM)

Which is the right most option to invest the money? (finance)

Constant and variables A problem statement comprises of relationship between two or more variable. a) Constant If a concept has only one value and it does not change in a particular situation then it is called constant. Example If all participants of a study are female then Gender will be constant If all participants of a study have the same age (i.e. 25 years) then the Age will be constant. a) Variables A variable is defined as a characteristic of the participants or situation for a given study that has different values. A variable must vary or have different values in the study.

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Vary + able = Change + able Example If the participants of a study are of male and females then the Gender will be a variable. If the participants of a study are of different ages then the Age will be a variable. i. Types of Variables In quantitative research, variables are defined operationally and are commonly divided into different types on following basis a. On the basis of relationship b. On the basis of data a. On the basis of relationship Variables are divided in four types on the basis of relationship. i. Independent Variable: A variable that is not influenced in a specific situation but causes change in other variables such as advertising that causes change in sales of a product. Independent variable is also called explanatory or manipulated variable. ii.Dependent Variable: A variable that is influenced by any other variable (independent variable) in a specific situation. As in above example sales is influenced by advertising and hence it is called dependent variable. Dependent variable is also called outcome or response variable. iii.Mediating Variable: a variable that forms a link between
independent and dependent variables working as bridge between them. For example, in the example of advertising and sales advertising do not directly affect the sales rather advertising creates awareness and image that in turn causes increase in sales. Here awareness and image are the two mediating variables. iv.Moderating Variable: a variable that reduces the intensity or strength of independent and dependent variables. For example, competitors product, price, placement, or packaging moderates the relationship between advertising and sales.

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

b. On the basis of Data Variables are divided in to two broader types on the basis of data. i. Categorical variable: A variable whose values are not numerical in nature. For example Gender (Male, female),
Religion (islam, christianity, Jews, etc), Motivation level (High, medium, low) Types of Categorical variable: 1. Nominal variable A categorical variable whose values are not ordered for example Gender Male, Female 2. Ordinal variable A categorical variable whose values are in ordered for example Education Metric, inter, graduation ii.Numerical variable A variable whose values are numerical in nature for example No of employees (23, 45, 69, 100), Collar size (14, 14.5, 15, 15.5), Height (5.7, 5.8, 5.3) Types of Numerical variable 1. Discrete variable A numerical variable whose values have same interval for example Number of employees (23, 45, 69, 100), Collar size (14.5, 15, 15.5) 2. Continuous variable A numerical variable whose values dont have same interval for example Speed 40.1, 45.7, 67.5. Km/h Height 5.7, 5.8, 5.3 feet

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

a) Research Questions Research problem needs to be translated into one or more research questions that are defined as A research question is an interrogative statement that seeks for the tentative relationship among variables and clarifies what the researcher wants to answer. Example What is the impact of advertisement on sales of a new product in the market What is the annual turnover of employees in Higher educational institutions of Pakistan Does investing in stock market yield more return on investment as compare to investment in real estate.

Types of Research Questions On the basis of nature of problem, research questions are divided into three types
1. Descriptive research question:

A question that is answered through Summarising data about a single variable

Example: What is the annual turnover of employees in higher educational institutions of Pakistan?
2. Associational research question:: A question that is answered

through determining strength and direction of relationship between two or more variables Example: What is the impact of advertisement on sales of a new product in the market?

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

3. Difference research question: A question that is answered

through comparing and contrasting two groups on the basis of same variable Example Does investing in stock market yield more return on investment as compare to investment in real estate. Schematic diagram showing how the purpose and type of research question correspond to the general type of statistic used in a study.

a) Research Hypothesis

Research hypotheses are predictive statements about the relationship between two variables Types of Hypothesis There are two types of hypothesis
1) Null Hypothesis: A statement that nullifies the existence of

predicted relationship or difference between two variables. Example: and Sales Ho = There is no relationship between Advertising

10

QUANTITATIVE TECHNIQUES IN BUSINESS 2)

QTB

Alternative Hypothesis: A statement that relates the existence of predicted relationship or difference between two variables. Example: H1 = There is relationship between advertising and

sales Differences between Research Questions and Hypothesis Research question Interrogative statement Non-Predictive Non-Directional 4. Data
A set of raw facts and figures related to a specific problem is called Data

Hypothesis Simple statement Predictive Directional

Example:

Age: 16, 18, 20, 21, 23, Nationality: Pakistani, Indian, American

Types of data Data is divided on two bases 1. Nature of data Nature wise data can be of two types i. Quantitative data: a data that consist of numbers for example data about age consists of values like 16, 18, 20, 21, 23 (years) ii.Qualitative data: a data that consist of words rather than numbers. For example data about Nationality consists of values like Pakistani, Indian, and American etc. 2. Time frame: Time wise data can be of two types i. Cross-sectional data: Data that is collected from different units on same time ii.Time Series data: Data that is collected from same units on different time

11

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

4. Data matrix

Data matrix is a tabular arrangement of data in the form of rows and columns. In this arrangement, the

Rows represents the cases Columns represents the variables

Survey: Survey is a quantitative research strategy that involves the structured collection of data from a pre-determined sample. It involves following methods. 1. 2. 3. Survey Design Questionnaire Structured interview Structured Observation

1: Objectives of Survey The first step of survey design is to clearly define that why we are going to conduct the survey. Example: The basic aim of this survey is to collect updated, accurate yet relevant data in order to answer a research problem 2. Survey Design:

12

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

After setting objectives of survey we develop the plan (design) of survey deciding that: Whom to survey (Sample Selecting) Where to survey (Site Selecting) How to survey (Method)
What to survey (Questions for required information)

3. Pilot Test It is process of checking/assessing the accuracy of the wording sequence and ability to understand the question by conducting survey from one or two respondent as a trail in order to refine questionnaire 4. Fieldwork/conduct a survey It is a process of collecting data actually from the target sample. It can be done in following ways: Self administered survey Postal survey Online survey 5. Data Preparation After getting your survey completed and knowing the interface of the SPSS the next step is to prepare the data for analysis. This process involves four steps. 1. Coding the questionnaire. 2. Defining the variables in SPSS variable view. 3. Entering the data in SPSS data view. 6. Data Analysis It is a process of summarizing, organizing and transforming data with the goal to highlight the useful information, suggesting conclusions in order to support good decision making.

13

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Data can be analyzed in two ways: Descriptive Inferential 7. Interpretation: Interpretation is a process of making sense of results by explaining and assigning meaning to them. 8. Report Writing: References Morgan, L. Leech, W. Gloeckner & Barrett (2007) SPSS for Introductory Statistics: Use and Interpretation (3rd ed.) Mahwah, NJ: Lawrence Erlbaum Associates. Jarrett, D. (2007) Using SPSS (6th ed.) Middlesex University. Pallant, J. SPSS Survival Manual A Step by Step Guide to Data Analysis using SPSS for Windows (3rd ed.) McGraw Hill Open University Press

14

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

ACTIVITY

15

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Class Activity Session 1


1. Write down three examples of each type of variables
Types of variables on the basis of Relationship Independent variable
1. 2.
3.

Dependent variable
1. 2. 3.

Mediating variable
1. 2.
3.

Moderating variable
1. 2.
3.

Types of variables on the basis of data type Categorical variable Nominal variable Ordinal variable
1. 2. 3. 1. 2. 3. 1. 2. 3.

Numerical variable Discrete variable


1. 2. 3.

Continuous variable

1. Categorize the following variables according to their types Gender, Marital Status, nationality, qualification , motivation level, ethnicity, income, color size, colors of cars 2. For given research problems faced by managers answer the following queries

16

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

SITUATION: HR manager of ABC Company is facing high rate of employees turnover due to which organizational performance is affecting. a. b. c. d. e. f. develop problem statement identify variables and their types develop research question develop hypothesis decide about design of survey Decide which type of data will be collected.

17

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

INTRODUCTION TO SPSS AND DATA PREPARATION

18

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

A. Lesson Objective
After attending this session, the students will be able to : 1. Understand what is SPSS 2. How to run SPSS software
3. Understand how to Code the Qualitative data

4. Learn How to define the variables using variable view in SPSS 5. Learn How to enter the data using Data view in SPSS

B. Lesson Outline
1) Introduction to SPSS 2) How to run SPSS 3) SPSS Interface 4) Data Preparation (Processing) 5) Data Analysis

19

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Introduction to SPSS
SPSS stands for statistical package for social sciences. It is a software that is basically used for the analysis of quantitative data.

1. How to open SPSS 2. SPSS Interface


SPSS has user friendly interface similar to MS. Excel software including two sheets having row and columns format. It comprises of 1. Title bar (at the top showing title of file) 2. Menu bar (below the title showing menu list) 3. Tool bar (showing different tools) 4. List of attributes of variables (Header row) 5. Serial Number (left most column) 6. Working area (cells comprising row and columns) 7. Scroll bars (right most and lowest end) 8. Views tabs (variable view / data view) 8.1VARIABLE VIEW Variable view is used to define the variables on the basis of different attributes it includes. Rows indicate variables. Columns indicate attributes variable You can add or delete variables and modify attributes of variables, including the following attributes: 8.1.1 Name of the variable (Short without space) 8.1.2 Type (Numeric, String etc) 8.1.3 Width (8, 10, etc) 8.1.4 Decimals (2, 3, 5 etc for continuous variables) 8.1.5 Label (Full name of the variable) 8.1.6 Values (answer categories with codes) 8.1.7 Missing (blank, multiple, wrong answers) 8.1.8 Columns(6, 8, 10 etc) 8.1.9 Align (Left, right, centre) 8.1.10Measure(Nominal, ordinal, scale) 1.1Data View

20

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Data view is used to enter the data of each case (row wise) against each variable (column wise) according to the coding scheme, in the form of a data matrix Rows are cases. Each row represents a case or an observation. For example, each individual respondent to a questionnaire is a case. Columns are variables. Each column represents a variable or characteristic that is being measured. For example, each item on a questionnaire is a variable. Cells contain values. Each cell contains a single value of a variable for a case. The cell is where the case and the variable intersect. Cells contain only data values.

1. Data Preparation (Processing)


After getting your survey completed (Sample attached as annexure 1) and knowing the interface of the SPSS the next step in quantitative research process is to prepare the data for analysis. This process involves four steps. 1.1Coding the questionnaire. 1.2Defining the variables in SPSS variable view. 1.3Entering the data in SPSS data view. 1.4Checking the data for errors.

3.1 Coding the questionnaire After assigning ID numbers to the completed questionnaires, the researcher should begin the coding process. Coding is the process of assigning numbers to the values or levels of each variable. Before starting the coding process, you should keep in mind some coding rules to avoid any coding mistakes. These rules are as under Rules of Coding

All data should be numeric. (e.g. Male = 1 and Female = 2)

21

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Each variable must occupy the same column. (one column for one variable) All values (codes) for a variable must be mutually exclusive. Questions should be phrased so that persons would logically chose only one of the provided options and all possible options should be provided. A final category of other may be provided in cases where all possible options can not be listed but these are not very useful for statistical purposes Each variable should be coded to give maximum information. Do not collapse categories or values when you set up the codes for them rather try to code and enter the data in as detailed a form as available. Thus enter actual test scores, GPAs etc. as specific as possible other wise use categories to get the data. For each participant, there must be a code or value against each variable. These codes should be numbers, except for variables for which the data are missing. It is recommended to use blanks for missing data as SPSS is designed to handle blanks as missing values. Alternatively you can code extra ordinary high values for blank, multiple or wrong answers (i.e. 98 or 99). But in this case you must tell SPSS (while defining variables) that these codes are for missing values otherwise the SPSS will treat them as actual data Apply any coding rule consistently for all participants. It means that be consistent in your coding scheme. For example if you have decided to code male=1 and female=0 then this coding scheme will be used for all the cases. You can not use multiple coding schemes for different cases against same variable. Use high numbers (codes) for positive values (Strongly agree=5) and small numbers for negative values (strongly disagree=1). For a variable that is ordered

3.2 Defining variables in SPSS variable View: the next step is to define the variables in SPSS. For this purpose create and save an SPSS data file (Blank) into which you will enter the data. Click on the variable view tab. You will find the following window

22

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

In this window the numbers in left most columns shows the serial number like 1, 2, 3, 4 .. (row wise) and variable attributes (column wise). Remember that each question will be named as a variable and define each variable on the basis of following attributes by clicking in the blank boxes under them.

3.2.1 Name of the variable Always name the variables as short as possible and also without space in them. For example Type in Recommen in cell parallel to number 1 below the Name and press enter. The cursor will move forward to the next cell that is TYPE. Note that each variable name must be unique; duplication is not allowed and the first character must be a letter or one of the characters @, #, or $. 3.2.2 Type Enter the type of variable that can be Numeric. A variable whose values are numbers. Values are displayed in standard numeric format. The Data Editor accepts numeric values in standard format or in scientific notation. It can be further specified by selecting other variable types including comma, Dot, scientific notation, date, dollar or a custom currency String. A variable whose values are not numeric and therefore are not used in calculations. The values can contain any characters up to the defined length. Uppercase and lowercase letters are considered distinct. This type is also known as an alphanumeric variable. But preferably numeric type should be used by giving dummy codes (male=1 and female=0) to the string variables 3.2.1 Width Width indicates the number of digits you can place in one value (Code). It is recommended to have width=8 for a better output. 3.2.2 Decimals Decimals indicate the number of decimal places you need to have in a code or value. Preferably it should be not more than 2. It is preferably used in continues variables. 3.2.3 Label

23

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

In label column you need to write the full name or phrase of the variable so that you could remember that which question was named as this variable (Recommen is labeled as I recommend course). It can be upto 40 characters with spaces but it is recommended to keep it upto 20 characters so that the printouts of results would be readable. 3.2.4 Values (answer categories with codes) In values column numeric codes are assigned to the categories of answers (i.e 5=strongly agree etc). We click on the none in then click on the three dots button and in value labels window insert value (5,4,3,2,1 etc) and Label (Strongly agree, agree, undecided, disagree, Strongly disagree) then click on add each time and finally click OK. 3.2.5 Missing This column is used to assign the codes for the missing values. Missing values are defined to accommodate the errors in filling of questionnaires by respondents. Respondents can have three different types of mistakes that are Blank answer the respondent (s) did not attempted a question or a series of questions Multiple answer the respondents (s) marked two options rather attempting only one. Wrong answer the respondent (s) gave answer of their own rather marking out of the given options You can assign missing value codes (large and novel values i.e.98, 99) by clicking on the none in missing value column, click on the three dot button and writing in upto three missing values in discrete missing value option. You can also assign only one global missing value for all types of error. Remember If you do not define missing values then SPSS will use it in analysis considering it a normal value. 3.2.1 Columns This option is used to define the width of the columns in data view to accommodate number of digits in a value against a variable. Preferably it should be 8 to accommodate the 8 digit numbers defined in width option 3.2.2 Align Align option is used to define the alignment (left, right, center) of the data in data view. Preferably the numbers are aligned right in SPSS. So select right in the dropdown box of Align 3.2.3 Measure Measure option is used to define the level of measurement of the variable. SPSS provides only three choices for level of measurement: nominal, ordinal or scale.

24

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Nominal: a variable can be treated as nominal If the categories are just different names and not ordered (Low to high), label the variables as nominal is the SPSS variable view (remember the nominal variables with only two categories are called dichotomous but are marked Nominal in SPSS) Ordinal: a variable can be treated as ordinal If the categories or values of a variable vary from low to high (i.e., are ordered) and there are only three or four such values (e.g. good better, best, or strongly disagree, disagree, agree, strongly agree), we recommend that you label the variable ordinal. Also, if there are five or more ordered levels or values of a variable and you suspect that the frequency distribution of the variable is substantially non-normal, label the variable ordinal. Scale: a variable can be treated as scale when its values represent ordered categories with a meaningful metric, so that distance comparisons between values are appropriate. Examples of scale variables include age in years and income in thousands of dollars. Furthermore If the variables have five or more ordered categories or values and you have no reason to suspect that the distribution is non-normal, label the variable scale in the SPSS variable view measure column. If the variable is essentially continuous (i.e. measured to one or more decimal places or is the average of several items), it is likely to be at least approximately normally distributed, so call it scale. (Remember that SPSS marks both interval and ratio measures as Scale)

Table 1 Measurement levels

25

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Traditional Term Nominal


Nominal Dichotomou s 2 Levels Ordered or not Ordinal 3 + levels Not Ordered True Categories Names, labels 3+ Levels Ordered levels Unequal Intervals between levels Not normally distribut ed Compete nce Scale Mothers Educatio

Table 2 Characteristics and Examples of Our Four Levels of Measurement


Normal (Scale)

Two categ NA

NA
Examples

Characteri stics

5 + levels Ordered levels Approximatel y normally distributed Equal Intervals between levels

Ordinal
Ethnicit y Religio n Curricu

Gender Math grades (high vs. low)

SAT math Math Achievement Height

26

Orde differ

QUANTITATIVE TECHNIQUES IN BUSINESS lum Type Hair Color n

QTB

3.3 Data Entry in SPSS Data View: After defining all the variables one by one in variable view of the SPSS, next step is to enter the data in the data view of the SPSS. Click on the data view tab in SPSS you will have this form of window

Here you have numbers on the left most column that shows the number of cases (i.e 1, 2,3 ) row wise and the top most row showing variables that are defined in variable view (recommend, work hard, college etc.) column wise. Click on the cell below recommend in front of case 1, and enter the answer code from filled questionnaire in it (i.e. 3) and press the right arrow. Enter 5 under work hard and press right arrow and continue to entering the data codes till last variable. Now the data of first case against each variable is entered. Keep on the same practice until the data for each case against each variable is entered. Put missing value (i.e. 99) wherever you find any blank, multiple or wrong answers by respondents. The data file will look like following

27

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

3 Data Analysis
Data analysis is a process of organizing, summarizing, presenting, interpreting, and drawing conclusions based on data with the goal of highlighting useful information, and supporting decision making. In quantitative research data analysis is performed objectively using statistical techniques. Statistics is a branch of applied mathematics concerned with the collection and interpretation of quantitative data to draw conclusions and test (accept or reject) hypothesis. There are two levels/types of statistics
1. Descriptive statistics 2. Inferential statistics Descriptive statistics will be learnt in next class

ACTIVITY

28

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Class Activity Session 2


Exercise: Please code the following sample questionnaire, define variables, enter data in SPSS

29

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

30

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

31

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

32

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

33

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

DATA ANALYSIS
Descriptive Statistics

34

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

A. Lesson Objectives
After studying this session you would be able to: 1. 2. 3. 4. 5. Produce simple graphical and numerical summaries of data. Measure the location (Average) of the data Measure the dispersion(Spread) of the data Check the data normality Use Data transformation techniques 5.1Count, Reverse, Revise 5.2Compute a new variable

A. Lesson Outline
1. Descriptive statistics 1.1Summarizing Numerical Data 1.1.1Five Figure Summaries 1.1.2Frequency Distribution 1.1.2.1Tables 1.1.2.2Graphs 2. Measures of Central Tendency 2.1Mean 2.2Median 2.3Mode 3. Measures of Variability 3.1Standard Deviation 3.2Range 3.3Interquartile range 3.4Variance 4. Normality of data 4.1Skewness 4.2Kurtosis

35

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Descriptive Statistics
Descriptive statistics are the statistics that are used to understand and describe the data. They are used to answer the descriptive type of research questions. It involves Summarizing the data Measure of central tendency Measure of dispersion Checking data normality Data file management Recode and transform variables 1- Summarizing the data A data matrix contains too much information to be taken in at a glance due to which it becomes difficult to understand and get feel of the data. A set of data can be understood only if it is summarized in some appropriate way. Summarizing data techniques varies based on the type of data that whether the data is categorical or numerical. We will see how both types of data are summarized one by one. 1.1-Summarizing categorical data A categorical variable is usually summarized in frequencies and there percentages. This process is called Frequency distribution. It can be presented in two ways that are in the form of Tables of frequency and percentages or Graphs. Lets see frequency distribution in detail. 1.1.1-Frequency Distribution. A frequency distribution is a tally (IIII) or count of the number of times each score (category) on a single variable is marked by respondents. A frequency can be further summarized by expressing them as percentages of the total using following formula Percentage = (frequency/total) X100 Example The frequency distribution of final grades in a class of 50 students might be 7 As, 20 Bs, 18 Cs and 5 Ds. Note that in this frequency distribution most students have Bs or Cs (grades in the middle) and similar small numbers have As and Ds (high and low grades).

36

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

When there are a small number of scores for the low and high values and most scores are for the middle values, the distribution is said to be approximately normally distributed To get a frequency distribution Table: Analyze Descriptive Statistics Frequencies move religion to the variable box OK (make sure that the Display frequency tables box is checked) Fig.1. Frequency table for religion in hsbdata Frequenc Valid y Percent Percent Valid protestant catholic no religion Total Missing other religion blank Total Total 30 23 14 67 4 4 8 75 40.0 30.7 18.7 89.3 5.3 5.3 10.7 100.0 44.8 34.3 20.9 100.0 Cumulative Percent 44.8 79.1 100.0

Interpretation: In this example, there is a Frequency column that shows the numbers of students who marked each type of religion (e.g., 30 said protestant and 4 left it blank). Notice that there are a total of (67) for the three responses considered Valid and a total (8) for the two types of responses considered to be Missing as well as an overall total (75). The Percent column indicates that 40.0% are protestant, 30.7% are catholic, 18.7% are not religious, 5.3% had one of several other religions, and 5.3% left the question blank. The Valid Percentage

37

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

column excludes the eight missing cases and is often the column that you would use. Given this data set, it would be accurate to say that of those not coded as missing, 44.8% were protestant and 34.3% catholic and 20.9% were not religious. Frequency distribution graphs With Nominal data, you should not use a graphic that connects adjacent categories because with nominal data, there is no necessary ordering of the categories or levels. Thus, it is better to make a bar graph or chart of the frequency distribution of variables like religion, ethnic group, or other nominal variables; the points that happen to be adjacent in your frequency distribution are not by necessarily adjacent. Bar Charts
bar graphs are usually used to display "categorical qualitative data", the bars in bar graphs are usually separated and the height of the bars shows the frequency of that category.

To get a bar chart select Graphs legacy dialogues interactive chart move variable to the box OK Fig.2. Bar chart for the nominal variable religion

bar

1.1-Summarizing Numerical Data Simple numerical summaries of a numerical variable can be obtained through 1.1.1.Five Figure Summary The data can be summarized by quoting five figures if the data is first sorted into (ascending) numerical order. These five figures are

38

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

1. Minimum value (Min) the smallest value, with rank 1 2. Maximum value (Max) the largest value, with rank n,

and 3. Median (M/Q2) The middle value, with rank (n+1)/2


The median divides the data into two halves, each with the same number of observations. Each of these halves may, in turn, be divided into two by quartiles, so that the data is split into 4 quarters. These are known as: 4. Lower quartile (Q1) The middle value of first half. 5. Upper quartile (Q3) The middle value of second half

Rank the values from 1 (the smallest value) to n (the largest value; n denotes the total number of observation).
Minimum Maximum Median

Lower half

Upper half

Lower quartile

Upper Quartile

Example 1: Department An absenteeism data. Consider the


absenteeism data for a department in an organization Department A: 20 employees 0 0 2 2 0 0 1 1 3 1 2 3 Step 1-Ascending order 15 0 95 0 0 0 1 1 1 2 2 2 3 3 3 5 5 5 8 10 3 5 95 5 5 8 10 15

Step 2 Ranking Ra nk Val ue 1 2 3 4 5 6 7 8 9 1 0 0 0 0 0 1 1 1 2 2 2 1 1 3 1 2 3 1 3 3 1 4 5 1 5 5 1 6 5 1 7 8 1 8 1 0 1 9 1 5 9 5 9 5

Step 3 deriving summary elements The minimum value at rank 1 is 0, The Maximum value at rank 20 (n) is 18

39

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

The median at rank (n+1/2= 20+1/2=21/2=10.5). since the 10th value is 2 and the 11th values is 3 so Median = (2+3)/2= 2.5

The lower half consists of the values ranked from 1 to 10. The middle rank is therefore (1+10)/2 = 5 . The 5th value is 1 and the 6th value is also 1, so Lower quartile = (1+1)/2 = 1

Similarly, the upper half consists of the values ranked from 11 to 20. The middle of these ranks is (11+20)/2 = 31/2 = 15. The 15th and 16th values both are 5, so Upper quartile = (5+5)/2 = 5

Table 3: The five-figure summary


Summary Minimum Lower quartile Median Upper quartile Maximum 0 1 2.5 5 95 Value

Exercise: summarize the following department B using five figure summary Department B: 30 employees 2 2 2 2 2 2 3 8 8 8 8 10 10 12 15 3 3 4 4 5 5

absenteeism

data

of

7 8

1.1.1.Boxplot
A boxplot is a quick method of summarizing and graphically representing ordinal and scale data for examining one or more sets of data. It is also called box and whisker plot. It is useful to Summarize the data by getting five figure summary Check the data for errors Examine and compare frequency distributions

40

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Check assumption for inferential statistics (Check normality of data)

Boxplot for one set of data Graphs Boxplot of separate variables the boxes represent box in boxplot window select simple and summaries click define select the variable and move it into click ok

41

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Case Processing Summary Cases Valid N math achievement test Percent 75 100.0% N 0 Missing Percent .0% N 75 Total Percent 100.0%

Upper whisker =Max


Upper Quartile = Q3 Median = Q2 Lower Quartile = Q1

Lower whisker =Min

Interpretation The case processing summary table shows the valid N=75, with no missing values for total sample of 75 for the variable math achievement. The plot shows a box plot for math achievement. The box represents the middle 50% of the cases (M=13), lower end of the box shows lower quartile (Q1=7.67), and upper end of the quartile shows upper quartile (17.00). The whiskers indicate the expected range (25.33) of scores from minimum (Min=-1.67) to Maximum (Max=23.67). Scores outside of this range are considered unusually high or low, such scores are called outliers. There are no outliers for in this case. Boxplot for two sets of data

42

QUANTITATIVE TECHNIQUES IN BUSINESS To draw boxplot for two or more data sets click on Graphs legacy dialogues interactive box plot the x-axis and move SAT math to y-axis OK

QTB

move gender to

Box and whisker plot for ordinal or normal data Interpretation


Fig. 5 shows two box plots, one for males and one for females. The box represents the middle 50% of the cases (i.e. those between the 25th and 75th percentiles). The whiskers indicate the expected range of scores. Scores outside of this range are considered unusually high or low. Such scores, called outliers, are shown above and or below the whiskers with circles or asterisks (for very extreme scores) and the SPSS data view line number for that participant. Note there are no outliers for the 34 males, but there is a low (#6) and a high (#54) female outlier. (Note this number will not be the participants ID unless you specify that SPSS should report this by ID number or the ID numbers correspond exactly to the line number).

Histograms
Histogram is a form of a bar graph used with numerical (scale) variable preferably of continuous nature. The intervals are shown on the X-axis and the number of scores in each interval is represented by the height of a rectangle located above the interval. Unlike the bar graph, in a histogram there is no space between the bars. The data is continuous so the lower limit of any one interval is also the upper limit of the previous interval. It is useful to Summarize the data Examining and comparing frequency distributions Check normality of data

To draw a histogram select: Graphs legacy dialogues variable to the box OK interactive histogram move

Fig.3. Histogram of SAT- math score

43

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Interpretation In fig. 3 the frequencies (number of students), shown by the bars are for a range of points (in this case SPSS selected a range of 50: 250-299, 300349, 350-399, etc). Notice that the largest number of students (about 20) had scores in the middle two bars of the range (450-499 and 500-549). Similar small numbers of students have very low and very high scores. The bars in the histogram form a distribution (pattern or curve) that is similar to the normal, bell shaped curve. Thus, the frequency distribution of the SAT math scores is said to be approximately normal. In fig. 4 shows the frequency distribution for the competence scale. Notice that the bars form a pattern very different from the normal curve line. This distribution can be said to be not normally distributed. As we see later in the chapter, the distribution is negatively skewed. That is, extreme scores or the tail of the curve are on the low end or left side. Note how much this differs from the SAT math score frequency distribution. As you will see in the Levels of Measurement section, we call the competence scale variable ordinal.

1.1-Scatter plot
Scatter plot is a plot or graph of two variables that shows how the score on one variable associates with his or her score on the other variable. Each dot or circle on the plot represents a particular individuals score on the two variables with one variable being represented on the X axis and the other on the Y axis. The measurement for both variables is continuous (measurement data). It is useful to Gain insight into the relationship between two scale variables. To check the assumptions of linearity for correlation and regression statistics

44

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

To locate the outliers that are far from the regression line.

To draw boxplot for two or more data sets click on Graphs legacy dialogues interactive scatter plot move Scholastic aptitude to the x-axis and move competence scale to y-axis OK

Interpretation The output shows a scatter plot for two scale variables i.e. scholastic aptitude test and competence scale
The overall pattern of the dots show that it is from diagonal upward straight regression line showing positive association between the two variables and the points fit the line pretty well (r2= 0.08) and there are very few values dispersed far from the regression line so it seems that there is strong relationship between scholastic aptitude test and competence scale

1. Measures of Central Tendency (Average/ Location)


Central tendency of a data set refers to a measure of the "middle, central or average" value of the data set in order to find out the only one value that can represent the whole data set. It is also called measure of the location. It includes Mean is the arithmetic average of numerical data. It is an appropriate measure of central tendency when there is less fluctuation in data and values are more consistent with no outliers. It is the most common measure of central tendency. It can be calculated by dividing sum of the values ( X) with the number of values (n). Its formula is

45

QUANTITATIVE TECHNIQUES IN BUSINESS X = X /n Median is the middle value of the numerical data. measure of central tendency for ordinal raw data is more fluctuations and outliers. It is the midpoint of a same numbers of scores are above the median as calculated by

QTB

It is an appropriate less consistent with distribution that the below it. It can be

Arranging the data in ascending order Ranking them And locating the value at middle rank using the formula as under X = (n+1)/2th value

Mode is the most common category in the data. It is a measure of central tendency for any kind of data but it is most appropriate for categorical data preferably of qualitative nature. It generally provides the least precise information about central tendency in case of categories of ordinal or scale data. Remember that some time data can have more than one mode. Mode is denoted by X = the most frequent value

The mean median and mode have same value if the data is normally distributed (symmetrical) but would have varying values if the data is skewed. The suitability of measures of central tendency is given in table below

To find out mean, median, and mode click on Analyze Descriptive statistics frequencies move the Scholastic aptitude to the variables box click on statistics tab check mean median mode click OK

Fig.6. Mean, Median Mode of SAT math score N Valid 75

46

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Missing Mean Median Mode

0 490.53 490.00 500

1. Measures of Variability
Measure of variability is the quantitative measure of the degree of variation or dispersion of values in a data set including score of one variable. It provides information about

the degree to which individual scores are clustered about or deviate from the average value in a distribution. A measure of statistical dispersion is zero if all the data are identical, and increases as the data becomes more diverse. It cannot be less than zero. Standard Deviation is the most common measure of variability. It is as follows

Standard deviation
Standard deviation is the most commonly used measure of the variability. It is the average distance of the values from the mean of data and thus shows how much variation is there in the data from the "average" (mean). The formula for standard deviation is as follows

S=(x-x)2n-1 It can be calculated using following steps Example: Suppose we wished to find the standard deviation of the data set consisting of the values 3, 7, 7, and 19. Step 1: find the arithmetic mean (average) of 3, 7, 7, and 19,

1.

Step 2: find the deviation of each number from the

mean, by subtracting the mean from values (x-x) 2. Step 3: square each of the deviations to obtain (x-x)2 , which amplifies large deviations and makes negative values positive,

47
=48

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

3. Step 4: find the average of those squared deviations by adding them up and dividing by

n-1 to get the variance s2) 4 - 1 4. Step 5: take the non-negative square root of the quotient (converting squared units back to regular units),

S= 48=6.93
5. So, the standard deviation of the set is 6.93

Interpretation of standard deviation


In order to measure the dispersion of the data from its mean (x = 9) standard deviation is calculated. The standard deviation (s=6.93) shows that the average distance of the values from the means is 6.93 which relates that the most of the values falls in the range of 9 6.93 (xs) that is from 2.07 to 15.93.
Zero Standard deviation means that the data values are clustered at one point i.e. mean. A low standard deviation indicates that the data points tend to be very close to the mean, whereas high standard deviation indicates that the data are spread out over a large range of values. For data with a symmetric and approximately normal distribution it can be shown that

About two-third of the data will lie within one standard deviation on either side of the mean, that is between (x S) About 95% of the data will lie within two standard deviation on either side of the mean that is between (x 2S) Nearly all the data will lie within three standard deviation on either side of the mean that is between(x 3S) These facts would help you interpret the standard deviation for an approximately normal variable Remember that when the distribution is skewed the standard deviation may be a less helpful measure of spread as its values can be largely affected by outliers.

Other measures of Variability


Besides standard deviation there are also some other measures of variability that are as follows

Range - The range is the difference between the highest and lowest score in a
distribution. It is the simplest measure to compute and understand variability of the data but it is not often used as the sole measure of variability due to its instability. Because it is based solely on the most extreme scores in the

48

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

distribution and does not fully reflect the pattern of variation within a distribution, hence the range is a very limited measure of variability. Range = Max - Min Interquartile Range (IQR) - The interquartile range is the range of the middle 50% of a distribution. Because any outliers in our distribution must be on the ends of the distribution, the range as dispersion can be strongly influenced by outliers. One solution to this problem is to eliminate the ends of the distribution and measure the range of scores in the middle. Thus, with the interquartile range we will eliminate the bottom 25% and top 25% of the distribution, and then measure the distance between the extremes of the middle 50% of the distribution that remains. IQR = Q3 - Q1

Variance - The variance is a measure based on the deviations of individual


scores from the mean. As noted in the definition of the mean, however, simply summing the deviations will result in a value of 0. To get around this problem the variance is based on squared deviations of scores about the mean. When the deviations are squared, the rank order and relative distance of scores in the distribution is preserved while negative values are eliminated. Then to control for the number of subjects in the distribution, the sum of the squared deviations, S(X - `X), is divided by N (population) or by N - 1 (sample). The result is the average of the sum of the squared deviations and it is called the variance.

To get the measures of variability

Analyze SATmath

Descriptive Statistics Descriptive move Options Std Deviation, variance, Range, IQR Continue OK

Descriptive Statistics for the Scholastic Aptitude testmath (SATM)


Descriptive Statistics N scholastic aptitude test math Valid N (listwise) Range Std. Deviation Variance

75 75

480

94.553

8.940E3

49

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Table 4 Selection of Appropriate Descriptive Statistics and Plots

1. Checking assumption for parametric tests Every inferential statistical test has assumptions. These Statistical assumption explain when it is and isnt reasonable to perform a specific statistical test. It these assumptions are not met, the value that SPSS calculates, which tells the researcher whether or not the results are statistically significant, will not be completely accurate and may even lead the researcher to draw the wrong conclusions about the results. It involves checking assumptions for parametric tests as well as non parametric tests. These involves Assumptions of large sample size (non parametric test i.e. Chisquare etc.) Normality of the data (parametric test i.e correlation and regression etc.) Linearity of the data (parametric test i.e correlation and regression)

50

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Here we will discuss the normality curve. The other will be while studying corresponding tests The Normal Curve

discussed

The frequency distributions of many of the variables used in the behavioral sciences are distributed approximately as a normal curve when N is large. Examples of such variables that approximately fit a normal curve are height, weight, intelligence, and many personality variables. Notice for each of these examples, most people would fall toward the middle of the curve, with fewer people at the extremes. If the average height of men in United States was 510 then this height would be in the middle of the curve. The heights of men who are taller than 510 would be to the right of the middle on the curve, and those of men who are shorter than 510 would be to the left of the middle on the curve, with only a few men 7 or 5 tall.

3.2 Properties of Normal Curve 1. The mean, median and mode are equal. 2. It has one hump and this hump is in the middle of the distribution. 3. The curve is symmetric. If you fold the normal curve in half, the right side would fit perfectly with the left side; that is, it is not skewed. 4. The range is infinite. 5. The curve is neither too peaked nor too flat and its tails are neither too short nor too long.

3.2 how to check the normality Normality of data can be checked by using 1. Histograms a. Draw histogram for the data

51

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

b. Double click on the Histogram in output window to get into chart editor window c. Click on the normal curve button in tool bar and check the shape of the curve d. If it is fulfilling the characteristics mentioned above and the shape of the curve is just like the shape given above than the data is normal otherwise it is non-normal 2. Boxplots Box plots can be useful for identifying variables with extreme scores, which can make the distribution skewed (non-normal).Also if there are few outliers, if the whiskers are approximately the same length, and if the lines in the box is approximately in the middle of the box, then we can assume that the variable is approximately normally distributed. Thus, math achievement is near normal, motivation is approximately normal, but competence is quiet skewed in the HSB data file. 4.4 Non normally shaped Distributions If the data is not normally distributed than it can have 1. Skewness If one tail of the frequency distribution is longer than the other, and if the mean and median are different, the curve is skewed. A perfectly normal curve has a skewness of zero (0.0), if it is skewed to the left, it is called negatively skewed and if it is skewed to the right than it is called positively skewed. If the value if skewness lies between -1 and +1 than it is considered as the data is approximately normal. 2. Kurtosis If a frequency distribution is more peaked than the normal curve in figure above then it is said to have positive kurtosis and is called leptokurtic. Inversely if a frequency distribution is relatively flat with heavy tails, it has negative kurtosis and is called platykurtic. Both skewness and kurtosis can be measured using frequencies command in analyze menu. Skewness is necessary to measure but kurtosis effects less on the results of the test.

52

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Class Activity Session 3


Knowledge test 1. What is organization of data? 2. Define Classification 3. Give 4 bases of classification of data 4. Define Tabulation 5. Note three desirable characteristics of a good statistical table 6. Define Frequency Distribution 7. Define Class Limits 8. Define Class Boundaries 9. Define Class Mid Points 10.Define Class Frequency Skill Test: The following are the number of vehicles available to different branches of a multinational bank. Make a frequency distribution taking class interval size 1
2,4,6,1,3,3,5,7,8,6,4,7,6,4,4,2,1,5,0,1,5,9,9,10,3,6,4,2,5,7 ,9,6,1,2,10,4,8,9,2,3,1,0,4,10,1,1,2,2,2,3,4,4,4,6,6,5,5,5,4 ,5,8,4,3,3,2,1,8,6,9,10

1. Make a frequency distribution taking class interval size 2 2. Calculate the location of this data (mean, median and mode) 3. Calculate the dispersion of this data using range, inter-quartile range, upper-quartile range

53

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Q4. If class mid points in a frequency distribution of age of a group of persons are: 25, 32, 99, 46, 53 and 60. Find a) The size of the class interval b) The class boundaries

54

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Q5. A computer company received a rush order for as many home computers as could be shipped during a 6 week period. Company record provides the following daily shipments:

22 77 79 83 65 50

65 73 60 33 75 66

65 30 63 41 55 65

67 62 45 49 75 59

55 54 51 28 39 25

50 48 68 55 87 35

65 65 79 61 45 33

Group these daily shipment figures into a frequency distribution having the suitable number of classes.

Q6. In degree colleges of a city no teacher is less than 30 years or more than 60 years in age. Less than Total Freque ncy 60 980 55 925 50 810 45 675 40 535 35 380 30 220 25 75

Find the frequencies in the class intervals 25 30, 30 35, . . . .

Q7. Make a frequency distribution taking the classes as 1.19 1.23, 1.24 1.28, etc. from the following data

55

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

1.35,1.46,1.64,1.50,1.32,1.45,1.24,1.49,1.47,1.59,1.41 ,1.48,1.36,1.48,1.51,1.45,1.26,1.38,1.76,1.63,1.19,1.5 6,1.65,1.54,1.61,1.73,1.60,1.50,1.45,1.76,1.67,1.35,1. 55,1.68,1.46,1.40,1.32,1.47,1.64,1.45.

Also make the class boundaries

Q8. Tabulate the following marks in a frequency distribution taking 10 as the class interval and 45 as the lowest limit.

109,74,49,103,95,90,118,52,88,101,96,72,56,64,110,97 ,59,52,96,82,65,85,105,116,91,83,99,52,76,84,89,77,10 4.

Q9. The following figures relate to the bonus paid to 40 factory workers

Bonus (Rs.) 76,70,54,70,104,58,88,94,89,57,86,62,58,73,103,90,84 ,90,88,59,84,63,65,72,101,56,87,92,60,87,83,69,57,71, 102,57,83,93,61,86.

i. Prepare a frequency distribution taking the class width as 7, by inclusive method ii. Prepare anther frequency distribution taking the class width as 10, by exclusive method.

Q10. In an experiment measuring the percent shrinkage on dyeing, 40 plastic clay test specimens gave the following results:

56

QUANTITATIVE TECHNIQUES IN BUSINESS 19.3 17.1 18.4 19.4 21.8 17.5 19.5 13.9 16.3 18.8 17.4 20.5 18.8 17.8 18.5 18.6 17.5 17.3 16.8 18.2 16.9 19.1 15.8 22.3 16.1 16.5 18.5 17.5 19.0 19.5 16.9 20.4 23.4 17.4 18.2 20.5

QTB 17.9 18.7 14.9 18.8

Group these values into a frequency distribution taking 1.00 as the size of the class interval e.g. 13.5 14.4, 14.5 15.4 etc. and determine the class boundaries.

57

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Checking Data Reliability & Data Transformation

Descriptive Statistics

58

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Session Objectives
After attending this session the students will be able to Learn how to check the quality (goodness) of data How to perform Factor analysis How to transform data and create new variables

Session outline
1. Quality of data 1.1Reliability and Validity 1.2Factor analysis 2. Data File Management 2.1Count the Data 2.2Recode Variables 2.3Compute a new variable

59

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Goodness of data
Reliability and Validity Factor Analysis

Data File Management


Data file management involves different methods for data transformations into the form needed to answer the research questions. Data file management can be quite time consuming especially if you have a lot of questions/items that you combine to compute that summated or composite variables that you want to use in later analysis. You will learn three useful data transformation techniques: Count, Recode, and compute a new variable that is the average of several initial variables. From these operations we will produce new variables.

Problem 5.1: Count Math Courses Taken Sometimes you want to know how many items the participants have taken, bought, done, agreed with and so forth. For this purpose you can use the count option in transform menu.
Example: How many math courses (algebra1, algebra2, geometry, trigonometry and calculus) did each of the 75 participants take in high school? Label your new variable. There are five different math courses with the scores of taken=1 and not taken=0 we want to count that how many course are taken by each respondent. For this 1. go to Transform menu

2. select count values within the cases option to get count window 3. Now type mathcrt in target variable. This is SPSS name for your new variable 4. type math courses taken in the target label box 5. Then select all the math courses and move them over to the numeric variables box. Your Count window should look like following window 1 6. Click on define variable. To get window 2

60

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

1. Type 1 (code for math course taken) in the value box, click on add and continue 2. Click on ok 3. Check your file in variable view that a new variable mathcrt is there and in data view it is also added along with data values Problem 5.2: Recode and Relabel
Recode is the command used for adding new and improved variables in the file by 1. Revising the variables with large number of answer categories having low frequencies in each category so that group size will be large enough to perform statistical analysis 2. Reversing the categories of a negatively worded question to make it positive in order to compute new variable First we will learn to use recode option to revise the fathers and mothers education in HSB data file so that those with no postsecondary education have a value of 1, those with some post secondary have a value of 2, and those with a bachelors degree or more have a value of 3. Label the new variables and values Click on transform => Recode=> into Different variables and you should get Fig: 5.4. Now click on mothers education and then the arrow button. Click on fathers education and the arrow to move them to the numeric Values=> output box. Now highlight faed in the numeric variable box so that it turns blue. Click on the Output Variable Name box and type faedr. Click on the Label box and type fathers education revised. Click on change. Did you get faed=> faedr in the Numeric Variable => Output Variable box as in Fig Now repeat these procedures with maed in the Numeric Variable => Output Box. Highlight maed. Click on Output Variable Name, Type maedr.

61

QUANTITATIVE TECHNIQUES IN BUSINESS Click Label, type mothers education revised. Click Change. Then click on Old and New Values to get Fig Click on Range and type 2 in first box and 3 in second box Click on Value (part of New Value on the right) and type 1. Then click on Add. Repeat these steps to change old value s 4 through 7 to a new value of 2. Then Range: 8 through 10 to Value : 3. If it does, click on Continue. Finally, click on OK.

QTB

Check your variable and data view that two new variables with the names of faedr and maedr are added there. Define the new variables attributes in the variable view as per variable definition procedure Now we will learn to use recode option to reverse Pleasure items (item06 and item10 ) in HSB data file so that these negatively worded items could be reversed. Label the new variables and values. Follow the following steps

Click on Transform => Record=> Into Different Variables. Click on reset to clear the window of old information as a precaution. Select item06 and item10 and click on the arrow button. Highlight item06 so that it turns blue Click on Output Variable and Name and type item06r. Click on Label and type item06 reversed. Finally click on change. Now highlight item10 so that it turns blue.

62

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Click on Output Variable and Name and type item10r. Click on Label and type item10 reversed. Click on change. Click on old and New values to get fig Now click on the value box (under old value) and type 4 Click on the value box for the new value and type 1 Click Add to tell the computer to change values of 4 to 1 Repeat last three steps to recode the values 3 to2, 2 to 3, and 1 to 4. Click on continue and then Ok

Check your variable and data view that two new variables with the names of item06r and item10r are added there.

Compute Variables Compute option in transform menu is used to compute one variable from number of variables derived from questionnaire (as we are used to ask number of questions to measure one variable).
Compute Pleasure Scale Score
Here we will learn how to Compute the average pleasure scale from item02, item06r, item10r and item14. Name the new computed variable pleasure and label its highest and lowest values. Click on transform => compute. In the. Target Variable box of Fig., type pleasure. Click on type & Label and give it the name pleasure scale. Click on continue to return to Fig. In the Numeric expression box type (item02+item06r+item10+item14)/4 be sure that what you typed is exactly like this Finally, click on Ok.

63

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Now provide Value Labels for the pleasure scale using commands similar to those you did for fathers education revised. Type 1, then very low and click Add. Type 4, then very high, and click Add. See Fig. if you need help.

Check your data file to see if pleasure scale has been added as a new variable in both variable and data views.

Class Activity Session 4


GRAPHIC PRESENTATION OF DATA
EXCERISE 2

SHORT QUESTIONS:

1. Define Diagram 2. Define Graph 3. Define Pictogram 4. Define Simple Bar Chart 5. When do you prefer to draw a diagram? 6. When do you prefer to draw a graph? 7. Define Pie Chart 8. Define Histogram

64

QUANTITATIVE TECHNIQUES IN BUSINESS 9. Define Historigram 10.Differentiate between Histogram and Historigram NUMERCIAL QUESTIONS:

QTB

Q1. Draw a simple bar chart to represent the following set of data a) The following table shows disability in sample population:

Type of Disability No. of Persons

Blind 13

Deaf & Dumb 26

Crippled 41

Other Handicapped 33

b) The top 5 car dealers of Lahore ranked by the number of cars sold in the last month are listed below:

Car Dealers
Bari Motors

Cars Sold

30 Siddiqui Motors 24

Atlantic Motors

21

Ravi Motors

18

65

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Drive Line

15

66

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Q2. The following figures give the annual average prices of beef and mutton in Pakistan

Year Beef 1991 1992 1993 1994 1995 1996 36 40 45 50 55 64

Price in Rs. Per KG Mutton 60 64 80 90 100 120

Show the prices of Beef and Mutton by a multiple diagram

Q3. Given the population of four cities, represent this information by multiple diagrams City 1951 A B 94 87 Population in 10,000 1961 126 95 1971 196 144

42

54

69

30

42

52

67

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

68

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Q4. The table below shows the quantity in hundred tons of three commodities A, B, and C produced by certain firm during the year 1981 to 1986 Year 1991 1992 1993 A 18 24 28 B 85 76 80 C 52 60 62

1994

31

95

74

a) Construct a component bar chart to illustrate this data b) For each year express the figure for each year as a percentage of annual total and hence construct a percentage bar chart

69

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Q5. The distribution of students in a particular department of the Punjab University during 1985 1990 is given below: Year 1985 1986 1987 1988 1989 1990 Male 140 120 130 164 102 105 Female 30 60 70 51 88 90 Total 170 180 200 215 190 195

a) Draw a component bar chart b) Percentage component chart

70

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Q6. Draw percentage sub-divided rectangle diagram for the following data Item Food Clothing Housing Fuel Education Misc. Total Family A 240 120 140 80 100 120 800 Family B 350 130 200 100 120 100 1000

Q7. Draw Pie Diagram for the following data: Items Expenditures (Rs.) Food Clothing Rent Medical Care Others 95 32 50 23 40

71

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Q8. In a certain office establishment 200 employees were asked to express their opinion on how they feel the office chief is performing his duties. The responses are classified as follows: Disapprove Strongly 94 52 43 11 Disapprove Approve Approve Strongly

Draw a pie chart for the data

Q9. Compare budgets of two families by Pie Chart Budgets of Items Food Clothing House Rent Education Light Misc. Family A 48 8 8 6 6 4 Family B 180 42 48 18 30 42

72

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Q10. Draw (i) Histogram (ii) Frequency Curve (iii) Frequency Polygon on the same graph for the following distribution

Daily Wages (Rs.) 10 20 20 30 30 40 40 50 50 60 60 70 70 80 80 90 90 - 100

No. of Workers 2 5 10 15 18 12 7 5 1

Q11. Draw a histogram, frequency curve and frequency polygon on separate graphs for the following frequency distribution Mid Values (x) Frequency 3 17 28 47 54 31 14 4 32 37 42 47 52 57 62 67

Q12. Draw a histogram for the following distribution Class Frequen cy 25 - 29 5 30 - 34 15 35 - 44 40 45 - 49 30 50 - 59 50 60 -74 15

73

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Q13. Draw the Lorenz Curve for the following distribution of average monthly income distribution of shopkeepers. Also make the conclusion

Average Monthly Income (Rs.) 3,000 5,000 5,000 7,000 7,000 9,000 9,000 11,000 11,000 15,000 15,000 19,000 20,000 30,000

No. Of Shopkeepers 4 12 20 25 8 4 2

Q14. Compare the seasonal sales of Fan Industries of Gujrat and Gujranwala by Lorenz Curve and find in which city there is more inequality in sales.

Seasonal amount of sales in Lac (Rs.) 5 10 12 18 20

No. of Companies Gujrat Gujranwala

3 8 10 6 4

5 10 15 3 2

74

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Q15. Draw the histogram for the following data

Years Price (Rs/K g)

1995 10

1996 13

1997 18

1998 16

1999 15

2000 24

2001 22

2002 22

2003 20

75

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Mid-Term Project Discussion & Lab Practice Session

76

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Lab Practice Session


The students will be given two hours session in Lab revision of what they have learnt in Pre-mid session. The objectives of this session are to provide students an opportunity to Revise the whole course that they have learnt throughout the pre-mid session Have hands on practice on dealing with quantitative data using Descriptive statistics in SPSS Share their problems that they confront during revision and get the solution Clarify if they have any ambiguity regarding understanding or application of any concept regarding QTB

Pre-Mid Project Discussion The students will be given one hours session to discuss about the final draft of their Mid-term projects. The objectives of this session are to provide students an opportunity to

Share their problems that they confront during revision and get the solution Clarify if they have any ambiguity regarding understanding or application of any concept regarding QTB Get productive feedback on what they have done regarding their projects

The Drafts will be Checked on the following criteria


The drafts will be checked if the following components are covered a. If the topic and models (of secondary data) are appropriately selected b. An introduction explaining the background and objectives of your work. c. The Justification of the topic selection d. A description of the data definitions of the variables, conclusions about data quality, and so on.

77

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

e. A justification of the methods you have chosen to analyze the data. f. Description of data using descriptive analysis and prediction of relations among variables g. Length:

1000 words

Mid-Term Paper

78

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

INFERENTIAL STATISTICS

79

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

INFERENTIAL STATISTICS
1. Lesson Objectives After studying this session you would be able to 1. Understand and infer results from data in order to answer the associational and differential research questions using different parametric and non parametric tests. Understand, implement and interpret the chi-square, phi and cramers V understand, implement and interpret the correlation statistics understand, implement and interpret the regression statistics understand, implement and interpret the T-test statistics

2.

3. 4. 5.

Lesson Outline 1. Non parametric test. 1. 2. 3. 4. 2. 1. Chi square /Fisher exact Phi and cramers v Kendall tau-b Eta Correlation 1. 2. 2. 1. 2. 3. 1. 2. Pearson correlation Spearman correlation Simple regression Multiple regression One-sample T-test Independent sample T-test

Parametric test

Regression

T-Test

80

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

3.

Paired sample T-test

INFERENTIAL STATISTICS
Inferential statistics are used to make inferences (conclusions) about a population from a sample based on the statistical relationships or differences
between two or more variables using statistical tests with the assumption that sampling is random in order to generalize or make predictions about the

future. Why we use inferential Statistics:Inferential statistics are used 1. To test some hypothesis either to check relationship between variables (two/more) or to compare two groups to measure the differences among them. To generalize the results about a population from a sample To make predictions about the future. To make conclusions

2. 3. 4.

You don't need to understand the underlying calculus, but you do need to know which inferential statistic is appropriate to use and how to interpret it. Some basic concepts about inferential statistics 1. Statistical significance (The p value) Statistical significance test is the test of a null hypothesis H o which is a hypothesis that we attempt to reject or nullify. i.e. Ho =There is no relationship /Difference between variable 1 and variable 2 When we apply any inferential statistic, it gives us significance value (called p value). If the p value is less than 5% then the test result is said to be significant at the 5% level. The term significant means that the test signifies or points to the conclusion that there is evidence against the truth of the null hypothesis. The comparison of p with 5% is a standard method often used by researchers, but it is better to report and interpret the actual values of p. Interpretation

81

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

If the p value is greater than 0.05 than it means that Ho is accepted and H1 is rejected. It relates that there is no relationship/difference between the variables/groups. If the p value is less than or equal to 0.05 than it means that Ho is rejected and H1 is accepted. It relates that there is relationship/difference between the variables/groups. A higher p value means that the relationship is lesser significant and a smaller p value means that the relationship is highly significant. 2. Confidence Interval

Confidence interval is a range of values constructed for a variable of interest so that this range has a specified probability of including the true value of the variable. The specified probability is called the confidence level, and the end points of the confidence interval are called the confidence limits. It is one of the alternatives to null hypothesis significance testing (NHST). These intervals provide more information then NHST and may provide more practical information. For example, suppose one knew that an increase in reading scores of five points, obtained on a particular instrument, would lead to a functional increase in reading performance.
Two different methods of instruction were compared. The result showed that students who used this new method scored significantly higher statistically than those who used the other method. According to NHST, we would reject the null hypothesis of no difference between methods and conclude that the new method is better. If we apply confidence intervals to this same study, we can determine an interval that contains the population mean difference 95% of the time. If the lower bound of that interval is greater than five points, we can conclude that using this method of instruction would lead to a practical or functional increase in reading levels. If, however, the confidence interval ranged from say 1 to 11, the result would be statistically significant, but the mean difference in the population could be as little as 1 point, or as big as 11 points. Given these results, we could not be confident that there would be a practical increase in reading using the new method. 3. The effect size (weak, moderate or strong) Effect size is the strength of the relationship between the independent variable and the dependent variable, and/or the magnitude of the difference between levels of the independent variable with respect to the dependent variable. A statistically significant outcome does not give information about the strength or size of the outcome. Therefore, it is important to know, the size of the effect. Statisticians have proposed many effect size measures that fall mainly into two types of families, the r family and the d family.

82

QUANTITATIVE TECHNIQUES IN BUSINESS Interpreting Effect Sizes

QTB

Effect sizes always have an absolute value between -1.0 and +1.0. According to Cohen (1988) we can interpret the effect size (r/d) as follows

0 >0 0.33 >0.33 0.70

No effect Small effect Medium/typical effect Large effect Maximum effect

No relationship Weak relationship Moderate relationship Strong relationship Perfect relationship

>0.70 <1 1

Steps in interpreting inferential statistics


1. 2. 3. Relate why a test is applied Discuss for which variable the test is applied Elaborate whether the null hypothesis is rejected or accepted w.r.t. p value As discussed above if the significance (p) value is less than 0.05 then H O is rejected and H1 is accepted, conversely if the significance value is greater than 0.05 then HO is accepted and H1 is rejected 4. 1. 2. 1. State what is the direction of the effect For associational research question indicate whether the association or relationship is positive or negative For differential research question state which group performed better? Conclude the results

Types of tests used in Inferential Statistics


Inferential statistics include a wide variety of tests to infer the results. This variety of tests can be classified in two broader categories that are 1. 2. Non parametric tests Parametric tests

83

QUANTITATIVE TECHNIQUES IN BUSINESS Following is the detailed discussion related to both types of tests.

QTB

1.Non parametric test


Non parametric tests are the statistical tests that are used 1. 2. When the level of measurement is nominal or ordinal. E.g. chi-square test or Kendalls tau-b. When assumptions about normal distribution in the population is not met e.g. spearman correlation

Non parametric tests involve 1. Chi-Square test 2. Kendalls tau-b 3. Eta 4. Spearman correlation (will be discussed in correlation section) Lets see these tests in detail. Chi-Squared Test

Chi-Squared test is the most commonly used non-parametric test to check the association between two nominal variables in order to accept or reject the null hypothesis. Hypothesis for Chi-Square Test Ho = there is no association between gender and geometry in h.s. H1 = There is association between gender and geometry in h.s. It is used to check 1. The association between two nominal variables 2. Compare two or more groups if they are categorical in nature Assumptions and Conditions for the Chi-Squared test
1.

The data of the variables must be independent. Each subject is assessed only once. Both the variables are nominal. All the expected counts are greater than 1 for chi-square.

2. 3.

84

QUANTITATIVE TECHNIQUES IN BUSINESS 4.

QTB

At least 80% of the expected frequencies should be greater than or equal to 5. Checking the assumptions for the Chi-Squared test The assumptions for Chi-squared test are checked through cross tabulation of the categorical variables. It can be drawn by 1. Click the analyze menu 2. Select the descriptive statistics option 3. Select crosstabs option in the sub menu 4. Put geometry in h.s. in rows section and gender in columns section 5. Check chi-squared, phi and Cramers v from statistics tab 6. Check observed, expected and total from cells tab
7. Click continue then ok to get the following crosstabs in output window
geometry in h.s. * gender Crosstabulation gender male geometry in h.s. not taken Count Expected Count % of Total Taken Count Expected Count % of Total Total Count Expected Count % of Total 10 17.7 13.3% 24 16.3 32.0% 34 34.0 45.3% female 29 21.3 38.7% 12 19.7 16.0% 41 41.0 54.7% Total 39 39.0 52.0% 36 36.0 48.0% 75 75.0 100.0%

1. 2.

Check if all the values of expected counts are greater than one (excluding total column and the total row) Check if the 80% values of expected counts are greater than 5. You can calculate the percentage using following formula Number of cells with expected counts greater than 5 Total number of cells

100

85

QUANTITATIVE TECHNIQUES IN BUSINESS 3.

QTB

If the assumptions are fulfilled then use significance value of Pearson chi-square as highlighted below If the assumptions for chi-square are not fulfilled then select the significance value of Fishers exact test To check the strength of the relationship (effect size) use the value of Phi for 2x2 crosstab and value of Cramers V for 3x3 crosstab. Remember that both Phi and Cramers v have similar values for 2x3 and 3x2 crosstabs

4.

5.

86

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Case Processing Summary

Cases Valid N geometry in h.s. * gender 75 Percent 100.0% N 0 Missing Percent .0% N 75 Total Percent 100.0%

Chi-Square Tests

Asymp. Sig. (2Value Pearson Chi-Square Continuity Correctionb Likelihood Ratio Fisher's Exact Test Linear-by-Linear Association N of Valid Casesb 12.544 75 1 .000 12.714a 11.112 13.086 df 1 1 1 sided) .000 .001 .000

Exact Sig. (2sided)

Exact Sig. (1sided)

.000

.000

a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 16.32. b. Computed only for a 2x2 table Symmetric Measures Value Nominal by Nominal Phi Cramer's V N of Valid Cases -.412 .412 75 Approx. Sig. .000 .000

Interpretation: To check the association between gender and geometry in h.s. chi-square test is conducted. The case processing summary table indicates that there is no participant with missing value. The assumptions are checked through

87

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

crosstabs. The Crosstabulation table includes the Counts and Expected Counts, and their relative percentages within gender. The result shows that there are 24 males who had taken geometry which is 71% of total 34 male students. On the other hand, 12 of 41 females took geometry; that is only 29% of the females. It looks like a higher percentage of males took geometry than female students. The Ch-Square Test table tell us whether we can be confident that this apparent difference is not due to chance. Note: it is noted very carefully that, we use the Pearson Chi-Square or (for small samples) the Fishers exact test to interpret the results of the test. Note, in the Cross Tabulation table, that the Expected Count of the number of male students who didnt take geometry is 17.7 and the observed or actual Count is 10. Thus, there are 7.7 fewer males who didnt take geometry than would be expected by chance, given the Totals shown in the Table. There are also the same discrepancies between observed and expected counts in the other three cells of the table. A question answered by the chi-square test is whether these discrepancies between observed and expected counts are bigger than one might expect by chance. The Chi-Square Tests table is used to determine if there is a statistically significant relationship between two dichotomous or nominal variables. It tells you whether the relationship is statistically significant but does not indicate the strength of the relationship, like phi or a correlation does. In output, we use the Pearson Chi-Square or (for small samples) the Fishers exact test to interpret the results of the test. They are statistically significant (p < .001), which indicates that we can be quite certain that males and females are different on whether they take geometry. Phi is -.412, and like the chi-square, it is statistically significant. Phi is also a measure of effect size for an associational statistic and, in this case, effect size is moderate according to Cohen (1988)
KENDALLS TAU-B

If the variables are ordered (i.e. ordinal), you have several other choices. We will use Kendalls tau-b in this problem. Example: What is the relationship or association between fathers education and mothers education?
1. 2.

Analyze Descriptive Statistics Click on Reset to clear the previous entries.

Crosstabs.

88

QUANTITATIVE TECHNIQUES IN BUSINESS 3. 4.

QTB

5.

6. 7.

Put mothers education revised in the Rows box and fathers education revised in the columns box. Click on Cells and ask that the Observed and Expected cell counts and Total percentages be printed in the table. Click on Continue and then Statistics. Request the following Statistics: Kendalls tau-b coefficient under ordinal, and Phi and Cramers V under nominal (for comparison purposes). Do not check Chi-Square. Click on Continue Click on OK. Case Processing Summary Cases Valid N mother education revised * father education revised Percent 73 97.3% Missing N 2 Percent 2.7% N Total Percent 75 100.0%

89

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

mother education revised * father education revised Crosstabulation father education revised 1 mother education revised 1 Count Expected Count % of Total 2 Count Expected Count % of Total 3 Count Expected Count % of Total Total Count Expected Count % of Total 43 35.6 2 8 13.1 3 2 4.4 2.7% 2 1.5 2.7% 2 .2 2.7% 6 6.0 Total 53 53.0 72.6% 18 18.0 24.7% 2 2.0 2.7% 73 73.0

58.9% 11.0% 6 12.1 10 4.4

8.2% 13.7% 0 1.3 .0% 49 49.0 0 .5 .0% 18 18.0

67.1% 24.7%

8.2% 100.0%

Symmetric Measures Asymp. Std. Approx Approx. a Error . Tb Sig. .108 3.846 .000

Value Ordinal by Ordinal Kendall's taub .494 73

N of Valid Cases a. Not assuming the null hypothesis.

90

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Symmetric Measures Asymp. Std. Approx Approx. a Error . Tb Sig. .108 3.846 .000

Value Ordinal by Ordinal Kendall's taub .494 73

N of Valid Cases

b. Using the asymptotic standard error assuming the null hypothesis.


Interpretation:

To investigate the relationship between fathers education and mothers education, Kendalls tau-b was used. The analysis indicated a significant positive association between fathers education and mothers education, tau =.494, p<.001. This means that more highly educated fathers were married to more highly educated mothers and less educated fathers were married to less educated mothers. This tau is considered to be a large effect size (Cohen, 1988).
ETA

If one variable is nominal and the other is scale then ETA is the appropriate test used to check the relationship between the two variables. Eta is calculated for both variables. First you should decide the dependent variable and consider the Eta value of that variable. Example: What is the association between gender and number of math courses taken? How strong is it? 8. Analyze Descriptive Statistics Crosstabs.
9. 10.

Click on Reset to clear the previous entries. Put math courses taken in the Rows box and gender in the columns box. Click the Statistics and select Eta. Click Continue Click OK to get following results

11. 12. 13.

91

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Case Processing Summary Cases Valid N math courses taken * gender Percent 75 100.0% N 0 Missing Percent Total N Percent

.0% 75 100.0%

math courses taken * gender Crosstabulation Gender Male female Total

math courses taken

Count Expected Count

4 7.3 3 7.3 9 6.8 6 3.6 7 5.4 5 3.6 34

12 8.7 13 8.7 6 8.2 2 4.4 5 6.6 3 4.4 41

16 16.0 16 16.0 15 15.0 8 8.0 12 12.0 8 8.0 75

Count Expected Count

Count Expected Count

Count Expected Count

Count Expected Count

Count Expected Count Count

Total

92

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Case Processing Summary Cases Valid N Percent N Missing Percent Total N Percent

Expected Count
Directional Measures

34.0

41.0

75.0

Value Nominal by Interval Eta math courses taken Dependent gender Dependent .328 .419

Interpretation Eta was used to investigate the strength of the association between gender and number of math courses taken (eta=.33). This is a weak to medium effect size (Cohen, 1988). Males were more likely to take several or all the math courses than females.

93

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Class Activity Session 6


Please show all work and explain your answers.
1.

Popcorn sales in movie theaters break down as 40% plain popcorn and 60% buttered popcorn. While 65% of the plain popcorn is purchased by adults, 80% of the buttered popcorn is purchased by children. If a child purchases popcorn, what is the probability that it is buttered popcorn?

(guidelines: develop a joint-probability table. Note that the problem is asking that you compute a conditional probability)
2.

A process follows the binomial distribution with n = 7 and p = .4. Find


a. b. c. P(x = 3) P(x > 5) P(x 2)

3.

Scores on an endurance test for cardiac patients are normally distributed with mean = 200 and standard deviation = 30.
a. b. c. What is the probability a patient will score above 206? What percentage of patients score below 155? What score does a patient at the 25th percentile receive?

94

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

4. A calculus instructor uses computer aided instruction and allows students to take the midterm exam as many times as needed until a passing grade is obtained. Following is a record of the number of students in a class of 20 who took the test each number of times.
Student s 7 6 4 2 1 Number of tests

1 2 3 4 5

a. b. c.

use the relative frequency approach to construct a probability distribution show that it satisfies the required condition for being a probability distribution. Find the expected value of the number of tests taken.

5. For the payoff table below, the decision maker will use P(s1) = .15, P(s2) = .5,
and P(s3) = .35. s1 d1 d2 -5000 15,000 s2 1000 -2000 s3 10,000 40,000

a.

What alternative would be chosen according to expected value?

b.

For a lottery having a payoff of 40,000 with probability p and -15,000 with probability (1-p), the decision maker expressed the following indifference probabilities.

95

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Payoff 10,000 1000 -2000 -5000

Probability .85 .60 .53 .50

Let U(40,000) = 10 and U(-15,000) = 0 and find the utility value for each payoff.

c.

What alternative would be chosen according to expected utility?

96

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

CORRELATION & REGRESSION


Inferential Statistics

97

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Correlation Correlation is a statistical process that determines the mutual (reciprocal) relationship between two (or more) variables which are thought to be mutually related in a way that systematic changes in the value of one variable are accompanied by systematic changes in the other and vice versa. It is used to determine 1. 2. 3. The existence of mutual relationship that is defined by the significance (p) value. The direction of relationship that is defined by the sign (+,-) of the test value The strength of relationship that is defined by the test value

Correlation Coefficient (r)

The correlation coefficient measures the strength of linear relationship between two or more numerical variables. The value of correlation coefficient can vary from -1.0 (a perfect negative correlation or association) through 0.0 (no correlation) to +1.0 (a perfect positive correlation). Note that +1 and -1 are equally high or strong, but they lead to different interpretations. A high positive correlation between anxiety and grades would mean that students with higher anxiety tended to have higher grades, those with lower anxiety had lower grades, and those in between had grades that were neither especially high nor especially low. A high negative correlation would mean that students with high anxiety tended to have low grades; also high grades would be associated with low anxiety. With a zero correlation there are no consistent associations. A student with high anxiety might have low, high or medium grades. There are two types of correlation 1. 2.
1.

Pearson Correlation Spearman Correlation Pearson Correlation The Pearson Correlation is used when you have two variables that are normal/scale An assumption of the Pearson correlation is that the

98

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

variables are related in a linear (straight line) way so we will examine the scatter plots to see if that assumption is reasonable. Second, the Pearson Correlation, and the Spearman correlation will be computed. and the Spearman is used when one or both is ordinal.

1.

Assumptions and conditions for Pearson 1. The two variables have a linear relationship. 2. Scores on one variable are normally distributed for each value of the other variable and vice versa. 3. Outliers (i.e. extreme scores) can have a big effect on the correlation.

1.

Checking the assumptions for Pearson Correlation The assumptions for correlation test are checked through normal curve (normality assumption) and the scatter plot (linearity assumption) Normality assumption 1. 2. 3. 4. Click on the analyze menu Select the descriptive statistics option Select frequency option in the sub menu Put math achievement and Satmath in variables box
5.

Check skewness in statistics tab and histogram in charts tab

6.

Click continue and then ok You will get skewness values showing that the variables are approximately normally distributed further check the normality of data

7.

99

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

through normal curve in histograms using chart editor

Statistics math achievement test N Valid Missing Skewness Std. Error of Skewness 75 0 .044 .277 scholastic aptitude test - math 75 0 .128 .277

Linearity assumption 8. 9. 10. 11. 12. 13. Click on the graph menu Select legacy dialogue, interactive and then scatter plot Put math achievement in y-axis and satmath in x-axis Click ok to get scatter plot in output window Double click on the scatter plot to get into chart editor Click on the add fit line at total button in tool bar to get linear line and R square linear = 0.62 close window Repeat the previous step for quadratic line and get R square = 0.621 click apply and close the window

14. 15.

100

QUANTITATIVE TECHNIQUES IN BUSINESS 16.

QTB

Calculate the difference between the two R square (0.621 0.62 = 0.001) If the difference is less than 0.05 (the p value) then the relation is linear (0.001>0.05) hence apply Pearson correlation

17.

How to apply Pearson Correlation


1. Select analyze then correlate and then bivariate

2. Put math achievement and Satmath in variable box 3. Ensure that Pearson, two tailed, and flag relationships are checked
4. Click ok to get follow results in output window
Correlations scholastic math achievement test math achievement test Pearson Correlation Sig. (2-tailed) N scholastic aptitude test math Pearson Correlation Sig. (2-tailed) N **. Correlation is significant at the 0.01 level (2-tailed). 75 .788** .000 75 75 1 aptitude test math .788** .000 75 1

Interpretation

To investigate if there was a statistically significant association between Scholastic aptitude test and math achievement, a correlation was computed. Both the variables were approximately normal there is linear relationship between them hence fulfilling the assumptions for Pearson's correlation. Thus, the Pearsons r is calculated, r= 0.79, p < .001 relating that there is highly significant relationship between the variables. The positive sign of the Pearson's test value shows that there is positive relationship, which means

101

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

that students who have high scores in math achievement test do have high scores in scholastic aptitude test and vice versa. Using Cohens (1988) guidelines the effect size is large relating that there is strong relationship between math achievement and scholastic aptitude test.
Spearman Correlation:

If the assumptions for Pearson correlation are not fulfilled then consider the Spearman correlation with the assumption that the Relationship between two variables is monotonically non-linear Example: what is the association between mothers education and math achievement? 1. Analyze Correlate Bivariate. 2. Move math achievement and mothers education to the variables box 3. Next ensure that the spearman and Pearson boxes are checked. 4. Make sure that the two-tailed (under test of significance), flag significant correlations and two-tailed are checked 5. Now click on options and check means and standard deviations and click on exclude cases list wise. 6. Click on continue and click on Ok Correlationsa math mother's achieve educatio ment n test Spearman' mother's s rho education Correlation Coefficient Sig. (2-tailed) math Correlation achievement Coefficient test Sig. (2-tailed)
Interpretation To investigate if there was a statistically significant association between mothers education and math achievement, a correlation was computed. Mothers education was skewed (skewness=1.13), which violated the assumption of normality. Thus, the spearman rho statistic was calculated, r, (73) = .32, p = .006. The direction of

1.000 . .315** .006

3.15** .006 1.000 .

102

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

the correlation was positive, which means that students who have highly educated mothers tend to have higher math achievement test scores and vice versa. Using Cohens (1988) guidelines the effect size is medium for studies in his area. The r2 indicates that approximately 10% of the variance in math achievement test score can be predicted from mothers education.

REGRESSION ANALYSIS

Regression analysis is used to measure the relationship between two or more variables. One variable is called dependent (response, or outcome) variable and the other is called Independent (explanatory or predictor) variables. It is used to check that due to one unit change in the independent variable(s) how much change occurs in dependent variable.

Regression Equation
It is the equation representing the relation between selected values of one variable (x:the independent variable) and observed values of the other (y: the dependent variable); it permits the prediction of the most probable values of y. The standard form of this equation for two variables and for more than two variables respectively is as follows

Y = a + bx ex4 Y = dependent variable a = Constant b, c, d, e, = slope coefficients

Y = a + bx1 + cx2 + dx3 +

x1, x2, x3, x4 = Independent variables Types of Regression There are two types of regression analysis that are 1. Simple Regression

103

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

2.

Multiple regression

3.

Simple Regression

Simple regression is used to check the contribution of independent variable(s) in the dependent variable if the independent variable is one. 4. 1. Assumptions and conditions of simple regression

Dependent variable should be scale 2. The relationship of variables should be linear 3. Data should be independent Example: Can we predict math achievement from grades in high school Commands
2. Analyze

Regression Linear 3. Highlight math achievement. Click the arrow to move it into the dependent box 4. Highlight grades in high school and click on the arrow to move it into the independent (s) box. 5. Click on Ok Variables Entered/Removedb Mode l 1 Variables Entered grades in h.s.a Variables Removed Method . Enter

a. All requested variables entered. Model Summary Mod el 1 Std. Error R Adjusted R of the Square Square Estimate .254 .244 5.80018

R .504a

a. Predictors: (Constant), grades in h.s. ANOVAb

104

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Variables Entered/Removedb Mode l 1 Variables Entered grades in h.s.a Sum of Squares 836.606 2455.875 3292.481 Variables Removed Method . Enter Mean Square 1 73 74

Model 1 Regressio n Residual Total

Df

Sig. .000a

836.606 24.868 33.642

a. Predictors: (Constant), grades in h.s. b. Dependent Variable: math achievement test Coefficientsa Standardiz ed Coefficient s Beta t .157 .504 4.987 Sig. .876 .000

Unstandardized Coefficients Model 1 (Constant) grades in h.s. B .397 2.142 Std. Error 2.530 .430

a. Dependent Variable: math achievement test Regression equation is Y = 0.40 + 2.14X Interpretation Simple regression was conducted to investigate how well grades in high school predict math achievement scores. The results were statistically significant F (1, 73) = 24.87, p<.001. The indentified equation to understand this relationship was math achievement = .40 + 2.14* (grades in high school). The adjusted R2 value was .244. This indicates that 24% of the

105

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

variance in math achievement was explained by the grades in high school. According to Cohen (1988), this is a large effect.

Multiple Regression
Multiple regressions is used to check the contribution of independent variable(s) in the dependent variable if the independent variables are more than one. 1. Assumptions and conditions of Multiple regression 1. Dependent variables should be scale. Example: How well can you predict math achievement from a combination of four variables: grades in high school, fathers education, mother education and gender Commands
2. Analyze

3. 4.

5.
6.

Regression Linear Highlight math achievement. Click the arrow to move it into the dependent box Highlight grades in high school, fathers education, mother education and gender and click on the arrow to move them into the independent (s) box. Under method, be sure that enter is selected. Click on continue and then ok to get the following results in output window

106

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Descriptive Statistics Mean math achievement test grades in h.s. father's education mother's education Gender 12.662 1 5.70 4.73 4.14 .55 Std. Deviation 6.49659 1.552 2.830 2.263 .501 N 73 73 73 73 73

Model Summary Std. Error of the R Adjusted R Estimat Square Square e .379 .343 ANOVAb Model 1 Regressio n Residual Total Sum of Squares 1153.222 1885.583 3038.804 df 4 Mean Square F Sig. .000a 5.2658 5

Mod el 1

R .616a

288.30 10.397 5

68 27.729 72 Coefficients

107

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Descriptive Statistics Mean math achievement test grades in h.s. father's education mother's education 12.662 1 5.70 4.73 4.14 Std. Deviation 6.49659 1.552 2.830 2.263 N 73 73 73 73 Standardiz ed Coefficient s Beta .465 .083 .141 T .415 4.560 .610 1.084 Sig. .680 .000 .544 .282 .006

Unstandardized Coefficients Model 1 (Constant) grades in h.s. father's education mother's education Gender B 1.047 1.946 .191 .406 -3.759 Std. Error 2.526 .427 .313 .375 1.321

-.290 -2.846

a. Dependent Variable: math achievement test Regression Equation: Y = 1.047 + 1.95X1 + 0.19X2 + 0.41X3 3.76 X4 Interpretation Simultaneously multiple regression was conducted to investigate the best predictors of math achievement test scores. The means, standard deviation,

108

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

and inter correlations can be found in table. The combination of variables to predict math achievement from grades in high school, fathers education, mothers education and gender was statistically significant, F = 10.40, p <0.05. The beta coefficients are presented in last table. Note that high grades and male gender significantly predict math achievement when all four variables are included. The adjusted R2 value was 0.343. This indicates that 34 % of the variance in math achievement was explained by the model according to Cohen (1988), this is a large effect.

109

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Class Activity Session 7


Correlation
Some studies are interested in whether two variables are related to each other. Is there a relationship between birth order and IQ scores? Is there a relationship between socioeconomic status (SES) and health? The CORRELATION COEFFICIENT is a statistic that shows the strength of the relationship between the two variables. The correlation coefficient falls between -1.00 and +1.00. The statistic shows both the STRENGTH of the relationship between the variables, and the DIRECTION of the relationship. The numerical value indicates the strength of the relationship. The sign in front of the numerical value indicates the direction of the relationship. Let us consider each of these in more detail.

THE NUMBERICAL VALUE:


Correlation coefficient values that are close to zero (e.g., -.13, +.08) suggest that there is no relationship between the two variables. The closer the correlation is to one (e.g., -.97, +.83) the stronger the relationship between the two variables. Thus, we might expect that there would be no relationship between the height of college students and their SAT scores, and we would be correct. The correlation coefficient is very close to zero. However, we might expect a correlation between adult height and weight to be stronger, and again we would be correct.

THE SIGN:
The sign of the correlation coefficient tells us whether these two variable are directly related or inversely related. Do the two variables increase and decrease in the same direction? The more time a student spends studying the better their grade, the less time spent studying the lower the grade. Notice how both study time and grade vary in the same direction. As studying increases grades increase, and when studying decreases grades decline. Grade and study time would be POSITIVELY correlated. The term POSITIVE does not necessarily mean its a

110

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

good thing (when is getting a poor grade a "good" thing!). It simply means that there is a direct relationship, the variables are varying (changing) in the same direction. Do the two variables vary in opposing directions? As the number of children in a family increase the lower the IQ scores of the children. Thus, family size and children's IQ scores vary in the opposite direction. As family size increases the IQ scores decline, as the family size decreases IQ scores increase. IQ and family size are NEGATIVELY correlated (inversely related). Try the following exercise to see if you understand the concept of correlation. It is best if you have read both this section and the research method section on correlational studies before completing the exercise. EXERCISE Inferential Statistics Inferential Statistics allow researchers to draw conclusions (inferences) from the data. There are several types of inferential statistics. The choice of statistic depends on the nature of the study. Covering the different procedures used is beyond the scope of this course. However, understanding why they are used is important. A researcher asks two groups of children to complete a personality test. The researcher then wants to know whether the males scored differently than the females on certain measures of personality. We will create a fictitious personality trait "Z." Here are the scores for the girls and the boys:

Girls 23

Boys 37 The mean score for the "Z" trait in boys was higher than the mean score for "Z" in the girls. But notice how within the two groups there was considerable

40 37 41 41

56 18 41 42

111

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

33 28 25
24 13 28 44 33 47 25 46

38 50 22

fluctuation. By "chance" alone we might have obtained these different values. Thus, in order to conclude that "Z" shows a gender difference, we need to rule out that these differences were just a fluke. This is where inferential statistics come in to play.

Mean=31.4 Mean=37.9 2 2 SD=9.03 SD=11.14

An important concept in inferential statistics is STATISTICAL SIGNIFICANCE. When an inferential statistic reveals a statistically significant result the differences between the groups were unlikely due to chance. Thus, we can rule out chance with a certain degree of confidence. When the results of the inferential statistic are not statistically significant, chance could still be a reason why we obtained the observations that we did.

112

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

T-TEST STATISTIC
Inferential Statistics

113

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

T-TEST Statistics
The t test is used to compare to groups to answer the differential research questions. Its values determines the difference by comparing means Hypothesis for T-test HO: there is no Difference between variable 1 and variable 2 H1: There is difference between variable 1 and variable 2 Types of T-test There are three types of T-test 1. 2. 3. One sample t-test Independent sample t-test Paired sample t-test

1. ONE SAMPLE T-TEST


One sample t-test is used to determine if there is difference between population mean (Test value) and the sample mean (X) Assumptions and conditions of 1 sample t-test 1.
2.

The dependent variable should be normally distributed within the population The data are independent.(scores of one participant are not depend on scores of the other :participant are independent of one another )

Example: is the mean SAT-Math score in the modified HSB data set significantly different from the presumed population mean of 500? Commands
1.

Analyze

Compare means

One sample t-test

114

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

2. 3. 4.

Move scholastic aptitude test-math to the test variables box. Type 500 in the test value box Click on Ok

One-Sample Statistics N scholastic aptitude test math Mean 75 490.53 Std. Deviation 94.553 Std. Error Mean 10.918

One-Sample Test Test Value = 500 Mean Sig. (2- Differenc tailed) e .389 -9.467 95% Confidence Interval of the Difference Lower -31.22 Upper 12.29

t scholastic aptitude test math Interpretation: -.867

Df 74

To investigate the difference between population and the sample, onesample t-test is conducted. The One-Sample Statistics table provides basic descriptive statistics for the variable under consideration. The Mean AT-Math for the students in the sample will be compared to the hypothesize population mean, displayed as the Test Value in the One-Sample Test table. On the bottom line of this table are the t value, df, and the two-tailed sig. (p) value, which are circled. Note that p=.389 so we can say that the sample mean (490.53) is not significantly different from the population mean of 500. The table also provides the difference (-9.47) between the sample and population mean and the 95% Confidence Interval. The difference between the sample and the population mean is likely to be between +12.29 and -31.22 points. Notice that this range includes the value of zero, so it is

115

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

possible that there is no difference. Thus, the difference is not statistically significant.

2. INDEPENDENT SAMPLE T-TEST


Independent sample T-test is used to compare two independent groups (Male and Female)with respect to there effect on same dependent variable. Assumptions and conditions of Independent T-test 1. 2. 3. Variance of the dependent variable for two categories of the independent variable should be equal to each other Dependent variable should be scale Data on dependent variable should be independent.

Example: Do male and female students differ significantly in regard to their average math achievement scores Commands
1. Analyze

Compare means

independent sample t-

test 2. Move math achievement scores to the test variables box. 3. Move gender to the grouping variable box 4. Click on define groups 5. Type 0 for males in the group 1 box and 1 for females in the group 2 box 6. Click on continue 7. Click on Ok

116

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Interpretation The first table, Group Statistics, shows descriptive statistics for the two groups (males and females) separately. Note that the means within each of the three pairs look somewhat different. This might be due to chance, so we will check the t test in the next table. The second table, Independent Sample Test, provides two statistical tests. The left two columns of numbers are the Levenes test for the assumption that the variances of the two groups are equal. This is not the t test; it only assesses an assumption! If this F test is not significant (as in the case of math achievement and grades in high school), the assumption is not violated, and one uses the Equal variances assumed line for the t test and related statistics. However, if Levenes F is statistically significant (Sig. <.05), as is true for visualization, then variances are significantly different and the assumption of equal variances is violated. In that case, the Equal variances not assumed line used; and SSPS adjusts t, df, and Sig. The appropriate lines are circled. Thus, for visualization, the appropriate t=2.39, degree of freedom (df) = 57.15, p=.020. This t is statistically significant so, based on examining the means, we can say that boys have higher visualization scores than girls. We used visualization to provide an example where the assumption of equal variances was violated (Levenes test was significant). Note that for grades in high school, the t is not statistically significant (p=.369) so we conclude that there is no evidence of a systematic difference between boys and girls on grades. On the other hand, math achievement is statistically significant because p<.05; males have higher means.

117

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

The 95% Confidence Interval of the Difference is shown in the two right-hand column of the output. The confidence interval tells us if we repeated the study 100 times, 95 of the times the true (population) difference would fall within the confidence interval, which for math achievement is between 1.05 points and 6.97 points. Note that if the Upper and Lower bounds have the same sign (either + and + or and -), we know that the difference is statistically significant because this means that the null finding of zero difference lies outside of the confident interval. On the other hand, if zero lies between the upper or lower limits, there could be no difference, as is the case of grades in h.s. The lower limit of the confidence interval on math achievement tells us that the difference between males and females could be as small as 1.05 points out 25, which are the maximum possible scores. Effects size measures for t tests are not provided in the printout but can be estimated relatively easily. For math achievement, the difference between the means (4.01) would be divided by about 6.4, an estimate of the pooled (weighted average) standard deviation. Thus, d would be approximately .60, which is, according to Cohen (1988), a medium to large sized effect. Because you need means and standard deviations to compute the effect size, you should include a table with means and standard deviations in your results section for a full interpretation of t tests.

1.

PAIRED SAMPLE T-TEST

Paired sample T-test is used to compare two paired groups (e.g. Mothers and fathers) with respect to there effect on same dependent variable. Assumptions and conditions of Paired sample T-test 2. 3. The independent variable is dichotomous and its levels (or groups) are paired, or matched, in some way (husband-wife, pre-post etc) The dependent variable is normally distributed in the two conditions Example: Do students fathers or mothers have more education? Commands
1. Analyze

Compare means paired sample t-test 2. Click on both of the variables, fathers education and mothers education, and move them simultaneously to the paired variables box 3. Click on Ok

118

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Paired Samples Statistics Mean Pair 1 father's education mother's education 4.73 4.14 N 73 73 Std. Deviation 2.830 2.263 Std. Error Mean .331 .265

Paired Samples Correlations N Pair 1 father's education & mother's education Interpretation The first table shows the descriptive statistics used to compare mothers and fathers education levels. The second table Paired Samples Correlations, provides correlations between the two paired scores. The correlation (r=.68) between mothers and fathers education indicates that highly educate men tend to marry highly educated women and vice versa. It doesnt tell you whether men or women have more education. That is what t in the third table tells you. The last table shows the Paired Samples t Test. The Sig. for the comparison of the average education level of the students mothers and fathers was p=.019. Thus, the difference in educational level is statistically significant, and we can tell from the means in the first table that fathers have more education; however, the effect size is small (d=.28), which is computed by dividing the mean of the paired differences (.59) by the standard deviation (2.1) of the paired differences. Also, we can tell from the confidence interval that the difference in the means could be as small as .10 of a point or as large as 1.08 points on the 2 to 10 scale. 73 Correlati on .681 Sig. .000

119

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Class Activity Session 8


Inferential Statistics
Inferential Statistics allow researchers to draw conclusions (inferences) from the data. There are several types of inferential statistics. The choice of statistic depends on the nature of the study. Covering the different procedures used is beyond the scope of this course. However, understanding why they are used is important. A researcher asks two groups of children to complete a personality test. The researcher then wants to know whether the males scored differently than the females on certain measures of personality. We will create a fictitious personality trait "Z." Here are the scores for the girls and the boys:

Girls 23 40 37 41 41 33 28 25 24 13 28 44

Boys 37 56 18 41 42 38 50 22 33 47 25 46 The mean score for the "Z" trait in boys was higher than the mean score for "Z" in the girls. But notice how within the two groups there was considerable fluctuation. By "chance" alone we might have obtained these different values. Thus, in order to conclude that "Z" shows a gender difference, we need to rule out that these differences were just a fluke. This is where inferential statistics come in to play.

Mean=31.4 Mean=37.9

120

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

2 SD=9.03

2 SD=11.14

An important concept in inferential statistics is STATISTICAL SIGNIFICANCE. When an inferential statistic reveals a statistically significant result the differences between the groups were unlikely due to chance. Thus, we can rule out chance with a certain degree of confidence. When the results of the inferential statistic are not statistically significant, chance could still be a reason why we obtained the observations that we did. In the example above we would use an inferential statistic called a T-TEST. The t-test is used when we are comparing TWO groups. In this instance the t-test does not yield a statistically significant difference. In other words, the differences between the scores for the boys and the scores for the girls are not large enough for us to rule out chance as a possible explanation. We would have to conclude then that there is no gender difference for our hypothetical "Z" trait. Inferential statistics do not tell you whether your study is accurate or whether your findings are important. Statistics cannot make up for an illconceived study or theory. They simply assess whether we can rule out the first "extraneous" variable of all research, CHANCE.

121

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Final-Term Project Discussion & Lab Practice Session

122

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Lab Practice Session


The students will be given two hours session in Lab revision of what they have learnt in Post-mid session. The objectives of this session are to provide students an opportunity to Revise the whole course that they have learnt throughout the post mid session Have hands on practice on dealing with quantitative data using SPSS Share their problems that they confront during revision and get the solution Clarify if they have any ambiguity regarding understanding or application of any concept regarding QTB

Final Project Discussion


The students will be given one hours session to discuss about the final draft of their final projects. The objectives of this session are to provide students an opportunity to Share their problems that they confront during revision and get the solution Clarify if they have any ambiguity regarding understanding or application of any concept regarding QTB Get productive feedback on what they have done regarding their projects

The Drafts will be checked on the following criteria


The drafts will be checked if the following components are covered a. Whether the survey is appropriately designed to collect the primary data b. Whether the following components are appropriately discussed in the report An introduction explaining the background and objectives of your work. The Justification of the topic selection A description of the data definitions of the variables, conclusions about data quality, and so on. A justification of the methods you have chosen to analyze the data.

123

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Analysis and results descriptive as well as inferential with results Conclusion a discussion and interpretation of your results and a summary of what you have achieved.

Length:

1500 to 2000 words

PRESENTATION on Final Term Project

124

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

Presentation Students will be evaluated on the basis of following criteria. Timing of presentation. Clarity of concepts. Structure of the presentation. Quality of overheads, handouts etc. Application of theory to practice. Ability to answer questions effectively

125

QUANTITATIVE TECHNIQUES IN BUSINESS

QTB

126

Vous aimerez peut-être aussi