Vous êtes sur la page 1sur 16

14-1

Business Research Methods

Data Preparation

14-2

Data Preparation Process


Validation Editing Coding

Data Entry
Data Cleaning Tabulation & Analysis

14-3

Questionnaire Checking
A questionnaire returned from the field may be unacceptable for several reasons. Parts of the questionnaire may be incomplete. The pattern of responses may indicate that the respondent did not understand or follow the instructions. The responses show little variance. One or more pages are missing. The questionnaire is received after the preestablished cutoff date. The questionnaire is answered by someone who does not qualify for participation. Validation & Editing help in preparing data for data entry

14-4

Validating
It is the process of ascertaining whether the interviews conducted complied with specified norms It helps in detecting any fraud or failure by interviewer to follow specified instructions Questionnaire has a separate place to record respondents name, address, telephone number & other demographic details & date of interview It is the basis for validation to confirm if the interview was really conducted

Editing

14-5

It is the process of checking for mistakes by interviewer or respondent in filling the questionnaire It is a manual process which is generally done twice, first by the service firm which conducted interviews & second by the market research firm The first check is generally done by the field supervisors in the field itself Problems to be checked in editing involve - Finding out whether the interviewer has followed skip pattern - Whether responses to open ended questions have been properly obtained

Editing

14-6

During editing some illegible, incomplete, inconsistent or ambiguous responses are found which are called unsatisfactory responses . Treatment of Unsatisfactory Results Returning to the Field The questionnaires with unsatisfactory responses may be returned to the field, where the interviewers recontact the respondents. Assigning Missing Values If returning the questionnaires to the field is not feasible, the editor may assign missing values to unsatisfactory responses. Discarding Unsatisfactory Respondents In this approach, the respondents with unsatisfactory responses are simply discarded.

14-7

Coding
Coding : It is the process of assigning a symbol, usually a number, to each possible response to each question. Coding is necessary for efficient data analysis Categorization of responses to be done for the purpose of coding should be: --Appropriate :If income is important variable wider income classification may not be appropriate --Exhaustive :Should list all possible alternatives --Mutually Exclusive: Responses should not fit into more than one category

14-8

Coding
Coding closed ended questions is easy since there are a definite number of predetermined responses Closed ended questions are generally precoded & hence intermediate step of framing the codes prior to data entry can be avoided Coding the data from open ended questions is much more difficult as responses are unlimited & vary
.

14-9

Coding
Guidelines for coding unstructured questions:

Category codes should be mutually exclusive and collectively exhaustive. Only a few (10% or less) of the responses should fall into the other category. Category codes should be assigned for critical issues even if no one has mentioned them. Data should be coded to retain as much detail as possible.

14-10

Content Analysis for open ended questions


Qualitative technique used to analyze text provided in the response category of open ended questions It systematically & objectively derives categories of responses that represents homogeneous thoughts & opinions It identifies responses particularly relevant to the survey It requires the researcher to name categories through a detailed examination of data ( as against pre-coding) It is an iterative interpretation process of first reading the responses, then rereading them to establish meaningful categories The number & meaning of categories are further refined so that it is most representative of the respondents text Each response is classified into as many categories as necessary to capture full picture Responses out of context of the question are not coded

Codebook

14-11

A codebook contains coding rules and the necessary information about each variable in the survey. A codebook generally contains the following information question number ---(3) variable number ----(4) variable name ----(Brand) instructions for coding--- 1=Amul 2=Cadbury 3=Nestle

Coding Dont Knows

14-12

Dont know is included in possible answers Respondents choose this either because they genuinely dont know or because they dont want to answer A considerable number of DK responses may be generated for some questions Researcher can either ignore them or allocate the frequency to all other responses in the ratio they occur How many chocolates you eat in a typical week? 300(<20):200(>20):50(DK) 330(<20):220(>20)

Data Entry or Transcribing

14-13

Data entry involves transferring coded data from questionnaires or coding sheets into computers through keypunching Data collected through CATI or CAPI are entered directly into computer Besides keypunching data can be transferred using optical scanning, mark sense forms or computerised sensory analysis Optical scanners can read responses on questionnaires. They can read darkened small circles & process marked answers .Used in correction of papers in competitive exams. Transcription of UPC data at checkout counters in supermarkets Mark sense forms require responses to be recorded with special pencil in a predestinated area coded for that response. The data can then be read by a machine Computerised sensory analysis automate data collection process. Questions appear on a computerised gridpad & responses are recorded directly into computer using a sensing device

14-14

Data Cleaning Data cleaning is undertaken after data entry & includes ----consistency checks ----treatment of missing values Compared to preliminary consistency checks during editing ,checking at this stage is more thorough & extensive as it uses computers

14-15

Data Cleaning Consistency Checks


Consistency checks identify data that are out of range, logically inconsistent, or have extreme values. Computer packages like SPSS, SAS, EXCEL and MINITAB can be programmed to identify out-of-range values for each variable and print out the respondent code, variable code, variable name, record number, column number, and out-of-range value. Extreme values should be closely examined.

14-16

Data Cleaning Treatment of Missing Responses


Substitute a Neutral Value A neutral value, typically the mean response to the variable, is substituted for the missing responses. Substitute an Imputed Response The respondents' pattern of responses to other questions are used to impute or calculate a suitable response to the missing questions. In casewise deletion, cases, or respondents, with any missing responses are discarded from the analysis. In pairwise deletion, instead of discarding all cases with any missing values, the researcher uses only the cases or respondents with complete responses for each calculation.

After data cleaning computer data file is deemed clean &ready for analysis

Vous aimerez peut-être aussi