Vous êtes sur la page 1sur 2

Hallmark Business School www.hbs.ac.

in

UNIT IVData Preparation And Analysis some group assignments should be known beforehand.Discriminant Analysis is
Data Preparation:includes editing, coding, and data entry and is the activity that quite close to being a graphical version of MANOVA and often used to
ensures the accuracy of the data and their conversion from raw form to reduced complement the findings of Cluster Analysis and Principal Components
and classified forms that are more appropriate for analysis. Preparing a Analysis.When Discriminant Analysis is used to separate two groups, it is called
descriptive statistical summary is another preliminary step leading to an Discriminant Function Analysis (DFA); while when there are more than two
understanding of the collected data. groups the Canonical Varieties Analysis (CVA) method is used.Discriminant
Editing, Coding, Data Entry: Editingdetects errors and omissions, corrects them Analysis has various benefits as a statistical tool and is quite similar to regression
when possible, and certifies that maximum data quality standards are achieved. analysis. It can be used to determine which predictor variables are related to the
Types of Editing Field Editing and Central Editing. Coding involves assigning dependent variable and to predict the value of the dependent variable given
numbers or other symbols to answers so that the responses can be grouped into certain values of the predictor variables. Discriminant Analysis is also widely
a limited number of categories. In coding, categories are the partitions of a data used to create Perceptual Mapping by marketers and has some benefits over
set of a given variable (e.g., if the variable is gender, the partitions are male and other methods that use perceived distances; like the option of using tests of
female). Categorization is the process of using rules to partition a body of data. significance to check for dissimilarities among products and that the distances
Both closed- and open-response questions must be coded.A codebook, or coding between two products would not be impacted by other products included in the
scheme, contains each variable in the study and specifies the application of study.Discriminant Analysis is often used in combination with cluster analysis.
coding rules to the variable. It is used by the researcher or research staff to Say, the loans department of a bank wants to find out the creditworthiness of
promote more accurate and more efficient data entry or data analysis. It is also applicants before disbursing loans. It may use Discriminant Analysis to find out
the definitive source for locating the positions of variables in the data file during whether an applicant is a good credit risk or not
analysis.Coding rules - Four rules guide the precoding and postcoding and cluster analysis:It is a statistical tool used to classify objects into groups, such that
categorization of a data set. The categories within a single variable should be: the objects belonging to one group are much more similar to each other and
Appropriate to the research problem and purpose. Exhaustive. Mutually rather different from objects belonging to other groups. It is generally used for
exclusive. Derived from one classification dimension. Content analysisfollows a exploratory data analysis and serves as a method of discovery by solving
systematic process for coding and drawing inferences from texts. It starts by classification issues. 1) Hierarchical cluster analysis methods - Agglomerative
determining which units of data will be analyzed. Content Analysis Types: 1) methods in this, all objects start in separate clusters till slowly similar objects
Syntacticalunitscan be words, phrases, sentences, or paragraphs; words are the are combined and this process is repeated till all objects are in a single cluster.
smallest and most reliable data units to analyze; 2) Referentialunits are Finally, the optimum number of clusters is chosen from among all
described by words, phrases, and sentences; they may be objects, events, options.Divisive methods in this, all objects start in the same cluster and the
persons, and so forth, to which a verbal or textual expression refers; 3) reverse of the agglomerative method is used. 2) Non-hierarchical Cluster
Propositionalunits are assertions about an object, event, person, and so on; 4) Analysis method (also known as k-means clustering methods): These are
Thematicunits are topics contained within (and across) texts; they represent generally used when large data sets are involved. Further, these provide the
higher-level abstractions inferred from the text and its context.Missing data are flexibility of moving a subject from one cluster to another.The main benefit of
information from a participant or case that is not available for one or more Cluster Analysis is that it allows us to group similar data together. This helps us
variables of interest. In survey studies, missing data typically occur when identify patterns between data elements. It reveals associations between data
participants accidentally skip, refuse to answer, or do not know the answer to an objects and helps to outline structure which might not have been apparent
item on the questionnaire.Data entry converts information gathered by previously but gives much sense and meaning to the data when discovered. Once
secondary or primary methods to a medium for viewing and manipulation. a clear structure emerges, it allows easier decision making.
Keyboarding remains a mainstay for researchers who need to create a data file multiple regression and correlation:Multiple regression is also known as logistic
immediately and store it in a minimal space on a variety of media. regression - Logistic regression aims to measure the relationship between a
Validity of data:In general, validity is an indication of how sound your research is. categorical dependent variable and one or more independent variables (usually
More specifically, validity applies to both the design and the methods of your continuous) by plotting the dependent variables probability scores. A categorical
research. Validityindata collection means that your findings truly represent the variable is a variable that can take values falling in limited categories instead of
phenomenon you are claiming to measure. Valid claims are solid claims. being continuous.Logistic regression uses regression to predict the outcome of a
Qualitative Vs Quantitative data analyses:Read Exhibit 7-2. categorical dependent variable on the basis of predictor variables. The probable
Bivariate and Multivariate statistical techniques: Bivariate studies are different outcomes of a single trial are modeled as a function of the explanatory variable
from univariate studies because it allows the researcher to analyze the using a logistic function. Logistic modeling is done on categorical data which may
relationship between two variables (often denoted as X, Y) ins order to test be of various types including binary and nominal. For example, a variable might
simple hypotheses of association and causality. For example, if you wanted to be binary and have two possible categories of yes and no; or it may be nominal
know whether there is a relationship between the number of students in an say hair color maybe black, brown, red, gold and grey.Another objective of
engineering classroom (independent variable) and their grades in that subject logistic regression is to check if the probability of getting a particular value of the
(dependent variable), you would use bivariate analysis since it measures two dependent variable is related to the independent variable. Multiple logistic
elements based on the observation of data. Four steps to conducting bivariate regression is used when there are more than one independent variables under
analysis: 1) Define the nature of the relationship; 2) Identify the type and study. For e.g., Logistic Regression would help identify factors like product
direction of the relationship; 3) Determine if the relationship is statistically quality, service quality, brand image, reward programs, etc., that impact
significant; 4) Identify the strength of the relationship. Multivariate studies are customers loyalty and willingness to recommend a retail stores products to
similar to bivariate studies, but multivariate studies have more than one others. The results would help improve the stores performance on these
dependent variable. For example, if an advertiser wanted to examine the parameters and increase customer loyalty.
effectiveness of three different banner ads on a popular website, the advertiser multidimensional scaling:is a means of visualizing the level of similarity of
could measure the ads click rate for both men and women. Researchers could individual cases of a dataset. It refers to a set of related ordination techniques
then use multivariate statistical analysis to examine the relationships between all used in information visualization, in particular to display the information
of the variables.Multivariate analytical techniques represent a variety of contained in a distance matrix.Steps: 1) formulating the problem; 2) Obtaining
mathematical models used to measure and quantify outcomes, taking into input data; 3) Running the MDS statistical program; 4) Decide number of
account important factors that can influence this relationship.The most popular is dimensions; 5) Mapping the results and defining the dimensions; 6) Test the
multiple regression analysis which helps one understand how the typical value of results for reliability and validity; 7) Report the results comprehensively. For
the dependent variable changes when any one of the independent variables is e.g,In marketing, MDS is a statistical technique for taking the preferences and
varied, while the other independent variables are held fixed. Other techniques perceptions of respondents and representing them on a visual grid, called
include factor analysis, path analysis and multiple analyses of variance perceptual maps. By mapping multiple attributes and multiple brands at the
(MANOVA). same time, a greater understanding of the marketplace and of consumers'
Factor analysis:It is a statistical tool that measures the impact of a few un- perceptions can be achieved, as compared with a basic two attribute perceptual
observed variables called factors on a large number of observed variables. It is map
used as a data reduction method. It may be used to uncover and establish the Application of statistical software for data analysis: Following are the statistical
cause and effect relationship between variables or to confirm a hypothesis. It is software and the features it has for doing data analysis: 1) SAS/STAT:SAS/STAT
often used to determine a linear relationship between variables before software is designed for both specialized and enterprise wide analytical needs. It
subjecting them to further analysis.Principal Factor Analysis is also called uses more of coding and little less of menu-driven way of doing
Common Factor Analysis and it aims to identify the minimum number of factors analysis.SAS/STAT software provides a complete, comprehensive set of tools that
that can lead to the correlation between a given set of variables. Other types of can meet the data analysis needs of the entire organization. Features: Anova;
Factor Analysis include Image factoring, Alpha factoring, Principal Component Mixed Models Linear mixed, non-linear mixed and general linear models;
Analysis and so on. Regression; Categorical data analysis; Bayesian analysis; Multivariate analysis;
Discriminant analysis:It is a statistical tool with an objective to assess the Survival analysis; Psychometric analysis; Cluster analysis; Nonparametric analysis;
adequacy of a classification, given the group memberships; or to assign objects to Survey data analysis; Mutiple imputation for missing values. 2) SPSS: It is more
one group among a number of groups. For any kind of Discriminant Analysis, menu driven and less coding; Analysing variables seperately; Comparing multiple
Hallmark Business School www.hbs.ac.in

variables; Association between variables. 3) R: It is all coding for doing all the
latest methods of doing data analysis. Every data analysis method can be done
using R; Creating unique and beautiful data visualizations; Getting better results
faster; Draw on the talents of statisticians worldwide as they make method
libraries for free usage.

Vous aimerez peut-être aussi