Mixture Modeling

Mixture Modeling
Chongming Yang
Research Support Center
FHSS College
Mixture of Distributions
Mixture of Distributions
Classification Techniques
Latent Class Analysis (categorical
indicators)
Latent Profile Analysis (continuous
Indicators)
Finite Mixture Modeling (multivariate
normal variables)

Integrate Classification Models into

Other Models
Mixture Factor Analysis

Mixture Regressions
Mixture Structural Equation Modeling
Growth Mixture Modeling
Multilevel Mixture Modeling
Disadvantages of Multi-steps
Practice
Multistep practice
Run classification model
Save membership Variable
Model membership variable and other
variables
Disadvantages
Biases in parameter estimates
Biases in standard errors
Significance
Confidence Intervals
Latent Class Analysis (LCA)

Setting
Latent trait assumed to be categorical
Trait measured with multiple categorical
indicators
Example: drug addiction, Schizophrenia
Aim
Identify heterogeneous classes/groups
Estimate class probabilities
Identify good indicators of classes
Relate covariates to Classes
Graphic LCA Model

Categorical Indicators u: u1, u2,u3, ur
Categorical Latent Variable C: C =1,
2, , or K
Probabilistic Model
Assumption: Conditional independence
of u
so that interdependence is explained by C like factor analysis model
An
item probability
Joint Probability of all indicators

P (u1 , u2 , u3 ...ur )
k
P(c k ) P(u
k 1
| c k ) P (u2 | c k )...P (u r | c k )
LCA Parameters
Number of Classes -1
Item Probabilities -1
Class Means (Logit)

Probability Scale
(logistic Regression without any Covariates x)
Logit Scale
Mean
(highest number of Class)
=0
Latent Class Analysis with

Covariates
Covariates
are related to Class

Probability with multinomial logistic
regression
P (cik 1| xi )
ck ck x
e
J 1
cj cj x
Posterior Probability
(membership/classification of cases)
P(c k ) P(u1 | c k ) P(u2 | c k )...P(ur | c k )

P(c k | u1, u2 ,...ur )
P(u1 , u2 ,...ur )
Estimation
Maximum Likelihood estimation via
Expectation-Maximization algorithm
E (expectation) step: compute average
posterior probabilities for each class and
item
M (maximization) step: estimate class
and item parameters
Iterate EM to maximize the likelihood of
the parameters
Test against Data

O = observed number of response
patterns
E = model estimated number of
2
(o e)
2
response
patterns
e
Pearson
Chi-square based on likelihood ratio
2 LR 2 o log(o / e)
Determine Number of
Classes
Substantive theory (parsimonious,
interpretable)
Predictive validity
Auxiliary variables / covariates
Statistical information and tests
Bayesian Information Criterion (BIC)
Entropy
Testing K against K-1 Classes
Vuong-Lo-Mendell-Rubin likelihood-ratio test
Bootstrapped likelihood ratio test
Bayesian Information Criterion

(BIC)
BIC 2log (L) (h)ln(N )

L = likelihood
h = number of parameters
N = sample size
Choose model with smallest
BIC
BIC Difference > 4
appreciable
Quality of Classification
Entropy
= average of highest class probability of

individuals
A value of close to 1 indicates good
classification
No clear cutting point for acceptance or
rejection
Testing K against K-1

Classes
Bootstrapped likelihood ratio test
LRT = 2[logL(model 1)logL(model2)], where
model 2 is
nested in model 1.
Bootstrap Steps:
1. Estimate LRT for both models
2. Use bootstrapped samples to obtain
distributions for LRT of both models
3. Compare LRT and get p values
Testing K against K-1

Classes
Vuong-Lo-Mendell-Rubin likelihoodratio test
Determine Quality of
Indicators
Good indicators
Item response probability is close to 0 or
1 in each class
Bad indicators
Item response probability is high in more
than one classes, like cross-loading in
factor analysis
Item response probability is low in all
classes like low-loading in factor
analysis
LCA Examples
LCA
LCA with covariates
Class predicts a categorical outcome
Save Membership Variable

Variable:
idvar = id;
Output:
Savedata: File = cmmber.txt;
Save = cprob;
Latent Profile Analysis

Covariance of continuous variables
are dependent on class K and fixed
at zero
Variances of continuous variables are
constrained to be equal across
classes and minimized
Mean differences are maximized
across classes
Finite Mixture Modeling

(multivariate normal variables)
Finite = finite number of subgroups/classes
Variables are normally distributed in each
class
Means differ across classes
Variances are the same across
Covariances can differ without restrictions
or equal with restrictions across classes
Latent profile can be special case with
covariances fixed at zero.
Mixture Factor Analysis

Allow one to examine measurement
properties of items in heterogeneous
subgroups / classes
Measurement invariance is not
required assuming heterogeneity
Factor structure can change
See Mplus outputs
Factor Mixture Analysis

Parental Control
Parents let you make your
home on weekend nights
around with
programs you watch
programs you watch
bed on week nights
own decisions about the time you must be

own decisions about the people you hang
own decisions about what you wear
own decisions about which television
own decisions about which television
own decisions about what time you go to
own decisions about what you eat
Parental Acceptance
Feel people in your family understand you
Feel you want to leave home
Feel you and your family have fun together

Feel that your family pay attention to you
Feel your parents care about you
Feel close to your mother
Feel close to your father
Two dimensions of Parenting
Mixture SEM
See mixture growth modeling
Mixture Modeling with Known

Classes
Identify hidden classes within known
groups
Under nonrandomized experiments
Impose equality constraints on
covariates to identify similar classes
from known groups
Compare classes that differ in covariates

Mixture Modeling

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Mixture Modeling

Transféré par

Droits d'auteur :

Formats disponibles

Mixture Modeling

Integrate Classification Models into

Mixture Factor Analysis

Latent Class Analysis (LCA)

Graphic LCA Model

Joint Probability of all indicators

Class Means (Logit)

(highest number of Class)

Latent Class Analysis with

are related to Class

P(c k ) P(u1 | c k ) P(u2 | c k )...P(ur | c k )

Test against Data

Bayesian Information Criterion

BIC 2log (L) (h)ln(N )

= average of highest class probability of

Testing K against K-1

Testing K against K-1

Save Membership Variable

Latent Profile Analysis

Finite Mixture Modeling

Mixture Factor Analysis

Factor Mixture Analysis

own decisions about the time you must be

Feel you and your family have fun together

Two dimensions of Parenting

Mixture Modeling with Known

Vous aimerez peut-être aussi