The Stata Journal (2013)
13,Number 3,pp.625–639
lclogit:A Stata command for fitting
latent-class conditional logit models via the
expectation-maximization algorithm
Daniele Pacifico
Italian
Department of the Treasury
cielabdaniele.pacifico@tesoro.it Hong il Yoo Durham University Durham,UK
@durham.ac.uk
Abstract.In this article,we describe lclogit ,a Stata command for fitting
a discrete-mixture or latent-class logit model via the expectation-maximization
algorithm.
Keywords:st0312,lclogit,lclogitpr,lclogitcov,lclogitml,latent-class model,ex-
pectation-maximization algorithm,mixed logit
1Introduction
Mixed logit or random parameter logit is used in many empirical applications to cap-ture more realistic substitution patterns than traditional conditional logit.The ran-dom parameters are usually assumed to follow a normal distribution,and the resulting model is fit through simulated maximum likelihood,as in Hole ’s (2007)Stata command mixlogit .Several recent studies,however,note potential gains from specifying a dis-crete instead of normal mixing distribution,including the ability to approximate the true parameter distribution more flexibly at lower computational costs.1
Pacifico (2012)implements the expectation-maximization (EM )algorithm for fitting a discrete-mixture logit model,also known as a latent-class logit (LCL )model,in Stata.As Bhat (1997)and Train (2008)emphasize,the EM algorithm is an attractive alterna-tive to the usual (quasi-)Newton methods in the present context because it guarantees numerical stability and convergence to a local maximum even when the number of latent classes is large.In contrast,the usual optimization procedures often fail to achieve con-vergence because inversion of the (approximate)Hessian becomes numerically difficult.
With this contribution,we aim at generalizing Pacifico ’s (2012)code with a Stata command that introduces a series of important functionalities and provides an improved performance in terms of run time and stability.
1.For example,see Hess et al.(2011),Shen (2009),and Greene and Hensher (2003).
c 2013StataCorp LP st0312
626Latent-class logit model 2EM algorithm for LCL
This section recapitulates the EM algorithm forfitting an LCL model.2Suppose that each of N agents f
aces,for notational simplicity,J alternatives in each of T choice scenarios.3Let y njt denote a binary variable that equals1if agent n chooses alternative j in scenario t and equals0otherwise.Each alternative is described by alternative-specific characteristics x njt and each agent by agent-specific characteristics,including a constant,z n.
LCL assumes that there are C distinct sets(or classes)of taste parameters,β= (β1,β2,...,βC).If agent n is in class c,the probability of observing his or her sequence of choices is a product of conditional logit formulas:
P n(βc)=
T
t=1
J
j=1
exp(βc x njt)
J
k=1
exp(βc x nkt)
y
njt
(1)
Because the class membership status is unknown,the researcher needs to specify the unconditional likelihood of agent n’s choices,which equals the weighted average of(1) over classes.The weight for class c,πcn(θ),is the population share of that class and is usually modeled as fractional multinomial logit,
exp(θc z n)
1+
C−1
l=1
exp(θl z n)
whereθ=(θ1,θ2,...,θC−1)are class membership model parameters;note thatθC has been normalized to0for identification.
The sample log likelihood is then obtained by summing each agent’s log uncondi-tional likelihood:
ln L(β,θ)=
N
液压压力机液压压力机n=1
ln
C
c=1
πcn(θ)P n(βc)(3)
Bhat(1997)and Train(2008)note numerical difficulties associated with maximizing(3)
directly.They show thatβandθcan be more conveniently estimated via a well-known EM algorithm for likelihood maximization in the presence of incomplete data,treating each agent’s class membership status as the missing information.Let superscript s
denote the estimates obtained at the s th iteration of this algorithm.Then at iteration
s+1,the estimates are updated as
βs+1=argmaxβ N
n=1
C
c=1
ηcn(βs,θs)ln P n(βc)
θs+1=argmaxθ N
n=1
C
c=1
ηcn(βs,θs)lnπcn(θ)
2.Further details are available in Bhat(1997)and Train(2008).
3.lclogit is also applicable when the number of scenarios varies across agents,and the number of
alternatives varies both across agents and over scenarios.
D.Pacifico and H.Yoo627 whereηcn(βs,θs)is the posterior probability that agent n is in class c evaluated at the
s th estimates:
ηcn(βs,θs)=
πcn(θs)P n(βs c)
C
l=1
πln(θs)P n(βs l)
(4)
The updating procedure can be implemented easily in Stata,exploiting clogit and fmlogit routines as follows.4βs+1is computed byfitting a conditional logit model (clogit)C times,each time usingηcn(βs,θs)for a particular c to weight observations on each n.θs+1is obtained byfitting a fractional multinomial logit model(fmlogit) that takesη1n(βs,θs),η2n(βs,θs),...,ηCn(βs,θs)as dependent variables.When z n only includes the constant term so that each class share is the same for all agents,that is,whenπcn(θ)=πc(θ),each class share can be directly updated by using the following analytical solution withoutfitting the fractional multinomial logit model:
πc(θs+1)=
N
n=1
ηcn(βs,θs)
C
l=1
N
n=1
ηln(βs,θs)
(5)
With a suitable selection of starting values,the updating procedure can be repeated until changes in the estimates and improvement in the log likelihood between iterations are small enough.
An often-highlighted feature of LCL is its ability to accommodate unobserved inter-personal taste variation without restricting the shape of the underlying taste distribu-tion.Hess et al.(2011)have recently emphasized that LCL also provides a convenient means to account for observed interpersonal heterogeneity in correlations among tastes for different attributes.For example,letβq andβh denote taste coefficients on the q th and h th attributes,respectively.Each coefficient may take one of C distinct values and is a random parameter from the researcher’s perspective.Their covariance is given by
cov n(βq,βh)=
C
c=1
πcn(θ)βc,qβc,h−
C
c=1
πcn(θ)βc,q
C
c=1
πcn(θ)βc,h
whereβc,q is the value ofβq when agent n is in class c,andβc,h is defined similarly.As long as z n in(2)includes a nonconstant variable,this covariance will vary across agents with different observed characteristics through the variation inπcn(θ).
3The lclogit command
lclogit is a Stata command that implements the EM iterative scheme outlined in the previous section.This command generalizes Pacifico’s(2012)step-by-step procedure and introduces an improved internal loop along with other important functionalities. The overall effect is to make the estimation process more convenient,significantly faster, and more stable numerically.
4.fmlogit is a user-written program.See footnote5for a further description.
628Latent-class logit model For example,the internal code of lclogit executes fewer algebraic operations per iteration to update the estimates;uses the standard generate command to perform tasks that were previously executed with slightly slower egen functions;and,when possible,works with log probabilities instead of probabilities.All of these changes substantially reduce the estimation run time,especially in the presence of a large number of parameters and observations.If we take the8-class modelfit by Pacifico(2012)as an example,lclogit produces the same results as the step-by-step procedure while taking less than one-half of the run time.
The data setup for lclogit is identical to that required by clogit.
3.1Syntax
The generic syntax for lclogit is
lclogit depvar
indepvars
if
in
,group(varname)id(varname)
nclasses(#)
membership(varlist)convergence(#)iterate(#)seed(#)
constraints(Class#numlist:
Class#numlist:...
)nolog
3.2Options
group(varname)specifies a numeric identifier variable for the up() is required.
id(varname)specifies a numeric identifier variable for the choice makers or agents.
With cross-section data,users should specify the same variable for both the group() and the id()options.id()is required.
nclasses(#)specifies the number of latent classes used in the estimation.A minimum of two latent classes lasses()is required.
membership(varlist)specifies independent variables to enter the fractional multinomial logit model of class membership,that is,the variables included in the vector z n of
(2).These variables must be constant within the same agent as identified by id().5
When this option is not specified,the class shares are updated algebraically following
(5).
convergence(#)specifies the tolerance for the log likelihood.When the proportional increase in the log likelihood over the lastfive iterations is less than the specified criterion,lclogit declares convergence.The default is convergence(0.00001). 5.Pacifico(2012)specified an ml program with the method lf tofit the class membership model.
lclogit uses another user-written program from Buis(2008),fmlogit,which performs the same estimation with the significantly faster and more accurate d2method.lclogit is downloaded with
a modified version of the prediction command of fmlogit and fmlogit pr because we had to modify
this command to obtain double-precision class shares.
D.Pacifico and H.Yoo 629iterate(#)specifies the maximum number of iterations.If convergence is not achieved after the selected number of iterations,lclogit stops the recursion and notes this fact before displaying the estimation results.The default is iterate(150).
seed(#)sets the seed for pseudouniform random numbers.The default is the creturn value c(seed).
The starting values for taste parameters are obtained by splitting the sample into nclasses()different subsamples and fitting a clogit model for each of them.Dur-ing this process,a pseudouniform random number is generated for each agent to assign the agent into a particular subsample.6As for the starting values for the class shares,lclogit uses equal shares,that is,1/nclasses().constraints(Class #numlist : Class #numlist :... )specifies the constraints that are imposed on the taste parameters of the designated classes,that is,βc in (1).For instance,suppose that x1and x2are alternative-specific characteristics included in indepvars for lclogit and that the user wishes to restrict the coefficient on x1to 0for Class1and Class4and the coefficient on x2to 2for Class4.Then the relevant series of commands would look like this:
constraint 1x1=0
constraint 2x2=2
lclogit depvar indepvars ,group(varname )id(varname )
///
nclasses(8)constraints(Class11:Class412)nolog suppresses the display of the iteration log.
4Postestimation command:lclogitpr
lclogitpr predicts the probabilities of choosing each alternative in a choice situation (choice probabilities hereafter),the class shares or prior probabilities of class member-ship,and the posterior probabilities of class membership.The predicted probabilities are stored in a variable named stubname#,where #refers to the relevant class number;the only exception is the unconditional choice probability,which is stored in a variable named stubname .
4.1Syntax
The syntax for lclogitpr is lclogitpr stubname if in ,class(numlist )pr0pr up cp
6.More specifically,the unit interval is divided into nclasses()equal parts,and if the agent’s pseudo-random draw is in the c th part,the agent is allocated to the subsample whose clogit results serve as the initial estimates of class c ’s taste parameters.Note that lclogit is identical to asmprobit in that the current seed,as at the beginning of the command’s execution,is restored once all necessary pseudorandom draws have been made.