首页 > 专利查询

lclogit潜类别logit模型(latent class logit model)教程

阅读：评论：0

The Stata Journal (2013)

13,Number 3,pp.625–639

lclogit:A Stata command for ﬁtting

latent-class conditional logit models via the

expectation-maximization algorithm

Daniele Paciﬁco

Italian

Department of the Treasury

氧气过滤器Rome,Italy

cielabdaniele.paciﬁco@tesoro.it Hong il Yoo Durham University Durham,UK

@durham.ac.uk

Abstract.In this article,we describe lclogit ,a Stata command for ﬁtting

a discrete-mixture or latent-class logit model via the expectation-maximization

algorithm.

Keywords:st0312,lclogit,lclogitpr,lclogitcov,lclogitml,latent-class model,ex-

pectation-maximization algorithm,mixed logit

1Introduction

Mixed logit or random parameter logit is used in many empirical applications to cap-ture more realistic substitution patterns than traditional conditional logit.The ran-dom parameters are usually assumed to follow a normal distribution,and the resulting model is ﬁt through simulated maximum likelihood,as in Hole ’s (2007)Stata command mixlogit .Several recent studies,however,note potential gains from specifying a dis-crete instead of normal mixing distribution,including the ability to approximate the true parameter distribution more ﬂexibly at lower computational costs.1

Paciﬁco (2012)implements the expectation-maximization (EM )algorithm for ﬁtting a discrete-mixture logit model,also known as a latent-class logit (LCL )model,in Stata.As Bhat (1997)and Train (2008)emphasize,the EM algorithm is an attractive alterna-tive to the usual (quasi-)Newton methods in the present context because it guarantees numerical stability and convergence to a local maximum even when the number of latent classes is large.In contrast,the usual optimization procedures often fail to achieve con-vergence because inversion of the (approximate)Hessian becomes numerically diﬃcult.

With this contribution,we aim at generalizing Paciﬁco ’s (2012)code with a Stata command that introduces a series of important functionalities and provides an improved performance in terms of run time and stability.

1.For example,see Hess et al.(2011),Shen (2009),and Greene and Hensher (2003).

c 2013StataCorp LP st0312

626Latent-class logit model 2EM algorithm for LCL

This section recapitulates the EM algorithm forﬁtting an LCL model.2Suppose that each of N agents f

aces,for notational simplicity,J alternatives in each of T choice scenarios.3Let y njt denote a binary variable that equals1if agent n chooses alternative j in scenario t and equals0otherwise.Each alternative is described by alternative-speciﬁc characteristics x njt and each agent by agent-speciﬁc characteristics,including a constant,z n.

LCL assumes that there are C distinct sets(or classes)of taste parameters,β= (β1,β2,...,βC).If agent n is in class c,the probability of observing his or her sequence of choices is a product of conditional logit formulas:

P n(βc)=

t=1

j=1

exp(βc x njt)

k=1

exp(βc x nkt)

njt

(1)

Because the class membership status is unknown,the researcher needs to specify the unconditional likelihood of agent n’s choices,which equals the weighted average of(1) over classes.The weight for class c,πcn(θ),is the population share of that class and is usually modeled as fractional multinomial logit,

πcn(θ)=智能支付

exp(θc z n)

C−1

l=1

exp(θl z n)

蓝牙天线(2)

whereθ=(θ1,θ2,...,θC−1)are class membership model parameters;note thatθC has been normalized to0for identiﬁcation.

The sample log likelihood is then obtained by summing each agent’s log uncondi-tional likelihood:

ln L(β,θ)=

液压压力机液压压力机n=1

c=1

πcn(θ)P n(βc)(3)

Bhat(1997)and Train(2008)note numerical diﬃculties associated with maximizing(3)

directly.They show thatβandθcan be more conveniently estimated via a well-known EM algorithm for likelihood maximization in the presence of incomplete data,treating each agent’s class membership status as the missing information.Let superscript s

denote the estimates obtained at the s th iteration of this algorithm.Then at iteration

s+1,the estimates are updated as

βs+1=argmaxβ N

n=1

c=1

ηcn(βs,θs)ln P n(βc)

θs+1=argmaxθ N

n=1

c=1

ηcn(βs,θs)lnπcn(θ)

2.Further details are available in Bhat(1997)and Train(2008).

3.lclogit is also applicable when the number of scenarios varies across agents,and the number of

alternatives varies both across agents and over scenarios.

D.Paciﬁco and H.Yoo627 whereηcn(βs,θs)is the posterior probability that agent n is in class c evaluated at the

s th estimates:

ηcn(βs,θs)=

πcn(θs)P n(βs c)

l=1

πln(θs)P n(βs l)

(4)

The updating procedure can be implemented easily in Stata,exploiting clogit and fmlogit routines as follows.4βs+1is computed byﬁtting a conditional logit model (clogit)C times,each time usingηcn(βs,θs)for a particular c to weight observations on each n.θs+1is obtained byﬁtting a fractional multinomial logit model(fmlogit) that takesη1n(βs,θs),η2n(βs,θs),...,ηCn(βs,θs)as dependent variables.When z n only includes the constant term so that each class share is the same for all agents,that is,whenπcn(θ)=πc(θ),each class share can be directly updated by using the following analytical solution withoutﬁtting the fractional multinomial logit model:

πc(θs+1)=

n=1

ηcn(βs,θs)

l=1

n=1

ηln(βs,θs)

(5)

With a suitable selection of starting values,the updating procedure can be repeated until changes in the estimates and improvement in the log likelihood between iterations are small enough.

An often-highlighted feature of LCL is its ability to accommodate unobserved inter-personal taste variation without restricting the shape of the underlying taste distribu-tion.Hess et al.(2011)have recently emphasized that LCL also provides a convenient means to account for observed interpersonal heterogeneity in correlations among tastes for diﬀerent attributes.For example,letβq andβh denote taste coeﬃcients on the q th and h th attributes,respectively.Each coeﬃcient may take one of C distinct values and is a random parameter from the researcher’s perspective.Their covariance is given by

cov n(βq,βh)=

c=1

πcn(θ)βc,qβc,h−

c=1

πcn(θ)βc,q

c=1

πcn(θ)βc,h

(6)边沟滑模施工

whereβc,q is the value ofβq when agent n is in class c,andβc,h is deﬁned similarly.As long as z n in(2)includes a nonconstant variable,this covariance will vary across agents with diﬀerent observed characteristics through the variation inπcn(θ).

3The lclogit command

lclogit is a Stata command that implements the EM iterative scheme outlined in the previous section.This command generalizes Paciﬁco’s(2012)step-by-step procedure and introduces an improved internal loop along with other important functionalities. The overall eﬀect is to make the estimation process more convenient,signiﬁcantly faster, and more stable numerically.

4.fmlogit is a user-written program.See footnote5for a further description.

628Latent-class logit model For example,the internal code of lclogit executes fewer algebraic operations per iteration to update the estimates;uses the standard generate command to perform tasks that were previously executed with slightly slower egen functions;and,when possible,works with log probabilities instead of probabilities.All of these changes substantially reduce the estimation run time,especially in the presence of a large number of parameters and observations.If we take the8-class modelﬁt by Paciﬁco(2012)as an example,lclogit produces the same results as the step-by-step procedure while taking less than one-half of the run time.

The data setup for lclogit is identical to that required by clogit.

3.1Syntax

The generic syntax for lclogit is

lclogit depvar

indepvars

,group(varname)id(varname)

nclasses(#)

membership(varlist)convergence(#)iterate(#)seed(#)

constraints(Class#numlist:

Class#numlist:...

)nolog

3.2Options

group(varname)speciﬁes a numeric identiﬁer variable for the up() is required.

id(varname)speciﬁes a numeric identiﬁer variable for the choice makers or agents.

With cross-section data,users should specify the same variable for both the group() and the id()options.id()is required.

nclasses(#)speciﬁes the number of latent classes used in the estimation.A minimum of two latent classes lasses()is required.

membership(varlist)speciﬁes independent variables to enter the fractional multinomial logit model of class membership,that is,the variables included in the vector z n of

(2).These variables must be constant within the same agent as identiﬁed by id().5

When this option is not speciﬁed,the class shares are updated algebraically following

(5).

convergence(#)speciﬁes the tolerance for the log likelihood.When the proportional increase in the log likelihood over the lastﬁve iterations is less than the speciﬁed criterion,lclogit declares convergence.The default is convergence(0.00001). 5.Paciﬁco(2012)speciﬁed an ml program with the method lf toﬁt the class membership model.

lclogit uses another user-written program from Buis(2008),fmlogit,which performs the same estimation with the signiﬁcantly faster and more accurate d2method.lclogit is downloaded with

a modiﬁed version of the prediction command of fmlogit and fmlogit pr because we had to modify

this command to obtain double-precision class shares.

D.Paciﬁco and H.Yoo 629iterate(#)speciﬁes the maximum number of iterations.If convergence is not achieved after the selected number of iterations,lclogit stops the recursion and notes this fact before displaying the estimation results.The default is iterate(150).

seed(#)sets the seed for pseudouniform random numbers.The default is the creturn value c(seed).

The starting values for taste parameters are obtained by splitting the sample into nclasses()diﬀerent subsamples and ﬁtting a clogit model for each of them.Dur-ing this process,a pseudouniform random number is generated for each agent to assign the agent into a particular subsample.6As for the starting values for the class shares,lclogit uses equal shares,that is,1/nclasses().constraints(Class #numlist : Class #numlist :... )speciﬁes the constraints that are imposed on the taste parameters of the designated classes,that is,βc in (1).For instance,suppose that x1and x2are alternative-speciﬁc characteristics included in indepvars for lclogit and that the user wishes to restrict the coeﬃcient on x1to 0for Class1and Class4and the coeﬃcient on x2to 2for Class4.Then the relevant series of commands would look like this:

constraint 1x1=0

constraint 2x2=2

lclogit depvar indepvars ,group(varname )id(varname )

///

nclasses(8)constraints(Class11:Class412)nolog suppresses the display of the iteration log.

4Postestimation command:lclogitpr

lclogitpr predicts the probabilities of choosing each alternative in a choice situation (choice probabilities hereafter),the class shares or prior probabilities of class member-ship,and the posterior probabilities of class membership.The predicted probabilities are stored in a variable named stubname#,where #refers to the relevant class number;the only exception is the unconditional choice probability,which is stored in a variable named stubname .

4.1Syntax

The syntax for lclogitpr is lclogitpr stubname if in ,class(numlist )pr0pr up cp

6.More speciﬁcally,the unit interval is divided into nclasses()equal parts,and if the agent’s pseudo-random draw is in the c th part,the agent is allocated to the subsample whose clogit results serve as the initial estimates of class c ’s taste parameters.Note that lclogit is identical to asmprobit in that the current seed,as at the beginning of the command’s execution,is restored once all necessary pseudorandom draws have been made.

本文发布于:2023-05-14 16:40:54，感谢您对本站的认可！

本文链接：https://patent.en369.cn/patent/4/99597.html

上一篇：93LC56B中文资料

下一篇：LC TYPE CONNECTOR WITH CLIP-ON PUSHPULL TAB FOR R

标签：边沟过滤器智能蓝牙滑模氧气支付天线

留言与评论（共有 0 条评论）