Logit models of discrete choice

In this series of posts I discuss a set of methods commonly used by a wide range of modelers, from regulators, market researchers, town planners, and ecologists, to model the behavior of agents making discrete choices between mutually exclusive (that is, not complementary) options. In the context of this chapter, these are choices to buy or not buy some good from a range of alternatives, but it is possible to answer a wide range of questions with similar methods. The resulting fitted models can be used for many interesting purposes, including estimating demand and supply curves in differentiated goods markets, examining how customers may substitute between choices when changes to their choice set are made, price optimization, designing products, and estimating the welfare impacts of new products or corporate mergers. Over the coming set of posts I’ll discuss most of these applications.

I’ll illustrate two main models for the “demand side” of a discrete choice problem. The first is the logit model. This is easy to implement, though is best seen as a learning tool—it has some unfortunate inadequacies. Next I describe the random coefficients logit model, which models the choices of a heterogeneous set of decision-makers. For both models, I describe how to fit them using either choice-level data or aggregate-level data.

There are notorious endogeneity problems in dealing with supply and demand models. If we see a fairly expensive product selling like hotcakes, should we conclude that consumers are not sensitive to prices, or that perhaps the product is not as similar to its cheaper substitutes as we might infer by looking at the (observable) characteristics of the product alone? And if this unobserved quality enters consumers’ decisions, then surely it must enter producers’ decisions to set prices. For both the logit and random coefficient logit models described in these posts, we also describe how to implement two “supply side” models. The first overcomes endogeneity problems using instrumental variables (though this makes fewer assumptions about producers’ price-setting behaviors); the second allows for more realistic policy simulations by modeling suppliers’ choices explicitly.

I ignore some important models that are commonly taught in a discrete choice course. The first is the so-called nested logit model. This model exists between the logit and random coefficients logit models, and most of its desirable properties are subsumed by the latter. The other is the multinomial probit model (MNP), a powerful choice model. Currently fitting this model with Bayesian methods is extremely expensive computationally, and so the marginal benefits of using Bayesian methods are not as large as with the logit-based models in this chapter.

In the course of describing the aggregate random coefficient logit model, I introduce the concept of using simulated likelihoods. Using this approach we can fit a wide range of likelihood-based models where there might be no algebraic expression for the likelihood contribution of a data-point. These methods are an invaluable tool for the applied economist; they are widely used in macroeconomics and industrial organization.

The sections are as follows (if there is no link it’s not ready yet):

A simple model of choice
The choice-level logit choice model
Choosing priors for the logit choice model
Putting it all together
The aggregate choice count model and share model (placeholder)
Dealing with endogenous price-setting: IVs, latent variables, behavioural price-setting (placeholder)
Random coefficients logit (old)
Ranked random coefficients logit (old)
Aggregate random coefficients logit (placeholder)
Simulating a new product (placeholder)
Simulating a merger (placeholder)

Logit models of discrete choice

Jim Savage

3/18/2019