2021-10-29
IO: role of market structure on equilibrium outcomes.
Dynamics: study the endogenous evolution of market structure.
Bonus motivation: AI literature studies essentially the same set of problems with similar tools (Igami 2020)
Some examples in empirical IO
But also in other applied micro fields:
Which health insurance to pick given there are switching costs? (Handel 2013)
Addiction (Becker and Murphy 1988)
In some cases, we can reduce a dynamic problem to a:
E.g., Investment decision
Dynamic problem, as gains are realized after costs
“Static” solution: invest if \(\mathbb E (NPV ) > TC\)
Action today (\(a_t=0\) or \(1\)) does not affect the amount of future payoffs (NPV)
But many cases where it’s hard to evaluate dynamic questions in a static/reduced-form setting.
“A dynamic model can do anything a static model can.”
So-called New Empirical IO (summary in Bresnahan (1989))
Advantages
We can adress intertemporal trade-offs
We can examine transitions and not only steady states
We are able to address policy questions that cannot be addressed with reduced-form methods
Disadvantages
We typically need more assumptions
Identification in dynamic models is less transparent
It is often computationally intensive (i.e., slow / unfeasible)
Typical steps
Formally, a discrete-time MDP consists of the following objects
A discrete time index \(t \in \lbrace 0,1,2,...,T \rbrace\), for \(T \leq \infty\)
A state space \(\mathcal S\)
An action space \(\mathcal A\)
A family of transition probabilities \(\lbrace \Pr_{t}(s_{t+1}|s_t,a_t) \rbrace\)
A discount factor, \(\beta\)
A family of single-period reward functions \(\lbrace (u_t(s_t,a_t) \rbrace\)
In words
The state space \(\mathcal S\) contains all the information needed to
The (conditional) action space \(\mathcal A (s_t)\) contains all the actions available in state \(s_t\)
The transition probabilities \(\lbrace \Pr_{t+1}(s_{t+1}|s_t,a_t) \rbrace\) define the probabilities of future states \(s_{t+1}\) conditional on
The discount factor \(\beta\) together with the static reward functions \(\lbrace (u_t(s_t,a_t) \rbrace\) determines the objective function \[ \mathbb E_{\boldsymbol s'} \Bigg[ \sum_{t=0}^{T} \beta^{t} u_{t}\left(s_t, a_{t}\right) \Bigg] \]
Brief parenthesis on notation
I have seen states denoted as
I will try to stick to \(s\) all the time
I have seen decisions denoted as
I will try to stick to \(a\) all the time
The objective is to pick the decision rule (or policy function) \(P = \boldsymbol a^* = \lbrace a_1^*, ..., a_t ^ * \rbrace\) that solves \[ \max_{\boldsymbol a} \ \mathbb E_{\boldsymbol s'} \Bigg[ \sum_{t=0}^{T} \beta^{t} u_{t} \left(s_{t}, a_{t} \right) \Bigg] \] Where the expectation is taken over transition probabilities generated by the decision rule \(\boldsymbol a\).
In many applications, we assume stationarity
The transition probabilities and utility functions do not directly depend on \(t\)
Uncomfortable assumption?
You think there is some reason (variable) why today’s probabilities should be different from tomorrow’s?
In the finite horizon case (\(T \leq \infty\)), stationarity does not help much
In infinite-horizon problems, stationarity helps a lot
Now the difference between \(t\) and \(T\) is always the same, i.e. \(\infty\)
\(\sum_{t=0}^{\infty} \beta^{t} u(s_t, a_{t})\) does not depend on \(t\), conditional on \(s_t\)
The future looks the same whether the agent is in state \(s_t\) at time \(t\) or in state \(s_{t+\tau} = s_t\) at time \(t + \tau\)
Consider a stationary infinite-horizon problem
The only variable which affects the agent’s view about the future is the current value of the state, \(s_t\)
We can rewrite the agent’s problem as \[ V_0(s_0) = \max_{\boldsymbol a} \ \mathbb E_{\boldsymbol s'} \Bigg[ \sum_{t=0}^{\infty} \beta^{t} u\left(s_t, a_{t}\right) \Bigg] \] where
\[ \begin{align} V(s_0) &= \max_{\boldsymbol a} \ \mathbb E_{\boldsymbol s'} \Bigg[ \sum_{t=0}^{\infty} \beta^{t} u(s_t, a_{t}) \Bigg] = \newline &= \max_{\boldsymbol a} \ \mathbb E_{\boldsymbol s'} \Bigg[ {\color{red}{u(s_{0}, a_{0})}} + \sum_{{\color{red}{t=1}}}^{\infty} \beta^{t} u(s_t, a_{t}) \Bigg] = \newline &= \max_{\boldsymbol a} \ \Bigg\lbrace u(s_{0}, a_{0}) + {\color{red}{\mathbb E_{\boldsymbol s'}}} \Bigg[ \sum_{t=1}^{\infty} \beta^{t} u(s_t, a_{t}) \Bigg] \Bigg\rbrace = \newline &= \max_{\boldsymbol a} \ \Bigg\lbrace u(s_{0}, a_{0}) + {\color{red}{\beta}} \ \mathbb E_{\boldsymbol s'} \Bigg[ \sum_{t=1}^{\infty} \beta^{{\color{red}{t-1}}} u(s_t, a_{t}) \Bigg] \Bigg\rbrace = \newline &= \max_{{\color{red}{a_0}}} \ \Bigg\lbrace u(s_{0}, a_{0}) + \beta \ {\color{red}{\max_{\boldsymbol a}}}\ \mathbb E_{\boldsymbol s'} \Bigg[ \sum_{t=1}^{\infty} \beta^{t-1} u(s_t, a_{t}) \Bigg] \Bigg\rbrace = \newline &= \max_{a_0} \ \Bigg\lbrace u(s_{0}, a_{0}) + \beta \ {\color{red}{\int V(s_1) \Pr(s_1 | s_0, a_0)}} \Bigg\rbrace \end{align} \]
We have now a recursive formulation of the value function: the Bellman Equation \[ {\color{red}{V(s_0)}} = \max_{a_0} \ \Bigg\lbrace u(s_{0}, a_{0}) + \beta \ \int {\color{red}{V(s_1)}} \Pr(s_1 | s_0, a_0) \Bigg\rbrace \] Intuition
The decision rule that satisfies the Bellman Equation is called the policy function \[ a(s_0) = \arg \max_{a_0} \ \Bigg\lbrace u(s_{0}, a_{0}) + \beta \ \int V(s_1) \Pr(s_1 | s_0, a_0) \Bigg\rbrace \]
Under regularity conditions
It is possible to show that \[ T(W)(s) = \max_{a \in \mathcal A(s)} \ \Bigg\lbrace u(s, a) + \beta \ \int W(s') \Pr(s' | s, a) \Bigg\rbrace \] is a contraction mapping of modulus \(\beta\).
How do we actually do it in practice?
So what’s going to be new here?
Rust (1987): An Empirical Model of Harold Zurcher
Harold Zurcher (HZ) is the city bus superintendant in Madison, WI
As bus engines get older, the probability of malfunctions increases
HZ decides when to replace old bus engines with new ones
Tradeoff
Do we care about Harold Zurcher?
Units of observation
Observables: for each bus, he sees
Variation
What would you do otherwise?
Problem
Outcome
Assumptions of the structural model
HZ static utility function (for a single bus) \[ u\left(s_t, a_{t} ; \theta\right)= \begin{cases}-c\left(s_t ; \theta\right) & \text { if } a_{t}=0 \text { (not replace) } \newline -R-c(0 ; \theta) & \text { if } a_{t}=1 \text { (replace) }\end{cases} \] where
HZ objective function is to maximize the expected present discounted sum of future utilities \[ V(s_t ; \theta) = \max_{\boldsymbol a} \mathbb E_{s_{t+1}} \left[\sum_{\tau=t}^{\infty} \beta^{\tau-t} u\left(s_{\tau}, a_{\tau} ; \theta\right) \ \Bigg| \ s_t, \boldsymbol a ; \theta\right] \] where
Notes
This (sequential) representation of HZ’s problem is very cumbersome to work with.
We can rewrite \(V (s_t; \theta)\) with the following Bellman equation \[ V\left(s_t ; \theta\right) = \max_{a_{t}} \Bigg\lbrace u\left(s_t, a_{t} ; \theta\right)+\beta \mathbb E_{s_{t+1}} \Big[V\left(s_{t+1} ; \theta\right) \Big| s_t, a_{t} ; \theta\Big] \Bigg\rbrace \] Basically we are dividing the infinite sum (in the sequential form) into a present component and a future component.
Notes:
Suppose for a moment that \(s_t\) follows a second-order markov process \[ s_{t+1}=f\left(s_t, {\color{red}{s_{t-1}}}, \varepsilon ; \theta\right) \] Now \(s_t\) is not sufficient to describe current \(V\)
Which variables should be state variables? I.e. should be included in the state space?
General rule for 1st order markow processes: variables need to
What do you do otherwise? Integrate them out! Examples
Note: you can always get the non-expected value function if you know the probability of raining or the transition probabilities by month
Along with this value function comes a corresponding policy (or choice) function mapping the state \(s_t\) into HZ’s optimal replacement choice \(a_t\) \[ P \left(s_t ; \theta\right) = \max_{a_{t}} \Bigg\lbrace u\left(s_t, a_{t} ; \theta\right) + \beta \mathbb E_{s_{t+1}} \Big[ V \left(s_{t+1} ; \theta\right) \Big| s_t, a_{t} ; \theta\Big] \Bigg\rbrace \] Given \(\frac{\partial c}{\partial s}>0\), the policy function has the form \[ P \left(s_t ; \theta\right) = \begin{cases}1 & \text { if } s_t \geq \gamma(\theta) \newline 0 & \text { if } s_t<\gamma(\theta)\end{cases} \] where \(\gamma\) is the replacement mileage.
How would this compare with the optimal replacement mileage if HZ was myopic?
Why do we want to solve for the value and policy functions?
We have the Bellman Equation \[ V\left(s_t ; \theta\right) = \max_{a_{t}} \Bigg\lbrace u\left(s_t, a_{t} ; \theta\right)+\beta \mathbb E_{s_{t+1}} \Big[V\left(s_{t+1} ; \theta\right) \ \Big| \ s_t, a_{t} ; \theta\Big] \Bigg\rbrace \] Which we can compactly write as \[ V\left(s_t ; \theta\right) = T \Big( V\left(s_{t+1} ; \theta\right) \Big) \] Blackwell’s Theorem: under regularity conditions, \(T\) is a contraction mapping with modulus \(\beta\).
Contraction Mapping Theorem: \(T\) has a fixed point and we can find it by iterating \(T\) from any starting value \(V^{(0)}\).
What does Blackwell’s Theorem allow us to do?
This process is called value function iteration
Ideal Estimation Routine
Issue: model easily rejected by the data
The policy function takes the the form: replace iff \(s_t \geq \gamma(\theta)\)
Can’t explain the coexistence of e.g. “a bus without replacement at 22K miles” and “another bus being replaced at 17K miles” in the data
We need some unobservables in the model to explain why observed choices do not exactly match predicted choices
How can we explain different replacement actions at different mileages in the data?
But where? Two options
Rust uses the following utility specification: \[ u\left(s_t, a_{t}, {\color{red}{\epsilon_{t}}} ; \theta\right) = u\left(s_t, a_{t} ; \theta\right) + {\color{red}{\epsilon_{a_{t} t}}} = \begin{cases} - c\left(s_t ; \theta\right) + {\color{red}{\epsilon_{0 t}}} & \text { if } \ a_{t}=0 \newline \newline -R-c(0 ; \theta) + {\color{red}{\epsilon_{1 t}}} & \text { if } \ a_{t}=1 \end{cases} \]
Can we still solve the model? Can we estimate it?
The Bellman Equation becomes \[ V \Big( {\color{red}{ \lbrace s_\tau \rbrace_{\tau=1}^t , \lbrace \epsilon_\tau \rbrace_{\tau=1}^t }} ; \theta \Big) = \max_{a_{t}} \Bigg\lbrace u\left(s_t, a_{t} ; \theta\right) + {\color{red}{\epsilon_{it}}} + \beta \mathbb E_{s_{t+1}, {\color{red}{\epsilon_{t+1}}}} \Big[V\left(s_{t+1}, {\color{red}{\epsilon_{it+1}}} ; \theta\right) \ \Big| \ {\color{red}{ \lbrace s_\tau \rbrace_{\tau=1}^t , \lbrace \epsilon_\tau \rbrace_{\tau=1}^t }}, a_{t} ; \theta\Big] \Bigg\rbrace \] Issues
Rust makes 4 assumptions to make the problem tractable:
A1: first-order markov process of \(\epsilon\) \[ \Pr \Big(s_{t+1}, \epsilon_{t+1} \Big| s_{1}, ..., s_t, \epsilon_{1}, ..., \epsilon_{t}, a_{t} ; \theta\Big) = \Pr \Big(s_{t+1}, \epsilon_{t+1} \Big| s_t, \epsilon_{t}, a_{t} ; \theta \Big) \]
What it buys
What it still allows:
What are we assuming away
The Bellman Equation becomes \[ V\left(s_t, {\color{red}{\epsilon_{t}}} ; \theta\right) = \max_{a_{t}} \Bigg\lbrace u\left(s_t, a_{t} ; \theta\right) + {\color{red}{\epsilon_{a_{t} t}}} + \beta \mathbb E_{s_{t+1}, {\color{red}{\epsilon_{t+1}}}} \Big[V(s_{t+1}, {\color{red}{\epsilon_{t+1}}} ; \theta) \ \Big| \ s_t, a_{t}, {\color{red}{\epsilon_{t}}} ; \theta \Big] \Bigg\rbrace \]
Open issues
Curse of dimensionality in the state space: (\(s_t, \epsilon_{0t}, \epsilon_{1t}\))
Curse of dimensionality in the expected value: \(\mathbb E_{s_{t+1}, \epsilon_{0,t+1}, \epsilon_{1,t+1}}\)
\[ \mathbb E_{s_{t+1}, \epsilon_{t+1}} \Big[V (s_{t+1}, \epsilon_{t+1} ; \theta) \ \Big| \ s_t, a_{t}, \epsilon_{t} ; \theta \Big] \]
Initial conditions
A2: conditional independence of \(\epsilon_t | s_t\) from \(\epsilon_{t-1}\) and \(s_{t-1}\) \[ \Pr \Big(s_{t+1}, \epsilon_{t+1} \Big| s_t, \epsilon_{t}, a_{t} ; \theta \Big) = \Pr \Big( \epsilon_{t+1} \Big| s_{t+1} ; \theta \Big) \Pr \Big( s_{t+1} \Big| s_t, a_{t} ; \theta \Big) \]
What it buys
What it still allows:
What are we assuming away
Any time of persistent heterogeneity
Does it matter? Easily yes
There are tons of applications where the unobservables are either fixed or correlated over time
The Bellman Equation is \[ V\left(s_t, {\color{red}{\epsilon_{t}}} ; \theta\right) = \max_{a_{t}} \Bigg\lbrace u\left(s_t, a_{t} ; \theta\right) + {\color{red}{\epsilon_{a_{t} t}}} + \beta \mathbb E_{s_{t+1}, {\color{red}{\epsilon_{t+1}}}} \Big[V (s_{t+1}, {\color{red}{\epsilon_{t+1}}} ; \theta) \Big| s_t, a_{t} ; \theta \Big] \Bigg\rbrace \]
Remeber: if \(\epsilon\) does not affect the future, it should’t be in the state space!
How? Integrate it out.
Rust: define the alternative-specific value function \[ \begin{align} &\bar V_0 \left(s_t ; \theta\right) = u\left(s_t, 0 ; \theta\right) + \beta \mathbb E_{s_{t+1}, {\color{red}{\epsilon_{t+1}}}} \Big[V\left(s_{t+1}, {\color{red}{\epsilon_{t+1}} }; \theta\right) | s_t, a_{t}=0 ; \theta\Big] \newline &\bar V_1 \left(s_t ; \theta\right) = u\left(s_t, 1 ; \theta\right) + \beta \mathbb E_{s_{t+1}, {\color{red}{\epsilon_{t+1}}}} \Big[V\left(s_{t+1}, {\color{red}{\epsilon_{t+1}}} ; \theta\right) | s_t, a_{t}=1 ; \theta\Big] \end{align} \]
\(\bar V_0 (s_t)\) is the present discounted value of not replacing, net of \(\epsilon_{0t}\)
The state does not depend on \(\epsilon_{t}\)!
What is the relationship with the value function? \[ V\left(s_t, \epsilon_{t} ; \theta\right) = \max_{a_{t}} \Bigg\lbrace \begin{array}{l} \bar V_0 \left(s_t ; \theta\right)+\epsilon_{0 t} \ ; \newline \bar V_1 \left(s_t ; \theta\right)+\epsilon_{1 t} \end{array} \Bigg\rbrace \]
We have a 1-to-1 mapping between \(V\left(s_t, \epsilon_{t} ; \theta\right)\) and \(\bar V_a \left(s_t ; \theta\right)\) !
Can we solve for \(\bar V\)?
Yes! They have a recursive formulation \[ \begin{aligned} & \bar V_0 \left(s_t ; \theta\right) = u\left(s_t, 0 ; \theta\right) + \beta \mathbb E_{s_{t+1}, {\color{red}{\epsilon_{t+1}}}} \Bigg[ \max_{a_{t+1}} \Bigg\lbrace \begin{array}{l} \bar V_0 \left(s_{t+1} ; \theta\right) + {\color{red}{\epsilon_{0 t+1}}} \ ; \newline \bar V_1 \left(s_{t+1} ; \theta\right) + {\color{red}{\epsilon_{1 t+1}}} \end{array} \Bigg\rbrace \ \Bigg| \ s_t, a_{t}=0 ; \theta \Bigg] \newline & \bar V_1 \left(s_t ; \theta\right) = u\left(s_t, 1 ; \theta\right) + \beta \mathbb E_{s_{t+1}, {\color{red}{\epsilon_{t+1}}}} \Bigg[ \max_{a_{t+1}} \Bigg\lbrace \begin{array}{l} \bar V_0 \left(s_{t+1} ; \theta\right) + {\color{red}{\epsilon_{0 t+1}}} \ ; \newline \bar V_1 \left(s_{t+1} ; \theta\right) + {\color{red}{\epsilon_{1 t+1}}} \end{array} \Bigg\rbrace \ \Bigg| \ s_t, a_{t}=1 ; \theta \Bigg] \newline \end{aligned} \]
We can also split the expectation in the alternative-specific value function \[ \begin{aligned} & \bar V_0 \left(s_t ; \theta\right) = u\left(s_t, 0 ; \theta\right) + \beta \mathbb E_{s_{t+1}} \Bigg[ \mathbb E_{{\color{red}{\epsilon_{t+1}}}} \Bigg[ \max_{a_{t+1}} \Bigg\lbrace \begin{array}{l} \bar V_0 \left(s_{t+1} ; \theta\right) + {\color{red}{\epsilon_{0 t+1}}} \ ; \newline \bar V_1 \left(s_{t+1} ; \theta\right) + {\color{red}{\epsilon_{1 t+1}}} \end{array} \Bigg\rbrace \ \Bigg| \ s_t \Bigg] \ \Bigg| \ s_t, a_{t}=0 ; \theta \Bigg] \newline & \bar V_1 \left(s_t ; \theta\right) = u\left(s_t, 1 ; \theta\right) + \beta \mathbb E_{s_{t+1}} \Bigg[ \mathbb E_{{\color{red}{\epsilon_{t+1}}}} \Bigg[ \max_{a_{t+1}} \Bigg\lbrace \begin{array}{l} \bar V_0 \left(s_{t+1} ; \theta\right) + {\color{red}{\epsilon_{0 t+1}}} \ ; \newline \bar V_1 \left(s_{t+1} ; \theta\right) + {\color{red}{\epsilon_{1 t+1}}} \end{array} \Bigg\rbrace \ \Bigg| \ s_t \Bigg] \ \Bigg| \ s_t, a_{t}=1 ; \theta \Bigg] \newline \end{aligned} \] This allows us to concentrate on one single term \[ \mathbb E_{{\color{red}{\epsilon_{t+1}}}} \Bigg[ \max_{a_{t+1}} \Bigg\lbrace \begin{array}{l} \bar V_0 \left(s_{t+1} ; \theta\right) + {\color{red}{\epsilon_{0 t+1}}} \ ; \newline \bar V_1 \left(s_{t+1} ; \theta\right) + {\color{red}{\epsilon_{1 t+1}}} \end{array} \Bigg\rbrace \ \Bigg| \ s_t \Bigg] \] Open issues
A3: independence of \(\epsilon_t\) from \(s_t\) \[ \Pr \Big( \epsilon_{t+1} \Big| s_{t+1} ; \theta \Big) \Pr \Big( s_{t+1} \Big| s_t, a_{t} ; \theta \Big) = \Pr \big( \epsilon_{t+1} \big| \theta \big) \Pr \Big( s_{t+1} \Big| s_t, a_{t} ; \theta \Big) \]
What it buys
What are we assuming away
Open Issues
A4: \(\epsilon\) is type 1 extreme value distributed (logit)
What it buys
What are we assuming away
Different substitution patterns
Relevant? Maybe, if there are at least three options (here binary choice)
Logit magic 🧙🪄 \[ \mathbb E_{\epsilon} \Bigg[ \max_n \bigg( \Big\lbrace \delta_n + \epsilon_n \Big\rbrace_{n=1}^N \bigg) \Bigg] = 0.5772 + \ln \bigg( \sum_{n=1}^N e^{\delta_n} \bigg) \]
where \(0.5772\) is Euler’s constant
The Bellman equation becomes \[ \begin{aligned} & \bar V_0 \left(s_t ; \theta\right) = u\left(s_t, 0 ; \theta\right) + \beta \mathbb E_{s_{t+1}} \Bigg[ 0.5772 + \ln \Bigg( \sum_{a' \in \lbrace 0, 1 \rbrace} e^{\bar V_{a'} (s_{t+1} ; \theta)} \Bigg) \ \Bigg| \ s_t, a_{t}=0 ; \theta \Bigg] \newline & \bar V_1 \left(s_t ; \theta\right) = u\left(s_t, 1 ; \theta\right) + \beta \mathbb E_{s_{t+1}} \Bigg[ 0.5772 + \ln \Bigg( \sum_{a' \in \lbrace 0, 1 \rbrace} e^{\bar V_{a'} (s_{t+1} ; \theta)} \Bigg) \ \Bigg| \ s_t, a_{t}=1 ; \theta \Bigg] \newline \end{aligned} \]
So far we have analysized how the 4 assumptions help solving the model.
Maximum Likelihood
\[ \mathcal L = \Pr \Big(s_{1}, ... , s_T, a_{0}, ... , a_{T} \ \Big| \ s_{0} ; \theta\Big) \]
What is the impact of the 4 assumptions on the likelihood function?
A1: First order Markow process of \(\epsilon\)
\[ \begin{align} \mathcal L(\theta) &= \Pr \Big(s_{1}, ... , s_T, a_{0}, ... , a_{T} \Big| s_{0} ; \theta\Big)\newline &= \prod_{t=1}^T \Pr \Big(a_{t+1} , s_{t+1} \Big| s_t, a_t ; \theta\Big) \end{align} \]
A2: independence of \(\epsilon_t\) from \(\epsilon_{t-1}\) and \(s_{t-1}\) on \(s_t\)
We can decompose the joint distribution of \(a_t\) and \(s_{t+1}\) into marginals \[ \begin{align} \mathcal L(\theta) &= \prod_{t=1}^T \Pr \Big(a_{t+1} , s_{t+1} \Big| s_t, a_t ; \theta\Big) = \newline &= \prod_{t=1}^T \Pr \big(a_t \big| s_t ; \theta\big) \Pr \Big(s_{t+1} \Big| s_t, a_t ; \theta\Big) \end{align} \]
\(\Pr \big(s_{t+1} \big| s_t, a_t ; \theta\big)\) can be estimated from the data
for \(\Pr \big(a_t \big| s_t ; \theta\big)\) we need the two remaining assumptions
A3: Independence of \(\epsilon_t\) from \(s_t\)
\[ \begin{align} \Pr \big(a_t=1 \big| s_t ; \theta \big) &= \Pr \Big( \bar V_1 (s_{t+1} ; \theta) + \epsilon_{1 t+1} \geq \bar V_0 (s_{t+1} ; \theta) + \epsilon_{0 t+1} \ \Big| \ s_t ; \theta \Big) = \newline &= \Pr \Big( \bar V_1 (s_{t+1} ; \theta) + \epsilon_{1 t+1} \geq \bar V_0 (s_{t+1} ; \theta) + \epsilon_{0 t+1} \ \Big| \ \theta \Big) \end{align} \]
A4: Logit distribution of \(\epsilon\)
\[ \begin{align} \Pr \big(a_t=1 \big| s_t ; \theta \big) &= \Pr \Big( \bar V_1 (s_{t+1} ; \theta) + \epsilon_{1 t+1} \geq \bar V_0 (s_{t+1} ; \theta) + \epsilon_{0 t+1} \ \Big| \ \theta \Big) = \newline &= \frac{e^{\bar V_1 (s_{t+1} ; \theta)}}{e^{\bar V_0 (s_{t+1} ; \theta)} + e^{\bar V_1 (s_{t+1} ; \theta)}} \end{align} \]
The final form of the likelihood function for one bus is \[ \mathcal L(\theta) = \prod_{t=1}^T \Pr\big(a_t \big| s_t ; \theta \big) \Pr \Big(s_{t+1} \ \Big| \ s_t, a_t ; \theta\Big) \] where
Since we have may buses, \(j\), the likelihood of the data is \[ \mathcal L(\theta) = \prod_{j} \mathcal L_j (\theta) = \prod_{j} \prod_{t=1}^T \Pr\big(a_{jt} \big| s_{jt} ; \theta \big) \Pr \Big(s_{j,t+1} \ \Big| \ s_{jt}, a_{jt} ; \theta\Big) \] And, as usual, we prefer to work with log-likelihoods \[ \log \mathcal L(\theta) = \sum_{j} \sum_{t=1}^T \Bigg( \log \Pr\big(a_{jt} \big| s_{jt} ; \theta \big) + \log\Pr \Big(s_{j,t+1} \ \Big| \ s_{jt}, a_{jt} ; \theta\Big) \Bigg) \]
Now we have all the pieces to estimate \(\theta\)!
Procedure
What do dynamics add?
Main limitation of Rust (1987): value function iteration
Solutions
We’ll cover Hotz and Miller (1993) since at the core of the estimation of dynamic games.
Setting: Harold Zurcher problem
Problem: computationally intense to do value function iteration
Can we solve the model without solving a fixed point problem?
How did we estimate the model in Rust? Two main equations
Solve the Bellman equation of the alternative-specific value function \[ {\color{green}{\bar V(s; \theta)}} = \tilde f( {\color{green}{\bar V(s; \theta)}}) \]
Compute the expected policy function \[ {\color{blue}{P( \cdot | s; \theta)}} = \tilde g( {\color{green}{\bar V(s; \theta)}} ; \theta) \]
Maximize the likelihood function
\[ \mathcal L(\theta) = \prod_{j} \prod_{t=1}^T {\color{blue}{ \Pr\big(a_{jt} \big| s_{jt} ; \theta \big)}} \Pr \Big(s_{j,t+1} \ \Big| \ s_{jt}, a_{jt} ; \theta\Big) \]
Can we remove step 1?
Idea 1: it would be great if we could start from something like \[ {\color{blue}{P(\cdot|s; \theta)}} = T( {\color{blue}{P(\cdot|s; \theta)}} ; \theta) \]
Idea 2: could replace the RHS element with a consistent estimate \[ {\color{blue}{P(\cdot|s; \theta)}} = T( {\color{red}{\hat P(\cdot|s; \theta)}} ; \theta) \] And this could give us an estimating equation!
Unclear? No problem, let’s go slowly step by step
Bellman equation \[ {\color{green}{\bar V_a \left(s_t ; \theta\right)}} = u\left(s_t, a ; \theta\right) + \beta \mathbb E_{s_{t+1}, \epsilon_{t+1}} \Bigg[ \max_{a'} \Big\lbrace {\color{green}{\bar V_{a'}}} \left(s_{t+1}; \theta\right) + \epsilon_{a',t+1} \Big\rbrace \ \Big| \ s_t, a_t=a ; \theta \Bigg] \]
Expected policy function
\[ {\color{blue}{\Pr \big(a_t=a \big| s_t ; \theta \big)}} = \Pr \Big( {\color{green}{\bar V_a (s_{t+1} ; \theta)}} + \epsilon_{a, t+1} \geq {\color{green}{\bar V_{a'} (s_{t+1} ; \theta)}} + \epsilon_{a', t+1} , \ \forall a' \ \Big| \ \theta \Big) \]
Expected decision before the shocks \(\epsilon_t\) are realized
How do we get from the two equations \[ \begin{aligned} {\color{green}{\bar V(s; \theta)}} &= \tilde f( {\color{green}{\bar V(s; \theta)}}) \newline {\color{blue}{P(\cdot|s; \theta)}} &= \tilde g( {\color{green}{\bar V(s; \theta)}} ; \theta) \end{aligned} \] To one? \[ {\color{blue}{P(\cdot|s; \theta)}} = T ({\color{blue}{P(\cdot|s; \theta)}}; \theta) \] If we could express \(\bar V\) in terms of \(P\), … \[ \begin{aligned} {\color{green}{\bar V(s; \theta)}} & = \tilde h( {\color{blue}{P(\cdot|s; \theta)}}) \newline {\color{blue}{P(\cdot|s; \theta)}} &= \tilde g( {\color{green}{\bar V(s; \theta)}} ; \theta) \end{aligned} \]
…. we could then substitute the first equation into the second …
But, easier to work with a different representation of the value function.
Recall Rust value function (not the alternative-specific \(\bar V\)) \[ V\left(s_t, \epsilon_t ; \theta\right) = \max_{a_{t}} \Bigg\lbrace u \left( s_t, a_{t} ; \theta \right) + \epsilon_{a_{t} t} + \beta \mathbb E_{s_{t+1}, \epsilon_{t+1}} \Big[V\left(s_{t+1}, \epsilon_{t+1} ; \theta\right) \Big| s_t, a_{t} ; \theta\Big] \Bigg\rbrace \] We can express it in terms of expected value function
\[ V\left(s_t ; \theta\right) = \mathbb E_{\epsilon_t} \Bigg[ \max_{a_{t}} \Bigg\lbrace u\left(s_t, a_t ; \theta\right) + \epsilon_{a_{t} t}+ \beta \mathbb E_{s_{t+1}} \Big[V\left(s_{t+1}; \theta\right) \Big| s_t, a_{t} ; \theta\Big] \Bigg\rbrace \Bigg] \]
Value of being in state \(s_t\) without knowing the realization of the shock \(\epsilon_t\)
Analogous to the relationship between policy funciton and expected policy function
Note
Recall the alternative-specific value function of Rust
\[ \begin{align} {\color{green}{\bar V_a \left( s_t ; \theta\right)}} &= u\left(s_t, d ; \theta\right) + \beta \mathbb E_{s_{t+1}, \epsilon_{t+1}} \Bigg[ \max_{a'} \Big\lbrace {\color{green}{\bar V_{a'} \left(s_{t+1}; \theta\right)}} + \epsilon_{a',t+1} \Big\rbrace \ \Big| \ s_t, a_t=a ; \theta \Bigg] \newline &=u\left(s_t, a ; \theta\right)+\beta \mathbb E_{s_{t+1}, \epsilon_{t+1}} \Big[ {\color{orange}{V \left( s_{t+1}, \epsilon_{t+1} ; \theta \right)}} \Big| s_t, a_t=a ; \theta \Big] \newline &= u \left( s_t, a ; \theta \right) + \beta \mathbb E_{s_{t+1}} \Big[ {\color{red}{V \left( s_{t+1} ; \theta \right)}} \Big| s_t, a_t=a; \theta \Big] \end{align} \]
Relationship with the value function
\[ {\color{orange}{V \left(s_t, \epsilon_{t} ; \theta \right)}} = \max_{a_{t}} \Big\lbrace {\color{green}{ \bar V_0 \left( s_t ; \theta \right)}} + \epsilon_{0t}, {\color{green}{\bar V_1 \left( s_t ; \theta \right)}} + \epsilon_{1t} \Big\rbrace \]
Relationship with the expected value function \[ {\color{red}{V\left(s_t ; \theta\right)}} = \mathbb E_{\epsilon_t} \Big[ {\color{orange}{V\left(s_t, \epsilon_{t} ; \theta\right)}} \ \Big| \ s_t \Big] \]
We switched from alternative-specific value function \({\color{green}{\bar V (s_t ; \theta)}}\) to expected value function \({\color{red}{V(s_t ; \theta)}}\)
Go from this representation \[ \begin{align} {\color{red}{V(s ; \theta)}} & = f( {\color{red}{V(s ; \theta)}}) \newline {\color{blue}{P(\cdot | s ; \theta)}} & = g( {\color{red}{V(s ; \theta)}}; \theta) \end{align} \] To this \[ \begin{align} {\color{red}{V(s ; \theta)}} & = h( {\color{blue}{P(\cdot|s ; \theta)}} ; \theta) \newline {\color{blue}{P(\cdot|s ; \theta)}} & = g({\color{red}{V(s ; \theta)}}; \theta) \end{align} \] I.e. we want to express the expected value function (EV) in terms of the expected policy function (EP).
Note : the \(f\), \(g\) and \(h\) functions are different functions now.
First, let’s ged rid of one operator: the max operator \[ V\left(s_t ; \theta\right) = \sum_a \Pr \Big(a_t=a | s_t ; \theta \Big) * \left[\begin{array}{c} u\left(s_t, a ; \theta\right) + \mathbb E_{\epsilon_t} \Big[\epsilon_{at}\Big| a_t=a, s_t\Big] \newline \qquad + \beta \mathbb E_{s_{t+1}} \Big[V\left(s_{t+1} ; \theta\right) \Big| s_t, a_t=a ; \theta\Big] \end{array}\right] \]
We are just substituting the \(\max\) with the policy \(\Pr\left(a_t=a| s_t ; \theta\right)\)
Important: we got rid of the \(\max\) operator
But we are still taking the expectation over
Now we get rid of another operator: the expectation over \(s_{t+1}\) \[ \mathbb E_{s_{t+1}} \Big[V\left(s_{t+1} ; \theta\right) \Big| s_t, a_t=a ; \theta\Big] \qquad \to \qquad \sum_{s_{t+1}} V\left(s_{t+1} ; \theta\right) \Pr \Big(s_{t+1} \Big| s_t, a_t=a ; \theta \Big) \] where
so that the expected value function becomes \[ V\left(s_t ; \theta\right) = \sum_a \Pr \Big(a_t=a | s_t ; \theta \Big) * \left[\begin{array}{c} u\left(s_t, a ; \theta\right) + \mathbb E_{\epsilon_t} \Big[\epsilon_{at}\Big| a_t=a, s_t\Big] \newline + \beta \sum_{s_{t+1}} V\left(s_{t+1} ; \theta\right) \Pr \Big(s_{t+1} \Big| s_t, a_t=a ; \theta \Big) \end{array}\right] \]
The previous equation, was defined at the state level \(s_t\)
If we stack them, we can write them as \[ V\left(s ; \theta\right) = \sum_a \Pr \Big(a \ \Big| \ s ; \theta \Big) .* \Bigg[ u\left(s, a ; \theta\right) + \mathbb E_{\epsilon} \Big[\epsilon_{a} \ \Big| \ a, s \Big] + \beta \ T(a ; \theta) \ V(s ; \theta) \Bigg] \] where
Now we have a system of \(k\) equations in \(k\) unknowns that we can solve.
Tearing down notation to the bare minimum, we have \[ V = \sum_a P_a .* \bigg[ u_a + \mathbb E [\epsilon_a ] + \beta \ T_a \ V \bigg] \] which we can rewrite as \[ V - \beta \ \left( \sum_a P_a .* T_a \right) V = \sum_a P_a .* \bigg[ u_a + \mathbb E [\epsilon_a ] \bigg] \]
and finally we can solve for \(V\) through the famous Hotz and Miller inversion \[ V = \left[I - \beta \ \sum_a P_a .* T_a \right]^{-1} \ * \ \left( \sum_a P_a \ .* \ \bigg[ u_a + \mathbb E [\epsilon_a] \bigg] \right) \] Solved? No. We still need to do something about \(\mathbb E [\epsilon_a]\).
What is \(\mathbb E [\epsilon_a]\)?
Let’s consider for example the expected value of the shock, conditional on investment \[ \begin{aligned} \mathbb E \Big[ \epsilon_{1 t} \ \Big| \ a_t = 1, \cdot \Big] &= \mathbb E \Big[ \epsilon_{t} \ \Big| \ \bar V_1 \left( s_t ; \theta \right) + \epsilon_{1 t} > \bar V_0 \left( s_t ; \theta \right) + \epsilon_{0 t} \Big] \newline & = \mathbb E \Big[ \epsilon_{1 t} \ \Big| \ \bar V_1 \left( s_t ; \theta \right) - \bar V_0 \left( s_t ; \theta \right) > \epsilon_{0 t} - \epsilon_{1 t} \Big] \end{aligned} \] with logit magic 🧙🪄 is \[ \mathbb E\left[\epsilon_{1 t} | a_{t}=1, s_t\right] = 0.5772 - \ln \left(P\left(s_t ; \theta\right)\right) \]
We again got rid of another \(\max\) operator!
Now we can substitute it back and we have an equation which is just a function of primitives \[ \begin{aligned} V(\cdot ; \theta) =& \Big[I-(1-P(\cdot ; \theta)) \beta T(0 ; \theta)-P(\cdot ; \theta) \beta T(1 ; \theta)\Big]^{-1} \newline * & \left[ \begin{array}{c} (1-P(\cdot ; \theta))\Big[u(\cdot, 0 ; \theta)+0.5772-\ln (1-P(\cdot ; \theta))\Big] \newline + P(\cdot ; \theta)\Big[u(\cdot, 1 ; \theta) + 0.5772 - \ln (P(\cdot ; \theta))\Big] \end{array} \right] \end{aligned} \]
Or more compactly \[ V = \left[I - \beta \ \sum_a P_a .* T_a \right]^{-1} \ * \ \left( \sum_a P_a \ .* \ \bigg[ u_a + 0.5772 - \ln(P_d) \bigg] \right) \]
What is the first equation? \[ V = \left[I - \beta \ \sum_a P_a .* T_a \right]^{-1} \ * \ \left( \sum_a P_a \ .* \ \bigg[ u_a + 0.5772 - \ln(P_a) \bigg] \right) \] Expected static payoff: \(\sum_a P_a \ .* \ \bigg[ u_a + 0.5772 + \ln(P_a) \bigg]\)
Unconditional transition probabilities: \(\sum_a P_a .* T_a\)
We got our first equation \[ {\color{red}{V}} = \left[I - \beta \ \sum_a {\color{blue}{P_a}} .* T_a \right]^{-1} \ * \ \left( \sum_a {\color{blue}{P_a}} \ .* \ \bigg[ u_a + 0.5772 - \ln({\color{blue}{P_a}}) \bigg] \right) \]
I.e. \[ \begin{align} {\color{red}{V(s ; \theta)}} & = h( {\color{blue}{P(s ; \theta)}} ; \theta) \newline \end{align} \]
What about the second equation \({\color{blue}{P(\cdot|s ; \theta)}} = g({\color{red}{V(s ; \theta)}}; \theta)\)?
In general, the expected probability of investment is \[ P(a=1; \theta)= \Pr \left[\begin{array}{c} u(\cdot, 1 ; \theta)+\epsilon_{1 t}+\beta \mathbb E \Big[V(\cdot ; \theta) \Big| \cdot, a_{t}=1 ; \theta \Big]> \newline \qquad u(\cdot, 0 ; \theta) + \epsilon_{0 t}+\beta \mathbb E \Big[V(\cdot ; \theta) \Big| \cdot, a_{t}=0 ; \theta \Big] \end{array}\right] \]
With the logit assumption, simplifies to \[ {\color{blue}{P(a=1 ; \theta)}} = \frac{\exp \Big(u(\cdot, 1 ; \theta)+\beta T(1 ; \theta) V(\cdot ; \theta) \Big)}{\sum_{a'} \exp \Big(u(\cdot, a' ; \theta)+\beta T(a' ; \theta) V(\cdot ; \theta) \Big)} = \frac{\exp (u_1 +\beta T_1 {\color{red}{V}} )}{\sum_{a'} \exp (u_{a'} +\beta T_{a'} {\color{red}{V}} )} \]
Now we have also the second equation! \[ \begin{align} {\color{blue}{P(s ; \theta)}} & = g({\color{red}{V(s ; \theta)}}; \theta) \end{align} \]
Idea 2: Replace \({\color{blue}{P} (\cdot)}\) on the RHS with a consistent estimator \({\color{Turquoise}{\hat P (\cdot)}}\) \[ {\color{cyan}{\bar P(\cdot ; \theta)}} = g(h({\color{Turquoise}{\hat P(\cdot)}} ; \theta); \theta) \]
\({\color{cyan}{\bar P(\cdot ; \theta_0)}}\) will converge to the true \({\color{blue}{P(\cdot ; \theta_0)}}\), because \({\color{Turquoise}{\hat P (\cdot)}}\) is converging to \({\color{blue}{P(\cdot ; \theta_0)}}\) asymptotically.
How to compute \({\color{Turquoise}{\hat P(\cdot)}}\)?
From the data, you observe states and decisions
You can compute frequency of decisions given states
Assumption: you have enough data
Steps so far
Estimate the conditional choice probabilities \({\color{Turquoise}{\hat P}}\) from the data
Solve for the expected value function with the inverstion step \[ {\color{orange}{\hat V}} = \left[I - \beta \ \sum_a {\color{Turquoise}{\hat P_a}} .* T_a \right]^{-1} \ * \ \left( \sum_a {\color{Turquoise}{\hat P_a}} \ .* \ \bigg[ u_a + 0.5772 - \ln({\color{Turquoise}{\hat P_a}}) \bigg] \right) \]
Compute the predicted CCP, given \(V\) \[ {\color{cyan}{\bar P(a=1 ; \theta)}} = \frac{\exp (u_1 +\beta T_1 {\color{orange}{\hat V}} )}{\sum_{a'} \exp (u_{a'} +\beta T_{a'} {\color{orange}{\hat V}} )} \]
What now? Use the estimated CCP to build an objective function.
We have (at least) 2 options
\[ \mathbb E \Big[a_t - \bar P(s_t, \theta) \ \Big| \ s_t \Big] = 0 \quad \text{ at } \quad \theta = \theta_0 \]
We will follow the second approach
The likelihood function for one bus is \[ \mathcal{L}(\theta) = \prod_{t=1}^{T}\left(\hat{\operatorname{Pr}}\left(a=1 \mid s_{t}; \theta\right) \mathbb{1}\left(a_{t}=1\right)+\left(1-\hat{\operatorname{Pr}}\left(a=0 \mid s_{t}; \theta\right)\right) \mathbb{1}\left(a_{t}=0\right)\right) \] where \(\hat \Pr\big(a_{t} \big| s_{t} ; \theta \big)\) is a function of
Why pseudo-likelihood? We have inputed something that is not a primitive but a consistent estimate of an equilibrium object, \(\hat P\)
Now a few comments on Hotz and Miller (1993)
There is still 1 computational bottleneck in HM: the inversion step \[ V = \left[I - \beta \ \sum_a P_a .* T_a \right]^{-1} \ * \ \left( \sum_a P_a \ .* \ \bigg[ u_a + 0.5772 - \ln(P_a) \bigg] \right) \] The \(\left[I - \beta \ \sum_a P_a .* T_a \right]\) matrix has dimension \(k \times k\)
Hotz and Miller (1993) inversion gets us a recursive equation in probability space
\[ \bar P(\cdot ; \theta) = g(h(\hat P(\cdot) ; \theta); \theta) \]
Idea
Crucial assumption
For both Hotz et al. (1994) and Rust (1987), we need to discretize the state space
Hotz et al. (1994) cannot handle unobserved heterogeneity or “unobserved state variables” that are persistent over time.
Example
Suppose there are 2 bus types \(\tau\): high and low quality
We don’t know the share of types in the data
With Rust
Parametrize the effect of the difference in qualities
Parametrize the proportion of high quality buses
Solve the value function by type \(V(s_t, \tau ; \theta)\)
Integrate over types when computing choice probabilities \[ P(a|s) = \int P(a|s,\tau) P(\tau) = \Pr(a|s, \tau=0) * \Pr(\tau=0) + \Pr(a|s, \tau=1) * \Pr(\tau=1) \]
What is the problem with Hotz et al. (1994)?
The unobserved heterogeneity generates persistency in choices
Likelihood of decisions must be integrated over types \[ \mathcal L (\theta) = \sum_{\tau_a} \prod_{t=1}^{T} \Pr (a_{jt}| s_{jt}, \tau_a) \Pr(\tau_a) \]
Hotz & Miller needs consistent estimates of \(P(a, s, \tau)\)
Difficult when \(\tau\) is not observed!
Work on identification
Abbring, Jaap H, and Øystein Daljord. 2020. “Identifying the Discount Factor in Dynamic Discrete Choice Models.” Quantitative Economics 11 (2): 471–501.
Aguirregabiria, Victor, and Pedro Mira. 2002. “Swapping the Nested Fixed Point Algorithm: A Class of Estimators for Discrete Markov Decision Models.” Econometrica 70 (4): 1519–43.
Aguirregabiria, Victor, and Junichi Suzuki. 2014. “Identification and Counterfactuals in Dynamic Models of Market Entry and Exit.” Quantitative Marketing and Economics 12 (3): 267–304.
Becker, Gary S, and Kevin M Murphy. 1988. “A Theory of Rational Addiction.” Journal of Political Economy 96 (4): 675–700.
Berry, Steven T. 1992. “Estimation of a Model of Entry in the Airline Industry.” Econometrica: Journal of the Econometric Society, 889–917.
Bresnahan, Timothy F. 1989. “Empirical Studies of Industries with Market Power.” Handbook of Industrial Organization 2: 1011–57.
Crawford, Gregory S, and Matthew Shum. 2005. “Uncertainty and Learning in Pharmaceutical Demand.” Econometrica 73 (4): 1137–73.
Erdem, Tülin, Susumu Imai, and Michael P Keane. 2003. “Brand and Quantity Choice Dynamics Under Price Uncertainty.” Quantitative Marketing and Economics 1 (1): 5–64.
Erdem, Tülin, and Michael P Keane. 1996. “Decision-Making Under Uncertainty: Capturing Dynamic Brand Choice Processes in Turbulent Consumer Goods Markets.” Marketing Science 15 (1): 1–20.
Golosov, Mikhail, Aleh Tsyvinski, Ivan Werning, Peter Diamond, and Kenneth L Judd. 2006. “New Dynamic Public Finance: A User’s Guide [with Comments and Discussion].” NBER Macroeconomics Annual 21: 317–87.
Gowrisankaran, Gautam, and Marc Rysman. 2012. “Dynamics of Consumer Demand for New Durable Goods.” Journal of Political Economy 120 (6): 1173–1219.
Handel, Benjamin R. 2013. “Adverse Selection and Inertia in Health Insurance Markets: When Nudging Hurts.” American Economic Review 103 (7): 2643–82.
Hendel, Igal, and Aviv Nevo. 2006. “Measuring the Implications of Sales and Consumer Inventory Behavior.” Econometrica 74 (6): 1637–73.
Hotz, V Joseph, and Robert A Miller. 1993. “Conditional Choice Probabilities and the Estimation of Dynamic Models.” The Review of Economic Studies 60 (3): 497–529.
Hotz, V Joseph, Robert A Miller, Seth Sanders, and Jeffrey Smith. 1994. “A Simulation Estimator for Dynamic Models of Discrete Choice.” The Review of Economic Studies 61 (2): 265–89.
Igami, Mitsuru. 2020. “Artificial Intelligence as Structural Estimation: Deep Blue, Bonanza, and AlphaGo.” The Econometrics Journal 23 (3): S1–24.
Imai, Susumu, Neelam Jain, and Andrew Ching. 2009. “Bayesian Estimation of Dynamic Discrete Choice Models.” Econometrica 77 (6): 1865–99.
Kalouptsidi, Myrto, Yuichi Kitamura, Lucas Lima, and Eduardo A Souza-Rodrigues. 2020. “Partial Identification and Inference for Dynamic Models and Counterfactuals.” National Bureau of Economic Research.
Kalouptsidi, Myrto, Paul T Scott, and Eduardo Souza-Rodrigues. 2017. “On the Non-Identification of Counterfactuals in Dynamic Discrete Games.” International Journal of Industrial Organization 50: 362–71.
Keane, Michael P, and Kenneth I Wolpin. 1997. “The Career Decisions of Young Men.” Journal of Political Economy 105 (3): 473–522.
Magnac, Thierry, and David Thesmar. 2002. “Identifying Dynamic Discrete Decision Processes.” Econometrica 70 (2): 801–16.
Pakes, Ariel. 1986. “Patents as Options: Some Estimates of the Value of Holding European Patent Stocks.” Econometrica 54 (4): 755–84.
Rust, John. 1987. “Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher.” Econometrica: Journal of the Econometric Society, 999–1033.
———. 1988. “Maximum Likelihood Estimation of Discrete Control Processes.” SIAM Journal on Control and Optimization 26 (5): 1006–24.
———. 1994. “Structural Estimation of Markov Decision Processes.” Handbook of Econometrics 4: 3081–3143.
Su, Che-Lin, and Kenneth L Judd. 2012. “Constrained Optimization Approaches to Estimation of Structural Models.” Econometrica 80 (5): 2213–30.