2021-10-29
Setting: agents making strategic decisions (new) in dynamic environments.
Lit review: forthcoming IO Handbook chapter Aguirregabiria, Collard-Wexler, and Ryan (2021)
Typically in IO we study agents in strategic environments. Complicated in dynamic environments.
We will cover first the estimation and then the computation of dynamic games
Last: bridge between Structural IO and Artificial Intelligence
Stylized version of Ericson and Pakes (1995) (no entry/exit)
\(J\) firms (products) indexed by \(j \in \lbrace 1, ..., J \rbrace\)
Time \(t\) is dicrete, horizon is infinite
States \(s_{jt} \in \lbrace 1, ... \bar s \rbrace\): quality of product \(j\) in period \(t\)
Actions \(a_{jt} \in \mathbb R^+\): investment decision of firm \(j\) in period \(t\)
Static payoffs \[ \pi_j (s_{jt}, \boldsymbol s_{-jt}, a_{jt}; \theta^\pi) \] where
Note: if we micro-fund \(\pi(\cdot)\) , e.g. with some demand and supply model, we have 2 strategic decisions: prices (static) and investment (dynamic).
State transitions \[ \boldsymbol s_{t+1} = f(\boldsymbol s_t, \boldsymbol a_t, \boldsymbol \epsilon_t; \theta^f) \] where
Objective function: firms maximize expected discounted future profits \[ \max_{\boldsymbol a} \ \mathbb E_t \left[ \sum_{\tau=0}^\infty \beta^{\tau} \pi_{j, t+\tau} (\theta^\pi) \right] \]
The value function of firm \(j\) at time \(t\) in state \(\boldsymbol s_{t}\), under a set of strategy functions \(\boldsymbol P\) (one for each firm) is \[ V^{\boldsymbol P_{-j}}_{j} (\mathbf{s}_{t}) = \max_{a_{jt} \in \mathcal{A}_j \left(\mathbf{s}_{t}\right)} \Bigg\lbrace \pi_{j}^{\boldsymbol P_{-j}} (a_{jt}, \mathbf{s}_{t} ; \theta^\pi ) + \beta \mathbb E_{\boldsymbol s_{t+1}} \Big[ V_{j}^{\boldsymbol P_{-j}} \left(\mathbf{s}_{t+1}\right) \ \Big| \ a_{jt}, \boldsymbol s_{t} ; \theta^f \Big] \Bigg\rbrace \] where
\(\pi_{j}^{\boldsymbol P_{-j}} (a_{jt}, \mathbf{s}_{t} ; \theta^\pi )\) are the static profits of firm \(j\) given action \(a_{jt}\) and policy functions \(\boldsymbol P_{-j}\) for all firms a part from \(j\)
The expecation \(\mathbb E\) is taken with respect to the conditional transition probabilities \(f^{\boldsymbol P_{-j}} (\mathbf{s}_{t+1} | \mathbf{s}_{t}, a_{jt} ; \theta^f)\)
Equillibrium notion: Markow Perfect Equilibrium (Maskin and Tirole 1988)
What is it basically?
We want to estimate 2 sets of parameters:
Generally 2 approaches
Bajari, Benkard, and Levin (2007) plan
Important: parametric assumptions would contradict the model for the estimation of value/policy functions
First step: from transitions \(f(\hat \theta^f)\) and CCPs \(\boldsymbol{\hat P}\) to values
We can use transitions and CCPs to simulate histories (of length \(\tilde T\))
Given a parameter value \(\tilde \theta^\pi\), we can compute static payoffs: \(\pi_{j}^{\boldsymbol {\hat{P}_{-j}}} \left( \tilde a_{j\tau}, \boldsymbol{\tilde s}_{\tau} ; \tilde \theta^\pi \right)\)
Simulated history + static payoffs = simulated value function \[ {V}_{j}^{\boldsymbol {\hat{P}}} \left(\boldsymbol{s}_{t} ; \tilde \theta^\pi \right) = \sum_{\tau=0}^{\tilde T} \beta^{\tau} \pi_{j}^{\boldsymbol {\hat{P}_{-j}}} \left( \tilde a_{j\tau}, \boldsymbol{\tilde s}_{\tau} ; \tilde \theta^\pi \right) \]
We can average over many, e.g. \(R\), simulated value functions to get an expected value function \[ {V}_{j}^{\boldsymbol {\hat{P}}, R} \left( \boldsymbol{s}_{t} ; \tilde \theta^\pi \right) = \frac{1}{R} \sum_{r=0}^{R}\Bigg( \sum_{\tau=0}^{\tilde T} \beta^{\tau} \pi_{j}^{\boldsymbol {\hat{P}_{-j}}} \left(\tilde a^{(r)}_{j\tau}, \boldsymbol{\tilde s}^{(r)}_{\tau} ; \tilde \theta^\pi \right) \Bigg) \]
For \(r = 1, ..., R\) simulations do:
Then average all the value functions together to obtain an expected value function \(V_{j}^{\boldsymbol {\hat{P}}, R} \left(\boldsymbol{s}_{t} ; \tilde \theta^\pi \right)\)
Note: advantage of simulations: can be parallelized
What have we done so far?
How do we pick the \(\theta^\pi\) that best rationalizes the data?
BBL idea
Note: it’s an inequality statement
Idea
If the observed policy \({\color{green}{\boldsymbol{\hat P}}}\) is optimal,
All other policies \({\color{red}{\boldsymbol{\tilde P}}}\)
… at the true parameters \(\theta^f\)
… should give a lower expected value \[ V_{j}^{{\color{red}{\boldsymbol{\tilde P}}}, R} \left( \boldsymbol{s}_{t} ; \tilde \theta^\pi \right) \leq V_{j}^{{\color{green}{\boldsymbol{\hat P}}}, R} \left( \boldsymbol{s}_{t} ; \tilde \theta^\pi \right) \]
So which are the true parameters?
Those for which any deviation from the observed policy \({\color{green}{\boldsymbol{\hat P}}}\) yields a lower value
Objective function to minimize: violations under alternative policies \({\color{red}{\boldsymbol{\tilde P}}}\) \[ \min_{\tilde \theta^\pi} \sum_{\boldsymbol s_{t}} \sum_{{\color{red}{\boldsymbol{\tilde P}}}} \Bigg[\min \bigg\lbrace V_{j}^{{\color{green}{\boldsymbol{\hat P}}}, R} \left( \boldsymbol{s}_{t} ; \tilde \theta^\pi \right) - V_{j}^{{\color{red}{\boldsymbol{\tilde P}}}, R} \left( \boldsymbol{s}_{t} ; \tilde \theta^\pi \right) \ , \ 0 \bigg\rbrace \Bigg]^{2} \]
Estimator: \(\theta^\pi\) that minimizes the average (squared) magnitude of violations for any alternative policy \({\color{red}{\boldsymbol{\tilde P}}}\) \[ \hat{\theta}^\pi= \arg \min_{\tilde \theta^\pi} \sum_{\boldsymbol s_{t}} \sum_{{\color{red}{\boldsymbol{\tilde P}}}} \Bigg[\min \bigg\lbrace V_{j}^{{\color{green}{\boldsymbol{\hat P}}}, R} \left( \boldsymbol{s}_{t} ; \tilde \theta^\pi \right) - V_{j}^{{\color{red}{\boldsymbol{\tilde P}}}, R} \left( \boldsymbol{s}_{t} ; \tilde \theta^\pi \right) \ , \ 0 \bigg\rbrace \Bigg]^{2} \]
We have seen that there are competing methods.
What are the advantages of Bajari, Benkard, and Levin (2007) over those?
Ericson and Pakes (1995) and companion paper Pakes and McGuire (1994) for the computation
\(J\) firms indexed by \(j \in \lbrace 1, ..., J \rbrace\)
Time \(t\) is dicrete \(t\), horizon is infinite
State \(s_{jt}\): quality of firm \(j\) in period \(t\)
Per period profits \[ \pi (s_{jt}, \boldsymbol s_{-jt}, ; \theta^\pi) \] where
We can micro-fund profits with some demand and supply functions
Investment: firms can invest an dollar amount \(x\) to increase their future quality
Continuous decision variable (\(\neq\) Rust)
Probability that investment is successful \[ \Pr \big(i_{jt} \ \big| \ a_{it} = x \big) = \frac{\alpha x}{1 + \alpha x} \]
Higher investment, higher success probability
\(\alpha\) parametrizes the returns on investment
Quality depreciation
Law of motion \[ s_{j,t+1} = s_{jt} + i_{jt} - \delta \]
Note that in Ericson and Pakes (1995) we have two separate decision variables
Does not have to be the case!
Example: Besanko et al. (2010)
Firms maximize the expected flow of discounted profits \[ \max_{\boldsymbol a} \ \mathbb E_t \left[ \sum_{\tau=0}^\infty \beta^{\tau} \pi_{j, t+\tau} (\theta^\pi) \right] \] Markow Perfect Equilibrium
Equillibrium notion: Markow Perfect Equilibrium (Maskin and Tirole 1988)
One important extension is exit.
The Belman Equation of incumbent \(j\) at time \(t\) is \[ V^{\boldsymbol P_{-j}}_{j} (\mathbf{s}_{t}) = \max_{d^{exit}_{jt} \in \lbrace 0, 1 \rbrace} \Bigg\lbrace \begin{array}{c} \beta \phi^{exit} \ , \newline \max_{a_{jt} \in \mathcal{A}_j \left(\mathbf{s}_{t}\right)} \Big\lbrace \pi_{j}^{\boldsymbol P_{-j}} (a_{jt}, \mathbf{s}_{t} ; \theta^\pi ) + \beta \mathbb E_{\boldsymbol s_{t+1}} \Big[ V_{j}^{\boldsymbol P_{-j}} \left(\mathbf{s}_{t+1}\right) \ \Big| \ a_{jt}, \boldsymbol s_{t} ; \theta^f \Big] \Big\rbrace \end{array} \Bigg\rbrace \] where
We can also incorporate endogenous entry.
Value function \[ V_{j}^{\boldsymbol P_{-j}} (e, \boldsymbol x_{-jt} ; \theta) = \max_{d^{entry} \in \lbrace 0,1 \rbrace } \Bigg\lbrace \begin{array}{c} 0 \ ; \newline - \phi^{entry} + \beta \mathbb E_{\boldsymbol s_{t+1}} \Big[ V_{j}^{\boldsymbol P_{-j}} (\bar s, \boldsymbol s_{-j, t+1} ; \theta) \ \Big| \ \boldsymbol s_{t} ; \theta^f \Big] \end{array} \Bigg\rbrace \] where
Do we observe potential entrants?
Doraszelski and Satterthwaite (2010): a MPE might not exist in Ericson and Pakes (1995) model.
Solution
Markov Perfect Bayesian Nash Equilibrium (MPBNE)
Solving the model is very similar to Rust
Where do things get complicated / tricky? Policy function update
Imagine a stylized exit game with 2 firms
Computationally
Issues: value function iteration might not converge and equilibrium multeplicity.
How to find them?
Can we assume them away?
What are the computational bottlenecks? \[ V^{\boldsymbol P_{-j}}_{j} ({\color{red}{\mathbf{s}_{t}}}) = \max_{a_{jt} \in \mathcal{A}_j \left(\mathbf{s}_{t}\right)} \Bigg\lbrace \pi_{j}^{\boldsymbol P_{-j}} (a_{jt}, \mathbf{s}_{t} ; \theta^\pi ) + \beta \mathbb E_{{\color{red}{\mathbf{s}_{t+1}}}} \Big[ V_{j}^{\boldsymbol P_{-j}} \left(\mathbf{s}_{t+1}\right) \ \Big| \ a_{jt}, \boldsymbol s_{t} ; \theta^f \Big] \Bigg\rbrace \]
Note: bottlenecks are not addittive but multiplicative: have to solve the expectation for each point in the state space. Improving on any of the two helps a lot.
Two and a half classes of solutions:
Note: useful also to get good starting values for a full solution method!
Weintraub, Benkard, and Van Roy (2008): what if firms had no idea about the state of other firms?
The value function becomes \[ V_{j} ({\color{red}{s_{t}}}) = \max_{a_{jt} \in \mathcal{A}_j \left({\color{red}{s_{t}}}\right)} \Bigg\lbrace {\color{red}{\mathbb E_{\boldsymbol s_t}}} \Big[ \pi_{j} (a_{jt}, \mathbf{s}_{t} ; \theta^\pi ) \Big| P \Big] + \beta \mathbb E_{{\color{red}{s_{t+1}}}} \Big[ V_{j} \left({\color{red}{s_{t+1}}}\right) \ \Big| \ a_{jt}, {\color{red}{s_{t}}} ; \theta^f \Big] \Bigg\rbrace \]
Doraszelski and Judd (2019): what if instead of simultaneously, firms would move one at the time at random?
The value function becomes \[ V^{\boldsymbol P_{-j}}_{j} (\mathbf{s}_{t}, {\color{red}{n=j}}) = \max_{a_{jt} \in \mathcal{A}_j \left(\mathbf{s}_{t}\right)} \Bigg\lbrace {\color{red}{\frac{1}{J}}}\pi_{j}^{\boldsymbol P_{-j}} (a_{jt}, \mathbf{s}_{t} ; \theta^\pi ) + {\color{red}{\sqrt[J]{\beta}}} \mathbb E_{{\color{red}{n, s_{j, t+1}}}} \Big[ V_{j}^{\boldsymbol P_{-j}} \left(\mathbf{s}_{t+1}, {\color{red}{n}} \right) \ \Big| \ a_{jt}, \boldsymbol s_{t} ; \theta^f \Big] \Bigg\rbrace \]
Computational gain
Doraszelski and Judd (2012): what’s the advantage of continuous time?
With continuous time, the value function becomes \[ V^{\boldsymbol P_{-j}}_{j} (\mathbf{s}_{t}) = \max_{a_{jt} \in \mathcal{A}_j \left(\mathbf{s}_{t}\right)} \Bigg\lbrace \frac{1}{\lambda(a_{jt}) - \log(\beta)} \Bigg( \pi_{j}^{\boldsymbol P_{-j}} (a_{jt}, \mathbf{s}_{t} ; \theta^\pi ) + \lambda(a_{jt}) \mathbb E_{\boldsymbol s_{t+1}} \Big[ V_{j}^{\boldsymbol P_{-j}} \left(\mathbf{s}_{t+1}\right) \ \Big| \ a_{jt}, \boldsymbol s_{t} ; \theta^f \Big] \Bigg) \Bigg\rbrace \]
Computational gain
Which method is best?
I compare them in Courthoud (2020)
Some applications of these methods include
There is one method to approximate the equilibrium in dynamic games that is a bit different from the others: Pakes and McGuire (2001)
Experience-Based Equilibrium
Players start with alternative-specific value function
Until convergence, do:
Compute optimal action, given \(\bar V_{j, a}^{(t)} (\boldsymbol s ; \theta)\) \[ a^* = \arg \max_a \bar V_{j, a}^{(t)} (\boldsymbol s ; \theta) \]
Observe the realized payoff \(\pi_{j, a^*}(\boldsymbol s ; \theta)\) and the realized next state \(\boldsymbol {s'}(\boldsymbol s, a^*; \theta)\)
Update the alternative-specific value function of the chosen action \(k^*\) \[ \bar V_{j, a^*}^{(t+1)} (\boldsymbol s ; \theta) = (1-\alpha_{\boldsymbol s, t}) \bar V_{j, a^*}^{(t)} (\boldsymbol s ; \theta) + \alpha_{\boldsymbol s, t} \Big[\pi_{j, a^*}(\boldsymbol s ; \theta) + \arg \max_a \bar V_{j, a}^{(t)} (\boldsymbol s' ; \theta) \Big] \] where
Where is the strategic interaction?
Importance of starting values
Convergence by desing
Computer Science reinforcement learning literature (AI): Q-learning
Differences
Abbring, Jaap H, and Jeffrey R Campbell. 2010. “Last-in First-Out Oligopoly Dynamics.” Econometrica 78 (5): 1491–1527.
Aguirregabiria, Victor, Allan Collard-Wexler, and Stephen P Ryan. 2021. “Dynamic Games in Empirical Industrial Organization.” National Bureau of Economic Research.
Aguirregabiria, Victor, and Pedro Mira. 2007. “Sequential Estimation of Dynamic Discrete Games.” Econometrica 75 (1): 1–53.
Arcidiacono, Peter, Patrick Bayer, Jason R Blevins, and Paul B Ellickson. 2016. “Estimation of Dynamic Discrete Choice Models in Continuous Time with an Application to Retail Competition.” The Review of Economic Studies 83 (3): 889–931.
Arcidiacono, Peter, and Robert A Miller. 2011. “Conditional Choice Probability Estimation of Dynamic Discrete Choice Models with Unobserved Heterogeneity.” Econometrica 79 (6): 1823–67.
Asker, John, Chaim Fershtman, Jihye Jeon, and Ariel Pakes. 2020. “A Computational Framework for Analyzing Dynamic Auctions: The Market Impact of Information Sharing.” The RAND Journal of Economics 51 (3): 805–39.
Bajari, Patrick, C Lanier Benkard, and Jonathan Levin. 2007. “Estimating Dynamic Models of Imperfect Competition.” Econometrica 75 (5): 1331–70.
Barwick, Panle Jia, and Parag A Pathak. 2015. “The Costs of Free Entry: An Empirical Study of Real Estate Agents in Greater Boston.” The RAND Journal of Economics 46 (1): 103–45.
Berry, Steven T, and Giovanni Compiani. 2021. “Empirical Models of Industry Dynamics with Endogenous Market Structure.” Annual Review of Economics 13.
Besanko, David, Ulrich Doraszelski, Yaroslav Kryukov, and Mark Satterthwaite. 2010. “Learning-by-Doing, Organizational Forgetting, and Industry Dynamics.” Econometrica 78 (2): 453–508.
Borkovsky, Ron N, Ulrich Doraszelski, and Yaroslav Kryukov. 2010. “A User’s Guide to Solving Dynamic Stochastic Games Using the Homotopy Method.” Operations Research 58 (4-part-2): 1116–32.
Calvano, Emilio, Giacomo Calzolari, Vincenzo Denicolo, and Sergio Pastorello. 2020. “Artificial Intelligence, Algorithmic Pricing, and Collusion.” American Economic Review 110 (10): 3267–97.
Caoui, El Hadi. 2019. “Estimating the Costs of Standardization: Evidence from the Movie Industry.” R&R, Review of Economic Studies.
Collard-Wexler, Allan. 2013. “Demand Fluctuations in the Ready-Mix Concrete Industry.” Econometrica 81 (3): 1003–37.
Courthoud, Matteo. 2020. “Approximation Methods for Large Dynamic Stochastic Games.” Working Paper.
Doraszelski, Ulrich. 2003. “An r&d Race with Knowledge Accumulation.” Rand Journal of Economics, 20–42.
Doraszelski, Ulrich, and Kenneth L Judd. 2012. “Avoiding the Curse of Dimensionality in Dynamic Stochastic Games.” Quantitative Economics 3 (1): 53–93.
———. 2019. “Dynamic Stochastic Games with Random Moves.” Quantitative Marketing and Economics 17 (1): 59–79.
Doraszelski, Ulrich, Gregory Lewis, and Ariel Pakes. 2018. “Just Starting Out: Learning and Equilibrium in a New Market.” American Economic Review 108 (3): 565–615.
Doraszelski, Ulrich, and Mark Satterthwaite. 2010. “Computable Markov-Perfect Industry Dynamics.” The RAND Journal of Economics 41 (2): 215–43.
Egesdal, Michael, Zhenyu Lai, and Che-Lin Su. 2015. “Estimating Dynamic Discrete-Choice Games of Incomplete Information.” Quantitative Economics 6 (3): 567–97.
Eibelshäuser, Steffen, and David Poensgen. 2019. “Markov Quantal Response Equilibrium and a Homotopy Method for Computing and Selecting Markov Perfect Equilibria of Dynamic Stochastic Games.” Working Paper.
Ericson, Richard, and Ariel Pakes. 1995. “Markov-Perfect Industry Dynamics: A Framework for Empirical Work.” The Review of Economic Studies 62 (1): 53–82.
Esteban, Susanna, and Matthew Shum. 2007. “Durable-Goods Oligopoly with Secondary Markets: The Case of Automobiles.” The RAND Journal of Economics 38 (2): 332–54.
Farias, Vivek, Denis Saure, and Gabriel Y Weintraub. 2012. “An Approximate Dynamic Programming Approach to Solving Dynamic Oligopoly Models.” The RAND Journal of Economics 43 (2): 253–82.
Fershtman, Chaim, and Ariel Pakes. 2012. “Dynamic Games with Asymmetric Information: A Framework for Empirical Work.” The Quarterly Journal of Economics 127 (4): 1611–61.
Goettler, Ronald L, and Brett R Gordon. 2011. “Does AMD Spur Intel to Innovate More?” Journal of Political Economy 119 (6): 1141–1200.
Hotz, V Joseph, and Robert A Miller. 1993. “Conditional Choice Probabilities and the Estimation of Dynamic Models.” The Review of Economic Studies 60 (3): 497–529.
Huang, Ling, and Martin D Smith. 2014. “The Dynamic Efficiency Costs of Common-Pool Resource Exploitation.” American Economic Review 104 (12): 4071–4103.
Ifrach, Bar, and Gabriel Y Weintraub. 2017. “A Framework for Dynamic Oligopoly in Concentrated Industries.” The Review of Economic Studies 84 (3): 1106–50.
Igami, Mitsuru. 2017. “Estimating the Innovator’s Dilemma: Structural Analysis of Creative Destruction in the Hard Disk Drive Industry, 1981–1998.” Journal of Political Economy 125 (3): 798–847.
Iskhakov, Fedor, John Rust, and Bertel Schjerning. 2016. “Recursive Lexicographical Search: Finding All Markov Perfect Equilibria of Finite State Directional Dynamic Games.” The Review of Economic Studies 83 (2): 658–703.
Jeon, Jihye. 2020. “Learning and Investment Under Demand Uncertainty in Container Shipping.” The RAND Journal of Economics.
Kasahara, Hiroyuki, and Katsumi Shimotsu. 2009. “Nonparametric Identification of Finite Mixture Models of Dynamic Discrete Choices.” Econometrica 77 (1): 135–75.
Maskin, Eric, and Jean Tirole. 1988. “A Theory of Dynamic Oligopoly, II: Price Competition, Kinked Demand Curves, and Edgeworth Cycles.” Econometrica: Journal of the Econometric Society, 571–99.
Pakes, Ariel, and Paul McGuire. 1994. “Computing Markov-Perfect Nash Equilibria: Numerical Implications of a Dynamic Differentiated Product Model.” RAND Journal of Economics 25 (4): 555–89.
———. 2001. “Stochastic Algorithms, Symmetric Markov Perfect Equilibrium, and the ‘Curse’of Dimensionality.” Econometrica 69 (5): 1261–81.
Pakes, Ariel, Michael Ostrovsky, and Steven Berry. 2007. “Simple Estimators for the Parameters of Discrete Dynamic Games (with Entry/Exit Examples).” The RAND Journal of Economics 38 (2): 373–99.
Pesendorfer, Martin, and Philipp Schmidt-Dengler. 2008. “Asymptotic Least Squares Estimators for Dynamic Games.” The Review of Economic Studies 75 (3): 901–28.
———. 2010. “Sequential Estimation of Dynamic Discrete Games: A Comment.” Econometrica 78 (2): 833–42.
Rust, John. 1994. “Structural Estimation of Markov Decision Processes.” Handbook of Econometrics 4: 3081–3143.
Ryan, Stephen P. 2012. “The Costs of Environmental Regulation in a Concentrated Industry.” Econometrica 80 (3): 1019–61.
Su, Che-Lin, and Kenneth L Judd. 2012. “Constrained Optimization Approaches to Estimation of Structural Models.” Econometrica 80 (5): 2213–30.
Sweeting, Andrew. 2013. “Dynamic Product Positioning in Differentiated Product Markets: The Effect of Fees for Musical Performance Rights on the Commercial Radio Industry.” Econometrica 81 (5): 1763–803.
Vreugdenhil, Nicholas. 2020. “Booms, Busts, and Mismatch in Capital Markets: Evidence from the Offshore Oil and Gas Industry.” R&R at Journal of Political Economy.
Weintraub, Gabriel Y, C Lanier Benkard, and Benjamin Van Roy. 2008. “Markov Perfect Industry Dynamics with Many Firms.” Econometrica 76 (6): 1375–1411.
Xu, Daniel Yi, and Yanyou Chen. 2020. “A Structural Empirical Model of r&d, Firm Heterogeneity, and Industry Evolution.” Journal of Industrial Economics.