Computational Economics (ECO309)
2025-04-17
The imperialism of Dynamic Programming
— Recursive Macroeconomic Theory (Ljunqvist & Sargent)
I spent the Fall quarter (of 1950) at RAND. My first task was to find a name for multistage decision processes. An interesting question is, “Where did the name, dynamic programming, come from?” The 1950s were not good years for mathematical research. We had a very interesting gentleman in Washington named Wilson. He was Secretary of Defense, and he actually had a pathological fear and hatred of the word “research”. I’m not using the term lightly; I’m using it precisely. His face would suffuse, he would turn red, and he would get violent if people used the term research in his presence. You can imagine how he felt, then, about the term mathematical. The RAND Corporation was employed by the Air Force, and the Air Force had Wilson as its boss, essentially. Hence, I felt I had to do something to shield Wilson and the Air Force from the fact that I was really doing mathematics inside the RAND Corporation. What title, what name, could I choose? In the first place I was interested in planning, in decision making, in thinking. But planning, is not a good word for various reasons. I decided therefore to use the word “programming”. I wanted to get across the idea that this was dynamic, this was multistage, this was time-varying. I thought, let’s kill two birds with one stone. Let’s take a word that has an absolutely precise meaning, namely dynamic, in the classical physical sense. It also has a very interesting property as an adjective, and that is it’s impossible to use the word dynamic in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It’s impossible. Thus, I thought dynamic programming was a good name. It was something not even a Congressman could object to. So I used it as an umbrella for my activities.
— Richard Bellman, Eye of the Hurricane: An Autobiography (1984, page 159)
Discrete States | Continuous States | |
---|---|---|
Discrete Time | Discrete Markov Chain | Continuous Markov Chain |
Continuous Time | Markov Jump Process | Markov Process |
Consider: \(\mu_{i,t+1}' =\mu_t' P\)
We have \(\mu_{i,t+1} = \sum_{k=1}^n \mu_{k,t} P_{k, i}\)
And: \(\sum_i\mu_{i,t+1} = \sum_i \mu_{i,t}\)
Postmultiplication by a stochastic matrix preserves the mass.
Interpretation: \(P_{ij}\) is the fraction of the mass initially in state \(i\) which ends up in \(j\)
\[\underbrace{ \begin{pmatrix} ? & ? & ? \end{pmatrix} }_{\mu_{t+1}'} = \underbrace{ \begin{pmatrix} 0.5 & 0.3 & 0.2 \end{pmatrix} }_{\mu_t'} \begin{pmatrix} 0.4 & 0.6 & 0.0 \\\\ 0.2 & 0.5 & 0.3 \\\\ 0 & 0 & 1.0 \end{pmatrix}\]
Graphical Representation:
It is easy to show that for any \(k\), \(P^k\) is a stochastic matrix.
\(P^k_{ij}\) denotes the probability of ending in \(j\), after \(k\) periods, starting from \(i\)
Given an initial distribution \(\mu_0\in R^{+ n}\)
We need to study a little bit the properties of Markov Chains
Two states \(s_i\) and \(s_j\) are connected if \(P_{ij}>0\)
We call incidence matrix: \(\mathcal{I}(P)=(\delta_{P_{ij}>0})_{ij}\)
Two states \(i\) and \(j\) communicate with each other if there are \(k\) and \(l\) such that: \((P^k)_ {i,j}>0\) and \((P^l)_ {j,i}>0\)
A stochastic matrix \(P\) is irreducible if all states communicate
Irreducible: all states can be reached with positive probability from any initial state.
\(\mu\) is a stationary distribution if \(\mu' = \mu' P\)
Theorem: there always exists such a distribution
Theorem:
We then say the Markov chain is ergodic
\(\mu^{\star}\) is the ergodic distribution
How do we compute the stationary distribution?
# we use the identity matrix and the \ operator
using LinearAlgebra: I, \
# define a stochastic matrix (lines sum to 1)
P = [ 0.9 0.1 0.0 ;
0.05 0.9 0.05 ;
0.0 0.9 0.1 ]
# define an auxiliary matrix
M = P' - I
M[end,:] .= 1.0
# define rhs
R = zeros(3)
R[end] = 1
# solve the system
μ = M\R
# check that you have a solution:
@assert sum(μ) == 1
@assert all(abs.(μ'P - μ').<1e-10)
Consider the following problems:
Monopoly pricing:
\[\max_{q} \pi(q) - c(q)\]
Shopping problem
\[\max_{\substack{c_1, c_2 \\ p_1 c_1 + p_2 c_2 \leq B}} U(c_1,c_2)\]
Consumption Savings
\[\max_{\substack{c() \\ w_{t+1}=(w_t-c(w_t))(1+r)) + y_{t+1}}} E_0 \sum_t \beta^t U(c(w_t))\]
Problem | objective | action | state | transition | type |
---|---|---|---|---|---|
monopoly pricing | profit | choose quantity to produce | optimization | ||
shopping problem | utility | choose consumption composition | budget \(B\) | comparative statics | |
consumption/savings | expected welfare | save or consume | available income | evolution of wealth | dynamic optimization |
\[V(s; x()) = E_0 \max \sum_ {t=0}^{T_0} \delta^t \left[ r(s_t, x_t) \right] + \delta^{T_0} E_0 \sum_ {t=T_0}^{\infty} \delta^t \left[ r(s_t, x_t) \right]\]
\[ = E_0 \left[ \max \sum_ {t=0}^{T_0} \delta^t \left[ r(s_t, x_t) \right] + \delta^{T_0} V(s_ {T_0}; x()) \right]\]
Look at the \(\alpha-\beta\) model. \[V_t = \max \sum_t^{\infty} \beta_t U(c_t)\] where \(\delta_0 = 1\), \(\delta_1=\alpha\), \(\delta_k=\alpha\beta^{k-1}\)
Makes the problem time-inconsistent:
When \(T<\infty\). With discrete action the problem can be represented by a tree.
or
\[V(s; x()) = r(s, x(s)) + \delta \sum_{s'} p(s'|s,x(s)) V(s'; x()) \]
Using the Blackwell’s theorem, we can prove the Bellman operator is a contraction mapping.
This justifies the Value Function Iteration algorithm:
Policy rule is deduced from \(V\) as the maximand in the Bellman step
Assume that \(X\) is finite.
Given \(x_n\), goal is to find \(V_n(s)\) in \[\forall s, V_n(s) = r(s, x_n(s)) + \delta \sum_{s'} \pi(s'| s, x_n(s)) V_n(s')\]
Three approaches:
For 2 and 3 it is useful to constuct a linear operator \(M\) such that \(V_{n+1} = R_n + \delta M_n . V_n\)
\(\newcommand{\E}{\mathbb{E}}\)
What is the value of being in a given state?
If Unemployed, facing current offer \(w\):
\[V^U(w) = U(\underline{c}) + \max_{a} \begin{cases} \beta V^E(w) & \text{if $a(w)$ is true} \\ \beta E_{w'}\left[ V^U(w^{\prime}) \right] & \text{if $a(w)$ is false} \end{cases}\]
If Employed, at rate \(w\) \[V^E(w) = U(w) + (1-\lambda) \beta V^E(w) + \lambda \beta E_{w'}\left[ V^U(w^{\prime}) \right] \]
We can represent value as two functions \(V^U\) and \(V^E\) of the states as