Computational Economics (ECO309)
2025-04-03
Optimization is everywhere in economics:
root finding: \(\text{find $x$ in $X$ such that $f(x)=0$}\)
minimization/maximization \(\min_{x\in X} f(x)\) or \(\max_{x\in X} f(x)\)
often a minimization problem can be reformulated as a root-finding problem
\[x_0 = {argmin}_{x\in X} f(x) \overbrace{\iff}^{??} f^{\prime} (x_0) = 0\]
In principle, there can be many roots (resp maxima) within the optimization set.
Algorithms that find them all are called “global”. For instance:
We will deal only with local algorithms, and consider local convergence properties.
The full mathematical treatment will typically assume that \(f\) is smooth (\(\mathcal{C}_1\) or \(\mathcal{C}_2\) depending on the algorithm).
In practice we often don’t know about these properties
So: fingers crossed
Here is the surface representing the objective that a deep neural network training algorithm tries to minimize.
And yet, neural networks do great things!
Minimize \(f(x)\) for \(x \in [a,b]\)
Choose \(\Phi \in [0,0.5]\)
Algorithm:
Minimize \(f(x)\) for \(x \in R^n\) given initial guess \(x_0 \in R^n\)
Many intuitions from the 1d case, still apply
Some specific problems:
Function \(f: R^p \rightarrow R^q\)
Jacobian: \(J(x)\) or \(f^{\prime}_x(x)\), \(p\times q\) matrix such that: \[J(x)_{ij} = \frac{\partial f(x)_i}{\partial x_j}\]
Gradient: \(\nabla f(x) = J(x)\), gradient when \(q=1\)
Hessian: denoted by \(H(x)\) or \(f^{\prime\prime}_{xx}(x)\) when \(q=1\): \[H(x)_{jk} = \frac{\partial f(x)}{\partial x_j\partial x_k}\]
In the following explanations, \(|x|\) denotes the supremum norm, but most of the following explanations also work with other norms.
X = A \ Y
Least-square minimization: $min_x _i f(x)_i^2 R $
replace \({J(x_n)^{\prime}J(x_n)}^{-1}\) by \({J(x_n)^{\prime}J(x_n)}^{-1} +\mu I\)
uses only gradient information like Gauss-Newton
equivalent to Gauss-Newton close to the solution (\(\mu\) small)
equivalent to Gradient far from solution (\(\mu\) high)
Consider the optimization problem: \[\max U(x_1, x_2)\]
under the constraint \(p_1 x_1 + p_2 x_2 \leq B\)
where \(U(.)\), \(p_1\), \(p_2\) and \(B\) are given.
How do you find a solution by hand?
General formulation for vector-valued functions \[f(x)\geq 0 \perp g(x)\geq 0\] means \[\forall i, f_i(x)\geq 0 \perp g_i(x)\geq 0\]
There are robust (commercial) solvers for NCP problems (PATH, Knitro) for that
How do we solve it numerically?
julia> f(x) = [x[1] - x[2] - 1, x[1] + x[2]]
f (generic function with 1 method)
julia> NLsolve.nlsolve(f, [0., 0.0])
Results of Nonlinear Solver Algorithm
* Algorithm: Trust-region with dogleg and autoscaling
* Starting Point: [0.0, 0.0]
* Zero: [0.5000000000009869, -0.5000000000009869]
* Inf-norm of residuals: 0.000000
* Iterations: 1
* Convergence: true
* |x - x'| < 0.0e+00: false
* |f(x)| < 1.0e-08: true
* Function Calls (f): 2
* Jacobian Calls (df/dx): 2