57. Lagrangian for LQ Control#
!pip install quantecon
Show code cell output
Requirement already satisfied: quantecon in /opt/conda/envs/quantecon/lib/python3.11/site-packages (0.7.2)
Requirement already satisfied: numba>=0.49.0 in /opt/conda/envs/quantecon/lib/python3.11/site-packages (from quantecon) (0.59.1)
Requirement already satisfied: numpy>=1.17.0 in /opt/conda/envs/quantecon/lib/python3.11/site-packages (from quantecon) (1.26.4)
Requirement already satisfied: requests in /opt/conda/envs/quantecon/lib/python3.11/site-packages (from quantecon) (2.32.2)
Requirement already satisfied: scipy>=1.5.0 in /opt/conda/envs/quantecon/lib/python3.11/site-packages (from quantecon) (1.13.1)
Requirement already satisfied: sympy in /opt/conda/envs/quantecon/lib/python3.11/site-packages (from quantecon) (1.12)
Requirement already satisfied: llvmlite<0.43,>=0.42.0dev0 in /opt/conda/envs/quantecon/lib/python3.11/site-packages (from numba>=0.49.0->quantecon) (0.42.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /opt/conda/envs/quantecon/lib/python3.11/site-packages (from requests->quantecon) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in /opt/conda/envs/quantecon/lib/python3.11/site-packages (from requests->quantecon) (3.7)
Requirement already satisfied: urllib3<3,>=1.21.1 in /opt/conda/envs/quantecon/lib/python3.11/site-packages (from requests->quantecon) (2.2.2)
Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/envs/quantecon/lib/python3.11/site-packages (from requests->quantecon) (2024.6.2)
Requirement already satisfied: mpmath>=0.19 in /opt/conda/envs/quantecon/lib/python3.11/site-packages (from sympy->quantecon) (1.3.0)
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
import numpy as np
from quantecon import LQ
from scipy.linalg import schur
57.1. Overview#
This is a sequel to this lecture linear quadratic dynamic programming
It can also be regarded as presenting invariant subspace techniques that extend ones that we encountered earlier in this lecture stability in linear rational expectations models
We present a Lagrangian formulation of an infinite horizon linear quadratic undiscounted dynamic programming problem.
Such a problem is also sometimes called an optimal linear regulator problem.
A Lagrangian formulation
carries insights about connections between stability and optimality
is the basis for fast algorithms for solving Riccati equations
opens the way to constructing solutions of dynamic systems that don’t come directly from an intertemporal optimization problem
A key tool in this lecture is the concept of an \(n \times n\) symplectic matrix.
A symplectic matrix has eigenvalues that occur in reciprocal pairs, meaning that if \(\lambda_i \in (-1,1)\) is an eigenvalue, then so is \(\lambda_i^{-1}\).
This reciprocal pairs property of the eigenvalues of a matrix is a tell-tale sign that the matrix describes the joint dynamics of a system of equations describing the states and costates that constitute first-order necessary conditions for solving an undiscounted linear-quadratic infinite-horizon optimization problem.
The symplectic matrix that will interest us describes the first-order dynamics of state and co-state vectors of an optimally controlled system.
In focusing on eigenvalues and eigenvectors of this matrix, we capitalize on an analysis of invariant subspaces.
These invariant subspace formulations of LQ dynamic programming problems provide a bridge between recursive (i.e., dynamic programming) formulations and classical formulations of linear control and linear filtering problems that make use of related matrix decompositions (see for example this lecture and this lecture).
While most of this lecture focuses on undiscounted problems, later sections describe handy ways of transforming discounted problems to undiscounted ones.
The techniques in this lecture will prove useful when we study Stackelberg and Ramsey problem in this lecture.
57.2. Undiscounted LQ DP Problem#
The problem is to choose a sequence of controls \(\{u_t\}_{t=0}^\infty\) to maximize the criterion
subject to \(x_{t+1}=Ax_t+Bu_t\), where \(x_0\) is a given initial state vector.
Here \(x_t\) is an \((n\times 1)\) vector of state variables, \(u_t\) is a \((k\times 1)\) vector of controls, \(R\) is a positive semidefinite symmetric matrix, \(Q\) is a positive definite symmetric matrix, \(A\) is an \((n\times n)\) matrix, and \(B\) is an \((n\times k)\) matrix.
The optimal value function turns out to be quadratic, \(V(x)= - x'Px\), where \(P\) is a positive semidefinite symmetric matrix.
Using the transition law to eliminate next period’s state, the Bellman equation becomes
The first-order necessary conditions for the maximum problem on the right side of equation (57.1) are
Note
We use the following rules for differentiating quadratic and bilinear matrix forms: \({\partial x' A x \over \partial x} = (A + A') x; {\partial y' B z \over \partial y} = B z, {\partial y' B z \over \partial z} = B' y\).
which implies that an optimal decision rule for \(u\) is
or
where
Substituting \(u = - (Q+B'PB)^{-1}B'PAx\) into the right side of equation (57.1) and rearranging gives
Equation (57.2) is called an algebraic matrix Riccati equation.
There are multiple solutions of equation (57.2).
But only one of them is positive definite.
The positive define solution is associated with the maximum of our problem.
It expresses the matrix \(P\) as an implicit function of the matrices \(R,Q,A,B\).
Notice that the gradient of the value function is
We shall use fact (57.3) later.
57.3. Lagrangian#
For the undiscounted optimal linear regulator problem, form the Lagrangian
where \(2 \mu_{t+1}\) is a vector of Lagrange multipliers on the time \(t\) transition law \(x_{t+1} = A x_t + B u_t\).
(We put the \(2\) in front of \(\mu_{t+1}\) to make things match up nicely with equation (57.3).)
First-order conditions for maximization with respect to \(\{u_t,x_{t+1}\}_{t=0}^\infty\) are
Define \(\mu_0\) to be a vector of shadow prices of \(x_0\) and apply an envelope condition to (57.4) to deduce that
which is a time \(t=0 \) counterpart to the second equation of system (57.5).
An important fact is that
where \(P\) is a positive define matrix that solves the algebraic Riccati equation (57.2).
Thus, from equations (57.3) and (57.6), \(- 2 \mu_{t}\) is the gradient of the value function with respect to \(x_t\).
The Lagrange multiplier vector \(\mu_{t}\) is often called the costate vector that corresponds to the state vector \(x_t\).
It is useful to proceed with the following steps:
solve the first equation of (57.5) for \(u_t\) in terms of \(\mu_{t+1}\).
substitute the result into the law of motion \(x_{t+1} = A x_t + B u_t\).
arrange the resulting equation and the second equation of (57.5) into the form
where
When \(L\) is of full rank (i.e., when \(A\) is of full rank), we can write system (57.7) as
where
57.4. State-Costate Dynamics#
We seek to solve the difference equation system (57.8) for a sequence \(\{x_t\}_{t=0}^\infty\) that satisfies
an initial condition for \(x_0\)
a terminal condition \(\lim_{t \rightarrow +\infty} x_t =0\)
This terminal condition reflects our desire for a stable solution, one that does not diverge as \(t \rightarrow \infty\).
We inherit our wish for stability of the \(\{x_t\}\) sequence from a desire to maximize
which requires that \(x_t' R x_t\) converge to zero as \(t \rightarrow + \infty\).
57.5. Reciprocal Pairs Property#
To proceed, we study properties of the \((2n \times 2n)\) matrix \(M\) defined in (57.9).
It helps to introduce a \((2n \times 2n)\) matrix
The rank of \(J\) is \(2n\).
Definition: A matrix \(M\) is called symplectic if
Salient properties of symplectic matrices that are readily verified include:
If \(M\) is symplectic, then \(M^2\) is symplectic
The determinant of a symplectic, then \(\textrm{det}(M) = 1\)
It can be verified directly that \(M\) in equation (57.9) is symplectic.
It follows from equation (57.10) and from the fact \(J^{-1} = J^\prime = -J\) that for any symplectic matrix \(M\),
Equation (57.11) states that \(M^\prime\) is related to the inverse of \(M\) by a similarity transformation.
For square matrices, recall that
similar matrices share eigenvalues
eigenvalues of the inverse of a matrix are inverses of eigenvalues of the matrix
a matrix and its transpose share eigenvalues
It then follows from equation (57.11) that the eigenvalues of \(M\) occur in reciprocal pairs: if \(\lambda\) is an eigenvalue of \(M\), so is \(\lambda^{-1}\).
Write equation (57.8) as
where \(y_t = \begin{pmatrix}x_t\cr \mu_t\cr\end{pmatrix}\).
Consider a triangularization of \(M\)
where
each block on the right side is \((n\times n)\)
\(V\) is nonsingular
all eigenvalues of \(W_{22}\) exceed \(1\) in modulus
all eigenvalues of \(W_{11}\) are less than \(1\) in modulus
57.6. Schur decomposition#
The Schur decomposition and the eigenvalue decomposition are two decompositions of the form (57.13).
Write equation (57.12) as
A solution of equation (57.14) for arbitrary initial condition \(y_0\) is evidently
where \(W_{12,t} = W_{12}\) for \(t=1\) and for \(t \geq 2\) obeys the recursion
and where \(W^t_{ii}\) is \(W_{ii}\) raised to the \(t\)th power.
Write equation (57.15) as
where \(y^\ast_t = V^{-1} y_t\), and in particular where
and where \(V^{ij}\) denotes the \((i,j)\) piece of the partitioned \(V^{-1}\) matrix.
Because \(W_{22}\) is an unstable matrix, \(y^\ast_t\) will diverge unless \(y^\ast_{20} = 0\).
Let \(V^{ij}\) denote the \((i,j)\) piece of the partitioned \(V^{-1}\) matrix.
To attain stability, we must impose \(y^\ast_{20} =0\), which from equation (57.16) implies
or
This equation replicates itself over time in the sense that it implies
But notice that because \((V^{21}\ V^{22})\) is the second row block of the inverse of \(V,\) it follows that
which implies
Therefore,
So we can write
and
However, we know that \(\mu_t = P x_t\), where \(P\) occurs in the matrix that solves the Riccati equation.
Thus, the preceding argument establishes that
Remarkably, formula (57.17) provides us with a computationally efficient way of computing the positive definite matrix \(P\) that solves the algebraic Riccati equation (57.2) that emerges from dynamic programming.
This same method can be applied to compute the solution of any system of the form (57.8) if a solution exists, even if eigenvalues of \(M\) fail to occur in reciprocal pairs.
The method will typically work so long as the eigenvalues of \(M\) split half inside and half outside the unit circle.
Systems in which eigenvalues (properly adjusted for discounting) fail to occur in reciprocal pairs arise when the system being solved is an equilibrium of a model in which there are distortions that prevent there being any optimum problem that the equilibrium solves. See [Ljungqvist and Sargent, 2018], ch 12.
57.7. Application#
Here we demonstrate the computation with an example which is the deterministic version of an example borrowed from this quantecon lecture.
# Model parameters
r = 0.05
c_bar = 2
μ = 1
# Formulate as an LQ problem
Q = np.array([[1]])
R = np.zeros((2, 2))
A = [[1 + r, -c_bar + μ],
[0, 1]]
B = [[-1],
[0]]
# Construct an LQ instance
lq = LQ(Q, R, A, B)
Given matrices \(A\), \(B\), \(Q\), \(R\), we can then compute \(L\), \(N\), and \(M=L^{-1}N\).
def construct_LNM(A, B, Q, R):
n, k = lq.n, lq.k
# construct L and N
L = np.zeros((2*n, 2*n))
L[:n, :n] = np.eye(n)
L[:n, n:] = B @ np.linalg.inv(Q) @ B.T
L[n:, n:] = A.T
N = np.zeros((2*n, 2*n))
N[:n, :n] = A
N[n:, :n] = -R
N[n:, n:] = np.eye(n)
# compute M
M = np.linalg.inv(L) @ N
return L, N, M
L, N, M = construct_LNM(lq.A, lq.B, lq.Q, lq.R)
M
array([[ 1.05 , -1. , -0.95238095, 0. ],
[ 0. , 1. , 0. , 0. ],
[ 0. , 0. , 0.95238095, 0. ],
[ 0. , 0. , 0.95238095, 1. ]])
Let’s verify that \(M\) is symplectic.
n = lq.n
J = np.zeros((2*n, 2*n))
J[n:, :n] = np.eye(n)
J[:n, n:] = -np.eye(n)
M @ J @ M.T - J
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
We can compute the eigenvalues of \(M\) using np.linalg.eigvals
, arranged in ascending order.
eigvals = sorted(np.linalg.eigvals(M))
eigvals
[0.9523809523809523, 1.0, 1.0, 1.05]
When we apply Schur decomposition such that \(M=V W V^{-1}\), we want
the upper left block of \(W\), \(W_{11}\), to have all of its eigenvalues less than 1 in modulus, and
the lower right block \(W_{22}\) to have eigenvalues that exceed 1 in modulus.
To get what we want, let’s define a sorting function that tells scipy.schur
to sort the corresponding eigenvalues with modulus smaller than 1 to the upper left.
stable_eigvals = eigvals[:n]
def sort_fun(x):
"Sort the eigenvalues with modules smaller than 1 to the top-left."
if x in stable_eigvals:
stable_eigvals.pop(stable_eigvals.index(x))
return True
else:
return False
W, V, _ = schur(M, sort=sort_fun)
W
array([[ 1. , -0.02316402, -1.00085948, -0.95000594],
[ 0. , 0.95238095, -0.00237501, -0.95325452],
[ 0. , 0. , 1.05 , 0.02432222],
[ 0. , 0. , 0. , 1. ]])
V
array([[ 0.99875234, 0.00121459, -0.04992284, 0. ],
[ 0.04993762, -0.02429188, 0.99845688, 0. ],
[ 0. , 0.04992284, 0.00121459, 0.99875234],
[ 0. , -0.99845688, -0.02429188, 0.04993762]])
We can check the modulus of eigenvalues of \(W_{11}\) and \(W_{22}\).
Since they are both triangular matrices, eigenvalues are the diagonal elements.
# W11
np.diag(W[:n, :n])
array([1. , 0.95238095])
# W22
np.diag(W[n:, n:])
array([1.05, 1. ])
The following functions wrap \(M\) matrix construction, Schur decomposition, and stability-imposing computation of \(P\).
def stable_solution(M, verbose=True):
"""
Given a system of linear difference equations
y' = |a b| y
x' = |c d| x
which is potentially unstable, find the solution
by imposing stability.
Parameter
---------
M : np.ndarray(float)
The matrix represents the linear difference equations system.
"""
n = M.shape[0] // 2
stable_eigvals = list(sorted(np.linalg.eigvals(M))[:n])
def sort_fun(x):
"Sort the eigenvalues with modules smaller than 1 to the top-left."
if x in stable_eigvals:
stable_eigvals.pop(stable_eigvals.index(x))
return True
else:
return False
W, V, _ = schur(M, sort=sort_fun)
if verbose:
print('eigenvalues:\n')
print(' W11: {}'.format(np.diag(W[:n, :n])))
print(' W22: {}'.format(np.diag(W[n:, n:])))
# compute V21 V11^{-1}
P = V[n:, :n] @ np.linalg.inv(V[:n, :n])
return W, V, P
def stationary_P(lq, verbose=True):
"""
Computes the matrix :math:`P` that represent the value function
V(x) = x' P x
in the infinite horizon case. Computation is via imposing stability
on the solution path and using Schur decomposition.
Parameters
----------
lq : qe.LQ
QuantEcon class for analyzing linear quadratic optimal control
problems of infinite horizon form.
Returns
-------
P : array_like(float)
P matrix in the value function representation.
"""
Q = lq.Q
R = lq.R
A = lq.A * lq.beta ** (1/2)
B = lq.B * lq.beta ** (1/2)
n, k = lq.n, lq.k
L, N, M = construct_LNM(A, B, Q, R)
W, V, P = stable_solution(M, verbose=verbose)
return P
# compute P
stationary_P(lq)
eigenvalues:
W11: [1. 0.95238095]
W22: [1.05 1. ]
array([[ 0.1025, -2.05 ],
[-2.05 , 41. ]])
Note that the matrix \(P\) computed in this way is close to what we get from the routine in quantecon that solves an algebraic Riccati equation by iterating to convergence on a Riccati difference equation.
The small difference comes from computational errors and will decrease as we increase the maximum number of iterations or decrease the tolerance for convergence.
lq.stationary_values()
(array([[ 0.1025, -2.05 ],
[-2.05 , 41.01 ]]),
array([[-0.09761905, 1.95238095]]),
0)
Using a Schur decomposition is much more efficient.
%%timeit
stationary_P(lq, verbose=False)
87.5 μs ± 579 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
%%timeit
lq.stationary_values()
1.39 ms ± 1.85 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
57.8. Other Applications#
The preceding approach to imposing stability on a system of potentially unstable linear difference equations is not limited to linear quadratic dynamic optimization problems.
For example, the same method is used in our Stability in Linear Rational Expectations Models lecture.
Let’s try to solve the model described in that lecture by applying the stable_solution
function defined in this lecture above.
def construct_H(ρ, λ, δ):
"contruct matrix H given parameters."
H = np.empty((2, 2))
H[0, :] = ρ,δ
H[1, :] = - (1 - λ) / λ, 1 / λ
return H
H = construct_H(ρ=.9, λ=.5, δ=0)
W, V, P = stable_solution(H)
P
eigenvalues:
W11: [0.9]
W22: [2.]
array([[0.90909091]])
57.9. Discounted Problems#
57.9.1. Transforming States and Controls to Eliminate Discounting#
A pair of useful transformations allows us to convert a discounted problem into an undiscounted one.
Thus, suppose that we have a discounted problem with objective
and that the state transition equation is again \(x_{t +1 }=Ax_t+Bu_t\).
Define the transformed state and control variables
\(\hat x_t = \beta^{\frac{t}{2}} x_t \)
\(\hat u_t = \beta^{\frac{t}{2}} u_t\)
and the transformed transition equation matrices
\(\hat A = \beta^{\frac{1}{2}} A\)
\(\hat B = \beta^{\frac{1}{2}} B \)
so that the adjusted state and control variables obey the transition law
Then a discounted optimal control problem defined by \(A, B, R, Q, \beta\) having optimal policy characterized by \(P, F\) is associated with an equivalent undiscounted problem defined by \(\hat A, \hat B, Q, R\) having optimal policy characterized by \(\hat F, \hat P\) that satisfy the following equations:
and
It follows immediately from the definitions of \(\hat A, \hat B\) that \(\hat F = F\) and \(\hat P = P\).
By exploiting these transformations, we can solve a discounted problem by solving an associated undiscounted problem.
In particular, we can first transform a discounted LQ problem to an undiscounted one and then solve that discounted optimal regulator problem using the Lagrangian and invariant subspace methods described above.
For example, when \(\beta=\frac{1}{1+r}\), we can solve for \(P\) with \(\hat{A}=\beta^{1/2} A\) and \(\hat{B}=\beta^{1/2} B\).
These settings are adopted by default in the function stationary_P
defined above.
β = 1 / (1 + r)
lq.beta = β
stationary_P(lq)
eigenvalues:
W11: [0.97590007 0.97590007]
W22: [1.02469508 1.02469508]
array([[ 0.0525, -1.05 ],
[-1.05 , 21. ]])
We can verify that the solution agrees with one that comes from applying the routine LQ.stationary_values
in the quantecon package.
lq.stationary_values()
(array([[ 0.0525, -1.05 ],
[-1.05 , 21. ]]),
array([[-0.05, 1. ]]),
0.0)
57.9.2. Lagrangian for Discounted Problem#
For several purposes, it is useful explicitly briefly to describe a Lagrangian for a discounted problem.
Thus, for the discounted optimal linear regulator problem, form the Lagrangian
where \(2 \mu_{t+1}\) is a vector of Lagrange multipliers on the state vector \(x_{t+1}\).
First-order conditions for maximization with respect to \(\{u_t,x_{t+1}\}_{t=0}^\infty\) are
Define \(2 \mu_0\) to be the vector of shadow prices of \(x_0\) and apply an envelope condition to (57.18) to deduce that
which is a time \(t=0 \) counterpart to the second equation of system (57.19).
Proceeding as we did above with the undiscounted system (57.5), we can rearrange the first-order conditions into the system
which in the special case that \(\beta = 1\) agrees with equation (57.5), as expected.
By staring at system (57.20), we can infer identities that shed light on the structure of optimal linear regulator problems, some of which will be useful in this lecture when we apply and extend the methods of this lecture to study Stackelberg and Ramsey problems.
First, note that the first block of equation system (57.20) asserts that when \(\mu_{t+1} = P x_{t+1}\), then
which can be rearranged to sbe
This expression for the optimal closed loop dynamics of the state must agree with an alternative expression that we had derived with dynamic programming, namely,
But using
it follows that
Thus, our two expressions for the closed loop dynamics agree if and only if
Matrix equation (57.22) can be verified by applying a partitioned inverse formula.
Note
Just use the formula \((a - b d^{-1} c)^{-1} = a^{-1} + a^{-1} b (d - c a^{-1} b)^{-1} c a^{-1}\) for appropriate choices of the matrices \(a, b, c, d\).
Next, note that for any fixed \(F\) for which eigenvalues of \(A- BF\) are less than \(\frac{1}{\beta}\) in modulus, the value function associated with using this rule forever is \(- x_0 \tilde P x_0\) where \(\tilde P\) obeys the following matrix equation:
Evidently, \(\tilde P = P \) only when \(F \) obeys formula (57.21).
Next, note that the second equation of system (57.20) implies the “forward looking” equation for the Lagrange multiplier
whose solution is
where
where we must require that \(F\) obeys equation (57.21).
Equations (57.23) and (57.24) provide different perspectives on the optimal value function.