Optimization problem
minimize $ f_0(x) $
subject to $ f_i(x) \leq b_i, \quad i = 1, \dots, m. $
optimal solution $ x^* $
for any $ z $ with $ f_1(z) \leq b_1, \dots, f_m(z) \leq b_m $, we have $ f_0 (z) \geq f_0(x^*) $.
data fitting
linear program
the objective and constraint function $ f_0, \dots, f_m $ are linear, $$ f_i (\alpha x + \beta y) = \alpha f_i (x) + \beta f_i (y) $$ for all $ x, y \in R^n $ and all $ \alpha, \beta \in R $.
nonlinear program
the optimization problem is not linear.
convex optimization problems
the objective and constraint functions are convex $$ f_i (\alpha x + \beta y) \leq \alpha f_i (x) + \beta f_i (y) $$
generalize optimization problems
exceptions
sovling least-square problems
analytical solution: $ x^* = (A^T A)^{-1} A^T b $
$$
\begin{align*}
f(x) &= \lVert Ax - b \rVert_2^2 = (Ax - b)^T (Ax - b)\\
&= (Ax)^T Ax - b^T Ax - (Ax)^T b + b^T b\\
&= x^T A^T Ax - b^T Ax - x^T A^T b + b^T b
\end{align*}
$$ Since $ b^T A x $ is a scalar $ (1 \times k \cdot k \times n \cdot n \times 1 ) $, we can take transpose
$$
b^T A x = (b^T A x)^T = x^T A^T b
$$
Thus,
$$
f(x) = x^T A^T Ax - 2 b^T Ax + b^T b
$$ Take the gradient of $ f(x) $ w.r.t $ x $ $$
\nabla f(x) = 2 A^T A x - 2 A^T b
$$ To find the minimum, set the gradient to zero:
$$
2 A^T A x - 2 A^T b = 0 \Rightarrow \boxed{ A^T A x = A^T b }
$$explain
reliable and efficient algorithms and software $ \rightarrow $ widely used.
computational time $ \propto n^2 k \quad (A \in R^{k \times n}) $.