본문 바로가기

[Physics/Math]/Math

Method of Lagrange multipliers

728x90
반응형
# Method of Lagrange multipliers
by kipid
This is an explaination about the method of Lagrange multipliers which is used to find the condition for optimization with some constraints. The poor english will be revised soon from time to time. Please report errata and wrong part.
Several sections are initially hiden. Click the "▼ Show/Hide" button to see the hidden section. And hide your readen sections so as to save your computer resources. Many maths rendered by MathJax can cause slow response and performance, although maths out of your view will be delayed-rendered by your scroll action.
## PH
  • 2023-03-16: To SEE.
  • 2014-06-12: docuK upgrade. css/js through CDN.
  • 2014-06-11: docuK upgrade.
## TOC ## Finding minimum/maximum point of $f(x)$ (or $f(\vec{x} = x^i \vec{e}_i)$) Many optimization problems usually follow the procedure (1) expressing problems in the form of mathematical functions (2) and then finding minima/maxima of these functions under given conditions. When we find the minimum/maximum point of $f(x)$ where $x$ goes from $a$ to $b$, i.e. $a \leq x \leq b$, what we should do is simply to compare boundary points ($x=a, b$) and stationary points inside this region $a \leq x \leq b$. In other words, we compare $f(a)$, $f(b)$, and the values at which \frac{d f(x)} {d x} = 0. Other points are not necessary to be compared unless $f(x)$ is not differentiable inside $[a, b] (\equiv a \leq x \leq b)$. For functions with multiple variables, e.g. $f(\vec{x}) \equiv f(x_1, x_2, \cdots)$, stationary means that \frac{\partial f(\vec{x})} {\partial x_k} = 0 for all variable $x_k$'s. In this case, we normally compare stationary points inside the region of our interest, and stationary points on the boundary of the region. We can express the boundary of variable region by introducing a constraint $g(\vec{x}) = c$. At the boundary of region, variables must satisfy this constraint $g(\vec{x}) = c$. ## Optimization with constraints: Method of Lagrange multipliers In many cases, the variables have some constraints besides boundary. So we cannot change the variables fully arbitrarily in these cases. The variables must change satisfying the constraints $g_l (\vec{x}) =$const. So infinitesimal change of variables must satisfy \sum_{k} \frac{\partial g_l(\vec{x})} {\partial x_k} \delta x_k = 0 . Regard this as \frac{d g_l(\vec{x})} {d \tau} = \sum_{k} \frac{\partial g_l(\vec{x})} {\partial x_k} \frac{d x_k} {d \tau} = 0 for any confined movement $\{\delta x_k \equiv \frac{d x_k} {d \tau} \}$ set. To find the minima/maxima of $f(\{x_i\})$ subject to constraints, we have to find the stationary (or extreme) points where \sum_{k} \frac{\partial f(\{x_i\})} {\partial x_k} \delta x_k = 0 with any infinitesimal deviation satisfying .
// This is condition for stationary points, not for global minima or maxima. It can be local minimum, or local maximum, or saddle point, or any stationary point.
### Multiple variables and single constraint For the sake of simplicity, let's firstly think about the case with single constraint. With this single constraint, one of the variables becomes dependent on the others. In other words, one variable is determined by the others. Selecting $l$ where $\partial_{x_l} \big[ g(\{x_i\}) \big] \neq 0$, \frac{\partial g(\{x_i\})} {\partial x_l} \delta x_l + \sum_{k \neq l} \frac{\partial g(\{x_i\})} {\partial x_k} \delta x_k = 0 , we can make \delta x_l = - \frac{\sum_{k \neq l} \partial_{x_k} \big[ g(\{x_i\}) \big] \delta x_k} {\partial_{x_l} \big[ g(\{x_i\}) \big]} . Making only $\delta x_l$ be deterministic, the other $\delta x_k$'s can be fully arbitrary. Then \begin{split} &\frac{\partial f(\{x_i\})} {\partial x_l} \delta x_l + \sum_{k \neq l} \frac{\partial f(\{x_i\})} {\partial x_k} \delta x_k = 0 . \\ &\frac{\partial f(\{x_i\})} {\partial x_l} \bigg[ - \frac{\sum_{k \neq l} \partial_{x_k} \big[ g(\{x_i\}) \big] \delta x_k} {\partial_{x_l} \big[ g(\{x_i\}) \big]} \bigg] + \sum_{k \neq l} \frac{\partial f(\{x_i\})} {\partial x_k} \delta x_k = 0 . \\ \end{split} Introducing $\lambda$ as \lambda \equiv \frac{\partial_{x_l} \big[ f(\{x_i\}) \big]} {\partial_{x_l} \big[ g(\{x_i\}) \big]} or \frac{\partial f(\{x_i\})} {\partial x_l} - \lambda \frac{\partial g(\{x_i\})} {\partial x_l} = 0 , \begin{split} &- \lambda \sum_{k \neq l} \frac{\partial g(\{x_i\})} {\partial x_k} \delta x_k + \sum_{k \neq l} \frac{\partial f(\{x_i\})} {\partial x_k} \delta x_k = 0 . \\ & \sum_{k \neq l} \bigg[ \frac{\partial f(\{x_i\})} {\partial x_k} - \lambda \frac{\partial g(\{x_i\})} {\partial x_k} \bigg] \delta x_k = 0 . \end{split} As $\delta x_k$'s can be fully arbitrary, \frac{\partial f(\{x_i\})} {\partial x_k} - \lambda \frac{\partial g(\{x_i\})} {\partial x_k} = 0 \quad \textrm{for any } k \neq l. Combining Eq and , \frac{\partial f(\{x_i\})} {\partial x_k} - \lambda \frac{\partial g(\{x_i\})} {\partial x_k} = 0 \quad \textrm{for all } k
which is primitive version of the method of Lagrange multiplier. To find the actual minima/maxima, we have to compare the values at both these stationary points and boundaries. This is just condition for stationarity.
There is another simple approach. Introducing $\lambda$ as before , let's see Eq + $\lambda$. As the $l$-th term becomes automatically zero and only $\delta x_l$ is to be deterministic, we have to satisfy \sum_{k \neq l} \bigg[ \frac{\partial f(\{x_i\})} {\partial x_k} - \lambda \frac{\partial g(\{x_i\})} {\partial x_k} \bigg] \delta x_k = 0 for fully arbitrary $\delta x_k$'s. Then the result comes directly. ### Multiple constraints When there are $n$ constraints on variables, i.e. $g_j (\{x_i\}) =$const for $j=1,2,\cdots,n$, \sum_{k} \frac{\partial f(\{x_i\})} {\partial x_k} \delta x_k = 0 \quad \textrm{at extreme or stationary points} with any $\delta x_k$'s satisfying \sum_{k} \frac{\partial g_j (\{x_i\})} {\partial x_k} \delta x_k = 0 \quad \textrm{for } j = 1,2,\cdots,n . At this time, $n$ variables of the whole become dependent on the others, or determined by the others. \begin{split} &\sum_{l\textrm{'s}} \frac{\partial g_j (\{x_i\})} {\partial x_l} \delta x_l + \sum_{k \neq l\textrm{'s}} \frac{\partial g_j (\{x_i\})} {\partial x_k} \delta x_k = 0 \quad \textrm{for } j = 1,2,\cdots,n . \end{split} Selecting $l$'s where $A_{jl} \equiv \partial g_j / \partial x_l$ is invertible and introducing an inverse matrix of that, i.e. \sum_{j=1}^{n} A^{-1}_{pj} \frac{\partial g_j (\{x_i\})} {\partial x_l} = \delta_{pl} \quad \textrm{with } p = \textrm{one of } l\textrm{'s} , then \begin{split} &\sum_{j=1}^{n} A^{-1}_{pj} \bigg[ \sum_{l\textrm{'s}} \frac{\partial g_j (\{x_i\})} {\partial x_l} \delta x_l + \sum_{k \neq l\textrm{'s}} \frac{\partial g_j (\{x_i\})} {\partial x_k} \delta x_k \bigg] = 0 . \\ & \sum_{l\textrm{'s}} \delta_{pl} \delta x_l + \sum_{j=1}^{N} \sum_{k \neq l\textrm{'s}} A^{-1}_{pj} \frac{\partial g_j (\{x_i\})} {\partial x_k} \delta x_k = 0 . \\ &\delta x_l = - \sum_{j=1}^{n} \sum_{k \neq l\textrm{'s}} A^{-1}_{lj} \frac{\partial g_j (\{x_i\})} {\partial x_k} \delta x_k . \end{split} Putting this into Eq , \begin{split} &\sum_{l\textrm{'s}} \frac{\partial f(\{x_i\})} {\partial x_l} \delta x_l + \sum_{k \neq l\textrm{'s}} \frac{\partial f(\{x_i\})} {\partial x_k} \delta x_k = 0 . \\ &\sum_{l\textrm{'s}} \frac{\partial f(\{x_i\})} {\partial x_l} \bigg[ - \sum_{j=1}^{n} \sum_{k \neq l\textrm{'s}} A^{-1}_{lj} \frac{\partial g_j (\{x_i\})} {\partial x_k} \delta x_k \bigg] + \sum_{k \neq l\textrm{'s}} \frac{\partial f(\{x_i\})} {\partial x_k} \delta x_k = 0 . \\ & \sum_{k \neq l\textrm{'s}} \bigg[ \frac{\partial f(\{x_i\})} {\partial x_k} - \sum_{l\textrm{'s}} \frac{\partial f(\{x_i\})} {\partial x_l} \sum_{j=1}^{n} A^{-1}_{lj} \frac{\partial g_j (\{x_i\})} {\partial x_k} \bigg] \delta x_k = 0 . \end{split} Defining $\lambda_j \equiv \sum_{l\textrm{'s}} \frac{\partial f(\{x_i\})} {\partial x_l} A^{-1}_{lj}$, \sum_{k \neq l\textrm{'s}} \bigg[ \frac{\partial f(\{x_i\})} {\partial x_k} - \sum_{j=1}^{n} \lambda_{j} \frac{\partial g_j (\{x_i\})} {\partial x_k} \bigg] \delta x_k = 0 . As $\delta x_k$'s are fully arbitrary, \frac{\partial f(\{x_i\})} {\partial x_k} - \sum_{j=1}^{n} \lambda_{j} \frac{\partial g_j (\{x_i\})} {\partial x_k} = 0 \quad \textrm{for } k \neq l\textrm{'s} . For $p = l$'s, \begin{split} &\frac{\partial f(\{x_i\})} {\partial x_p} - \sum_{j=1}^{n} \lambda_{j} \frac{\partial g_j (\{x_i\})} {\partial x_p} \quad \textrm{for }p = l\textrm{'s} \\ &= \frac{\partial f(\{x_i\})} {\partial x_p} - \sum_{j=1}^{n} \sum_{l\textrm{'s}} \frac{\partial f(\{x_i\})} {\partial x_l} A^{-1}_{lj} \frac{\partial g_j (\{x_i\})} {\partial x_p} \\ &= \frac{\partial f(\{x_i\})} {\partial x_p} - \sum_{l\textrm{'s}} \frac{\partial f(\{x_i\})} {\partial x_l} \delta_{lp} \\ &= \frac{\partial f(\{x_i\})} {\partial x_p} - \frac{\partial f(\{x_i\})} {\partial x_p} = 0 . \end{split}
Combining above two results, it becomes \frac{\partial f(\{x_i\})} {\partial x_k} - \sum_{j=1}^{n} \lambda_{j} \frac{\partial g_j (\{x_i\})} {\partial x_k} = 0 \quad \textrm{for all } k.
Here again we can start with an assumption that we properly select $\lambda_j$'s to satisfy \frac{\partial f(\{x_i\})} {\partial x_p} - \sum_{j=1}^{n} \lambda_{j} \frac{\partial g_j (\{x_i\})} {\partial x_p} = 0 \quad \textrm{for }p = l\textrm{'s} . Then the same result comes directly though this is not a strict proof. ### Functional variables Sometimes we have to optimize a quantity with functional variables. When $S$ is given by S \big( \{ h_i (\vec{x}') \} \big) = \int d \vec{x} ~ f \big( \{ h_i (\vec{x}) \} \big) , let's find the shapes of the function $h_i (\vec{x})$'s which make $S$ be stationary or minimum/maximum. At extreme, any deviation on functional variables, but subject to constraints, gives \delta S \big( \{ h_i (\vec{x}') \} \big) = S \big( \{ h_i (\vec{x}') + \delta h_i (\vec{x}') \} \big) - S \big( \{ h_i (\vec{x}') \} \big) = 0 . \begin{split} \delta S \big( \{ h_i (\vec{x}') \} \big) &= \int d \vec{x} \Big[ f \big( \{ h_i (\vec{x}) + \delta h_i (\vec{x}) \} \big) - f \big( \{ h_i (\vec{x}) \} \big) \Big] dx \\ &= \int d \vec{x} \sum_k \Big[ \frac{\partial f} {\partial h_k (\vec{x})} \delta h_k (\vec{x}) + \sum_{\mu} \frac{\partial f} {\partial \big( \partial_{\mu} h_k (\vec{x}) \big) } \partial_{\mu} \big( \delta h_k (\vec{x}) \big) \\ &~~~~~~~~~~~~~~~~~ + \sum_{\mu, \nu} \frac{\partial f} {\partial \big( \partial_{\mu, \nu} h_k (\vec{x}) \big)} \partial_{\mu, \nu} \big( \delta h_k (\vec{x}) \big) + \cdots \Big] \end{split} where \begin{split} &\partial_{\mu} h_k (\vec{x}) \equiv \frac{\partial h_k (\vec{x}) } {\partial x^{\mu}}, \\ &\partial_{\mu, \nu} h_k (\vec{x}) \equiv \frac{\partial^2 h_k (\vec{x}) } {\partial x^{\mu} \partial x^{\nu} } = \frac{\partial^2 h_k (\vec{x}) } {\partial x^{\nu} \partial x^{\mu} }, \\ &\cdots . \end{split} and $\sum_{\mu, \nu}$ goes over all possible combinations without double counting. We can regard this as \delta S \big( \{ h_i (\vec{x}) \} \big) = \frac{d S \big( \{ h_i (\vec{x}) + \epsilon \delta h_i (\vec{x}) \} \big)} {d \epsilon} \bigg|_{\epsilon=0} where $\delta h_i (\vec{x})$ is any arbitrary function. Integrating by part with boundary conditions (periodic, or zeros at boundary or infinity, or so on) which make integral-offed (surface) terms vanish. \begin{split} &\delta S \big( \{ h_i (\vec{x}') \} \big) = \sum_{k} \Bigg[ \int d \vec{x} \bigg[ \frac{\partial f} {\partial h_k (\vec{x})} \delta h_k (\vec{x}) \bigg] \\ &+ \sum_{\mu} \int \frac{d \vec{x}} {d x^{\mu}} \bigg[ \frac{\partial f} {\partial \big( \partial_{\mu} h_k (\vec{x}) \big) } \delta h_k (\vec{x}) \bigg] \bigg|_{\partial_{\mu} V} - \int d \vec{x} \bigg[ \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[ \frac{\partial f} {\partial \big( \partial_{\mu} h_k (\vec{x}) \big) } \Big] \delta h_k (\vec{x}) \bigg] \\ &+ \sum_{\mu, \nu} \int \frac{d \vec{x}} {d x^{\mu}} \bigg[ \frac{\partial f} {\partial \big( \partial_{\mu, \nu} h_k (\vec{x}) \big)} \partial_{\nu} \big( \delta h_k (\vec{x}) \big) \bigg] \bigg|_{\partial_{\mu} V} - \int d \vec{x} \bigg[ \sum_{\mu, \nu} \frac{\partial} {\partial x^{\mu}} \Big[ \frac{\partial f} {\partial \big( \partial_{\mu, \nu} h_k (\vec{x}) \big)} \Big] \partial_{\nu} \big( \delta h_k (\vec{x}) \big) \bigg] \\ &+ \cdots \Bigg] \\ &= \sum_{k} \Bigg[ \int d \vec{x} \bigg[ \frac{\partial f} {\partial h_k (\vec{x})} \delta h_k (\vec{x}) \bigg] \\ &+ \sum_{\mu} \int \frac{d \vec{x}} {d x^{\mu}} \bigg[ \frac{\partial f} {\partial \big( \partial_{\mu} h_k (\vec{x}) \big) } \delta h_k (\vec{x}) \bigg] \bigg|_{\partial_{\mu} V} - \int d \vec{x} \bigg[ \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[ \frac{\partial f} {\partial \big( \partial_{\mu} h_k (\vec{x}) \big) } \Big] \delta h_k (\vec{x}) \bigg] \\ &+ \sum_{\mu, \nu} \int \frac{d \vec{x}} {d x^{\mu}} \bigg[ \frac{\partial f} {\partial \big( \partial_{\mu, \nu} h_k (\vec{x}) \big)} \partial_{\nu} \big( \delta h_k (\vec{x}) \big) \bigg] \bigg|_{\partial_{\mu} V} - \sum_{\mu, \nu} \int \frac{d \vec{x}} {d x^{\nu}} \bigg[ \frac{\partial} {\partial x^{\mu}} \Big[ \frac{\partial f} {\partial \big( \partial_{\mu, \nu} h_k (\vec{x}) \big)} \Big] \delta h_k (\vec{x}) \bigg] \bigg|_{\partial_{\nu} V} \\ &~~~~ + \int d \vec{x} \bigg[ \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[ \frac{\partial f} {\partial \big( \partial_{\mu, \nu} h_k (\vec{x}) \big)} \Big] \delta h_k (\vec{x}) \bigg] \\ &+ \cdots \Bigg] \end{split} where $\int d \vec{x} / d x^{\mu}$ means integration with all variables except $x^{\mu}$. When integral-offed terms vanish because of boundary conditions, the result becomes \begin{split} &\delta S \big(\{h_i (\vec{x}') \} \big) \\ &= \sum_{k} \int d \vec{x} \bigg[\frac{\partial f} {\partial h_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} h_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu, \nu} h_k (\vec{x}) \big)} \Big] + \cdots \bigg] \delta h_k (\vec{x}) \end{split} When the function $h_k (\vec{x})$'s are at extreme form, it should satisfy that \frac{\partial f} {\partial h_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} h_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu, \nu} h_k (\vec{x}) \big)} \Big] + \cdots = 0 . Additionally there can be some constraints. Taking summation concept on integration, we can do similar procedure as before. Here are constraints with full integration, i.e. integration over the whole variables. Let's call it `type-1'. \int d \vec{x} ~ g_j \big(\{h_i (\vec{x}) \} \big) = \textrm{const} . Then \begin{split} &\sum_{k} \int d \vec{x} \bigg[\frac{\partial g_j} {\partial h_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_j} {\partial \big(\partial_{\mu} h_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_j} {\partial \big(\partial_{\mu, \nu} h_k (\vec{x}) \big)} \Big] + \cdots \bigg] \delta h_k (\vec{x}) \\ &= 0 \end{split} with appropriate boundary conditions. Taking $\vec{x}_p$'s properly so that we can find the inverse, \begin{split} &\int_{\vec{x}_p \textrm{'s}} d \vec{x} \bigg[\frac{\partial g_j} {\partial h_l (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_j} {\partial \big(\partial_{\mu} h_l (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_j} {\partial \big(\partial_{\mu, \nu} h_l (\vec{x}) \big)} \Big] + \cdots \bigg] \delta h_l (\vec{x}) \\ &+ \int_{V - \vec{x}_p \textrm{'s}} d \vec{x} \bigg[\frac{\partial g_j} {\partial h_l (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_j} {\partial \big(\partial_{\mu} h_l (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_j} {\partial \big(\partial_{\mu, \nu} h_l (\vec{x}) \big)} \Big] + \cdots \bigg] \delta h_l (\vec{x}) \\ &+ \sum_{k \neq l} \int d \vec{x} \bigg[\frac{\partial g_j} {\partial h_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_j} {\partial \big(\partial_{\mu} h_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_j} {\partial \big(\partial_{\mu, \nu} h_k (\vec{x}) \big)} \Big] + \cdots \bigg] \delta h_k (\vec{x}) \\ &= 0 . \end{split} So finding the inverse, we can make only $\delta h_l (\vec{x})$ at $\vec{x}_p \textrm{'s}$ be deterministic and the others be fully arbitrary. This is the summation concept on integration. Then similar approaches give \begin{split} &\frac{\partial f} {\partial h_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} h_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu, \nu} h_k (\vec{x}) \big)} \Big] + \cdots \\ &= \sum_{l \in \textrm{type1}} \lambda_l \bigg[\frac{\partial g_l} {\partial h_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} h_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu, \nu} h_k (\vec{x}) \big)} \Big] + \cdots \bigg] . \end{split} If there is a integral-offed constraint (let's say that this is `type2' constraint.), i.e. \int \frac{d \vec{x}} {d x^{\kappa}} g \big(\{h_i ( \vec{x} ) \} \big) = \textrm{const} , we can introduce arbitrary function $\lambda(x^{\mu})$ making \int d x^{\kappa} \lambda(x^{\kappa}) \bigg[\Big[\int \frac{d \vec{x}} {d x^{\kappa}} g \big(\{h_i ( \vec{x} ) \} \big) \Big] - \textrm{const} \bigg] = 0 . For these type2 constraints, Lagrange multipliers become dependent on the integral-offed variable. Then
\begin{split} &\frac{\partial f} {\partial h_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} h_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu, \nu} h_k (\vec{x}) \big)} \Big] + \cdots \\ &= \sum_{l \in \textrm{type1}} \lambda_l \bigg[\frac{\partial g_l} {\partial h_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} h_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu, \nu} h_k (\vec{x}) \big)} \Big] + \cdots \bigg] \\ &+ \sum_{l \in \textrm{type2}} \bigg[\lambda_l (x^{\kappa}) \frac{\partial g_l} {\partial h_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\lambda_l (x^{\kappa}) \frac{\partial g_l} {\partial \big(\partial_{\mu} h_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\lambda_l (x^{\kappa}) \frac{\partial g_l} {\partial \big(\partial_{\mu, \nu} h_k (\vec{x}) \big)} \Big] + \cdots \bigg] \end{split}
#### Multiple integrations When $S$, which has to be optimized, is given by S = \int \int \cdots \int d \vec{x}_1 d \vec{x}_2 \cdots d \vec{x}_n ~ f \Big(\{h_i (\vec{x}_1 )\} , \{h_i (\vec{x}_2 )\} , \cdots \{h_i (\vec{x}_n )\} \Big) , ohterwise $S$ is given with multiple integrations, how should we deal this problem? How does $S$ change under small variations on $h_k (\vec{x})$? The change will become \delta S = \int \int \cdots \int d \vec{x}_1 d \vec{x}_2 \cdots d \vec{x}_n ~ \Big[\frac{\partial f}{\partial h_k (\vec{x}_1)} \delta h_k (\vec{x}_1) + \frac{\partial f}{\partial h_k (\vec{x}_2)} \delta h_k (\vec{x}_2) + \cdots + \frac{\partial f}{\partial h_k (\vec{x}_n)} \delta h_k (\vec{x}_n) \Big] . Here the variable $\vec{x}_i$'s, which are used in integrations, are actually dummy one. So we can change the variables, actually the symbols arbitrarily with personal touch. For the sake of simplicity, see an example. When $S$ is given by S = \int \int d \vec{x}_1 d \vec{x}_2 ~ h (\vec{x}_1)^2 h (\vec{x}_2)^3 , \begin{split} \delta S &= \int \int d \vec{x}_1 d \vec{x}_2 ~ \Big[2 h (\vec{x}_1) h (\vec{x}_2)^3 \delta h (\vec{x}_1) + h (\vec{x}_1)^2 3 h (\vec{x}_2)^2 \delta h (\vec{x}_2) \Big] \\ &= \int d \vec{x}_1 ~ \delta h (\vec{x}_1) \int d \vec{x}_2 ~ \Big[2 h (\vec{x}_1) h (\vec{x}_2)^3 + h (\vec{x}_2)^2 3 h (\vec{x}_1)^2 \Big] \\ &= \int d \vec{x} ~ \delta h (\vec{x}) \Big[\int d \vec{x}_2 ~ 2 h (\vec{x}) h (\vec{x}_2)^3 + \int d \vec{x}_1 ~ h (\vec{x}_1)^2 3 h (\vec{x})^2 \Big] \\ \end{split} as the infinitesimal variation $\delta h (\vec{x}_k)$ is the same. Therefore in general, \begin{split} \delta S = \int d \vec{x} ~ \delta &h_k (\vec{x}) \Bigg[\int \cdots \int d \vec{x}_2 \cdots d \vec{x}_n ~ \frac{\partial f \Big(\{h_i (\vec{x})\} , \{h_i (\vec{x}_2 )\} , \cdots \{h_i (\vec{x}_n )\} \Big)} {\partial h_k (\vec{x})} \\ &+ \int \cdots \int d \vec{x}_1 \cdots d \vec{x}_n ~ \frac{\partial f \Big(\{h_i (\vec{x}_1)\} , \{h_i (\vec{x} )\} , \cdots \{h_i (\vec{x}_n )\} \Big)} {\partial h_k (\vec{x})} \\ &+ \cdots + \int \cdots \int d \vec{x}_1 \cdots d \vec{x}_{n-1} ~ \frac{\partial f \Big(\{h_i (\vec{x}_1)\} , \{h_i (\vec{x}_2 )\} , \cdots \{h_i (\vec{x} )\} \Big)} {\partial h_k (\vec{x})} \Bigg] . \end{split} \begin{split} \delta S = \int d \vec{x} ~ \delta h_k (\vec{x}) \Bigg[ &\int \prod_{j \neq 1} d \vec{x}_j \frac{\partial f} {\partial h_k (\vec{x}_1)} \bigg|_{\vec{x}_1 = \vec{x}} + \int \prod_{j \neq 2} d \vec{x}_j \frac{\partial f} {\partial h_k (\vec{x}_2)} \bigg|_{\vec{x}_2 = \vec{x}} \\ &+ \cdots + \int \prod_{j \neq n} d \vec{x}_j \frac{\partial f} {\partial h_k (\vec{x}_n)} \bigg|_{\vec{x}_n = \vec{x}} \Bigg] \end{split} Continueing the similar procedures, we can optimize these quantities which are given with multiple integrations. ### Determinant, or double derivative From double derivative $$\frac{d^2 f}{d \tau^2}$$ , we can determine whether this point is minimum, maximum, or just stationary. ##[.hiden] Method of Lagrange multipliers with complex variables ### Complex variables Complex variables are quite usual in physics. And sometimes, using complex variables, we can make problems much simpler. Let's consider the optimization problem minimizing (or maximizing) $f(\{z_k\})$ subject to $g_l (\{z_k\})=$const for $l$'s, where $f(\{z_k\})$ and $g_l (\{z_k\})$'s are real function, but composed of complex variables $z_k$'s. Expressing $z_k$'s with two real numbers z_k = a_k + i b_k , $f$ and $g_l$'s can be expressed with these real numbers only. \begin{split} &f(\{z_k\}) = f(\{a_k, b_k\}) , \\ &g_l (\{z_k\}) = g_l (\{a_k, b_k\}) . \end{split} Then this problem becomes equivalent to the one with multiple variables, of which number is double of complex variables. So for all variables, \frac{\partial f}{\partial a_k} - \sum_l \lambda_{l} \frac{\partial g_{l}}{\partial a_k} = 0 \qquad \textrm{and} \qquad \frac{\partial f} {\partial b_k} - \sum_l \lambda_{l} \frac{\partial g_{l}} {\partial b_k} = 0 should be satisfied at extreme (minimum, maximum, possibly saddle, and any other stationary) points. Expressing $f(\{a_k, b_k\})$ as f(\{a_k, b_k\}) = f(\{\frac{z_k+z_k^*}{2}, \frac{z_k-z_k^*}{2i}\}) , we can define \frac{\partial f}{\partial z_k} = \frac{1}{2} \frac{\partial f}{\partial a_k} + \frac{1}{2i} \frac{\partial f}{\partial b_k} \qquad \textrm{and} \qquad \frac{\partial f}{\partial z_k^*} = \frac{1}{2} \frac{\partial f}{\partial a_k} - \frac{1}{2i} \frac{\partial f}{\partial b_k} . Also we can easily find out that \frac{\partial f} {\partial a_k} = \frac{\partial f} {\partial z_k} + \frac{\partial f} {\partial z_k^*} \qquad \textrm{and} \qquad \frac{\partial f} {\partial b_k} = i \frac{\partial f} {\partial z_k} - i \frac{\partial f} {\partial z_k^*} which is consistant with above results. As the function $f$ and constraint $g$ are always real and therefore the derivative $\frac{\partial f}{\partial a_k}$'s, $\frac{\partial f}{\partial b_k}$'s, $\frac{\partial g_l}{\partial a_k}$'s, and $\frac{\partial g_l}{\partial b_k}$'s are always real too, above two real equations can be combined into one complex equation \begin{split} &\frac{\partial f} {\partial z_k^*} - \sum_l \lambda_l \frac{\partial g_l} {\partial z_k^*} \\ &= \frac{1}{2} \frac{\partial f} {\partial a_k} - \frac{1}{2i} \frac{\partial f} {\partial b_k} - \sum_l \lambda_l \bigg(\frac{1}{2} \frac{\partial g_l} {\partial a_k} - \frac{1}{2i} \frac{\partial g_l} {\partial b_k} \bigg) \\ &= \frac{1}{2} \bigg(\frac{\partial f} {\partial a_k} - \sum_l \lambda_l \frac{\partial g_l} {\partial a_k} \bigg) - \frac{1}{2i} \bigg(\frac{\partial f} {\partial b_k} - \sum_l \lambda_l \frac{\partial g_l} {\partial b_k} \bigg) = 0 . \end{split}
Then the extreme points can be found with \frac{\partial f} {\partial z_k^*} = \sum_l \lambda_l \frac{\partial g_l} {\partial z_k^*} \qquad \textrm{or} \qquad \frac{\partial f} {\partial z_k} = \sum_l \lambda_l \frac{\partial g_l} {\partial z_k} with real Lagrange multiplier $\lambda_l$'s. The condition that $f$ and $g_l$ is always real is crucial in this step. Otherwise let's say that $g_l$ can be complex, i.e. $g_l$ is a complex constraint which confine both real part and imaginary part.
g_l = g_l^{(R)} + i g_l^{(I)} = \textrm{const} .
// You might think that we can take complex $\lambda_l$'s simply. But this does not give the correct answer. For simple check, let's think about the case with single complex constraints. \begin{split} &\frac{\partial g(\{x_i\})} {\partial x_l} \delta x_l + \sum_{k \neq l} \frac{\partial g(\{x_i\})} {\partial x_k} \delta x_k = 0 . \\ &\delta x_l = - \frac{\sum_{k \neq l} \partial_{x_k} \big[ g(\{x_i\}) \big] \delta x_k} {\partial_{x_l} \big[ g(\{x_i\}) \big]} . \end{split} Can we make only $\delta x_l$ be deterministic so that the other $\delta x_k$'s can be fully arbitrary? The answer is NO because the variables must be pure real. One complex constraint actually corresponds to two real constraints. So two variables must be determined by, or dependent on the others. So the $\delta x_k$'s except only single $\delta x_l$ cannot be arbitrary as the combination - \frac{\sum_{k \neq l} \partial_{x_k} \big[ g(\{x_i\}) \big] \delta x_k} {\partial_{x_l} \big[ g(\{x_i\}) \big]} should give real quantity $\delta x_l$. Therefore another constraint confining $\delta x_k$'s is there.
Therefore one complex constraint becomes two real constraints. \begin{split} &g_l^{(R)} = \frac{1}{2} \big[g_l + g_l^* \big] , \\ &g_l^{(I)} = \frac{1}{2i} \big[g_l - g_l^* \big] . \end{split} Then \begin{split} \frac{\partial f} {\partial x_k} &= \sum_l \bigg[\lambda_l^{(R)} \frac{\partial g_l^{(R)}} {\partial x_k} + \lambda_l^{(I)} \frac{\partial g_l^{(I)}} {\partial x_k} \bigg] \\ &= \sum_l \bigg[\frac{\lambda_l^{(R)}} {2} \Big[\frac{\partial g_l} {\partial x_k} + \frac{\partial g_l^*} {\partial x_k} \Big] + \frac{\lambda_l^{(I)}} {2i} \Big[\frac{\partial g_l} {\partial x_k} - \frac{\partial g_l^*} {\partial x_k} \Big] \bigg] \\ &= \sum_l \bigg[\Big[\frac{\lambda_l^{(R)}} {2} + \frac{\lambda_l^{(I)}} {2i} \Big] \frac{\partial g_l} {\partial x_k} + \Big[\frac{\lambda_l^{(R)}} {2} - \frac{\lambda_l^{(I)}} {2i} \Big] \frac{\partial g_l^*} {\partial x_k} \bigg] . \end{split} As $\lambda_l^{(R)}$ and $\lambda_l^{(I)}$ is pure real, introducing complex Lagrange multiplier $\lambda_l$'s gives \frac{\partial f} {\partial x_k} = \sum_l \bigg[\lambda_l \frac{\partial g_l} {\partial x_k} + \lambda_l^* \frac{\partial g_l^*} {\partial x_k} \bigg] .
Then compact complex form is represented by \frac{\partial f} {\partial z_k^*} = \sum_l \bigg[\lambda_l \frac{\partial g_l} {\partial z_k^*} + \lambda_l^* \frac{\partial g_l^*} {\partial z_k^*} \bigg] or \frac{\partial f} {\partial z_k} = \sum_l \bigg[\lambda_l \frac{\partial g_l} {\partial z_k} + \lambda_l^* \frac{\partial g_l^*} {\partial z_k} \bigg] . One equation confirms the other, i.e. they are equivalent.
How about the case when $f$ can be complex? In this case, finding minimum/maximum is meaningless. Anyway how can we find the stationary condition for $f$? To be stationary, both real part and imaginary part of $f$ should be stationary. So \frac{\partial \big(\Re (f) \big)} {\partial z_k^*} = \sum_l \bigg[\lambda_l \frac{\partial g_l} {\partial z_k^*} + \lambda_l^* \frac{\partial g_l^*} {\partial z_k^*} \bigg] and \frac{\partial \big(\Im (f) \big)} {\partial z_k^*} = \sum_l \bigg[\kappa_l \frac{\partial g_l} {\partial z_k^*} + \kappa_l^* \frac{\partial g_l^*} {\partial z_k^*} \bigg] . There are two sets of equation. Combining above two equations, \frac{\partial f} {\partial z_k^*} = \sum_l \bigg[(\lambda_l +i \kappa_l) \frac{\partial g_l} {\partial z_k^*} + (\lambda_l^* +i \kappa_l^*) \frac{\partial g_l^*} {\partial z_k^*} \bigg] and \frac{\partial f^*} {\partial z_k^*} = \sum_l \bigg[(\lambda_l -i \kappa_l) \frac{\partial g_l} {\partial z_k^*} + (\lambda_l^* -i \kappa_l^*) \frac{\partial g_l^*} {\partial z_k^*} \bigg] . Defining new complex multipliers $\alpha_l \equiv \lambda_l + i \kappa_l$ and $\beta_l \equiv \lambda_l^* + i \kappa_l^*$, \frac{\partial f} {\partial z_k^*} = \sum_l \bigg[\alpha_l \frac{\partial g_l} {\partial z_k^*} + \beta_l \frac{\partial g_l^*} {\partial z_k^*} \bigg] and \frac{\partial f^*} {\partial z_k^*} = \sum_l \bigg[\beta_l^* \frac{\partial g_l} {\partial z_k^*} + \alpha_l^* \frac{\partial g_l^*} {\partial z_k^*} \bigg] . These two complex equations are quite different, i.e. one euquation does not confirm the other. So both equations must be satisfied. Another form is \frac{\partial f} {\partial z_k^*} = \sum_l \bigg[\alpha_l \frac{\partial g_l} {\partial z_k^*} + \beta_l \frac{\partial g_l^*} {\partial z_k^*} \bigg]
and
\frac{\partial f} {\partial z_k} = \sum_l \bigg[\alpha_l \frac{\partial g_l} {\partial z_k} + \beta_l \frac{\partial g_l^*} {\partial z_k} \bigg]
with complex Lagrange multiplier $\alpha_l$'s and $\beta_l$'s.
This complex expression gives simpler form of the result. However this is not the only form of the answer. Sometimes seperate real variable representation, especially polar representation of complex variables, gives us more things. And transformation of variables can give better perspective about the answer. If we transform the variables, \frac{\partial f(\{x_i\})} {\partial x_k} - \sum_{j=1}^{n} \lambda_{j} \frac{\partial g_j (\{x_i\})} {\partial x_k} = 0 becomes \begin{split} &\sum_{k} \frac{\partial x_k} {\partial u_l} \bigg[\frac{\partial f(\{x_i\})} {\partial x_k} - \sum_{j=1}^{n} \lambda_{j} \frac{\partial g_j (\{x_i\})} {\partial x_k} \bigg] = 0 , \\ &\frac{\partial f(\{u_i\})} {\partial u_l} - \sum_{j=1}^{n} \lambda_{j} \frac{\partial g_j (\{u_i\})} {\partial u_l} = 0 \end{split} with the same Lagrange multipliers. As an example, when we use polar representation for complex variables, i.e. z_k = \xi_k e^{i \theta_k} , \frac{\partial} {\partial \xi_k} = e^{i \theta_k} \frac{\partial} {\partial z_k} + e^{-i \theta_k} \frac{\partial} {\partial z_k^*} and \begin{split} \frac{\partial} {\partial \theta_k} &= i \xi_k e^{i \theta_k} \frac{\partial} {\partial z_k} - i \xi_k e^{-i \theta_k} \frac{\partial} {\partial z_k^*} \\ &= i z_k \frac{\partial} {\partial z_k} - i z_k^* \frac{\partial} {\partial z_k^*} . \end{split} Then \frac{\partial} {\partial z_k} = \frac{e^{-i \theta_k}} {2} \bigg[\frac{\partial} {\partial \xi_k} + \frac{1} {i \xi_k} \frac{\partial} {\partial \theta_k} \bigg] and \frac{\partial} {\partial z_k^*} = \frac{e^{i \theta_k}} {2} \bigg[\frac{\partial} {\partial \xi_k} - \frac{1} {i \xi_k} \frac{\partial} {\partial \theta_k} \bigg] . Therefore \frac{\partial f} {\partial \xi_k} = \sum_l \bigg[\lambda_l \frac{\partial g_l} {\partial \xi_k} + \lambda_l^* \frac{\partial g_l^*} {\partial \xi_k} \bigg] \quad \textrm{and} \quad \frac{\partial f} {\partial \theta_k} = \sum_l \bigg[\lambda_l \frac{\partial g_l} {\partial \theta_k} + \lambda_l^* \frac{\partial g_l^*} {\partial \theta_k} \bigg] . This sometimes gives us more information or better perspectives about the answer. ### Functional complex variables When $S$ is expressed with functional complex variables, similar approaches are possible. S = \int d \vec{x} ~ f \big(\{z_k (\vec{x}) \} \big) . z_k (\vec{x}) = a_k (\vec{x}) + i b_k (\vec{x}) where $a_k (\vec{x})$ and $b_k (\vec{x})$ are pure real function. f \big(\{z_k (\vec{x}) \} \big) = f \big(\{a_k (\vec{x}) , b_k (\vec{x}) \} \big) . Since \begin{split} &\frac{\partial f} {\partial a_k (\vec{x})} = \frac{\partial f} {\partial z_k (\vec{x})} + \frac{\partial f} {\partial z_k^* (\vec{x})} , \\ &\frac{\partial f} {\partial b_k (\vec{x})} = i \frac{\partial f} {\partial z_k (\vec{x})} - i \frac{\partial f} {\partial z_k^* (\vec{x})} , \\ &\frac{\partial f} {\partial \big(\partial_{\mu} a_k (\vec{x}) \big)} = \frac{\partial f} {\partial \big(\partial_{\mu} z_k (\vec{x}) \big)} + \frac{\partial f} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big)} , \\ &\frac{\partial f} {\partial \big(\partial_{\mu} b_k (\vec{x}) \big)} = i \frac{\partial f} {\partial \big(\partial_{\mu} z_k (\vec{x}) \big)} - i \frac{\partial f} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big)} , \textrm{ and so on,} \end{split} then \begin{split} &\frac{\partial f} {\partial z_k (\vec{x})} = \frac{1}{2} \bigg[\frac{\partial f} {\partial a_k (\vec{x})} + \frac{1}{i} \frac{\partial f} {\partial b_k (\vec{x})} \bigg] , \\ &\frac{\partial f} {\partial z_k^* (\vec{x})} = \frac{1}{2} \bigg[\frac{\partial f} {\partial a_k (\vec{x})} - \frac{1}{i} \frac{\partial f} {\partial b_k (\vec{x})} \bigg] , \\ &\frac{\partial f} {\partial \big(\partial_{\mu} z_k (\vec{x}) \big)} = \frac{1}{2} \bigg[\frac{\partial f} {\partial \big(\partial_{\mu} a_k (\vec{x}) \big)} + \frac{1}{i} \frac{\partial f} {\partial \big(\partial_{\mu} b_k (\vec{x}) \big)} \bigg] , \\ &\frac{\partial f} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big)} = \frac{1}{2} \bigg[\frac{\partial f} {\partial \big(\partial_{\mu} a_k (\vec{x}) \big)} - \frac{1}{i} \frac{\partial f} {\partial \big(\partial_{\mu} b_k (\vec{x}) \big)} \bigg] , \textrm{ and so on.} \end{split} Therefore \begin{split} &\frac{\partial f} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \\ &= \sum_{l \in \textrm{type1}} \lambda_l \bigg[\frac{\partial g_l} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \bigg] \\ &+ \sum_{l \in \textrm{type2}} \bigg[\lambda_l (x^{\kappa}) \frac{\partial g_l} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\lambda_l (x^{\kappa}) \frac{\partial g_l} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\lambda_l (x^{\kappa}) \frac{\partial g_l} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \bigg] \end{split}
or derivatives by $z_k (\vec{x})$.
When constraints are given by complex form, \begin{split} &\frac{\partial f} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \\ &= \sum_{l \in \textrm{type1}} \lambda_l \bigg[\frac{\partial g_l} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \bigg] \\ &+ \sum_{l \in \textrm{type1}} \lambda_l^* \bigg[\frac{\partial g_l^*} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_l^*} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_l^*} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \bigg] \\ &+ \sum_{l \in \textrm{type2}} \bigg[\lambda_l (x^{\kappa}) \frac{\partial g_l} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\lambda_l (x^{\kappa}) \frac{\partial g_l} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\lambda_l (x^{\kappa}) \frac{\partial g_l} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \bigg] \\ &+ \sum_{l \in \textrm{type2}} \bigg[\lambda_l^* (x^{\kappa}) \frac{\partial g_l^*} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\lambda_l^* (x^{\kappa}) \frac{\partial g_l^*} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\lambda_l^* (x^{\kappa}) \frac{\partial g_l^*} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \bigg] \end{split} or derivatives by $z_k (\vec{x})$. When additionally $f$ is complex function, \begin{split} &\frac{\partial f} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \\ &= \sum_{l \in \textrm{type1}} \alpha_l \bigg[\frac{\partial g_l} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \bigg] \\ &+ \sum_{l \in \textrm{type1}} \beta_l \bigg[\frac{\partial g_l^*} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_l^*} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_l^*} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \bigg] \\ &+ \sum_{l \in \textrm{type2}} \bigg[\alpha_l (x^{\kappa}) \frac{\partial g_l} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\alpha_l (x^{\kappa}) \frac{\partial g_l} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\alpha_l (x^{\kappa}) \frac{\partial g_l} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \bigg] \\ &+ \sum_{l \in \textrm{type2}} \bigg[\beta_l (x^{\kappa}) \frac{\partial g_l^*} {\partial z_k^* (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\beta_l (x^{\kappa}) \frac{\partial g_l^*} {\partial \big(\partial_{\mu} z_k^* (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\beta_l (x^{\kappa}) \frac{\partial g_l^*} {\partial \big(\partial_{\mu, \nu} z_k^* (\vec{x}) \big)} \Big] + \cdots \bigg] \end{split} and \begin{split} &\frac{\partial f} {\partial z_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} z_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial f} {\partial \big(\partial_{\mu, \nu} z_k (\vec{x}) \big)} \Big] + \cdots \\ &= \sum_{l \in \textrm{type1}} \alpha_l \bigg[\frac{\partial g_l} {\partial z_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} z_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu, \nu} z_k (\vec{x}) \big)} \Big] + \cdots \bigg] \\ &+ \sum_{l \in \textrm{type1}} \beta_l \bigg[\frac{\partial g_l^*} {\partial z_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\frac{\partial g_l^*} {\partial \big(\partial_{\mu} z_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\frac{\partial g_l^*} {\partial \big(\partial_{\mu, \nu} z_k (\vec{x}) \big)} \Big] + \cdots \bigg] \\ &+ \sum_{l \in \textrm{type2}} \bigg[\alpha_l (x^{\kappa}) \frac{\partial g_l} {\partial z_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\alpha_l (x^{\kappa}) \frac{\partial g_l} {\partial \big(\partial_{\mu} z_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\alpha_l (x^{\kappa}) \frac{\partial g_l} {\partial \big(\partial_{\mu, \nu} z_k (\vec{x}) \big)} \Big] + \cdots \bigg] \\ &+ \sum_{l \in \textrm{type2}} \bigg[\beta_l (x^{\kappa}) \frac{\partial g_l^*} {\partial z_k (\vec{x})} - \sum_{\mu} \frac{\partial} {\partial x^{\mu}} \Big[\beta_l (x^{\kappa}) \frac{\partial g_l^*} {\partial \big(\partial_{\mu} z_k (\vec{x}) \big) } \Big] + \sum_{\mu, \nu} \frac{\partial^2} {\partial x^{\nu} \partial x^{\mu}} \Big[\beta_l (x^{\kappa}) \frac{\partial g_l^*} {\partial \big(\partial_{\mu, \nu} z_k (\vec{x}) \big)} \Big] + \cdots \bigg] . \end{split} ##[.hiden] Approaching method to minimum/maximum from initial guess Although we find the condition in extreme points, we cannot find that points directly except for some simple cases. So here I introduce approaching technique from initial guess to the minimum or maximum point.
// This method does not guarantee that the point approaches a stationary point.
### Multiple variables and multiple constraints From the given condition of extrema, we can guess the minimum or maximum point. But this does not always give the exact minimum/maximum point. So introducing $\tau$, which I call pseudo time along which the point or state approaches minimum/maximum, and considering \frac{d f} {d \tau} = \sum_k \frac{\partial f} {\partial x_k} \frac{d x_k} {d \tau} , let's minimize/maximize the rate of change $d f / d \tau$ with constraints \sum_k \frac{\partial g_l} {\partial x_k} \frac{d x_k} {d \tau} = 0 which means the point moves inside the given constraints, and \sum_k \big(\frac{d x_k} {d \tau} \big)^2 = \textrm{const} which means the overall change rate of variables are limited. As we minimize/maximize the rate of change $df/d\tau$, it will eventually arrive on neighboring local minimum/maximum. Here we can fix some variables, i.e. $d x_l / d \tau = 0$ for some $l$'s, because it approaches anyway to a point which have less/more $f(\{x_k\})$ value than the current point. Then using the method of Lagrange multipliers, \frac{\partial f} {\partial x_k} = \sum_l \lambda_l \frac{\partial g_l} {\partial x_k} + 2 \lambda \frac{d x_k} {d \tau} . As the value $df/d\tau$ always have a miminum and a maximum, this stationary condition gives a correct answer. So \frac{d x_k} {d \tau} = \frac{1}{2 \lambda} \bigg[\frac{\partial f} {\partial x_k} - \sum_l \lambda_l \frac{\partial g_l} {\partial x_k} \bigg] . To approach a minimum (maximum), $\lambda$ should be minus (plus). And as $\lambda$ is introduced to satisfy the condition $\sum_k (d x_k / d \tau)^2 =$const, it can be any arbitrary nonzero number when we don't care about fixing this constant all the pseudo time.
Therefore to find the minimum/maximum point, we use \frac{d x_k} {d \tau} = \mp \alpha(\tau) \bigg[\frac{\partial f} {\partial x_k} - \sum_l \lambda_l \frac{\partial g_l} {\partial x_k} \bigg] where $\alpha(\tau)$ is an arbitrary positive number which will be decided in any convenient way. Then the change becomes \frac{d f} {d \tau} = \mp \alpha(\tau) \sum_k \bigg[\Big(\frac{\partial f} {\partial x_k} \Big)^2 - \sum_l \lambda_l \frac{\partial g_l} {\partial x_k} \frac{\partial f} {\partial x_k} \bigg] . Here $\lambda_l$'s must be chosen to satisfy the constraints \sum_k \frac{\partial g_j} {\partial x_k} \frac{d x_k} {d \tau} = 0 . As $\alpha(\tau) \neq 0$, \sum_k \bigg[\frac{\partial f} {\partial x_k} \frac{\partial g_j} {\partial x_k} - \sum_l \lambda_l \frac{\partial g_l} {\partial x_k} \frac{\partial g_j} {\partial x_k} \bigg] = 0 . Introducing an inverse of symmetric matrix B_{lj} (= B_{jl}) \equiv \sum_k \frac{\partial g_l} {\partial x_k} \frac{\partial g_j} {\partial x_k} , i.e. \sum_{k,j} \frac{\partial g_l} {\partial x_k} \frac{\partial g_j} {\partial x_k} B^{-1}_{ji} = \delta_{li} , then \begin{split} &\sum_{k,j} \frac{\partial f} {\partial x_k} \frac{\partial g_j} {\partial x_k} B^{-1}_{ji} = \sum_l \lambda_l \sum_{k,j} \Big[\frac{\partial g_l} {\partial x_k} \frac{\partial g_j} {\partial x_k} \Big] B^{-1}_{ji} \\ &\sum_{k,j} \frac{\partial f} {\partial x_k} \frac{\partial g_j} {\partial x_k} B^{-1}_{ji} = \lambda_i . \end{split}
This is actually coordinate (variables chosen) dependent approaching. So you should find appropriate coordinate to make it approach the minimum/maximum quickly. In other words, even if the initial point was the same, it can approach the minimum point through different paths depending on which coordinates, or variables you chose. The best way is separating free variables from ones which show up in constraints. Then with free variables, we can simply make them move \frac{d x_k} {d\tau} = \mp \beta(\tau) \frac{\partial f} {\partial x_k} without any undetermined Lagrange multipliers. ### Complex variables and complex constraints For the complex variables, since \frac{\partial f} {\partial z_k^*} = \frac{1}{2} \bigg[\frac{\partial f}{\partial a_k} + i \frac{\partial f}{\partial b_k} \bigg] \quad \textrm{and} \quad \frac{\partial f} {\partial z_k} = \frac{1}{2} \bigg[\frac{\partial f}{\partial a_k} - i \frac{\partial f}{\partial b_k} \bigg] where $z_k = a_k + i b_k$,
\begin{split} \frac{d z_k} {d \tau} &= \frac{d a_k} {d \tau} + i \frac{d b_k} {d \tau} \\ &= \mp \alpha(\tau) \Bigg[\bigg[\frac{\partial f} {\partial a_k} - \sum_l \lambda_l \frac{\partial g_l} {\partial a_k} \bigg] + i \bigg[\frac{\partial f} {\partial b_k} - \sum_l \lambda_l \frac{\partial g_l} {\partial b_k} \bigg] \Bigg] \\ &= \mp \alpha(\tau) \Bigg[\bigg[\frac{\partial f} {\partial a_k} + i \frac{\partial f} {\partial b_k} \bigg] - \sum_l \lambda_l \bigg[\frac{\partial g_l} {\partial a_k} + i \frac{\partial g_l} {\partial b_k} \bigg] \Bigg] \\ &= \mp 2 \alpha(\tau) \bigg[\frac{\partial f} {\partial z_k^*} - \sum_{l} \lambda_l \frac{\partial g_l} {\partial z_k^*} \bigg] \end{split} or \begin{split} \frac{d z_k^*} {d \tau} &= \frac{d a_k} {d \tau} \mp i \frac{d b_k} {d \tau} \\ &= \mp 2 \alpha(\tau) \bigg[\frac{\partial f} {\partial z_k} - \sum_{l} \lambda_l \frac{\partial g_l} {\partial z_k} \bigg] . \end{split}
As the rate of change $df/d\tau$ becomes \frac{df} {d\tau} = \sum_k \bigg[\frac{\partial f} {\partial z_k} \frac{d z_k} {d\tau} + \frac{\partial f} {\partial z_k^*} \frac{d z_k^*} {d\tau} \bigg] , \begin{split} \frac{df} {d\tau} &= \mp 2 \alpha(\tau) \sum_k \bigg[\frac{\partial f} {\partial z_k} \frac{\partial f} {\partial z_k^*} - \frac{\partial f} {\partial z_k} \sum_l \lambda_l \frac{\partial g_l} {\partial z_k^*} + \frac{\partial f} {\partial z_k^*} \frac{\partial f} {\partial z_k} - \frac{\partial f} {\partial z_k^*} \sum_l \lambda_l \frac{\partial g_l} {\partial z_k} \bigg] \\ &= \mp 2 \alpha(\tau) \sum_k \bigg[2 \frac{\partial f} {\partial z_k} \frac{\partial f} {\partial z_k^*} - \sum_l \lambda_l \Big(\frac{\partial g_l} {\partial z_k^*} \frac{\partial f} {\partial z_k} + \frac{\partial g_l} {\partial z_k} \frac{\partial f} {\partial z_k^*} \Big) \bigg] . \end{split} And the constraints are \begin{split} \frac{d g_j}{d\tau} &= \sum_k \bigg[\frac{\partial g_j} {\partial z_k} \frac{d z_k} {d \tau} + \frac{\partial g_j} {\partial z_k^*} \frac{d z_k^*} {d \tau} \bigg] = 0 \\ &= \mp 2 \alpha(\tau) \sum_k \bigg[\frac{\partial f} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} - \sum_l \lambda_l \frac{\partial g_l} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial f} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} - \sum_l \lambda_l \frac{\partial g_l} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} \bigg] \\ &= \mp 2 \alpha(\tau) \sum_k \bigg[\frac{\partial f} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial f} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} - \sum_l \lambda_l \Big(\frac{\partial g_l} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial g_l} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} \Big) \bigg] . \end{split} Then \sum_k \bigg[\frac{\partial f} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial f} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} - \sum_l \lambda_l \Big(\frac{\partial g_l} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial g_l} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} \Big) \bigg] = 0 . Introducing an inverse of symmetric matrix B_{lj} (= B_{jl}) \equiv \sum_k \Big(\frac{\partial g_l} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial g_l} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} \Big) , i.e. \sum_{k,j} \Big(\frac{\partial g_l} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial g_l} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} \Big) B^{-1}_{ji} = \delta_{li} , then \begin{split} &\sum_{k,j} \bigg[\frac{\partial f} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial f} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} \bigg] B^{-1}_{ji} = \sum_{l} \lambda_l \sum_{k,j} \Big(\frac{\partial g_l} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial g_l} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} \Big) B^{-1}_{ji} \\ &\sum_{k,j} \bigg[\frac{\partial f} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial f} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} \bigg] B^{-1}_{ji} = \sum_{l} \lambda_l \delta_{li} \\ &\sum_{k,j} \bigg[\frac{\partial f} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial f} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} \bigg] B^{-1}_{ji} = \lambda_i . \\ \end{split}
When constraints are given by complex form, \frac{d x_k} {d \tau} = \mp \alpha(\tau) \bigg[\frac{\partial f} {\partial x_k} - \sum_l \Big(\lambda_l \frac{\partial g_l} {\partial x_k} + \lambda_l^* \frac{\partial g_l^*} {\partial x_k} \Big) \bigg] . And \frac{d z_k} {d \tau} = \mp 2 \alpha(\tau) \bigg[\frac{\partial f} {\partial z_k^*} - \sum_l \Big(\lambda_l \frac{\partial g_l} {\partial z_k^*} + \lambda_l^* \frac{\partial g_l^*} {\partial z_k^*} \Big) \bigg] or \frac{d z_k^*} {d \tau} = \mp 2 \alpha(\tau) \bigg[\frac{\partial f} {\partial z_k} - \sum_l \Big(\lambda_l \frac{\partial g_l} {\partial z_k} + \lambda_l^* \frac{\partial g_l^*} {\partial z_k} \Big) \bigg] .
The rate of change $df/d\tau$ is then \begin{split} &\frac{df} {d\tau} = \sum_k \bigg[\frac{\partial f} {\partial z_k} \frac{d z_k} {d\tau} + \frac{\partial f} {\partial z_k^*} \frac{d z_k^*} {d\tau} \bigg] \\ &= \mp 2 \alpha(\tau) \sum_k \bigg[2 \frac{\partial f} {\partial z_k} \frac{\partial f} {\partial z_k^*} - \sum_l \bigg(\Big(\lambda_l \frac{\partial g_l} {\partial z_k^*} + \lambda_l^* \frac{\partial g_l^*} {\partial z_k^*} \Big) \frac{\partial f} {\partial z_k} + \Big(\lambda_l \frac{\partial g_l} {\partial z_k} + \lambda_l^* \frac{\partial g_l^*} {\partial z_k} \Big) \frac{\partial f} {\partial z_k^*} \bigg) \bigg] \\ &= \mp 2 \alpha(\tau) \sum_k \bigg[2 \frac{\partial f} {\partial z_k} \frac{\partial f} {\partial z_k^*} - \sum_l \bigg(\lambda_l \Big(\frac{\partial g_l} {\partial z_k^*} \frac{\partial f} {\partial z_k} + \frac{\partial g_l} {\partial z_k} \frac{\partial f} {\partial z_k^*} \Big) + \lambda_l^* \Big(\frac{\partial g_l^*} {\partial z_k^*} \frac{\partial f} {\partial z_k} + \frac{\partial g_l^*} {\partial z_k} \frac{\partial f} {\partial z_k^*} \Big) \bigg) \bigg] \\ &= \mp 4 \alpha(\tau) \sum_k \bigg[\frac{\partial f} {\partial z_k} \frac{\partial f} {\partial z_k^*} - \sum_l \Re \bigg(\lambda_l \Big(\frac{\partial g_l} {\partial z_k^*} \frac{\partial f} {\partial z_k} + \frac{\partial g_l} {\partial z_k} \frac{\partial f} {\partial z_k^*} \Big) \bigg) \bigg] \end{split} And the constraints are \begin{split} \frac{d g_j}{d\tau} &= \sum_k \bigg[\frac{\partial g_j} {\partial z_k} \frac{d z_k} {d \tau} + \frac{\partial g_j} {\partial z_k^*} \frac{d z_k^*} {d \tau} \bigg] = 0 \\ &= \mp 2 \alpha(\tau) \sum_k \bigg[\frac{\partial f} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} - \sum_l \Big(\lambda_l \frac{\partial g_l} {\partial z_k^*} + \lambda_l^* \frac{\partial g_l^*} {\partial z_k^*} \Big) \frac{\partial g_j} {\partial z_k} \\ &~~~~~~~~~~~~~~~~~~~~~ + \frac{\partial f} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} - \sum_l \Big(\lambda_l \frac{\partial g_l} {\partial z_k} + \lambda_l^* \frac{\partial g_l^*} {\partial z_k} \Big) \frac{\partial g_j} {\partial z_k^*} \bigg] \\ &= \mp 2 \alpha(\tau) \sum_k \bigg[\frac{\partial f} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial f} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} \\ &~~~~~~~~~~~~~~~~~~~~~ - \sum_l \bigg(\lambda_l \Big(\frac{\partial g_l} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial g_l} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} \Big) + \lambda_l^* \Big(\frac{\partial g_l^*} {\partial z_k^*} \frac{\partial g_j} {\partial z_k} + \frac{\partial g_l^*} {\partial z_k} \frac{\partial g_j} {\partial z_k^*} \Big) \bigg) \bigg] . \end{split} Defining short-handed notation such as \sum_{\tilde{l}} \lambda_{\tilde{l}} g_{\tilde{l}} \equiv \sum_l \Big[\lambda_l g_l + \lambda_l^* g_l^* \Big] , \begin{split} \sum_k \bigg[\frac{\partial f} {\partial z_k^*} \frac{\partial g_{\tilde{j}}} {\partial z_k} + \frac{\partial f} {\partial z_k} \frac{\partial g_{\tilde{j}}} {\partial z_k^*} - \sum_{\tilde{l}} \lambda_{\tilde{l}} \Big(\frac{\partial g_{\tilde{l}}} {\partial z_k^*} \frac{\partial g_{\tilde{j}}} {\partial z_k} + \frac{\partial g_{\tilde{l}}} {\partial z_k} \frac{\partial g_{\tilde{j}}} {\partial z_k^*} \Big) \bigg] = 0 . \end{split} Introducing an inverse of symmetric complex matrix C_{\tilde{l}\tilde{j}} (= C_{\tilde{j}\tilde{l}} = C_{\tilde{j}^*\tilde{l}^*}^*) \equiv \sum_k \Big(\frac{\partial g_{\tilde{l}}} {\partial z_k^*} \frac{\partial g_{\tilde{j}}} {\partial z_k} + \frac{\partial g_{\tilde{l}}} {\partial z_k} \frac{\partial g_{\tilde{j}}} {\partial z_k^*} \Big) , i.e. \sum_{k, \tilde{j}} \Big(\frac{\partial g_{\tilde{l}}} {\partial z_k^*} \frac{\partial g_{\tilde{j}}} {\partial z_k} + \frac{\partial g_{\tilde{l}}} {\partial z_k} \frac{\partial g_{\tilde{j}}} {\partial z_k^*} \Big) C^{-1}_{\tilde{j}\tilde{i}} = \delta_{\tilde{l}\tilde{i}} , then \begin{split} &\sum_{k, \tilde{l}} \Big(\frac{\partial f} {\partial z_k^*} \frac{\partial g_{\tilde{j}}} {\partial z_k} + \frac{\partial f} {\partial z_k} \frac{\partial g_{\tilde{j}}} {\partial z_k^*} \Big) C^{-1}_{\tilde{j}\tilde{i}} = \sum_{\tilde{l}, k, \tilde{j}} \lambda_{\tilde{l}} \Big(\frac{\partial g_{\tilde{l}}} {\partial z_k^*} \frac{\partial g_{\tilde{j}}} {\partial z_k} + \frac{\partial g_{\tilde{l}}} {\partial z_k} \frac{\partial g_{\tilde{j}}} {\partial z_k^*} \Big) C^{-1}_{\tilde{j}\tilde{i}} \\ &\sum_{k, \tilde{l}} \Big(\frac{\partial f} {\partial z_k^*} \frac{\partial g_{\tilde{j}}} {\partial z_k} + \frac{\partial f} {\partial z_k} \frac{\partial g_{\tilde{j}}} {\partial z_k^*} \Big) C^{-1}_{\tilde{j}\tilde{i}} = \lambda_{\tilde{i}} . \end{split} ### Functional variables When functional variables come into the problem, S = \int d \vec{x} ~ f \big(\{h_k (\vec{x}, \tau) \} \big) . Then \begin{split} \frac{d S} {d \tau} &= \int d \vec{x} \sum_k \bigg[\frac{\partial f} {\partial h_k} \frac{d h_k} {d \tau} + \sum_{\mu} \frac{\partial f} {\partial \big(\partial_{\mu} h_k \big)} \partial_{\mu} \big(\frac{d h_k} {d \tau} \big) + \cdots \bigg] \\ &= \sum_k \int d \vec{x} \bigg[\frac{\partial f} {\partial h_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} h_k \big)} \Big] + \cdots \bigg] \frac{d h_k} {d \tau} . \end{split} with boundary conditions. So to minimize this with constraints \begin{split} &\int d \vec{x} \sum_k \bigg[\frac{\partial g_l} {\partial h_k} \frac{d h_k} {d \tau} + \sum_{\mu} \frac{\partial g_l} {\partial \big(\partial_{\mu} h_k \big)} \partial_{\mu} \big(\frac{d h_k} {d \tau} \big) + \cdots \bigg] = 0 \\ &= \sum_k \int d \vec{x} \bigg[\frac{\partial g_l} {\partial h_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} h_k \big)} \Big] + \cdots \bigg] \frac{d h_k} {d \tau} \end{split} and \int d \vec{x} \sum_k \bigg(\frac{d h_k} {d \tau} \bigg)^2 = \textrm{const} , \begin{split} &\frac{\partial f} {\partial h_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} h_k \big)} \Big] + \cdots \\ &= \sum_l \lambda_l \bigg[\frac{\partial g_l} {\partial h_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} h_k \big)} \Big] + \cdots \bigg] + 2 \lambda \frac{d h_k} {d \tau} . \end{split}
Then to find the minimum/maximum, \frac{d h_k} {d \tau} = \mp \alpha (\tau) \bigg[\frac{\partial f} {\partial h_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} h_k \big)} \Big] + \cdots - \sum_l \lambda_l \Big(\frac{\partial g_l} {\partial h_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} h_k \big)} \Big] + \cdots \Big) \bigg] where $\alpha(\tau)$ is an arbitrary positive number.
Then the change is \frac{d S} {d \tau} = \mp \alpha (\tau) \int d \vec{x} \sum_k \bigg[\frac{\partial f} {\partial h_k} - \cdots \bigg] \bigg[\frac{\partial f} {\partial h_k} - \cdots - \sum_l \lambda_l \Big(\frac{\partial g_l} {\partial h_k} - \cdots \Big) \bigg] . The constraints becomes \begin{split} &\mp \alpha (\tau) \int d \vec{x} \sum_k \bigg[\frac{\partial g_j} {\partial h_k} - \cdots \bigg] \bigg[\frac{\partial f} {\partial h_k} - \cdots - \sum_l \lambda_l \Big(\frac{\partial g_l} {\partial h_k} - \cdots \Big) \bigg] = 0 \end{split} Introducing an inverse of symmetric matrix D_{lj} (= D_{jl}) \equiv \int d \vec{x} \sum_k \Big(\frac{\partial g_l} {\partial h_k} - \cdots \Big) \Big(\frac{\partial g_j} {\partial h_k} - \cdots \Big) , i.e. \int d \vec{x} \sum_k \Big(\frac{\partial g_l} {\partial h_k} - \cdots \Big) \Big(\frac{\partial g_j} {\partial h_k} - \cdots \Big) D^{-1}_{ji} = \delta_{li} , then \begin{split} &\sum_{j} \int d \vec{x} \Big(\frac{\partial f} {\partial h_k} - \cdots \Big) \Big(\frac{\partial g_j} {\partial h_k} - \cdots \Big) D^{-1}_{ji} = \sum_{l, k, j} \int d \vec{x} \lambda_{l} \Big(\frac{\partial g_l} {\partial h_k} - \cdots \Big) \Big(\frac{\partial g_j} {\partial h_k} - \cdots \Big) D^{-1}_{ji} \\ &\sum_{j} \int d \vec{x} \Big(\frac{\partial f} {\partial h_k} - \cdots \Big) \Big(\frac{\partial g_j} {\partial h_k} - \cdots \Big) D^{-1}_{ji} = \lambda_{i} . \end{split} ### Functional complex variables
When function is in complex form, \frac{d z_k} {d \tau} = \mp \alpha (\tau) \bigg[\frac{\partial f} {\partial z_k^*} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} z_k^* \big)} \Big] + \cdots - \sum_l \lambda_l \Big(\frac{\partial g_l} {\partial z_k^*} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} z_k^* \big)} \Big] + \cdots \Big) \bigg] or \frac{d z_k^*} {d \tau} = \mp \alpha (\tau) \bigg[\frac{\partial f} {\partial z_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} z_k \big)} \Big] + \cdots - \sum_l \lambda_l \Big(\frac{\partial g_l} {\partial z_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} z_k \big)} \Big] + \cdots \Big) \bigg] When additionally the constraints are given in complex form, \begin{split} \frac{d h_k} {d \tau} = &\mp \alpha (\tau) \bigg[\frac{\partial f} {\partial h_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} h_k \big)} \Big] + \cdots \\ &- \sum_l \bigg(\lambda_l \Big(\frac{\partial g_l} {\partial h_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} h_k \big)} \Big] + \cdots \Big) + \lambda_l^* \Big(\frac{\partial g_l^*} {\partial h_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial g_l^*} {\partial \big(\partial_{\mu} h_k \big)} \Big] + \cdots \Big) \bigg) \bigg] . \end{split} And \begin{split} \frac{d z_k} {d \tau} = &\mp \alpha (\tau) \bigg[\frac{\partial f} {\partial z_k^*} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} z_k^* \big)} \Big] + \cdots \\ &- \sum_l \bigg(\lambda_l \Big(\frac{\partial g_l} {\partial z_k^*} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} z_k^* \big)} \Big] + \cdots \Big) + \lambda_l^* \Big(\frac{\partial g_l^*} {\partial z_k^*} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial g_l^*} {\partial \big(\partial_{\mu} z_k^* \big)} \Big] + \cdots \Big) \bigg) \bigg] \end{split} or \begin{split} \frac{d z_k^*} {d \tau} = &\mp \alpha (\tau) \bigg[\frac{\partial f} {\partial z_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial f} {\partial \big(\partial_{\mu} z_k \big)} \Big] + \cdots \\ &- \sum_l \bigg(\lambda_l \Big(\frac{\partial g_l} {\partial z_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial g_l} {\partial \big(\partial_{\mu} z_k \big)} \Big] + \cdots \Big) + \lambda_l^* \Big(\frac{\partial g_l^*} {\partial z_k} - \sum_{\mu} \partial_{\mu} \Big[\frac{\partial g_l^*} {\partial \big(\partial_{\mu} z_k \big)} \Big] + \cdots \Big) \bigg) \bigg] . \end{split}
##[.hiden.no-sec-N#sec-Index-Def] Index and Definition
  • Partial differentiation: When function $f$ is given by multiple variables, the partial derivative is superficial derivative of only that variable even though the variables are mutually dependent.

    For example, f(z, z^*, a, b, \xi, \theta) = z^2 z^* \cos(ab) \xi^{a} \frac{1}{\theta} where $z$ and $z^*$ are complex, and the others are real, satisfying that \begin{split} &z = a + i b = \xi e^{i \theta}\\ &z^* = a - i b = \xi e^{-i \theta} . \end{split} \begin{split} &\frac{\partial f} {\partial z} = 2 z z^* \cos(ab) \xi^{a} \frac{1}{\theta} \\ & \frac{\partial f} {\partial z^*} = z^2 \cos(ab) \xi^{a} \frac{1}{\theta} \\ & \frac{\partial f} {\partial a} = z^2 z^* \big(-b \cdot \sin(ab) \big) \xi^{a} \frac{1}{\theta} + z^2 z^* \cos(ab) \big(\ln \xi \cdot \xi^{a} \big) \frac{1}{\theta} \\ & \frac{\partial f} {\partial b} = z^2 z^* \big(-a \cdot \sin(ab) \big) \xi^{a} \frac{1}{\theta} \\ & \frac{\partial f} {\partial \xi} = z^2 z^* \cos(ab) \big(a \xi^{a-1} \big) \frac{1}{\theta} \\ & \frac{\partial f} {\partial \theta} = z^2 z^* \cos(ab) \xi^{a} \big(- \frac{1}{\theta^2} \big) . \end{split} We can easily check that infinitesimal change of $f$, which comes from infinitesimal variation of variables, is given by \begin{split} \delta f = \frac{\partial f} {\partial z} \delta z + \frac{\partial f} {\partial z^*} \delta z^* + \frac{\partial f} {\partial a} \delta a + \frac{\partial f} {\partial b} \delta b + \frac{\partial f} {\partial \xi} \delta \xi + \frac{\partial f} {\partial \theta} \delta \theta \end{split} where \begin{split} &\delta z = \delta a + i \delta b = e^{i \theta} \delta \xi + i \xi e^{i \theta} \delta \theta \\ & \delta z^* = \delta a - i \delta b = e^{-i \theta} \delta \xi - i \xi e^{-i \theta} \delta \theta . \end{split} Although functions are usually expressed with mutually independent variables, this concept of partial derivative is quite important. Especially when a function is expressed with complex variables, this concept is crucial. In this case, even total derivative of complex variable can be illegal.

    Though we usually omit the expression, for convenience sake, in partial derivatives, we should be always careful about with what variables the function is expressed in mind. Otherwise it brings you great misunderstanding. Usually \begin{split} &\frac{\partial f} {\partial z} \textrm{ means } \frac{\partial f (z, z^*)} {\partial z} , \\ & \frac{\partial f} {\partial a} \textrm{ means } \frac{\partial f (a, b)} {\partial a} , \\ & \frac{\partial f} {\partial \theta} \textrm{ means } \frac{\partial f (\xi, \theta)} {\partial \theta} , \end{split} and so on.
  • Short-handed notation for multiple integration: \int d \vec{x} \equiv \int \int \int \cdots \bigg[d x^{0} d x^{1} d x^{2} \cdots \bigg] = \int \int \int \cdots \bigg[\prod_{i} d x^{i} \bigg] . For 4 variables $\{x^{0}, x^{1}, x^{2}, x^{3}\}$, \int d \vec{x} \equiv \int \int \int \int \bigg[d x^{0} d x^{1} d x^{2} d x^{3} \bigg] = \int \int \int \int \bigg[\prod_{i=0}^{3} d x^{i} \bigg] . Integral-offed one: \int \frac{d \vec{x}} {d x^{0}} \equiv \int \int \int \bigg[d x^{1} d x^{2} d x^{3} \bigg] = \int \int \int \bigg[\prod_{i \neq 0} d x^{i} \bigg] , \int \frac{d \vec{x}} {d x^{0} d x^{3}} \equiv \int \int \bigg[d x^{1} d x^{2} \bigg] = \int \int \bigg[\prod_{i \neq 0, 3} d x^{i} \bigg] , and so on.
## RRA
  1. kipid's blog - 최적화, 라그랑지 승수법 (Optimization with the Method of Lagrange multipliers); This is a Korean version, but the content is not exactly the same as the current document. The current English version is more concrete and featured.
  2. Wiki - Lagrange multiplier (라그랑주 승수법)
728x90
반응형