====== Variational Groundstate Search ======


Here we introduce the variational groundstate search for matrix product states. The goal is to find the MPS representation of the groundstate of the onedimensional lattice with a given hamiltonian H. Furthermore we limit the matrix dimension m of the matrices in the MPS to a maximal value. Therefore we try to find the best MPS representation of the groundstate for a given matrix dimension m.

The main idea is to use the variational principle. So we want to find the MPS $\lvert\psi\rangle$ that minimizes:
\begin{align}
E = min\frac{\langle\psi\lvert\hat H\lvert\psi\rangle}{\langle\psi\lvert\psi\rangle}
\end{align}

This problem can be expressed by using the Lagrange-formalism. So we want to minimize $\langle\psi\lvert\hat H\lvert\psi\rangle$ under the constraint that the norm $\langle\psi\lvert\psi\rangle$ is constant. Introducing the groundstate energy E as the Lagrange-multiplier, we get the Langrange function:
\begin{align}
L=\langle\psi\lvert\hat H\lvert\psi\rangle - E(\langle\psi\lvert\psi\rangle-1)
\end{align}

The usual way to continue is to differentiate the Lagrange function for all $p$ parameters of the trial wavefunction and set the differentiations to zero. This will lead to $p+1$ coupled equations that have to be solved consistently. Within MPS there are $p=d\cdot n$ matrices with $m^2$ coefficients. This makes $p=d\cdot n\cdot m^2$ parameters in total and therefore the global optimization will just work for very small dimensions m.

Therefore we follow the route of the DMRG: We just optimize one matrix of the MPS at a time and sweep through the system optimizing the matrices one after each other.

===== DMRG Optimization =====

As stated beforehand the idea is not to optimize all matrices at the same time, but just optimize one matrix at site l in the MPS and then shift to the next matrix and continue.

==== Local representations ====

In order to use the local DMRG optimization, we have to represent the expectation value $\langle\psi\lvert\hat H\lvert\psi\rangle$ in a compact form for the matrix at site l.

\begin{align}
\langle\psi\lvert\hat H\lvert\psi\rangle	 = & \sum_{\boldsymbol{\sigma, \sigma^\prime}}\langle\boldsymbol\sigma\lvert \underbrace{B^{\sigma_L*}\cdots B^{\sigma_{l+1}*}M^{\sigma_l*}A^{\sigma_{l-1}*}\cdots A^{\sigma_1*}}_{B^{\sigma_L*}\cdots A^{\sigma_1*}} \hat H \underbrace{A^{\sigma_1^{\prime}}\cdots A^{\sigma_{l-1}^{\prime}}M^{\sigma_l^{\prime}}B^{\sigma_{l+1}^{\prime}}\cdots B^{\sigma_L^{\prime}}}_{A^{\sigma_1^{\prime}}\cdots B^{\sigma_L^{\prime}}}\lvert\boldsymbol{\sigma^\prime}\rangle\newline
= & \sum_{\boldsymbol{\sigma, \sigma^\prime}}\sum_{\boldsymbol{\tilde\sigma, \tilde\sigma^\prime}}\langle\boldsymbol\sigma\lvert B^{\sigma_L*}\cdots A^{\sigma_1*} \lvert\boldsymbol{\tilde\sigma}\rangle W^{\tilde\sigma_1,\tilde\sigma_1^\prime}\cdots W^{\tilde\sigma_L,\tilde\sigma_L^\prime}\langle\boldsymbol{\tilde\sigma^\prime}\lvert A^{\sigma_1^{\prime}}\cdots B^{\sigma_L^{\prime}}\lvert\boldsymbol{\sigma^\prime}\rangle\newline
= & \sum_{\boldsymbol{\sigma, \sigma^\prime}}\sum_{\boldsymbol{\tilde\sigma, \tilde\sigma^\prime}}\underbrace{\langle\boldsymbol\sigma\lvert\boldsymbol{\tilde\sigma}\rangle}_{\delta_{\sigma, \tilde\sigma}} \sum_{a_i, a_i^\prime, b_i} B^{\sigma_L*}_{b_{L-1}, 1}\cdots A^{\sigma_1*}_{1, a_1}  W^{\tilde\sigma_1,\tilde\sigma_1^\prime}_{1, b_1}\cdots W^{\tilde\sigma_L,\tilde\sigma_L^\prime}_{b_{L-1}, 1} A^{\sigma_1^{\prime}}_{1, a_1^\prime}\cdots B^{\sigma_L^{\prime}}_{a_{L-1}^\prime, 1}\underbrace{\langle\boldsymbol{\tilde\sigma^\prime}\lvert\boldsymbol{\sigma^\prime}\rangle}_{\delta_{\sigma^\prime, \tilde\sigma^\prime}}\newline
= & \sum_{a_{l-1},a_l}\sum_{a_{l-1}^{\prime}, a_{l}^\prime}\sum_{b_{l-1}, b_l}L^{a_{l-1}}_{a_{l-1}^{\prime}, b_{l-1}} \left(\sum_{\sigma_{l}, \sigma^\prime_{l}}M^{\sigma_{l}*}_{a_{l-1}, a_{l}}W^{\sigma_{l}, \sigma_{l}^\prime}_{b_{l-1}, b_{l}}M^{\sigma^{\prime}_{l}*}_{a_{l-1}^\prime, a_{l}^\prime}\right)R^{a_{l}}_{a_{l}^{\prime}, b_{l}}
\end{align}


Whereby, $L^{a_{l-1}}_{a_{l-1}^{\prime}, b_{l-1}}$ represents the left side of the matrix at site l and $R^{a_{l}}_{a_{l}^{\prime}, b_{l}}$ the right side. They are explicitly given by:

\begin{align}
L^{a_{l-1}}_{a_{l-1}^{\prime}, b_{l-1}}=\sum_{\{a_i, a_i^\prime, b_i; i<l-1\}}\left(\sum_{\sigma_1, \sigma^\prime_1}A^{\sigma_1*}_{1, a_1}W^{\sigma_1, \sigma_1^\prime}_{1, b_1}A^{\sigma^{\prime}_1}_{1, a_1^\prime}\right)\cdots\left(\sum_{\sigma_{l-1}, \sigma^\prime_{l-1}}A^{\sigma_{l-1}*}_{a_{l-2}, a_{l-1}}W^{\sigma_{l-1}, \sigma_{l-1}^\prime}_{b_{l-2}, b_{l-1}}A^{\sigma^{\prime}_{l-1}}_{a_{l-2}^\prime, a_{l-1}^\prime}\right)
\end{align}

\begin{align}
R^{a_{l}}_{a_{l}^{\prime}, b_{l}}=\sum_{\{a_i, a_i^\prime, b_i; i>l\}}\left(\sum_{\sigma_{l+1}, \sigma^\prime_{l+1}}B^{\sigma_{l+1}*}_{a_l, a_{l+1}}W^{\sigma_{l+1}, \sigma_{l+1}^\prime}_{b_l, b_{l+1}}B^{\sigma^{\prime}_{l+1}}_{a_l^{\prime}, a_{l+1}^\prime}\right)\cdots\left(\sum_{\sigma_{L}, \sigma^\prime_{L}}A^{\sigma_{L}*}_{a_{L-1}, 1}W^{\sigma_{L}, \sigma_{L}^\prime}_{b_{L-1}, 1}A^{\sigma^{\prime}_{L}}_{a_{L-1}^\prime, 1}\right)
\end{align}

Also the scalar product $\langle\psi\lvert\psi\rangle$ can be given in local form:

\begin{align}
\langle\psi\lvert\psi\rangle = &\sum_{\boldsymbol{\sigma, \sigma^\prime}}\langle\boldsymbol\sigma\lvert B^{\sigma_L*}\cdots B^{\sigma_{l+1}*}M^{\sigma_l*}\underbrace{A^{\sigma_{l-1}*}\cdots A^{\sigma_1*}A^{\sigma_1^{\prime}}\cdots A^{\sigma_{l-1}^{\prime}}}_{1}M^{\sigma_l^{\prime}}B^{\sigma_{l+1}^{\prime}}\cdots B^{\sigma_L^{\prime}}\lvert\boldsymbol{\sigma^\prime}\rangle\newline
= &\sum_{\sigma_{l+1}, \ldots, \sigma_L}\sum_{a_l}\left(M^{\sigma_l*}M^{\sigma_l}\underbrace{B^{\sigma_{l+1}^{\prime}}\cdots B^{\sigma_L^{\prime}}B^{\sigma_L*}\cdots B^{\sigma_{l+1}*}}_{1}\right)_{a_l, a_l}\newline
= &\sum_{\sigma_l}\text{Tr}\left(M^{\sigma_l*}M^{\sigma_l}\right)
\end{align}

==== Effective calculation of L and R ====

After optimization of a single matrix we want to continue with the next one. In order to setup the new L and R, we can use that the new L and R can be updated from the previous by:

\begin{align}
L^{a_{l-1}}_{a_{l-1}^{\prime}, b_{l-1}}= &\sum_{a_{l-2}, a_{l-2}^\prime, b_{l-2}}L^{a_{l-2}}_{a_{l-2}^{\prime}, b_{l-2}}\left(\sum_{\sigma_{l-1}, \sigma^\prime_{l-1}}A^{\sigma_{l-1}*}_{a_{l-2}, a_{l-1}}W^{\sigma_{l-1}, \sigma_{l-1}^\prime}_{b_{l-2}, b_{l-1}}A^{\sigma^{\prime}_{l-1}}_{a_{l-2}^\prime, a_{l-1}^\prime}\right)\newline
= &\sum_{\sigma_{l-1}, \sigma^\prime_{l-1}} \sum_{a_{l-2}, a_{l-2}^\prime}A^{\sigma_{l-1}*}_{a_{l-2}, a_{l-1}}\left(\sum_{b_{l-2}}L^{a_{l-2}}_{a_{l-2}^{\prime}, b_{l-2}}W^{\sigma_{l-1}, \sigma_{l-1}^\prime}_{b_{l-2}, b_{l-1}}\right)A^{\sigma^{\prime}_{l-1}}_{a_{l-2}^\prime, a_{l-1}^\prime}\newline
= &\sum_{\sigma_{l-1}, \sigma^\prime_{l-1}} \sum_{a_{l-2}, a_{l-2}^\prime}A^{\sigma_{l-1}*}_{a_{l-2}, a_{l-1}}\left(L^{a_{l-2}}W^{\sigma_{l-1}, \sigma_{l-1}^\prime}\right)_{a_{l-2}^{\prime}, b_{l-1}}A^{\sigma^{\prime}_{l-1}}_{a_{l-2}^\prime, a_{l-1}^\prime}\newline
= &\sum_{\sigma_{l-1}, \sigma^\prime_{l-1}} \sum_{a_{l-2}}A^{\sigma_{l-1}*}_{a_{l-2}, a_{l-1}}\left(\sum_{a_{l-2}^\prime}\left(\left(L^{a_{l-2}}W^{\sigma_{l-1}, \sigma_{l-1}^\prime}\right)^t\right)_{b_{l-1}, a_{l-2}^{\prime}}A^{\sigma^{\prime}_{l-1}}_{a_{l-2}^\prime, a_{l-1}^\prime}\right)\newline
= &\sum_{\sigma_{l-1}, \sigma^\prime_{l-1}} \sum_{a_{l-2}}A^{\sigma_{l-1}*}_{a_{l-2}, a_{l-1}}\left(\left(L^{a_{l-2}}W^{\sigma_{l-1}, \sigma_{l-1}^\prime}\right)^tA^{\sigma^{\prime}_{l-1}}\right)_{b_{l-1}, a_{l-1}^\prime}
\end{align}

Please note that $L^{a_{0}}$ is a scalar with value 1. There is an analog formulation for $R^{a_l}_{a_l^\prime, b_l}$:

\begin{align}
R^{a_{l}}_{a_{l}^{\prime}, b_{l}}= \sum_{\sigma_{l+1},\sigma^\prime_{l+1}}\sum_{a_{l+1}}B^{\sigma_{l+1}*}_{a_{l}, a_{l+1}}\left(W^{\sigma_{l+1}, \sigma_{l+1}^\prime}\left(B^{\sigma^{\prime}_{l+1}}R^{a_{l+1}}\right)^t\right)_{b_{l}, a_{l}^\prime}
\end{align}

==== Optimization of a single matrix ====

We have shown how to setup local forms for the expectation value and the scalar product in respect to a single matrix at site l. Now we want to use the variational principle. Therefore we have to differentiate the Lagrange function:
\begin{align}
L=\langle\psi\lvert\hat H\lvert\psi\rangle - E(\langle\psi\lvert\psi\rangle-1)
\end{align}

in respect to all coefficients $M^{\sigma_l}_{a_{l-1}, a_l}$ and set these differentiations to zero. By this we get:
\begin{align}
0\stackrel{!}{=} & \frac{\partial L}{\partial M^{\sigma_L}_{a_{l-1}, a_l}} = \frac{\partial}{\partial M^{\sigma_L}_{a_{l-1}, a_l}}\left(\langle\psi\lvert\hat H\lvert\psi\rangle\right)-\lambda\frac{\partial}{\partial M^{\sigma_L}_{a_{l-1}, a_l}}\left(\langle\psi\lvert\lvert\psi\rangle\right)\newline
= & \sum_{\sigma_l^\prime}\sum_{a_{l-1}^\prime, a_l^\prime}\sum_{b_{l-1}, b_l} L_{a_{l-1}^\prime, b_{l-1}}^{a_{l-1}} W_{b_{l-1}, b_l}^{\sigma_l, \sigma_l^\prime} R_{a_l^\prime, b_l}^{a_l}M_{a_{l-1}^\prime, a_l^\prime}^{\sigma_l^\prime} - \lambda\sum_{a_{l-1}^\prime, a_l^\prime} M^{\sigma_l^{\prime}}_{a_{l-1}^\prime, a_l^\prime}
\end{align}

Now we get introduce $H_{\text{Eff}}$ by:
\begin{align}
H_{\text{Eff}(\sigma_la_{l-1}a_l)(\sigma_l^{\prime}a_{l-1}^{\prime}a_l^{\prime})}=\sum_{b_{l-1}, b_l}L_{a_{l-1}^\prime, b_{l-1}}^{a_{l-1}} W_{b_{l-1}, b_l}^{\sigma_l, \sigma_l^\prime} R_{a_l^\prime, b_l}^{a_l}
\end{align}
and $\nu_{\sigma_la_{l-1}a_l}=M_{a_{l-1}, a_l}^{\sigma_l}$. So we get a local Eigenvalue problem:
\begin{align}
H_{\text{Eff}}\nu - \lambda\nu = 0
\end{align}
This eigenvalue problem can be solved exactly (or approximately by a Lanczos) and the smallest eigenvalue gives the optimized value for the overall groundstate energy. The related eigenvector $\nu_0$ gives the new optimized matrix: $M^{\sigma_L}$


===== Iterative Optimization =====
In order to optimize a given starting state \(\lvert \psi \rangle\) iteratively to represent the groundstate we will proceed as follows:

  * **Initialization**
At first we will fill the matrices of the starting state with random values. Then we will right-normalize the state and by this create all the \(R\)-matrices The we will create the Hamilton-MPO. The sweep will start at site \(l=1\) and the MPS has the form :
\begin{equation*}
M^{\sigma_1}B^{\sigma_{2}}\cdots B^{\sigma_L}
\end{equation*}
  * **Right Sweep-Step **
At first we will setup the \(H_{\text{Eff}}\) at site \(l\) and solve the related local Eigenvalue problem. Then we will update the matrix \(M^{\sigma_l}\) by the (reshaped) eigenvector of the smallest eigenvalue (1). In order to extend the left-normalized block we will apply a SVD on the new matrix \(\tilde M^{\sigma_l}\) and save the  \(U\) as the new  \(A^{\sigma_l}\) while the \(S V^{\dagger}\) is multiplied to the matrix  \(B^{\sigma_{l+1}}\) which shrinks the right block (2) :
\begin{align*}
A^{\sigma_1}\cdots A^{\sigma_{l-1}} M^{\sigma_{l}} B^{\sigma_{l+1}}\cdots B^{\sigma_{L}} 
&\xrightarrow{(1)}A^{\sigma_1}\cdots A^{\sigma_{l-1}} \tilde{M}^{\sigma_{l}} B^{\sigma_{l+1}}\cdots B^{\sigma_{L}}\newline
&\xrightarrow{(2)}A^{\sigma_1}\cdots A^{\sigma_{l}} M^{\sigma_{l+1}} B^{\sigma_{l+2}}\cdots B^{\sigma_{L}}
\end{align*}
From the singular values (diagonal elements of \(S\))  we can setup the density matrix \(\rho=S^2\) and calculate the "von Neumann"-Entanglement-Entropy \(S_{\text{vN}}(\lvert\psi\rangle)=-\text{Tr}\rho \log_2 \rho\). Then we will proceed with the next site (\(l\rightarrow l+1\)).
  * **Complete right sweep**
Repeating step 2 until we have reached \(l=L\) will give one complete right sweep. The MPS is now left-normalized:
\begin{equation*}
A^{\sigma_{1}}\cdots A^{\sigma_{L-1}}M^{\sigma_L}
\end{equation*}
  * **Complete left sweep**
Analogously to step 3 we can sweep also to the left side. We will start with site \(l=L\) and end with site \(l=1\).
  * **Sweeping**
Repeating steps 3 and 4 is called sweeping. We will stop the sweeping untill a certain criteria is fullfilled (e.g. convergence, max. number of sweeps)


©: This site was created by Piet Dargel and Thomas Köhler