# MPS Applet

### Site Tools

en:variational_groundstate_search

# Differences

This shows you the differences between two versions of the page.

 — en:variational_groundstate_search [2011/10/10 13:36] (current) Line 1: Line 1: + ====== Variational Groundstate Search ====== + + Here we introduce the variational groundstate search for matrix product states. The goal is to find the MPS representation of the groundstate of the onedimensional lattice with a given hamiltonian H. Furthermore we limit the matrix dimension m of the matrices in the MPS to a maximal value. Therefore we try to find the best MPS representation of the groundstate for a given matrix dimension m. + + The main idea is to use the variational principle. So we want to find the MPS $\lvert\psi\rangle$ that minimizes: + \begin{align} + E = min\frac{\langle\psi\lvert\hat H\lvert\psi\rangle}{\langle\psi\lvert\psi\rangle} + \end{align} + + This problem can be expressed by using the Lagrange-formalism. So we want to minimize $\langle\psi\lvert\hat H\lvert\psi\rangle$ under the constraint that the norm $\langle\psi\lvert\psi\rangle$ is constant. Introducing the groundstate energy E as the Lagrange-multiplier,​ we get the Langrange function: + \begin{align} + L=\langle\psi\lvert\hat H\lvert\psi\rangle - E(\langle\psi\lvert\psi\rangle-1) + \end{align} + + The usual way to continue is to differentiate the Lagrange function for all $p$ parameters of the trial wavefunction and set the differentiations to zero. This will lead to $p+1$ coupled equations that have to be solved consistently. Within MPS there are $p=d\cdot n$ matrices with $m^2$ coefficients. This makes $p=d\cdot n\cdot m^2$ parameters in total and therefore the global optimization will just work for very small dimensions m. + + Therefore we follow the route of the DMRG: We just optimize one matrix of the MPS at a time and sweep through the system optimizing the matrices one after each other. + + ===== DMRG Optimization ===== + + As stated beforehand the idea is not to optimize all matrices at the same time, but just optimize one matrix at site l in the MPS and then shift to the next matrix and continue. + + ==== Local representations ==== + + In order to use the local DMRG optimization,​ we have to represent the expectation value $\langle\psi\lvert\hat H\lvert\psi\rangle$ in a compact form for the matrix at site l. + + \begin{align} + \langle\psi\lvert\hat H\lvert\psi\rangle = & \sum_{\boldsymbol{\sigma,​ \sigma^\prime}}\langle\boldsymbol\sigma\lvert \underbrace{B^{\sigma_L*}\cdots B^{\sigma_{l+1}*}M^{\sigma_l*}A^{\sigma_{l-1}*}\cdots A^{\sigma_1*}}_{B^{\sigma_L*}\cdots A^{\sigma_1*}} \hat H \underbrace{A^{\sigma_1^{\prime}}\cdots A^{\sigma_{l-1}^{\prime}}M^{\sigma_l^{\prime}}B^{\sigma_{l+1}^{\prime}}\cdots B^{\sigma_L^{\prime}}}_{A^{\sigma_1^{\prime}}\cdots B^{\sigma_L^{\prime}}}\lvert\boldsymbol{\sigma^\prime}\rangle\newline + = & \sum_{\boldsymbol{\sigma,​ \sigma^\prime}}\sum_{\boldsymbol{\tilde\sigma,​ \tilde\sigma^\prime}}\langle\boldsymbol\sigma\lvert B^{\sigma_L*}\cdots A^{\sigma_1*} \lvert\boldsymbol{\tilde\sigma}\rangle W^{\tilde\sigma_1,​\tilde\sigma_1^\prime}\cdots W^{\tilde\sigma_L,​\tilde\sigma_L^\prime}\langle\boldsymbol{\tilde\sigma^\prime}\lvert A^{\sigma_1^{\prime}}\cdots B^{\sigma_L^{\prime}}\lvert\boldsymbol{\sigma^\prime}\rangle\newline + = & \sum_{\boldsymbol{\sigma,​ \sigma^\prime}}\sum_{\boldsymbol{\tilde\sigma,​ \tilde\sigma^\prime}}\underbrace{\langle\boldsymbol\sigma\lvert\boldsymbol{\tilde\sigma}\rangle}_{\delta_{\sigma,​ \tilde\sigma}} \sum_{a_i, a_i^\prime, b_i} B^{\sigma_L*}_{b_{L-1},​ 1}\cdots A^{\sigma_1*}_{1,​ a_1}  W^{\tilde\sigma_1,​\tilde\sigma_1^\prime}_{1,​ b_1}\cdots W^{\tilde\sigma_L,​\tilde\sigma_L^\prime}_{b_{L-1},​ 1} A^{\sigma_1^{\prime}}_{1,​ a_1^\prime}\cdots B^{\sigma_L^{\prime}}_{a_{L-1}^\prime,​ 1}\underbrace{\langle\boldsymbol{\tilde\sigma^\prime}\lvert\boldsymbol{\sigma^\prime}\rangle}_{\delta_{\sigma^\prime,​ \tilde\sigma^\prime}}\newline + = & \sum_{a_{l-1},​a_l}\sum_{a_{l-1}^{\prime},​ a_{l}^\prime}\sum_{b_{l-1},​ b_l}L^{a_{l-1}}_{a_{l-1}^{\prime},​ b_{l-1}} \left(\sum_{\sigma_{l},​ \sigma^\prime_{l}}M^{\sigma_{l}*}_{a_{l-1},​ a_{l}}W^{\sigma_{l},​ \sigma_{l}^\prime}_{b_{l-1},​ b_{l}}M^{\sigma^{\prime}_{l}*}_{a_{l-1}^\prime,​ a_{l}^\prime}\right)R^{a_{l}}_{a_{l}^{\prime},​ b_{l}} + \end{align} + + + Whereby, $L^{a_{l-1}}_{a_{l-1}^{\prime},​ b_{l-1}}$ represents the left side of the matrix at site l and $R^{a_{l}}_{a_{l}^{\prime},​ b_{l}}$ the right side. They are explicitly given by: + + \begin{align} + L^{a_{l-1}}_{a_{l-1}^{\prime},​ b_{l-1}}=\sum_{\{a_i,​ a_i^\prime, b_i; i<​l-1\}}\left(\sum_{\sigma_1,​ \sigma^\prime_1}A^{\sigma_1*}_{1,​ a_1}W^{\sigma_1,​ \sigma_1^\prime}_{1,​ b_1}A^{\sigma^{\prime}_1}_{1,​ a_1^\prime}\right)\cdots\left(\sum_{\sigma_{l-1},​ \sigma^\prime_{l-1}}A^{\sigma_{l-1}*}_{a_{l-2},​ a_{l-1}}W^{\sigma_{l-1},​ \sigma_{l-1}^\prime}_{b_{l-2},​ b_{l-1}}A^{\sigma^{\prime}_{l-1}}_{a_{l-2}^\prime,​ a_{l-1}^\prime}\right) + \end{align} + + \begin{align} + R^{a_{l}}_{a_{l}^{\prime},​ b_{l}}=\sum_{\{a_i,​ a_i^\prime, b_i; i>​l\}}\left(\sum_{\sigma_{l+1},​ \sigma^\prime_{l+1}}B^{\sigma_{l+1}*}_{a_l,​ a_{l+1}}W^{\sigma_{l+1},​ \sigma_{l+1}^\prime}_{b_l,​ b_{l+1}}B^{\sigma^{\prime}_{l+1}}_{a_l^{\prime},​ a_{l+1}^\prime}\right)\cdots\left(\sum_{\sigma_{L},​ \sigma^\prime_{L}}A^{\sigma_{L}*}_{a_{L-1},​ 1}W^{\sigma_{L},​ \sigma_{L}^\prime}_{b_{L-1},​ 1}A^{\sigma^{\prime}_{L}}_{a_{L-1}^\prime,​ 1}\right) + \end{align} + + Also the scalar product $\langle\psi\lvert\psi\rangle$ can be given in local form: + + \begin{align} + \langle\psi\lvert\psi\rangle = &​\sum_{\boldsymbol{\sigma,​ \sigma^\prime}}\langle\boldsymbol\sigma\lvert B^{\sigma_L*}\cdots B^{\sigma_{l+1}*}M^{\sigma_l*}\underbrace{A^{\sigma_{l-1}*}\cdots A^{\sigma_1*}A^{\sigma_1^{\prime}}\cdots A^{\sigma_{l-1}^{\prime}}}_{1}M^{\sigma_l^{\prime}}B^{\sigma_{l+1}^{\prime}}\cdots B^{\sigma_L^{\prime}}\lvert\boldsymbol{\sigma^\prime}\rangle\newline + = &​\sum_{\sigma_{l+1},​ \ldots, \sigma_L}\sum_{a_l}\left(M^{\sigma_l*}M^{\sigma_l}\underbrace{B^{\sigma_{l+1}^{\prime}}\cdots B^{\sigma_L^{\prime}}B^{\sigma_L*}\cdots B^{\sigma_{l+1}*}}_{1}\right)_{a_l,​ a_l}\newline + = &​\sum_{\sigma_l}\text{Tr}\left(M^{\sigma_l*}M^{\sigma_l}\right) + \end{align} + + ==== Effective calculation of L and R ==== + + After optimization of a single matrix we want to continue with the next one. In order to setup the new L and R, we can use that the new L and R can be updated from the previous by: + + \begin{align} + L^{a_{l-1}}_{a_{l-1}^{\prime},​ b_{l-1}}= &​\sum_{a_{l-2},​ a_{l-2}^\prime,​ b_{l-2}}L^{a_{l-2}}_{a_{l-2}^{\prime},​ b_{l-2}}\left(\sum_{\sigma_{l-1},​ \sigma^\prime_{l-1}}A^{\sigma_{l-1}*}_{a_{l-2},​ a_{l-1}}W^{\sigma_{l-1},​ \sigma_{l-1}^\prime}_{b_{l-2},​ b_{l-1}}A^{\sigma^{\prime}_{l-1}}_{a_{l-2}^\prime,​ a_{l-1}^\prime}\right)\newline + = &​\sum_{\sigma_{l-1},​ \sigma^\prime_{l-1}} \sum_{a_{l-2},​ a_{l-2}^\prime}A^{\sigma_{l-1}*}_{a_{l-2},​ a_{l-1}}\left(\sum_{b_{l-2}}L^{a_{l-2}}_{a_{l-2}^{\prime},​ b_{l-2}}W^{\sigma_{l-1},​ \sigma_{l-1}^\prime}_{b_{l-2},​ b_{l-1}}\right)A^{\sigma^{\prime}_{l-1}}_{a_{l-2}^\prime,​ a_{l-1}^\prime}\newline + = &​\sum_{\sigma_{l-1},​ \sigma^\prime_{l-1}} \sum_{a_{l-2},​ a_{l-2}^\prime}A^{\sigma_{l-1}*}_{a_{l-2},​ a_{l-1}}\left(L^{a_{l-2}}W^{\sigma_{l-1},​ \sigma_{l-1}^\prime}\right)_{a_{l-2}^{\prime},​ b_{l-1}}A^{\sigma^{\prime}_{l-1}}_{a_{l-2}^\prime,​ a_{l-1}^\prime}\newline + = &​\sum_{\sigma_{l-1},​ \sigma^\prime_{l-1}} \sum_{a_{l-2}}A^{\sigma_{l-1}*}_{a_{l-2},​ a_{l-1}}\left(\sum_{a_{l-2}^\prime}\left(\left(L^{a_{l-2}}W^{\sigma_{l-1},​ \sigma_{l-1}^\prime}\right)^t\right)_{b_{l-1},​ a_{l-2}^{\prime}}A^{\sigma^{\prime}_{l-1}}_{a_{l-2}^\prime,​ a_{l-1}^\prime}\right)\newline + = &​\sum_{\sigma_{l-1},​ \sigma^\prime_{l-1}} \sum_{a_{l-2}}A^{\sigma_{l-1}*}_{a_{l-2},​ a_{l-1}}\left(\left(L^{a_{l-2}}W^{\sigma_{l-1},​ \sigma_{l-1}^\prime}\right)^tA^{\sigma^{\prime}_{l-1}}\right)_{b_{l-1},​ a_{l-1}^\prime} + \end{align} + + Please note that $L^{a_{0}}$ is a scalar with value 1. There is an analog formulation for $R^{a_l}_{a_l^\prime,​ b_l}$: + + \begin{align} + R^{a_{l}}_{a_{l}^{\prime},​ b_{l}}= \sum_{\sigma_{l+1},​\sigma^\prime_{l+1}}\sum_{a_{l+1}}B^{\sigma_{l+1}*}_{a_{l},​ a_{l+1}}\left(W^{\sigma_{l+1},​ \sigma_{l+1}^\prime}\left(B^{\sigma^{\prime}_{l+1}}R^{a_{l+1}}\right)^t\right)_{b_{l},​ a_{l}^\prime} + \end{align} + + ==== Optimization of a single matrix ==== + + We have shown how to setup local forms for the expectation value and the scalar product in respect to a single matrix at site l. Now we want to use the variational principle. Therefore we have to differentiate the Lagrange function: + \begin{align} + L=\langle\psi\lvert\hat H\lvert\psi\rangle - E(\langle\psi\lvert\psi\rangle-1) + \end{align} + + in respect to all coefficients $M^{\sigma_l}_{a_{l-1},​ a_l}$ and set these differentiations to zero. By this we get: + \begin{align} + 0\stackrel{!}{=} & \frac{\partial L}{\partial M^{\sigma_L}_{a_{l-1},​ a_l}} = \frac{\partial}{\partial M^{\sigma_L}_{a_{l-1},​ a_l}}\left(\langle\psi\lvert\hat H\lvert\psi\rangle\right)-\lambda\frac{\partial}{\partial M^{\sigma_L}_{a_{l-1},​ a_l}}\left(\langle\psi\lvert\lvert\psi\rangle\right)\newline + = & \sum_{\sigma_l^\prime}\sum_{a_{l-1}^\prime,​ a_l^\prime}\sum_{b_{l-1},​ b_l} L_{a_{l-1}^\prime,​ b_{l-1}}^{a_{l-1}} W_{b_{l-1}, b_l}^{\sigma_l,​ \sigma_l^\prime} R_{a_l^\prime,​ b_l}^{a_l}M_{a_{l-1}^\prime,​ a_l^\prime}^{\sigma_l^\prime} - \lambda\sum_{a_{l-1}^\prime,​ a_l^\prime} M^{\sigma_l^{\prime}}_{a_{l-1}^\prime,​ a_l^\prime} + \end{align} + + Now we get introduce $H_{\text{Eff}}$ by: + \begin{align} + H_{\text{Eff}(\sigma_la_{l-1}a_l)(\sigma_l^{\prime}a_{l-1}^{\prime}a_l^{\prime})}=\sum_{b_{l-1},​ b_l}L_{a_{l-1}^\prime,​ b_{l-1}}^{a_{l-1}} W_{b_{l-1}, b_l}^{\sigma_l,​ \sigma_l^\prime} R_{a_l^\prime,​ b_l}^{a_l} + \end{align} + and $\nu_{\sigma_la_{l-1}a_l}=M_{a_{l-1},​ a_l}^{\sigma_l}$. So we get a local Eigenvalue problem: + \begin{align} + H_{\text{Eff}}\nu - \lambda\nu = 0 + \end{align} + This eigenvalue problem can be solved exactly (or approximately by a Lanczos) and the smallest eigenvalue gives the optimized value for the overall groundstate energy. The related eigenvector $\nu_0$ gives the new optimized matrix: $M^{\sigma_L}$ + + + ===== Iterative Optimization ===== + In order to optimize a given starting state $\lvert \psi \rangle$ iteratively to represent the groundstate we will proceed as follows: + + * **Initialization** + At first we will fill the matrices of the starting state with random values. Then we will right-normalize the state and by this create all the $R$-matrices The we will create the Hamilton-MPO. The sweep will start at site $l=1$ and the MPS has the form : + \begin{equation*} + M^{\sigma_1}B^{\sigma_{2}}\cdots B^{\sigma_L} + \end{equation*} + * **Right Sweep-Step ** + At first we will setup the $H_{\text{Eff}}$ at site $l$ and solve the related local Eigenvalue problem. Then we will update the matrix $M^{\sigma_l}$ by the (reshaped) eigenvector of the smallest eigenvalue (1). In order to extend the left-normalized block we will apply a SVD on the new matrix $\tilde M^{\sigma_l}$ and save the  $U$ as the new  $A^{\sigma_l}$ while the $S V^{\dagger}$ is multiplied to the matrix ​ $B^{\sigma_{l+1}}$ which shrinks the right block (2) : + \begin{align*} + A^{\sigma_1}\cdots A^{\sigma_{l-1}} M^{\sigma_{l}} B^{\sigma_{l+1}}\cdots B^{\sigma_{L}} ​ + &​\xrightarrow{(1)}A^{\sigma_1}\cdots A^{\sigma_{l-1}} \tilde{M}^{\sigma_{l}} B^{\sigma_{l+1}}\cdots B^{\sigma_{L}}\newline + &​\xrightarrow{(2)}A^{\sigma_1}\cdots A^{\sigma_{l}} M^{\sigma_{l+1}} B^{\sigma_{l+2}}\cdots B^{\sigma_{L}} + \end{align*} + From the singular values (diagonal elements of $S$) ​ we can setup the density matrix $\rho=S^2$ and calculate the "von Neumann"​-Entanglement-Entropy $S_{\text{vN}}(\lvert\psi\rangle)=-\text{Tr}\rho \log_2 \rho$. Then we will proceed with the next site ($l\rightarrow l+1$). + * **Complete right sweep** + Repeating step 2 until we have reached $l=L$ will give one complete right sweep. The MPS is now left-normalized:​ + \begin{equation*} + A^{\sigma_{1}}\cdots A^{\sigma_{L-1}}M^{\sigma_L} + \end{equation*} + * **Complete left sweep** + Analogously to step 3 we can sweep also to the left side. We will start with site $l=L$ and end with site $l=1$. + * **Sweeping** + Repeating steps 3 and 4 is called sweeping. We will stop the sweeping untill a certain criteria is fullfilled (e.g. convergence,​ max. number of sweeps) + + + ©: This site was created by Piet Dargel and Thomas Köhler ​ 