Treecode and Fast Multipole Methods

In the following we want to describe the structure of treecodes and Fast Multipole Methods. We omit technical details of the involved expansions. The aim is to give a high level overview without too much technical detail. ## Introduction Given a set of points $p_i$, $i=1,\dots,n$ inside the unit box $[0, 1]^2$. We want to compute all the interactions $ f(p_i) = \sum_{j = 1}^n c_j g(p_i, p_j) $ for each point $p_i$, where the $c_j$ are charges associated with the points $p_j$. Here, we use the function $g(x, y) = \frac{1}{2\pi}\log|x-y|$ for $x\neq y$ and $g(x, x) = 0$. Computing the interactions for all points $p_i$ is of complexity $O(N^2)$ if naively computed since we have to compute all pairwise interactions $g(p_i, p_j)$. In this description we will make use of [[Octrees and Quadtrees|quadtrees]]. See the corresponding description for definitions and notation. In particular, we require the notions of [[Octrees and Quadtrees#Near field and Interaction list| near fields and interaction lists]]. ## Basic idea Consider the figure below showing a number of points and a target point $p_i$ within one of the green target boxes. We split up the computation of $f(p_i)$ in the following way. $ \begin{align}f(p_i) &= \sum_{p_j \text{ in far field}} c_j g(p_i, p_j)+ \sum_{p_j\text { in interaction list}} c_jg(p_i, p_j)\nonumber\\ &+ \sum_{p_j\text { in near field}} c_jg(p_i, p_j)+ \sum_{p_j\text { in green target boxes}} c_jg(p_i, p_j). \end{align} $ ![[Structure of the Fast Multipole Method 2024-11-23 23.09.39.excalidraw.png]] We call the points in the green boxes the targets. The red boxes denote the near-field and the blue boxes are part of the interaction list. For precise definitions see [[Octrees and Quadtrees]]. We assume the following basic principles. - We never directly compute interactions of the targets with the far-field points. - The complexity of computing the field terms arising from the points $p_j$ in the interaction list is constant with respect to the number of points $p_j$ in the interaction list and only depends on the desired accuracy $\epsilon$. The first assumption is intuitive. The second one implies that we can replace the field generated by the points in the interaction list with some kind of equivalent field that is easy to evaluate and approximates the original field up to an accuracy $\epsilon$ at the target points. For now we focus on the first point and describe how to compute all the interactions hierarchically by only ever working with at most points that are as far away as the interaction list. ## Domain partitioning We are recursively partitioning the domain into a quadtree by splitting up each box into four descendents. Level 0 is the original domain without split. On Level 1 we have four boxes. We denote a Box $i$ on level $j$ by $B_i^{(j)}$. Note that we have $4^{j}$ boxes on level $j$. Each box on level 1 only has direct (near field) neighbours. Things get more interesting on Level 2. Consider the green box. Its near field neighbours are shown red. The interaction list is shown blue. It consists of all children of the neighbours of the parent box of $B_i^{(j)}$ that are not in the near field of $B_i^{(j)}$. Consider the green box in level 2 of the figure. Its parent is the green box on level 1. To get the blue boxes we take the children of the red boxes on level 1. All children that are not direct neighbours of $B_i^{(j)}$ are part of the interaction list. If we go further down to level 3 the interaction list of the green box is larger. It has $27$ boxes. This is the largest size an interaction list of a single box can have, no matter what level we are. The white boxes on level 3 are in the far field. We do not consider these at all on this level when computing the field for the green box. ![[Structure of the Fast Multipole Method 2024-11-24 10.42.32.excalidraw.png]] ## A simple treecode top-down algorithm We compute all the interactions with the following top-down pass. We start on level 0. We can't do much here. So we recurse down to level 1. Again, we only have near-field neighbours and can't do much either. So we recurse down to level 2. We can now sum up the contributions from the blue boxes and obtain for each target point $p_t\in B_i^{(2)}$ the value $ \hat{f}^{(2)}(p_t) = \sum_{j \text{ in interaction list}} c_j g(p_t, p_j). $ We do this computation for all boxes on level 2 and their respective interaction lists. We now recurse down to level 3 and consider the green box there. Again, we sum up the contributions from the interaction list. But we also need the previously computed values $\hat{f}^{(2)}(p_t)$ for each target point $p_t$ in the green box from the previous level. We hence obtain $ \hat{f}^{(3)}(p_t) = \hat{f}^{(2)}(p_t) + \sum_{j \text{ in interaction list}} c_j g(p_t, p_j). $ The value $\hat{f}^{(2)}$ contains all the contributions from the boxes who are now in the far-field of the green box. We can just continue further down until we are at the deepest level $\mathcal{L}^*$ of the tree. At the deepest level we need to add the contributions from the near-field and the self interactions of the points in the green box. We obtain the final value $ f(p_t) = \hat{f}^{(\mathcal{L}^*-1)}(p_t) + \sum_{p_j \text{ in interaction list}} c_j g(p_t, p_j) + \sum_{p_j \text{ in near field }} c_j g(p_t, p_j) + \sum_{p_j \text{ in green box}} c_j g(p_t, p_j). $ This is called a treecode algorithm and goes back to original ideas by [[#^238aa5| Barnes and Hut]]. ## Complexity of the treecode algorithm The purpose of this tree-based reordering of the operations in the FMM is revealed through the second assumption, namely that the interactions with boxes in the interaction list can be computed up to an accuracy $\epsilon$ with a complexity that is independent of the number of source points in the interaction list. We are not yet concerned with how this is possible. We just assume that it is. We furthermore assume that the number of levels in the tree grows logarithmically with the overall number $n$ of particles so that at the deepest level we always have at most $M_{max}$ points in each box. This requires that $\mathcal{L}^* \sim \log n$. On each level we have to compute the interactions of each of the $n$ points with their respective interaction list. But under the assumption that this computation has constant complexity with respect to $n$ we obtain an overall complexity of $\mathcal{O}(n\log n)$ since the number of levels behaves like $\log n$ and on each level we have $n$ points for which to compute the influence from the interaction list. On the deepest level we have to include for each point the evaluation of the near field contribution. But this is a multiple of $M_{max}$ and independent of $n$. Hence, the complexity of this additional step is $\mathcal{O}(n)$, leading to the following overall result. **The complexity of the treecode algorithm is $\mathcal{O}(n\log n)$. ## A simple approximate summation for the interaction lists For the Laplace kernel $g(x, y) = \frac{1}{2\pi}\log|x-y|$ it is possible to derive a simple summation procedure that allows the computation of the interaction list contributions in constant complexity. Consider a disk of radius $R$ and complex points $z_i$ within the disk. For an evaluation point $z$ in the complex plane with $|z|> R$ the following approximation holds. $ \phi(z):=\sum c_i \log(z-z_i) = \log z (\sum c_i) - \frac{\sum c_i z_i}{z} + \mathcal{O}\left(\frac{R}{z}\right)^2. $ Note that this is the complex logarithm. The desired real logarithm of the absolute distances of the points is obtained by taking real parts since $ \log|z-z_i| = \text{Re}(\log(z-z_i)). $ The first term $\sum c_i$ in the expansion is the net charge. The factor $\sum c_i z_i$ in the second term is called the dipole moment (see also [[#^a9a137| Greengard, 1990]]) For each level and each box we now sum up the net charges and suitably shifted dipole moments centred in the boxes. This costs $\mathcal{O}(n)$ for each level since each point on each level contributes to a single box. Once this is computed we use the above expansion to approximately evaluate the influence of each box in the interaction list on the target points. This is of constant complexity for each box in the near field. ## From Treecode to the Fast Multipole Method (FMM) The FMM goes back to work by [[#^fefc3a| Greengard and Rohklin]]| and is similar to treecode methods but uses additional approximations to reduce the overall complexity from $\mathcal{O}(n \log n)$ to $\mathcal{O}(n)$. A good comparison of treecode and FMM is also given by [[#^a9a137| Greengard, 1990]]. The reason that treecode has a complexity of $\mathcal{O}(n\log n)$ is that in each level we have to do operations for all $n$ particles. Since there are of the order $\log n$ levels we obtain the above complexity result. But what happens if instead of the particles the complexity would only depend on the number of boxes in each level? On the leave level we have $\mathcal{O}(n)$ boxes each with at most $M_{max}$ particles. On the next higher level we have on the order of $\frac{n}{4}$ boxes and so on. So if the number of operations on each level is proportional to the number of boxes and not the number of particles in the level, the total work $W$ will scale like $ W\leq C(n + \frac{n}{4} + \frac{n}{4^2} + \frac{n}{4^3} + \dots )\leq \frac{4C}{3}n, $ where $C$ is a proportionality constant for the work in each box that depends not on $n$. Hence, the overall complexity reduced from $\mathcal{O}(n\log n)$ to $\mathcal{O}(n)$. So how do we achieve this? The key idea is that we not only have an approximation of the outgoing (or exterior) field in each box but also an approximation of the incoming field. ![[Structure of the Fast Multipole Method 2024-11-24 15.50.53.excalidraw.png|100%]] Instead of simply evaluating the field from the interaction list at each target point we will approximate the outgoing field from the boxes in the interaction list through local field expansion in the target box. For the local field for a box $B_i^{(j)}$ we introduce a representation of the form $ \hat{f}_{loc}^{(B_i^{(j)})}(x) = \sum_{\ell}^p \hat{c}_{\ell}h_\ell(x),~x\in B_i^{(j)} $ The number of local terms $p$ depends on the desired accuracy but crucially not on the number of particles in the box. Furthermore, in generalisation of the treecode approximation with net charges and dipole moments we introduce an outgoing approximation for each box of the form $ \hat{f}_{out}^{(B_i^{(j)})}(x) = \sum_{\ell}^p \hat{q}_{\ell}g_\ell(x),~x\in B_i^{(j)}. $ This outgoing approximation is valid for $x$ in the exterior of a disk of radius $R$ around the box $B_i^{(j)}$. Traditionally, the outgoing approximation is called a multipole expansion. The FMM now proceeds in the following passes - Bottom-Up Pass (from the leaf up to level 2): 1) Compute for each leaf box the outgoing approximation $\hat{f}_{out}^{(B_i^{(\mathcal{L}^*)})}(x)$ as approximation of the field generated through the particles $p_i$ in this box. This is called the *Particle-To-Multipole* operation (P2M) and costs $\mathcal{O}(n)$ since each particle needs to be incorporated into the outgoing field of the associated leaf box. 2) For each box approximate the outgoing field for the parent box via summing up the outgoing fields of the child boxes. This is called the *Multipole-To-Multipole* (M2M) operation. The details depend on the type of expansion. But the cost is the same as the number of boxes on each level. Hence, as before since the number of boxes is geometrically decreasing, the overall work is $\mathcal{O}(n)$. * Top-Down Pass (from level 2 up to the leaf) 1) Sum the outgoing field contributions from the interaction list into the local field contributions for each box. This is the Multipole-2-Local operation (M2L). It costs a total of $\mathcal{O}(n)$ as it only depends on the number of boxes in each level and the work per box is bounded by the size of the interaction list. 2) Sum down to the child level the parent contributions of the local field into the local field of the children. This again costs $\mathcal{O}(n)$. 3) Finally, on the leave level evaluate the local fields of each box at the associate particle positions in the Local-2-Multipole (L2M) operation and sum the contributions of particles in the near-field directly (P2P), again a $\mathcal{O}(n)$ operation. We see that the FMM is similar to treecode methods with the two differences being - Instead of directly summing up contributions from the interaction list into the particles they are summed onto local field expansions in each box. - Multipole field approximations and local field approximations are built up bottom-up/top-down to achieve linear complexity for this step. ## References - Barnes, J. & Hut, P. A hierarchical O(N log N) force-calculation algorithm. _Nature_ **324**, 446–449 (1986). ^238aa5 - Greengard, L. The Numerical Solution of the N -Body Problem. _Comput. Phys._ **4**, 142–152 (1990). ^a9a137 - Greengard, L. & Rokhlin, V. A fast algorithm for particle simulations. _J. Comput. Phys._ **73**, 325–348 (1987) ^fefc3a