Crystallographic Lattice Boltzmann Method

Current approaches to Direct Numerical Simulation (DNS) are computationally quite expensive for most realistic scientific and engineering applications of Fluid Dynamics such as automobiles or atmospheric flows. The Lattice Boltzmann Method (LBM), with its simplified kinetic descriptions, has emerged as an important tool for simulating hydrodynamics. In a heterogeneous computing environment, it is often preferred due to its flexibility and better parallel scaling. However, direct simulation of realistic applications, without the use of turbulence models, remains a distant dream even with highly efficient methods such as LBM. In LBM, a fictitious lattice with suitable isotropy in the velocity space is considered to recover Navier-Stokes hydrodynamics in macroscopic limit. The same lattice is mapped onto a cartesian grid for spatial discretization of the kinetic equation. In this paper, we present an inverted argument of the LBM, by making spatial discretization as the central theme. We argue that the optimal spatial discretization for LBM is a Body Centered Cubic (BCC) arrangement of grid points. We illustrate an order-of-magnitude gain in efficiency for LBM and thus a significant progress towards feasibility of DNS for realistic flows.

Thus improving the accuracy of LBM, while keeping its parallel efficiency and large time-stepping intact, is an important challenge. The only working methodology for improving the accuracy of LBM is to refine the grid near the solid body or in zones of extreme flow variations 25,26 . A fundamental problem with such approaches is that the local accuracy of the method remains unchanged and optimization is done only with respect to the global distribution of grid points. For example, for decaying turbulence in a periodic geometry, local grid refinements cannot improve the accuracy of LBM.
Before describing our new method, we briefly summarize basic LBM algorithm. In any M-dimensional model (with M being 1, 2 or 3) 20,27 , typically denoted by DMQN, one defines a set of N discrete velocities c i , =  i N 1, 2, . To each discrete velocity a weight w i > 0 is assigned such that ∑ = w 1 i i . The central quantity of interest in the method is the discrete population f i (x, t), at the location x and time t. In D-dimensions, macroscopic quantities such as mass density ρ, momentum density ρu and energy density E ≡ (ρu 2 + Dρθ)/2, with reduced temperature θ are, 0 2 These constraints on weights ensure that For such models, the higher order moments are ρ ρ θ δ ρ θ = + = .

Computational Method: Crystallographic Lattice Boltzmann Model
In this section, we introduce our new grid structure which drastically improves the performance of LBM without compromising its on-lattice streaming and other computational efficiencies. We begin with the observation that the introduction of the stair-case geometry, as done in LBM, is similar to the generation of a Wigner-Seitz cell for a given lattice structure 29 . This is pictorially shown in Fig. 1. With this point of view, it becomes apparent that,

Shells
Discrete Velocities (c i ) Weight (w i ) SC (± 1, 0, 0), (0, ± 1, 0), (0, 0, ± 1) 2 27 FCC (± 1, ± 1, 0), (0, ± 1, ± 1), (± 1, ± 1, 0), out of all possible space-filling arrangements, the Simple Cubic (SC) Lattice for space discretization used by LBM is not very efficient for representing local curvatures. Intuitively, one would expect an octahedron to be a better object in resolving local curvatures. Thus, a better choice for spatial grid distribution is a Body-Centered Cubic (BCC) arrangement of grid points for which the Wigner-Seitz cell is a truncated octahedron 30 . To illustrate this point, we consider a rhombic grid in 2D. Figures 2 and 3 show that the discrete approximation of a circle is more accurate on the BCC grid than on an SC grid: the total number of boundary points are almost double. For any given lattice, the links on the grid in Fig. 3 act as discrete velocities in LBM. It is also clear from Fig. 3 that, unlike previous attempts to design body-fitted grids, with this lattice, we achieve better space coverage and also a unification of the discretization of space and velocities.
Based on this insight, we invert the argument of LBM and make spatial discretization the central point and decide the velocity space based on this discretization. We generalize the D3Q27 model on a BCC lattice by changing velocity vectors of type {± 1, ± 1, ± 1} to {± 1/2, ± 1/2, ± 1/2}. Hereafter, this model will be referred as the RD3Q27 model. The set of discrete velocities and corresponding weights are listed in Table 2. For the RD3Q27 lattice, the reference temperature changes to θ 0 = 1/5 (see Eq. (6)). This procedure also removes some of the known artifacts associated with the velocity space discretization used in LBM. Finally, we demonstrate that these modifications improve the simulation efficiency by more than an order-of-magnitude. An alternate view point of the method is provided by entropic construction of LBM. In this formulation, one begins with a given H-function [31][32][33] and construct equilibrium as the minimum of the H-function under constraints. Typically, one constructs an isothermal equilibrium, where energy conservation is ignored. It can be shown that the true entropic equilibrium is well approximated by Eq. (5) at least till order O(Ma 2 ), where Ma is the Mach number. In dimensions higher than one, it is also possible to create the energy conserving equilibrium which we will use later. On the RD3Q27 lattice,   This ensures the conservation of density (ρ), momentum density (ρu α ) and energy density (ρu 2 + 3ρθ) during collision. The conserved quantities in terms of the equilibrium distribution function can be written as: At θ = θ 0 , the explicit expression for the equilibrium distribution function is: The corresponding equilibrium stress tensor and heat flux vector are given by, eq 0 eq 0 2 The better representation of extended hydrodynamics by present model can be seen by writing discrete moment chain. This moment chain is very similar to Grad's equation of the form 12,36,37 :

Shells
Discrete The closure relations for the higher order moments R αβγκ and N αβγ written in terms of other moments are given in Supplementary Appendix A. For the present model, Q xxx , Q yyy , Q zzz are independent variables and artificial closures as shown in the previous section (Eq. 3) do not exist. This happens because, in the diagonal direction |c ix | is 1/2 rather than 1. Thus one can expect that RD3Q27 should show enhanced accuracy for finite Knudsen flows as compared to D3Q27 38 . This set-up provides a good indication of the convergence of the discrete velocity model towards the Boltzmann equation. Finally, following standard route of discretization along characteristic directions using trapezoidal rule 20 , we can write fully discrete version of the kinetic Eq. 4 as where β is the discrete relaxation time. Notice this is similar to D3Q27 model, if we choose Δ t = Δ x, the advection happens from one lattice point to another and thus the method does not require any spatial interpolations. We recall that the BCC grid in 3D has an alternate interpretation in terms of replica of simple cubic grids displaced from each other by Δ x/2 in each direction where Δ x is the grid spacing.

Results
In this section, we present the results for standard canonical problems using the current RD3Q27 model. We first consider the planar Couette flow, a test problem for investigating the convergence of discrete velocity models towards the solution of the continuum Boltzmann-BGK equation 28 . In this set-up, the fluid is enclosed between two parallel plates separated by a distance L. The bottom plate at y = − L/2 moves with a velocity U 1 and top plate at y = L/2 moves with a velocity U 2 . For such boundary value problems relevant to micro-flows, one uses the discrete version of diffused wall approximation in LBM 12 . For this set-up, it is known that the D3Q27 model has poor accuracy at finite Knudsen numbers (defined as τ θ = L Kn 3 / 0 ); it works reasonably well only for Kn < 0.1. It can be seen in Fig. 4, the results of the RD3Q27 and D3Q27 models are similar at low Kn (< 0.1). Further, at high Kn, the shear stress ⁎ P ( ) xy predicted by RD3Q27 is in close agreement with the continuum Boltzmann-BGK solution.
The solution for the dimensionless shear stress obtained by solving the moment chain for the RD3Q27 model is, xy xy xy 6 15 where ∞ P xy is the shear stress at Kn → ∞ of the Boltzmann-BGK model and is given by It is evident that removing the artifact on the third moment has led to better agreement with the Boltzmann BGK values. Thus, unlike D3Q27 model, accuracy for the flows at finite Knudsen number has improved drastically. However unlike higher order LBM, kinetic boundary layer is not captured by the method.
To demonstrate the efficiency of this new framework for turbulence, we chose the setup of decaying turbulence starting with the Kida vortex initial condition 4,23 . In this flow, starting with a smooth initial condition, the enstrophy (Ω) and the maximum vorticity show very rapid growth. Thus it is not surprising that, for this set-up, the standard D3Q27 lattice Boltzmann model fails to make accurate predictions unless the number of grid points per direction is more than twice those for an equivalent spectral simulation.
In Fig. 5, the performance of our new BCC lattice based RD3Q27 velocity model is contrasted with the pseudo-spectral and the D3Q27 model for Re = 10,000. In the present model, Re is defined as, ρU 0 /ν, where in the PS method Re = 1/ν as U 0 = 1. It is evident from the results shown in Fig. 5 that our new method has drastically improved the performance of standard LBM. This gain can be understood in terms of the spectral representation of a bandwidth-limited function (i.e. whose Fourier coefficients are zero above a cut-off wave-number). Essentially, for a fully resolved three-dimensional simulation in a periodic geometry, a reasonable approximation is to assume that the functions to be modeled are isotropic and the resolution is bandwidth limited. This implies that, in Fourier space, there is no preferred direction and, hence, the efficient distribution of grid points is equivalent to the sphere-packing problem 30,39 . Thus the most efficient way to distribute points in the Fourier space is to arrange them as a FCC lattice. Consequently the best sampling lattice in real space is the BCC lattice, which is the dual of the FCC lattice 30,39 in the Fourier space. Thus the numerical gain in simulation accuracy is associated with the efficient choice of the lattice.
It must be noted that the parallel and serial efficiency of this new model is very similar to the standard D3Q27 model. Thus for the first time, LB simulations are able to surpass higher order models such as the spectral method. Here we emphasize that this new route to increase accuracy is not an alternative to building higher order LBM for better accuracy 23 . Indeed, by constructing higher order LB on BCC, the accuracy can be further improved.
Although RD3Q27 uses a grid of size 2N 3 as compared to standard case of N 3 , the performance gain is almost an order-of-magnitude. This happens because, for the same accuracy of simulation, the usual D3Q27 model requires a grid size of (MN) 3 , where M ranges from 2-2.5. This implies that the saving in the number of grid points is in the range of (MN) 3 /2(N) 3 , which is 4-7.8 times. Hence the saving in time is (MN) 4 /2(N) 4 i.e., at least 8 times (on a coarser grid, the time steps are larger).
To demonstrate this gain in the capability of LBM, let us consider the classical problem of flow past a sphere [40][41][42][43] . The simulations were performed for two different values of Re with Ma = 0.05 for D3Q27 and RD3Q27 models. The domain size, in terms of diameter D (measured in lattice unit), of 16D × 8D × 8D was used for Re = 50 and 20D × 10D × 10D for Re = 200. The simulations were performed on an Intel Xeon based system (CPU Model: E5-2670) with 16 cores. Figure 6 shows the percentage error (ε) in the mean value of it is evident that RD3Q27 performs an order-of-magnitude better than D3Q27. This gain in performance can be seen more conveniently in Fig. 7, where percentage error (ε) is plotted against CPU-time(t in seconds). In this plot, one can see that for the same error, present method requires an order-of-magnitude less time when compared to D3Q27 LBM.
As the result for Re = 200 suggests that the computing requirement for present model is substantially lower than LBM, we performed this simulation of flow past sphere for a wide range of Reynolds numbers. To the best of our knowledge, none of the existing methods can predict the drag behavior over the wide range of Reynolds Numbers for which experimental data is available. In Figs 8 and 9 the drag coefficient C D , predicted by present model is contrasted with the experimental result in Achenbach et al. 40 and Bakic et al. 44 with Re = UD/ν where D  Table 3. Figure 10 shows the variation of drag coefficient C D with time at Re = 3.18 × 10 5 . We have implemented a version of the diffuse bounce-back rule as described in Krithivasan et al. 45 , for the solid-fluid interface at the body. For inlet-outlet, we have used Grad's closure approximation as described in Chikatamarla et al. 46 . To highlight the reliability and robustness of our approach, the angular distribution of pressure coefficient 2 on the surface of the sphere for Re = 10 5 and Re = 1.62 × 10 5 are presented in Fig. 11 and these are compared with the experimental data. It is evident from the plot that the quality of C P prediction  remains very good even with a sphere size of D = 160 (lattice points). Even for this size of grid, the error in mean value of C D is only around 8%. Along with this paper, we have also provided a supplementary video 1 online where the temporal evolution of the azimuthal component of vorticity ω φ , C P and C D are shown. Figure 12 shows a snapshot of the azimuthal component of vorticity. This also shows that our model captures the flow separation clearly. We believe that the reason for such good accuracy without a well-resolved boundary layer is largely due to the fact that the drag for flow past sphere is largely dominated by form drag, which does not crucially depend on boundary-layer resolution. Thus, with moderate resolution requirement, our approach is able to predict quantities of engineering interest quite accurately. This improvement in LBM is remarkable as typical LBM simulations  Table 3. Comparison of C D values obtained using the present RD3Q27 model with experimental data 40 .  for high Reynolds Numbers for any setup require the use of turbulence models 6,47-49 but present RD3Q27 model does not require any explicit model for turbulence.

Outlook
A new formulation of LBM, based on the crystallographic viewpoint has been presented. It has been shown that LBM based on the BCC lattice spatial discretization has efficiency that is an order-of-magnitude higher than traditional LBM. The gain obtained by using the BCC lattice has been demonstrated by various representative simulations such as Kida flow and flow past a sphere. To conclude, this new formulation is capable of solving problems of practical interest without using explicit turbulence modeling. Thus, the current approach of formulating LBM on a BCC grid raises the prospect of direct simulations of turbulence without using explicit turbulence models. Finally, we emphasize that this new route to increase accuracy is not an alternate to building higher order LBM for better accuracy. Indeed by constructing higher order LBM on BCC, accuracy can be further improved 50 . Similarly, role of collision models (multi-relaxation, entropic) 32,51 needs to be investigated for this lattice.