High accuracy capillary network representation in digital rock reveals permeability scaling functions

Permeability is the key parameter for quantifying fluid flow in porous rocks. Knowledge of the spatial distribution of the connected pore space allows, in principle, to predict the permeability of a rock sample. However, limitations in feature resolution and approximations at microscopic scales have so far precluded systematic upscaling of permeability predictions. Here, we report fluid flow simulations in pore-scale network representations designed to overcome such limitations. We present a novel capillary network representation with an enhanced level of spatial detail at microscale. We find that ﻿the network-based flow simulations predict experimental permeabilities measured at lab scale in the same rock sample without the need for calibration or correction. By applying the method to a broader class of representative geological samples, with permeability values covering two orders of magnitude, we obtain scaling relationships that reveal how mesoscale permeability emerges from microscopic capillary diameter and fluid velocity distributions.


Capillary network extraction algorithm
A commonly used formal definition of the centerline, which is agnostic to the extraction algorithm, requires these four properties to be obeyed: 1. Connected: There must have at least one path between any two voxels of the centerline. 2. Centered: Any voxel of the centerline must be centered with respect to the object's boundary.
3. Thin: Centerline should be only 1-voxel thick. 4. Insensitive to boundary noise: Small surface details should not produce large twists or numerous small branches in the centerline.
In the following, we outline a novel adaptation of the Dijkstra shortest path algorithm to extract the centerline, or CNM, from a rock tomography image cube. The algorithm uses a penalization function and gradient vector for building a centerline with minimal centerline property violation, while detecting and connecting image voxels that form cycles in the image.
The main procedure of our algorithm is detailed in Algorithm 1. A graph G(V, A) is given to this algorithm representing the rock. The set V denotes the voxels v and A denotes the set of directed arcs (vout, vin) that leaves a vertex vout and arrives at neighbor voxel vin. On top of G, in line 1, every voxel v in V has its distance to the closest boundary of the rock computed and the weight of all arcs arriving at v set accordingly to its distance. Then, in line 2, the source voxels located at the rock's face are identified. These voxels will be used by the shortest path algorithm in the next steps. Subsequently, an empty set with all centerlines and an empty set of voxels visited by a centerline are initialized in lines 3 and 4, respectively. The loop that starts at line 5 iterates over the source voxels of the facet and, if a voxel was not yet visited by any other centerline of a previous iteration, the adaptation of the Dijkstra's algorithm is performed in line 7. The centerline in represented by a subgraph of G, namely Dcenterline. Next, in line 8, the voxels in centerline Dcenterline, denoted by V(Dcenterline), are marked as visited.
In line 9, the computed centerline is stored in the set all_centerlines. Throughout the rest of this section, a detailed explanation of the methods in lines 1, 2, and 7 will be provided. If voxel ∉ visited_voxels: 7 Dcenterline = Compute-Centerlines(G, distances, source = voxel, targets= source_voxels, local_maxima) 8 visited_voxels = visited_voxels U V( Dcenterline ) 9 all_centerlines = all_centerlines U Dcenterline 10 return all_centerlines In the first method, the algorithm establishes the distance map of the object by computing for each pore space voxel its closest distance to the boundary. The distance is determined by the recursive function dmin(v) that calculates for a given voxel v the distance from v to the closest boundary voxel.
The boundaries voxels vb are set as the base case of the recursion; dmin(vb) = 0. For all other voxels v, their distance from the closest boundary voxel is defined as is the set of v's neighbor voxels and E(v,t) is the Euclidian distance between v and t. In the definition of neighborhood applied here, voxels v and t are neighbors if they share a Face, Edge or Vertex, leading to Euclidian distances of 1, √2 or √3, respectively.
The algorithm uses a priority list as its main data structure for iteratively computing dmin(v) for every voxel v. First, boundary voxels vb are added to the list with their priority set to 0. Then, in the main loop, the voxel t with lowest priority (voxel index is used as tiebreaker) is picked from the list and the distance to its neighbor voxel v (that were not yet picked from the list) is set as follow: The process finishes once the priority list is empty. In addition, voxels v for which dmin(v) is greater or equal to dmin with regards to all adjacent voxels are annotated as local maxima leading to the creation of labeled, local voxel clusters. The main differentiator of our algorithm with regards to published works [42, SR1, SR2] is that the source and targets voxels of the centerline to be used in the Dijkstra shortest path algorithm are placed in the face of the cube that represents the image.

Algorithm 2: Compute-Distance-Map (Graph: G (V, A))
priority_list.add(v,0) 6 While priority_list not empty: For vn ∈ neighbors(v) -finished: # all neighbors except those already set as finished 10 If For identifying those source voxels, a depth-first search (DFS) algorithm [SR3] is executed for every voxel residing in the faces of the cube. The DFS algorithm will run multiple times, once for each cluster of connected voxels in the face. We visualize the outcome of DFS application in Supplementary Figure  S1, in which the voxels with the highest values of dmin(v) in a cluster (in black) are selected and highlighted (in red). Once the process is completed for all faces, one of the selected voxels is set as source voxel and the others are considered target voxels for application of the Dijkstra shortest path algorithm [SR4]. The output of this algorithm is an acyclic connected graph (tree) consisting of pathways that connect the source voxel to target voxels. The centerline is then built by taking the subset of these pathways that connects source voxel to the target voxels.
In the following, we briefly outline the Dijkstra shortest path algorithm for computing the pathways and the penalization function for minimizing a centerline's property violations. The algorithm relies on two auxiliary data structures (see Algorithm 4): one that stores the cost of shortest path from the source voxel s to each voxel v; cmin(v); and another one that stores the predecessor voxel of each voxel v; pred(v), associated with its respective shortest path to the source voxel. The cost function cmin(v) in the original Dijkstra shortest path algorithm is accumulative, that is, it represents the cost of the entire path starting from the source voxel to the target voxel v. To compute the centerline, the Dijkstra shortest path algorithm version adapted here only accumulates the penalty cost from the previous step. As a result, the algorithm connects voxel v to the neighbor that better suits the definition of the centerline by being "most centered" rather than the "shortest" path.
Similar to the original version of the algorithm, we use a priority list to iteratively compute the path from voxel s to each target voxel v that best fits the centerline definition. In each iteration, after voxel t is picked up from the priority list, the cost of the path of each of its neighbor voxel v that were not yet picked up from the priority list is updated as follow: • cmin(v) = penalty(t,v) and included into the list, if visited for the first time; or • cmin(v) = min {1 + penalty(pred(t), t) + penalty(t,v) + (1/ dmin (v)) * 1E3, cmin(v)}, if already visited.
In the first case, in which a known path from source to v does not exist, the path cost to traverse the graph from source to voxel v and the predecessor voxel of v are updated (pred(v) = t). In the second case, in which a path from the source to v is known, the path cost and the predecessor voxel are updated if, according to the penalty function, the connection between voxel t and v better represents the centerline than the existing connection between pred(v) and v.
The penalty function is vital for the algorithm to heuristically produce pathways matching the centerline definition as close as possible. For avoiding image boundaries, the penalty function takes into account a normalized gradient vector computed for every voxel based on distance transformation. The gradient vectors indicate which direction keeps the centerline away from the boundaries. The penalty of going from t to v depends on the angle α formed by the line that connects voxel t to v and the gradient vector at voxel t; penalty(t,v) = 0.5 + ( sin 2 (α)+1 ) / dmin (v). However, inside a cluster of voxels annotated as local maximum, we use an alternative penalty function that sets a straight pathway.
Finally, we note that typical 3D rock tomography images contain centerlines with cycles. However, the centerline produced by the algorithm described above is acyclic. Therefore, we detect during the centerline algorithm potential pairs of voxels forming centerline cycles. A pair of neighbor voxels t and v is set to form a potential cycle when the voxel v is removed from the priority list has its neighbor t that was already removed from the queue and satisfies the following conditions: (1) sin 2 (α) < 0.1, where α is the angle between the line from t to v and the gradient vector of v and, (2), the resultant cycle forms an LMpath [SR2]. finished.add(v) 10 For v ∈ neighbors(v) -finished: # all neighbors except those already set as finished 11 If v ∉ cmin: For v ∈ neighbors(v) ∩ finished: If sin(α) < 0.1and LMpath(t, v, pred, local_maxima): 26 end_points = end_points U { t, v} 27 centerlines = build_centerlines(end_points, pred) 28 return centerlines

Distribution of capillary diameters and microscopic flow speeds
We have investigated the capillary diameter distributions of two representative samples in the set. Supplementary Figures S3(a) and S3(c) show the Parker (B) and Bentheimer (K) samples, respectively. Both Parker and Bentheimer exhibit a peak at around 10-15 µm in their capillary diameter distributions. However, Bentheimer distribution is broader with higher mean diameter of 30 µm, as compared to 16 µm for Parker.
To analyze the flow behavior, we plot the volume-weighted flow speed distribution for Parker and Bentheimer samples in Supplementary Figures S3(b) and S3(d), respectively. Both distributions are bimodal, however, with varying peak positions and relative weights. While the first peak occurs at negligible flow speeds, of the order of pm/s, the second peak exhibits flow speeds of the order of µm/s. In Parker, there is a larger fraction of the fluid volume at rest, leading to a small volumeweighted average flow speed of 2 µm/s. In contrast, Bentheimer has a much higher volume-weighted average flow speed of 23 µm/s. The comparison of flow speed distributions reveals that the connection between porosity and permeability is not straightforward: the near-zero-velocity peak represents the volume fraction of the porous medium that, despite being fully connected, does not contribute significantly to permeability. In Supplementary Figure S4, we plot visual maps of the functional dependence of flow speed on capillary diameter obtained for four representative rock samples. All capillary diameter distributions are unimodal, while the flow speed distributions are bimodal. A weak correlation exists between the positions of the peaks, such that the peak at lower flow speed is more prominent for lower diameters and the peak at higher flow speeds is more prominent for higher diameters. Nevertheless, for any given diameter we observe both slower and faster flow speed components in the graphs.
Supplementary Figure S4: Visualizing flow speeds versus capillary diameters for representative rock samples. 2D-false-color plots depicting (a) the least porous, (b) the least permeable, (c) the most porous and (d) the most permeable sample of this study, respectively.