Approach of automatic 3D geological mapping: the case of the Kovdor phoscorite-carbonatite complex, NW Russia

We have developed an approach for automatic 3D geological mapping based on conversion of chemical composition of rocks to mineral composition by logical computation. It allows to calculate mineral composition based on bulk rock chemistry, interpolate the mineral composition in the same way as chemical composition, and, finally, build a 3D geological model. The approach was developed for the Kovdor phoscorite-carbonatite complex containing the Kovdor baddeleyite-apatite-magnetite deposit. We used 4 bulk rock chemistry analyses – Femagn, P2O5, CO2 and SiO2. We used four techniques for prediction of rock types – calculation of normative mineral compositions (norms), multiple regression, artificial neural network and developed by logical evaluation. The two latter became the best. As a result, we distinguished 14 types of phoscorites (forsterite-apatite-magnetite-carbonate rock), carbonatite and host rocks. The results show good convergence with our petrographical studies of the deposit, and recent manually built maps. The proposed approach can be used as a tool of a deposit genesis reconstruction and preliminary geometallurgical modelling.

Methodology of spatial analysis has been developed significantly in recent years [1][2][3][4] . However, geological maps highly depend on subjective factors such as theoretical basis of authors, accepted concept of geological structure genesis, etc.
Most of three-dimensional geological modelling methods are based on cross-sections built manually using borehole data, and/or serious expert solutions accepted during the modelling process, e.g. refs 5 and 6. The method based on potential-field interpolation and geological rules 7 seems to be one of the best methods of 3D geological modelling that is less affected by subjective factors. However, development of a priori geological rules (age relations and rock sequences) often presents some difficulty, especially in case of complicated deposits such as complex magmatic, metasomatic, hydrothermal, etc. Such deposits usually have continuous transitions and uncertain age relations between different rock types, and questionable genesis. Another nearly data-driven method for geological modelling is recently presented by Hajsadeghi and colleagues 8 .
A new method of 3D geological mapping has appeared as a result of our attempts to minimize the influence of the controversial aspects on geologic modelling of the Kovdor phoscorite-carbonatite complex hosting the Kovdor baddeleyite-apatite-magnetite deposit (Murmansk region, NW Russia, Fig. 1).
Besides, there is a problem: ore classification for geological and geometallurgical mapping must be based on the rock modal composition; but mineralogical studies of numerous samples are too expensive and time-consuming (therefore, only chemical composition of ores has been analyzed using a dense network of sampling points). This problem is important for mineral engineering, processing mineralogy and geometallurgy, because mineral processing is based on the properties of ore minerals. It is usually solved by indirect mineralogical characterization and element-to-mineral conversion, which are the cheapest and fastest methods [9][10][11][12] . The method of indirect mineralogical characterization enabled us to distinguish 16 ore types using contents of the main minerals (forsterite, apatite, magnetite and calcite) calculated from bulk rock chemistry analyses of Fe magn , P 2 O 5 , CO 2 and SiO 2 . Thus, the proposed approach can be used as a tool of preliminary geometallurgical 3D-modelling.

Geological setting
The Kovdor apatite-magnetite-baddeleyite deposit is situated in SW part of the Kovdor alkaline-ultrabasic massif with carbonatites. The deposit is a concentric-zonal pipe of barren (outer zone) and ore-bearing phoscorites (axled zone) filled with veins of calcite and dolomite carbonatites. Ore-bearing phoscorites are represented by continuous series of rocks consisting of apatite, magnetite, calcite and forsterite. Please see our previous publications for details [13][14][15] .
There are two points of view on the genesis of the Kovdor deposit: plutonic and hydrothermal-metasomatic. Plutonic hypothesis interprets different phoscorites and ore types as different intrusive phases; the main process of ore genesis was crystallization from magmatic melt 16,17 . According to the hydrothermal-metasomatic hypothesis, phoscorites are the product of hydrothermal-metasomatic alteration of initially intrusive body and crystallization from postmagmatic solutions and emanations; the main process of ore genesis was a replacement [18][19][20] . Discussion of the Kovdor phoscorite and carbonatite genesis is not finished 13,14,21 ; therefore, the geological modelling method with minimum a priori genetic notions is especially required for the Kovdor deposit.

Results
A general methodology of the 3D geological mapping. The method of automatic 3D geological mapping involves the following actions: 1. Collect representative samples from a deposit. Specify chemical composition of rock samples and percentage of rock-forming minerals. 2. Develop an appropriate petrographic classification of rocks based on major minerals modal composition, corresponding to generally accepted classifications (e.g. IUGC classification 22 for igneous rocks) and/or geometallurgical requirements. Define a rock type for each sample of "learning set" in accordance with the classification. 3. Define relationships between chemical and mineral composition of rocks, in one way or another. We propose mineral norm (normative mineral contents) calculation, statistical methods (artificial neural network, multiple regression, etc.) and logical computation of rock type boundaries (see the next section). Derive an algorithm defining a rock type by its chemical composition in accordance with the relationships specified above. 4. Build a set of block models showing a spatial distribution of the elements concentrations and merge these into one table. All essential conditions for geostatistical analysis and interpolation shall be maintained. General faults are taken from the historical database or revealed from interpolation patterns. Secondary (intraformation) faults are ignored. 5. Evaluate a rock type for each block in accordance with the algorithm specified above, and build a 3D block model with the evaluated rock type. It will be a final result. 6. Testing the results. Implication for the Kovdor phoscorite-carbonatite complex. Results will be presented in accordance with the actions listed in the section A general methodology of the 3D geological mapping.
1. Sampling of the deposit rocks and ores are described in the section Materials and Methods. Since a sample size for chemical analysis is three orders bigger than that for petrographic analyses (16 m and 3 cm, correspondingly), we selected petrographic samples (thin sections) representatively reflecting the corresponding core interval. To analyze the relationship between mineral and chemical compositions, we excluded petrographic samples taken from the ends of core intervals, and from inhomogeneous intervals. So, we excluded 276 petrographic samples, and finally analyzed 274 samples. Besides, these excluded samples were used for testing further results of rock type 'prediction' by their chemical composition. Statistical parameters of components used for automatic 3D geological mapping (SiO 2 , Fe magn , P 2 O 5 , CO 2 ) are shown in Table 1 and then determine rock type in accordance with the classification by mineral composition (Fig. 2).
Multiple regression. We performed multiple regression prediction (a general linear model) of calcite, magnetite, apatite and forsterite percentage by the bulk rock chemistry with centered log-ratio transformation of the bulk chemistry to avoid spurious correlation in closure numerical systems 23,24 . The method of multiple regression is forward stepwise, intercept is included in the model; tolerance is 0.0001. Coefficients of unknown parameters (β) and linear multiple correlation R of the regression are shown in Table 2. Only calcite prediction showed a reasonable coefficient of multiple correlation, and prediction of the other minerals was rather bad.
Artificial neural network. We have built some neural networks of different architectures (multilayer perceptron and radial basis function network) for revelation of relationship of bulk rock chemistry and rock type, i.e. it has been a classification problem, not a regression problem as in paragraph 3.2. It has been a supervised learning: train set included 70% of samples, testing set 15% and validation set 15%. The best model is radial basis function network with 4 input neurons, 26 hidden neurons, 14 output neurons, with Gaussian hidden activation and Identity function for output activation, error function is the sum of squares. A training performance of the network is 74.05%, test performance is 71.79% and validation performance is 74.36%. The trained neural network is included in Supplementary Materials 1.  Fig. 3B); but the host rock is overlapping forsterite-rich phoscorites in SiO 2 plot. So, to separate the host rock, we introduced the second discrimination parameter: P 2 O 5 < 2 mass % (Fig. 3D). All rocks associated with host rocks in terms of their real mineral content were also characterized as host rocks in terms of their SiO 2 and P 2 O 5 content.  Fig. 2).
One can see that calcite-bearing phoscorites (with index "C") are well separated from low-calcite ones (without index "C") by separation line at CO 2 = 9.4 mass %. Only three samples without index "C" (MF and Magnetitite) lie above the line. Phoscorites with index "M" (magnetite ores) are well separated by the line at Fe magn = 14.2 mass % -only one magnetite-bearing sample (MAF) lies in "non-M" area, and one magnetite-free sample (AF) lies in "M area" (Fig. 3A). Two apatite-bearing outliers (CMAF and CMA) essentially below the line separating "A area" and "non-A area" (P 2 O 5 = 4 mass %), and two forsterite-free outliers (MA) in "non-F area" (separation line is SiO 2 = 7 mass %) are probably caused by unrepresentative interval characterization by thin section; though we excluded 50% of initial sample set.
Separation of mineral indexes by chemical composition of rocks allowed us to create a logical scheme for automatic determination of a rock type. Logical evaluation sequence for rock type determination by chemical composition is shown on the following flowchart (Fig. 4).
Beside 16 rock types (14 types of phoscorites, carbonatite and host rock), "unrecognized" type was delineated. "Unrecognized" value appears if the set of mineral indexes becomes "_ _ _ _" or "C _ _ _". We assume the blocks with this value probably represent either dolomite-rich rocks (dolomite carbonatite or magnetite-serpentine-dolomite ore) or artifact. Anyway, unrecognized blocks are less than 0.1% of final geological 3D block model.
The set of logical formulas in MS Excel format is included in Supplementary Materials 2. 3D block models of spatial distribution of SiO 2 , Fe magn , P 2 O 5 , CO 2 were built. The parameters were interpolated by ordinary kriging. As a result, we generated 4 block models with 249437 blocks each. Their horizontal (−110 m) sections are shown in Fig. 5.
The block models were integrated in a single file, and then each of developed predicting models (i.e. mineral norms, multiple regression, neural network and logical computation) was applied to each block. As a result, we  built four 3D models of spatial distribution of the rock types, i.e. geological model. Horizontal and transverse vertical sections of the block models are shown in Fig. 6. The automatically built maps do not show vein complexes; however, some of them can play a significant role in understanding the deposit structure. So, we showed a zone of dolomite carbonatites manually. This zone was delineated based on our more than twenty-years' observations 15 , and the borehole data collected from 2008 to 2012. The borehole intervals with dolomite carbonatite projected on horizontal plane are shown in Fig. 7. Direction of linear dolomite veins conforms to the regional North-East fault intersecting the Kovdor massif. The north-eastern end of the fault controls the alkaline-ultrabasic massif Maliy Kovdor. Please note the general concordance of the central oval structure consisting of calcite-rich phoscorites with the dolomite carbonatite zone (see Fig. 6).    Fig. 2). Y-axis is chemical composition of rocks; bold numbers are values for logical evaluation of a rock type (Fig. 4).  A testing set consisted of 276 samples (drillcore intervals and corresponding thin sections), which were excluded from 'learning' of the classification system because of their inhomogeneity. Quality of prediction was defined as percent of coincidences of mineral indexes (C, M, A, F) of rock type between predicted from chemical composition and manually measured by thin sections. The coincidences are shown in Table 3.

Discussion
Advantages and limitations of the approach. In Figs 6 and 8, we compared the automatic maps with manually built maps 13,15 . One can see a good general similarity between maps created by neural network classification (Fig. 6C), by logical evaluation of rock types (Fig. 6D)  ( Fig. 8), on other hand. The map shown in (Fig. 8A) is based on routine documentation of the Kovdor "Zhelezny" (Iron) quarry, and exploration borehole data; it reflects actual data as of January 1, 2002. The map shown in (Fig. 8B) is based on the borehole data collected during 1940-2011, and our more than twenty-years' observations over the quarry operation. So, these two maps are considered to be the most recent and representative. The most evident distinction between the automatic map and the recent "manual" maps is the absence of carbonatite vein stockwork. In fact, the method does not generate any geological bodies smaller than the average sampling interval (16 m in the Kovdor deposit case). Average distance between borehole profiles seems to be a "resolution threshold" for linear object (i.e. faults); in our case, it is 50 m. So, there is no abundant carbonatite stockwork pattern except for the largest bodies, and no complicated fault network is presented on N.I. Krasnova's map 16,17 . We examined the modelling procedure to find a potential error source. The primary error source (excluding, certainly, sampling and analytical problems) is classification of rock types by their chemical composition. It appears that we cannot use the method if there are no precise classification boundaries between different types of rocks. But this problem can result from inadequate petrologic classification, i.e. absence of the precise classification boundaries can be a reason for petrologic classification revision. In case of the Kovdor deposit, only 9 of 274 samples were outliers.
One can see spherical "bodies" on peripheries of horizontal and vertical sections (see Figs 5 and 6). Obviously, they appear due to sampling heterogeneity resulting in interpolation errors, i.e. if the distance between samples was more than 150 m (search radius), a sphere with radius 150 m was generated. It means that errors in the automatic 3D geological model can be mainly caused by interpolation inaccuracy. So, the method application requires a sufficiently close and homogenous sampling network, as well as accurate geostatistical studies ensuring reduction of unavoidable sampling incompleteness.
The second possible source of errors is underestimation of faults. Usually information about faults location is available from historical data. If not, interpolation pattern of rock bulk-chemistry can provide this information: very close convergence of interpolation isolines can indicate faults rather accurately. Comparison with other geological maps created by different geologists shows that there are no problems with fault underestimation in the Kovdor deposit case (Figs 6 and 8).    (Fig. 7). The 3D models were generated by Micromine 2016, commercial license (http://www.micromine.com).
Scientific REPORtS | 7: 6893 | DOI:10.1038/s41598-017-06972-9 A comparison of prediction quality over the test sample (Table 3) shows that the norms (mineral normative composition) calculation well predicts only apatite content and, to a lesser degree, magnetite content. This is due to the fact that Fe magn and P 2 O 5 are associated only with magnetite and apatite, respectively. Norms calculation works bad for forsterite because, besides forsterite, there are other silicates in the deposits -phlogopite, diopside, aegirine, nepheline etc. Causes of bad prediction of calcite are more complex. A main cause, we believe, is that the phoscorite-carbonatie complex and surroundings are cut by numerous thin carbonatite veinlets, and probability of presence of the veinlet in a drillcore interval is higher than in thin-section. A second probable cause is that besides calcite there are dolomite and other carbonates (magnesite, siderite, ancylite, etc.) in the complex, which have other ideal CO 2 content.
Prediction by a multiple regression is worst. We suppose that it is because of the closure problem of chemical and mineral composition of rock (i.e. the sum of components is 100), and linear statistical methods do not work properly, although we have used centered logratio transformation of bulk-chemistry. Probably, this type of transformation is not suited for multiple regression because the centered logratio transformation forms singular numerical system, i.e. sum of all variables, as well as determinants of covariance and correlation matrices are necessarily zero 24 .
Predictions by artificial neural network and by logical evaluation are the best (Table 3 and Fig. 6). Quality of prediction of magnetite and apatite contents are nearly equal. Logical evaluation is better for the calcite prediction, while artificial neural network -for the forsterite prediction.
In general, four used techniques belong to three prediction approaches. Mineral norm calculations are based on a priori model. Multiple regression is a linear statistical method. Logical evaluation of rock type is a search of optimal border of rock types (we go through the samples one by one and shift the delineations until they fit at best), i.e. it is a sort of supervised learning or 'pattern recognition' (besides mentioned methods, there are random forests, support vector machines, kernel estimators, decision tree learning, maximum entropy classifier and others methods in the field of supervised learning 25,26 ). Artificial neural network and logical evaluation of rock type are supervised classification approaches well established in the field of machine learning sciences. The   comparison of prediction quality (Table 3 and Fig. 6) suggests that supervised learning approach is the best for chemistry-to-mineral conversion and automatic 3D mapping.
Applications of the approach. A real spatial structure is required for comparison of the geological body form with modeled structure of explosion pipes, magmatic eruptive columns, percolation media etc. Estimation of the real and model forms proximity produces constraints for genesis reconstruction. Automatic geological modelling of the Kovdor deposit enables to avoid any impacts resulted from a priori genetic conception and subjective factors. The model is based on actual data, consequently, it will be close to real spatial structure of the phoscorite pipe. It allows us to compare qualitatively and quantitatively natural and experimental objects. For example, one can see an analogy between a vertical section of the deposit (Fig. 6C,D, bottom) and a typical structure of rapidly decompressed glasses and rocks: there is a "funnel" formed by calcite-rich phoscorites and surrounded by low-calcite phoscorites. It is similar to the structures developing on the glass decompression front 27 and bowl-shaped structures in shock decompressed rhyolite 28 . Location of carbonatite body (i.e. usually porous body with minimal density, from −400 m to −800 m on the vertical sections) is similar to the location of bubbles at the depth of 8 cm in shock decompressed glasses 27 . Besides, diagonal orientation of the carbonatite body resembles fractures (i.e. volumes with minimum density) in rapid decompressed pumice 29 . General subvertical dip of ore bodies and concentric differentiation of south part of the deposit is similar to rapidly decompressed, partly crystallized ore melt 30,31 . Vortex-like structures are also typical for decompressed solid-gas columns, e.g. ref. 32. The series of analogies can be expanded. In future, we plan to perform detailed qualitative and quantitative comparison of the phoscorites bodies' forms and model objects.
Another application of the proposed approach is geometallulgical modelling. We have obtained 17 types of rocks, 12 of which contain considerable quantities of major ore minerals (apatite and magnetite) in different proportions: CMA, CM, CA, CMF, CAF, CMAF, Magnetitite, MAF, MA, Apatitite, MF and AF. In these rock types variations of ore minerals content can be too high to enable accurate feed ore quality control at the plant. For example, the content of magnetite and apatite in the MA ore (Fig. 3) may vary within 7-95 modal %; herewith the content of the main non-ore minerals here (calcite and forsterite) is less than 10 modal %. However, if necessary, these variations can be significantly reduced. For example, the MA ore can be divided into the necessary number of sub-types by recalculation of P 2 O 5 into apatite, and Fe magn into magnetite. The same detalization procedure can be easily used for almost all other types of ore, including carbonatites (which can be divided into "ferrous" and "phosphorous" sub-types). Some difficulties may appear only during detalization of the CMAF ore. Therefore, we can conclude that the produced 3D geological model of the Kovdor deposit is a preliminary geometallurgical model that can be developed in more details based on available data as well as comparison of processing products with in situ ore. The detailed geometallurgical modelling is in our further plans.

Conclusions
(1) The approach of automatic 3D geologic modelling was introduced. The approach enables to use borehole core bulk-analyses only, without any a priori genetic concept or geologic rules (e.g. age sequence). The approach requires a small number of variables (4 in the Kovdor deposit case), but a rather close and homogenous sampling network. Rocks of phoscorites-carbonatites series are well distinguished by their chemical composition according to the classification tetrahedron 14 .
(2) The best prediction technique in the framework of the suggested approach is supervised learning. Calculation of normative mineral composition (norms) and linear statistical methods (multiple regression) are likely unsuitable. (3) Automatically built 3D geological models (built by rock type prediction by neural network and logical evaluation) demonstrate good convergence with the most recent and representative maps: the map created by geological service of JSC "Kovdorskiy GOK" 15 and the map based on borehole data 13 . (4) The approach does not allow generating vein stockwork patterns or other geological details smaller than the average length of a sampling interval (16 m in the Kovdor deposit case). (5) The proposed approach can be a tool of preliminary geometallulgical modelling.

Materials and Methods
The studies were based on the results of rocks chemical analyses implemented during exploration of deep horizons at the Kovdor deposit (in 2008-2012). The total length of analyzed core samples is 30213 m taken from 108 exploratory boreholes, number of sample intervals is 1846, and average length of the intervals is about 16 m. Analyses for SiO 2 , TiO 2 , MgO, CaO, K 2 O, Na 2 O, Al 2 O 3 , FeO total , Fe magn , P 2 O 5 , ZrO 2 , S total , and CO 2 were carried out. Analyses were performed in analytical laboratory of JSC "KGILC" (Apatity, Murmansk region, Russia) with a conventional wet chemistry method (different types of titration and gravimetric analyses). Fe magn means metallic iron content in mineral fraction separable by magnetic separation.
Percentage of rock-forming minerals (carbonates, magnetite, apatite, forsterite) was calculated for 550 samples representing all varieties of phoscorites, carbonatites and main types of host rocks, with optical microscope in thin sections. Precise diagnostics of minerals was carried out using LEO-1450 scanning electron microscope featuring the Röntek energy-dispersive microanalyzer. Statistical analysis was carried out with the program STATIATICA 8 (StatSoft), geostatistical investigations -Micromine 16, logical computations -the Gnumeric Spreadsheet 1.10.16 (The GNOME Project, http://www.gnumeric.org).
To avoid the closure problem associated with this kind of data (the chemical variables are not independent because they are confined within a closed composition), we used centered log-ratio transformation 23 of the data for statistical analyses (multiple regression and artificial neural network).
Interpolation was performed by ordinary kriging. For weight equalization, drillcore interval samples were preliminarily composited on the basis of the Fe magn grade they contain; minimal length is 5 m and maximal is 10 m. Data were interpolated in the empty block model created by JSC "RJC" (St. Petersburg, Russia) for Kovdor deposit. Its characteristics are the following: block size -15 × 15 × 15 m; Easting (Y) minimum is −4000 m,  Table 4. Parameters of bulk-rock chemistry interpolation by ordinary kriging of the Kovdor phoscoritecarbonatite complex. * In case of lognormal distribution of raw data, i.e. SiO 2 and CO 2 .
maximum is −1000 m; Northing (X) minimum is −2000 m, maximum is 1000 m; Elevation (Z) minimum −1605.5 m, maximum is −347.5 m; totally 4703979 blocks. After the variables were interpolated into the basic block model, empty blocks were deleted. The main axis of kriging ellipsoid is vertical in all interpolation models, the search figure is spherical; the sphere is divided into 4 sectors; maximal points per sector is 5; negative kriging weights are nulled. To avoid a problem of kriging smoothing and edge oscillation, interpolation was performed in three runs with an increasing search radius (25, 50 and 150 m) and decreasing minimal interpolation points (4, 3 and 1, respectively). Distributions of Fe magn and P 2 O 5 are normal, and CO 2 and SiO 2 are lognormal, so we use natural logarithms of CO 2 and SiO 2 contents for kriging with back transformation after interpolation. Interpolation parameters of used components are listed in Table 4.