Regionalization of Habitat Suitability of Masson’s Pine based on geographic information system and Fuzzy Matter-Element Model

Pine needles have been widely used in the development of anti-hypertensive and anti-hyperlipidemic agents and health food. However, the widespread distribution of this tree poses great obstacles to the quality control and efficacy evaluation. To facilitate the effective and rational exploitation of Masson’s pine (Pinus massoniana Lamb), as well as ensure effective development of Masson’s pine needles as a medicinal agent, we investigated the spatial distribution of habitat suitability and evaluated the optimal ranges of ecological factors of P. massoniana with 280 samples collected from 12 provinces in China through the evaluation of four constituents known to be effective medicinally. The results of habitat suitability evaluation were also verified by Root Mean Square Error (RMSE). Finally, five ecological factors were chosen in the establishment of a habitat suitability evaluation system. The most suitable areas for P. massoniana growth were mainly concentrated in the middle and lower reaches of the Yangtze River basin, such as Sichuan, Guizhou, and Jiangxi provinces, while the best quality needles were from Guizhou, Sichuan, and the junction area of Chongqing, Hunan, and Hubei provinces. This information revealed that suitable areas for effective constituent accumulation of Masson’s pine needles accounted for only 7.41% of its distribution area.

Scientific RepoRts | 6:34716 | DOI: 10.1038/srep34716 P. massoniana is widely distributed in Jiangsu, Anhui, Henan, the Hanjiang river basin in Shaanxi, the middle and lower reaches of the Yangtze River, southern Fujian, Guangdong, Taiwan's northern mountains and west coast, the eastern slopes of Daxiangling in central Sichuan, Guiyang and Bijie in Guizhou, and Funing in Yunnan, etc. Previous studies were limited to chemistry, or pharmacology and pharmacodynamics research, but areas suitable both for the growth and high medical quality of P. massoniana have not been evaluated. The vast difference between the climate, topography, and soils of different regions 15 are likely to affect the quality of Masson's pine needles.
Specific species habitat suitability and potential distribution areas are predicted by comparing the known target species distribution areas with the target habitat to find the suitable areas of the target species. This was done by using mathematical induction method and a simulated demand of its niche. In recent years, the maximum entropy model has been widely and successfully used in the prediction of suitable areas for crops, and potential habitat of medical plants [16][17][18] . However the maximum entropy model, focuses on the prediction of species distribution, but was not designed to predict the quality of the target species in different geographic environments. Using the Fuzzy matter-element model we solved this by quantifying the ambiguity, and the geographic information system (GIS) to support the space environment data. The relationship of plants and environment could be quantitatively investigated using the maximum entropy model analysis, Fuzzy function, and the GIS supported prediction of plant attributes space structure.
This study analyzed growth districts, quality division, and comprehensive regionalization of P. massoniana based on the climate characteristics of its primary producing areas, combined with contents of potentially effective components detected by HPLC. We also obtained spatial distribution of P. massoniana habitat suitability and the suitable range of each ecological factor. These data provided a theoretical basis for the protection and sustainable harvest of wild P. massoniana, as well as cultivation planning.

Results
Results prediction by maximum entropy model. Five ecological factors that affect the growth of P. massoniana, including four climatic and one topographical, were selected out of 55 candidates based on the maximum entropy model for the 280 samples, specifically precipitation in April and June, average atmospheric temperature for February and August, and altitude. The results showed that soil had little effect on P. massoniana growth, while climatic factors had a significant effect. AUC values of the training data set and test data set were both higher than 0.9, suggesting good simulations.
The optimal ranges of ecological factors according to each response curve were 98.8-112.8 mm for precipitation during April and 149.2-194.2 mm during June, 5-8 °C for average atmospheric temperature during February and 22-24 °C during August, and an altitude of 800-1100 meters.
The results of the ecological suitability analysis for P. massoniana showed areas suitable for P. massoniana growth mainly concentrated in the middle and lower reaches of the Yangtze River basin, especially Sichuan, Guizhou, and Jiangxi provinces, as shown in Fig. 1.
Establishment of Fuzzy matter-element model. Contents of four effective compounds or groups of compounds, each considered to have equal importance because they have significant pharmacological effects, were used as an aggregative indicator to analyze the evaluated environmental factors. Ranges of suitability values for each factor were solved by threshold calculation using MATLAB. Each evaluated environmental factor was standardized by threshold values and membership function, as shown in Table 1.
Results of weighting of evaluation factors and distribution divisions allowing for the aggregative indicator. Contributions of the evaluated ecological factors, as shown in Table 2, revealed that climatic and topographic factors contributed 87% and 13%, respectively, suggesting that climatic factors had greater effects on the aggregative indicators in P. massoniana. And, within the 4 climatic factors, contribution of temperature (monthly average temperature February and August combined) was 69.7%, much higher than precipitation (precipitation in April and June combined), which was 17.3%, suggesting temperature is the most important factor in accumulation of the aggregative indicators in P. massoniana.
The results of the quality suitability analysis, as shown in Fig. 2, revealed areas for high quality Masson's pine needles are mainly concentrated in Guizhou, Sichuan, and the intersection of Chongqing, Hunan, and Hubei provinces. Production suitability division was produced by overlaying layers of growth suitability, quality suitability, and land-cover types for the utilization and cultivation of P. massoniana, as shown in Fig. 3. The results showed the area and distribution of highly suitable, marginally suitable, and unsuitable habitat for P. massoniana production, and that the area of suitable habitat allowing for the aggregative indicator (sum of suitable and marginally suitable habitat) accounts for only 7.41% of the area where P. massoniana is distributed in China, as shown in Table 3.

Result of model accuracy test.
The RMSE value of the model for habitat suitability analysis was 0.1329, indicating feasibility of the model.

Discussion
P. massoniana is widely distributed in sub-tropical areas in China. Its northern range is from Henan and the southern part of Shandong to Guangdong, Guangxi, and Taiwan in the south. It is found throughout eastern coastal areas of China to central Sichuan and Guizhou, and is abundant in middle and lower reaches of Yangtze River 19  In this study, the Maxent, Fuzzy matter element model, and GIS technology were combined to evaluate the ecological suitability, quality division, and production division of P. massoniana 18,22 . Results showed that the AUC value of the maximum entropy model was higher than 0.9, the RMSE value of the Fuzzy matter element model was 0.1329, indicating that the model was appropriate for use. The results of this research revealed that the area of suitable habitat (sum of suitable and marginally suitable habitat) accounts for only 7.41% of the area where P. massoniana is currently distributed in China, with areas in Guizhou and Sichuan provinces being some of the most suitable habitats. The prediction results also suggested the area suitable both for P. massoniana growth and high-quality needle production was not large. Therefore, with the development of Masson's pine needles for medical use, along with use of the wood for other economically valuable products, it appears to be problematic to ensure the sustainable utilization of P. massoniana. Newly emerging studies on habitat suitability analysis and quality prediction under different geographical conditions for medicinal plants mostly use one method, such as the mathematical method, model of induction, and maximum entropy model for analysis. This study was the first to combine several methods effectively to investigate the spatial distribution of habitat suitability, avoided the inherent problems of using a single method, and the result can provide a reference for the selection of suitable areas for P. massoniana cultivation.
It has been suggested that for the sustainable utilization of P. massoniana, nature reserves in suitable habitats should be established [23][24][25] . Moreover, cultivation is an essential method to enlarge the population and increase yield by selecting high quality breeds and improving cultivation methods 26 .
On site sampling was conducted at 140 sites and 280 fresh pine needle samples were collected, the sampling distribution is shown in Fig. 4. Geographic information including altitude, longitude, and latitude was recorded by GPS. Contents of shikimic acid, procyanidins, total flavonoids, and total lignans were quantitatively determined by HPLC, as shown in "Supplementary Table S1".   vegetative covers. The most influential factors were chosen out of 55 ecological factors using a maximum entropy model for the establishment of habitat suitability evaluation system.
Fuzzy matter-element model. In order to decide which functions to use, function selection was based on the specific relationship between the contents of constituents and ecological factors; characters of the subject and the functions; and first curve fitting between the contents of constituents. Finally, values in the functions were worked out with MATLAB. Using this method, the content of constituents and the ecological factors were established. Then a suitability value ranging from 0 to 1 was defined for each ecological factor based on function.
In a Fuzzy set, 0 means a certain element has absolutely no function to a Fuzzy set (unsuitable), 1 means a certain element completely belongs to a Fuzzy set (entirely suitable). 0 means the ecological factor is completely unsuitable for its growth, while 1 means the ecological factor is completely suitable for its growth, and a value between 0 and 1 represents partial suitability [29][30][31][32] . The selection of functions and parameter estimates of the ecological factors were conducted by a curve fitting from 5 available candidates shown below (1)(2)(3)(4)(5). The ecological factors were standardized by choosing K-t function (i.e. function 1) according to the curve fitting 33 .
x c x , a, b, c, d, α , β and K represent the a, b, c, d, α , β and K in K-t function, respectively.
Weighting of evaluation factors. Habitat suitability can be evaluated by many approaches, e.g. analytic hierarchy process 34 , logistic regression 35 , etc. Comparing these methods, maximum entropy model requires a smaller sample size 36 . Moreover, though accuracy of model output depends on the quantity of observation data, the maximum entropy model is capable of yielding a correct solution based on fewer observation data 37 . There is little published research on habitat suitability of P. massoniana, thus expertise in this subject is insufficient. Therefore, weighing coefficients were evaluated by the maximum entropy model in objective weight method.
Model accuracy. Using RMSE to test the results of habitat suitability evaluation 30,38-41 . Output divisions. Based on weighing of each evaluation factor, diagram layers of evaluation factors were calculated by a weighted mean and grid computing in ArcGIS, grid size was set as the maxim value of input grids. A spatial distribution diagram of habitat suitability with grid size of 1 km × 1 km was generated by computation. The superposition of the diagram and the coniferous forest cover type was made in order to eliminate unsuitable land-cover types including farmland, rivers, lakes, urban ares, etc., and to yield the final version of spatial distribution diagram of habitat suitability 22 .