Machine learning and topological data analysis identify unique features of human papillae in 3D scans

The tongue surface houses a range of papillae that are integral to the mechanics and chemistry of taste and textural sensation. Although gustatory function of papillae is well investigated, the uniqueness of papillae within and across individuals remains elusive. Here, we present the first machine learning framework on 3D microscopic scans of human papillae (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$n=2092$$\end{document}n=2092), uncovering the uniqueness of geometric and topological features of papillae. The finer differences in shapes of papillae are investigated computationally based on a number of features derived from discrete differential geometry and computational topology. Interpretable machine learning techniques show that persistent homology features of the papillae shape are the most effective in predicting the biological variables. Models trained on these features with small volumes of data samples predict the type of papillae with an accuracy of 85%. The papillae type classification models can map the spatial arrangement of filiform and fungiform papillae on a surface. Remarkably, the papillae are found to be distinctive across individuals and an individual can be identified with an accuracy of 48% among the 15 participants from a single papillae. Collectively, this is the first evidence demonstrating that tongue papillae can serve as a unique identifier, and inspires a new research direction for food preferences and oral diagnostics.


Introduction
The tongue is a highly sophisticated, heterogeneous anatomical structure and its operation is fundamental to speech, friction regulation and oral processing of food.The surface of the tongue is covered with tiny projections known as papillae which enable perception of taste, texture and oral mechanics.Of these numerous anatomical projections, fungiform papillae are considered as phenotypic markers of chemosensation of taste as they house the taste buds 1 , whereas filiform papillae that are devoid of taste buds are considered to be regulators of mechanoreception 2 for textural perception.Women are believed to have more fungiform papillae and are classed more frequently as supertasters 3 .On the other hand, increased number of papillae have been found to be associated with enhanced fatty perception 4,5 .In addition to taste perception, papillae on the tongue are responsible for mechano-sensing.Mechano-sensing refers to our ability to sense the texture, friction, lubrication and touch on the tongue surface, and is carried out mainly by numerous filiform papillae that act as fine strain-amplified sensors on the tongue surface.These sensory functions are critical for manipulation and transport of food and liquids in the mouth 2,6 .Such textural properties also influence our psychological reaction to food.For example, feelings such as satiety and therefore hunger are influenced by perception of friction and lubrication 7,8 .It has recently been shown that our preference for certain food such as chocolates is driven by surface lubrication that can be measured by artificial tongue-like surfaces 9 .Besides food preferences, there is burgeoning interest in understanding the complex morphology of the tongue due to its involvement in various age-related oral conditions [10][11][12] , mucosal degeneration and systemic diseases [13][14][15][16] .Certain medical conditions 17 and inter-individual differences are known to be associated specifically with the morphology of the papillae and the tongue.Understanding the finer details in morphology, differences in papillae structures can thus lead to fabricating novel bio-inspired artificial surfaces in biomedical engineering, food engineering and therapeutics 18,19 .
The intricate geometry of the tongue at a microscopic scale can be appreciated in 3D scans (see Figure 1).These images are obtained via surface reconstruction of 3D optical scans of a silicone-polymer mask of a human tongue.Fungiform papillae (Figure 1(b)) are larger, sparsely distributed over the surface, and have a simple hemisphere-like shape.The average diameter of a fungiform papilla is about 878µm 18 , and they are clearly visible in larger images (Figure 1(a)).The filiform papillae show a more intricate crown shape (Figure 1(c)).They are smaller (about 355µm in diameter) and substantially more numerous.A square centimeter of human tongue surface is estimated to contain between 100 and 200 filiform papillae 18  Although there has been significant research on the importance of papillae density, our understanding of the papillae shapes and surface properties of the tongue suffers from the difficulty of extracting and analysing geometry of papillae at microscopic scales.Previous studies have thus focused on manually localising papillae from 2D images 20 , primarily focusing on fungiform papillae 21 .Other works on biological surface data have used conformal geometry and computational topology at larger scales.Examples of such techniques include shape registration 22 , segmentation and topological data analysis [23][24][25][26][27][28] .Machine learning has recently emerged as a powerful technique for diagnosis where large volumes of medical data or images are available 29 .These approaches have largely focused on computing global functions such as a medical diagnosis from an image.However, to date there is no machine learning model that has classified microscopic tongue papillae based on 3D tongue scans.
Herein, we present the first study of the 3D shapes of filiform and fungiform papillae in humans, with an emphasis on the variations in the microscopic geometry seen in Figure 1.We develop a machine learning based framework applied to custom designed topological and geometric properties -called features -to understand one fundamental issue: What separates one type of papillae from another?We also ask the questions whether papillae are unique across and within individuals based on finer geometric details.Instead of applying machine learning as a black box application, we use statistics and explainable machine learning 30,31 to differentiate one type of papillae from another and identify the most distinctive features.
We follow the process of Topological Data analysis, where implicit shapes in data are extracted as topological features that form the basis of machine learning models.However, in addition to topological features, we also make use of geometric features computed from discrete curvatures to understand the uniqueness of tongue papillae.These features together are seen to have a high accuracy of 85% in correctly identifying the papilla type (filiform or fungiform) in a small segment of a surface.As a result, we can now map the papillae arrangement for the first time -including filiform papillae that are critical for developing biorelevant tribological surface and unravelling mechano-sensing -as seen in Figure 6.
Unprecedented analysis from our model reveals differences in papillae shapes across gender, age and individuals.We find that given a papilla, the age group and gender of the participant can be predicted to moderate accuracy, and even the exact individual from among 15 participants can be identified with approximately 48% accuracy, showing the first evidence for papillae to act as a unique identifier.This study demonstrating the uniqueness of papillae geometry at microscopic length scales using discrete differential geometry and computational topology stands to benefit future development of 3D tongue models for enabling rational food design diagnosis of oral medical conditions.

Results
Our analytic framework processes the data, computes the features, and then applies machine learning driven analysis.We briefly explain the data processing and feature extraction.Then we proceed with a machine learning driven analysis of the feature set, prediction of gender, age and papillae type that reveals insights about papillae.
The data is obtained as 3D digital scans.The process starts with taking masks of the dorsal area of tongue of participants on silicone polymers.These masks are scanned using a 3D scanner, which yields a set of 3D points.These points are then passed through a surface reconstruction algorithm 32 implemented in Meshlab 33 , which yields a mesh and a corresponding surface (see Figure 1).This process was developed by Andablo et al 18 .
From this mesh data, we extract segments that are candidates for papillae.The extraction process is as follows.Around a point P on the surface, select the set B of points within a radius r + δ , where r = max(r fungiform , r filiform ) µm and δ = 100µm, which we find to work well in practice.A plane fit to B based on the RANSAC algorithm 34 represents our best approximation of the plane of the segment base.The local maximum m in the segment is defined as the point furthest away from the plane.This point is assumed to be the peak of a papilla, if present.Finally, we cut a region of radius r around m representing a candidate mesh for a papilla.Figure 2(a) and Figure 2(b) show such extracted segments for a fungiform and filiform papilla, while Figure 2(c) shows general surface area without any papilla.These three kinds of elements are the basis of our study.
A total of 2092 segments extracted from scans of 15 participants were labeled manually as Fungiform, Filiform or None.In the statistical workflow, a random subset of the segments (called the training set) is used to develop statistical models, while the remaining (the test set) -whose labels are unknown to the model -are used to test the accuracy of the models in a task of correctly predicting the label class (called classification).All accuracies reported in this paper are accuracy on the test set of unseen data.The analysis and machine learning are carried out on a large set of features (Table S1).In past work 18 , baseline features height and radii have been found to be distinctive between papillae types.Our more comprehensive segment dataset and computational models improve upon these baseline features to attain high accuracy automated detection of papillae type and other tasks.

Features and feature visualisation
Features can be considered at different scales.At the global scale, a topological invariant of the entire papilla may be a distinctive feature.At the local scale of the neighborhood of a point on the surface, geometric properties -in particular, curvature of points in the neighborhood -best characterise the local shape of the surface.Local properties can be aggregated over the entire papilla to obtain a global feature.We describe below the significance of topological and geometric quantities in this context.Topological features.In this work, topological properties are computed via persistent homology.In this approach, each vertex (for us, a point on the reconstructed surface) is treated as the center of a growing ball, and the union of these balls is observed for changing topology.One way to interpret computational persistent homology is that it monitors topological features of different dimensions as they are born and die with the growth of the balls.Connected components in 0-dimension, loops in 1-dimension, and higher dimensional spheres in higher dimeensions.For a comprehensive introduction see the text by Edelsbrunner and Harer 35 .Figures 2(d -i) show the persistent topological components for the three types of segments, where the scale is measured in µm.Figures 2(d -f) show the persistent diagram view, where each component manifests as a point indexed by its birth and death time.The difference in distribution of the points across plots suggests that there are variations in topological features for different segments.Figures 2(g -i) show an alternative view of the same data, called the barcode viewwhere each bar shows the life duration of a topological component.From these sets of bars we can derive statistical features based on the distribution of bar lengths and more sophisticated methods.The feature which we have used in this work are based on persistent entropy, persistent images, persistence landscapes and amplitudes (please refer to the Methods section for detailed definition of each of the features and Table S1).
The distribution of bars at different lengths for H 0 (connected components) are shown in Figure 2(j,k) as the kernel density estimates.Fungiform bar lengths in Plot 2(j) have higher density for shorter bars of length between 0 and 10 as compared to Filiform and None (around 0.01), and then again in the mid range between 17 and 25, where all densities achieve their maximum.There are considerably fewer longer bars for Fungiform as compared to Filiform and None, which dominate the longer bar end of the spectrum.In plot (k) with densities H 1 , we note that the density of short bars (lengths between 0 and 10) are higher for Fungifrom (0.07), followed by Filiform (0.065) and None (0.06).Thus there seems to be one predominant region of major difference, while H 0 shows greater variation across types.Geometric (curvature) features.Curvature is locally defined at each point and is a complete descriptor of a surface.Positive curvature occurs where the surface matches a region of a sphere, for example at the top of a fungiform papilla.Sharp peaks are characterised by high positive curvature, while gentle tops, such as at the top of the fungiform papillae, have lower positive curvature.Negative curvatures are observed in saddle shaped neighborhoods, for example, around the base of papillae.

3/25
In digital discrete data, where manifolds are piecewise linear (triangulated) meshes, as in our case, curvature is computed at each vertex of the mesh as the angle deficit of the manifold (see Methods section for details).For our analysis, we compute curvatures on a sample of points in the segment.The geometric features of a segment include quantities such as the maximum and minimum of Gaussian curvatures, percentage of points with positive and negative Gaussian curvature, and other aggregated quantities (See Table S1).
The distribution of curvatures of the segments in Figure 2(a-c) are shown in Figure 2(l).For all types of papillae, most points are seen to be concentrated around small values of curvature close to zero.In particular, fungiform papillae have more points of near zero curvature, as can be expected from fungiforms having mostly flat or gently curving surfaces.In contrast, filliform and even generic surface areas are seen to have greater fraction of sharper curvature points.
Feature Visualisation.The correlation matrix of features is shown in Figure S4 in the Supplementary material.This set of features were selected after removing features with correlation higher than 0.65.The features remaining on this matrix show little correlation with each other, implying that they capture mutually distinct information, and thus they are informative in our analysis.Correlations by papillae type are shown in Figure S3.PCA-based embedding of the data (Supplementary Figure S2) shows overlap between classes.However, a non-linear method called Uniform Manifold Approximation and Projection for Dimension Reduction (UMAP) 36 does cluster the data in ways that show clear separation between classes (Figure S5), implying implicit distinction between the classes.Next, we examine these features in order to quantify more closely their usefulness.

Feature analysis and feature importance
Various features may have different levels of importance in the distinction between papillae.The importance of a feature is a fundamental question in the field of explainable machine learning, and is usually determined by its contribution to a classification model.It is a somewhat complex measure that is difficult to derive by looking at the feature in isolation.For our purposes, we use the technique called permutation feature importance 37 , and compute the contribution of these features to a class of standard classifiers called Kernel SVMs.The permutation feature importance method evaluates a feature f by nullifying f of the test data and observing the drop classification accuracy of the model.A large drop in accuracy implies f is an important attribute for the classifier model.The effect of nullifying f is achieved by permuting the values of f among the test data points.
Figure 3, shows three most important features in determining each of the four labels of interest to us: the papillae type, the gender, the age and the participant id.The main observation here is that certain topological features are seen to be consistently important in these tasks (Figures 3(a-d)).Topological features overall are also found to contribute more to prediction accuracy than other features (Figure 3(e)).
Type prediction features.The KDE plots of the most important features for the papillae type classification task are presented in Figure S7, and the box plots and the aggregated distributions are shown in Supplementary Figure S1.The three distinctive features are seen to have very different distributions for the different types of segments, which explains their effectiveness in classification.
Gender prediction features.We have two topological and one curvature feature at the top three for gender prediction task, whose box plots and aggregated distributions can be found in Figure 4. Persistent entropy (0) (Figure 4(a)), Maximum Gaussian curvature (Figure 4(b)) and Short bars (1) (Figure 4(c)) are all important features for determining gender.Figure 4(a), shows that the female participants tend to have a higher median value of the max Gaussian curvature (which holds for both Fungiform and Filiform) as compared to male participants, which could be linked to female papillae being 'sharper', or 'pointier'.
Age prediction features.Topological features also dominate the age-prediction task, as seen in Figure 3.The box plots and aggregated distributions of the top three features are presented in Figure 5 -Persistent entropy (0)(Figure 5(a)), Amplitude(Image,0) (Figure 5(b)) and Maximum Gaussian (Figure 5(c)) are the most important features for the age classification task.The baseline features (Height, Radius) are not amongst the most essential for this task, suggesting that their characteristics do not differ much for the two age groups in this study.An interesting observation is that height is more important than radius.The distributions can be seen in Figure 5. Similar to the gender-prediction task, the Maximum Gaussian curvature feature (Figure 5(b)) is one of the most important.The median for the younger age group is 0.269 (n = 840) and for the older is 0.166 (n = 640), implying some difference between the two groups, with the younger group having 'pointier' papillae.This holds both for Fungiform and Filiform.

Predicting gender, age and participant from papillae structure
Having understood the differences in papillae structure based on gender and age, we ask if one can easily predict gender, age and the participant given a papilla.Specifically, we ask if the papillae and the features identified above contain sufficient information to allow simple statistical methods to carry out accurate prediction.

Gender prediction
In this task we predict the biological gender of the participants.The classification performance is presented in Table 1.The models trained on topological features result in accuracy of 65%, outperforming the curvature features by 5% and baseline features by 14%.Using all the features together marginally improves accuracy to 67%.In plot (d) we see the relative importance of all features from each kind in each task.The curvature followed by topological features are the most important for papillae type classification; the topological are the most important for the Gender classification task; the topological are even more important for the Age classification task.We note the growing relative importance of topological features from 0.34 to 0.69 and the diminishing importance of the baseline features from 0.22 to 0.04, from left to right.The curvature features are the most important for the Type task with 0.44 and maintain consistent medium importance across the Gender, Age and Participant prediction task with 0.28, 0.25 and 0.27, respectively.

Age prediction
The participants are split into two groups depending on their age.The cut-off is 29 to achieve a close to equal split.The classification statistics are shown in Table 1.The results follow similar pattern to the gender prediction task.The topological features on their own achieve classification accuracy of 0.73, closely followed by curvature with 0.67.The baseline features are behind by almost 0.10, with a score of 0.58.Combining the features once again improves accuracy to 0.75.Results for Leave One Group Out test, where the age and gender of an unseen participant is predicted based on data from the others is shown in Supplementary Table S4.

Participant identity prediction
In this task we predict the participant from their papillae.The balanced accuracy of the topological features (39%) are almost double that of curvature features (22%).This is illustrated by the most important features as well, as all three of them are topological.Unlike in the previous two tasks for gender and age, here combining all the features brings a significant improvement in the balanced accuracy score to 48%, suggesting that both the local and global information can contribute to predicting the identity of the participant.Note that while accuracies around 40% to 50% as seen here are not good on binary classification tasks, in this case the task is distinction among 15 participants.A baseline rate of random prediction in this case will produce an accuracy of only 6%.The features thus distinguish participants to a high degree of distinctiveness.The topological features outperform the curvature and baseline features across all three tasks, and adding all features together does not improve the accuracy significantly for the age and gender tasks (only 0.02 increase).However, this is not the case for the participant prediction task, where the performance improves with 0.09.These results suggest that the topological information is a good indicator of age and gender.

Papillae detection and type classification.
The final result is the accuracy of the classification task for 3-class classification (fungiform vs. filiform vs. none) based on the features (Table S1).The classification statistics are shown in Table 2.The accuracy of the topological features is better than the baseline and the curvature features, and combining all features together provides the best accuracy.We achieve balanced accuracy of 0.72 for the topological, 0.67 for the curvature and 0.62 for the baseline.Combining all the features increases the performance to 0.85.with random split and using Leave-One-Group-Out (LOGO), where the test data are taken from a single participant and training is carried out on samples from all other participants.The models used are Support vector machines (SVM) and Logistic regression (LR).The standard deviation for the baseline and topological features is larger for LOGO, suggesting that there is higher variation between participants for these feature sets.This is not the case for the curvature features, which appear to be more similar and stable across participants.However, when all the features are combined, the balanced accuracy is improved and the standard deviation is relatively low.

Application of classification model
The machine learning model developed can be used for accurate papillae detection and positioning on segments from a single person's tongue.Figure 6 shows the method accurately positions the fungiform form (in blue) and filiform (in yellow) on a 8/25 tongue segment from one participant.This automated approach can thus efficiently and accurately construct maps or tongue prints from given tongue masks.

Discussion
We have presented here the first study of the 3D shapes of human papillae based on high resolution scans.Our study is based on a novel framework combining geometry, topology and machine learning.Past research 38,39 has focused on fungiform papillae in 2D images.In contrast, our microscale 3D reconstruction based approach can detect filliform papillae and non-papillated areas of the tongue, which are hard to distinguish with the naked eye and 2D images.Recent research has shown that the human perception of food is governed not only by the chemical sensation of taste, but also heavily by the mechanosensation, i.e. texture perceived by filliform papillae, for example, in the perception of soft textured delicacies such as chocolates 9 .Of more importance, the framework proposed here can be extended beyond the tongue papillae to the general study of shape and arrangement of microscale surface elements such as finger-like projections that are omnipresent in biology.
To capture the intricate biological shape information, we have developed a pool of geometric and topological features.While 3D geometric and topological transformations have previously been used to process biological scan information 40,41 , we employ a unique approach and treat them as statistical data that are fed to a machine learning system.In this approach, curvature statistics are used for aggregated local information, while persistent homology is used for global characteristics.Based on the subject of study, other features may be used.In our analysis topological features turn out to be more informative in prediction.Recent research 42 has suggested that persistent homology can capture local shape information as well as global properties.Our results on tongue papillae are consistent with this idea.
The analytics are based on machine learning models.The models themselves are built to predict the relevant variables of type, age, gender and participant, but our objective was to gain a better understanding of variations across classes and features.We thus used permutation feature importance to evaluate how each feature contributes to each model.From a pure accuracy point of view, large neural network models 43 trained on big datasets are considered the most successful current paradigm 44 .However, our objective in this study was to develop an interpretable framework for investigation of biological surface features, operating on relatively few samples from few participants.We have thus used simpler models that can be trained with smaller quantities of data.The accuracy of the results with simple models gives us confidence in our conclusion of feature importances and in the feasibility of highly accurate machine learning models in future research.

9/25
The tasks for prediction of age group and gender suffer from the small number of participants.Machine learning models for these tasks achieve balanced accuracies of approximately 74% and 67% respectively.Note that for such binary prediction, a random prediction model achieves 50% accuracy.The results suggest that geometric and topological features do vary to an extent across these variables, but more data will be needed to confirm the result and the nature of variation.The higher max Gaussian curvature appears as an important feature for female participants and the younger age group, suggesting more sharply curved or pointy shapes in these demographics.In past research, women and younger people have been noted to have higher density of fungiform papillae, which has been attributed to variations in taste perception, and women have been observed to be supertasters more frequently 3,45,46 .The curvature variation implies a difference in papillae shapes that could be contributing to the sensory differences as well.Fungiform papillae density has been noted 47 to drop above an age of 65.In our study the participants were within the relatively young range of 22 − 37.The shape features show some variations to reach a classification accuracy of 74% between age groups 22 − 28 and 29 − 37.The Leave One Group Out test on age and gender (Supplementary Table S4) shows lower accuracy and greater variability.Certain individuals seem harder to model in this task.Further investigation with more participants will be required to gain greater insight into this issue.
The papillae type detection results are more accurate at 85% and based on a large number of papillae, which gives us confidence that the model is truly accurate.To confirm that the models generalise to unseen participants, we carry out the Leave One Group Out test, and find that the accuracy holds up even on samples from a completely unseen participant, which confirms that the models can be used to classify and localise papillae on new tongue impressions.The papillae type model can thus be used to automatically identify filiform and fungiform papillae on scans of new tongue impressions.
The individual participant model shows 48% balanced accuracy and 51% raw accuracy.This score is not impressive in a binary classification task, but our participant prediction task is a multi-class one, with 15 possible classes.A papilla could have belonged to any one of the 15 classes, and a random predictor would have an accuracy of only 6.66%.Considering the sample sizes from different participants, (Table S2) a predictor that always predicts the largest class can achieve an accuracy of 11%.In comparison, the model achieves between 4 to 8 times the accuracy of these baselines based on the distinctiveness in the data of a single papilla.This distinctiveness may have multiple contributing factors -these can be true inter-individual variations as well as variations in experimental conditions in collecting the masks.The exact cause of this difference will require further study.Note that while the age, gender and participant identification tasks suggest unique individual characteristics, the success of the type identification task suggest a complementary conclusion of significant similarity within types and across individuals.Larger studies can potentially address some of these issues using larger models and more complex features, such as persistent homology of curvature functions.
The framework and discriminative models presented here enable deeper study of the papillae structure and their variations and arrangements.The model for localising and classifying papillae (as seen in Figure 6) enables the study of the overall tongue surface, or tongue prints.Such arrangements of papillae are known to influence the surface properties of the tongue and its perception abilities 18 .Our data and past research have shown that the distribution of papillae vary across individuals.A detailed study of this variation across various demographic parameters could reveal insights into preferences, cultures and medical conditions.Arrangements identified by our models could be used to build generative models that can fuel such insights and can create more realistic surfaces for use in food engineering and development of oral diagnostics.Ultimately, this study offers a new dimension showing papillae as an unique identifier for the first time in the literature which needs further validation using this developed method for a larger dataset of participants.

Collection of Human Tongue Silicone Impressions.
The data in this study has been obtained from 3D optical scans of masks of real human tongues from 15 healthy participants performed using an Alicona InfiniteFocus (IF), details of the data collection has been described in a previous publication 18 .Negative impressions of the upper surface of the tongue were collected from (n = 15 subjects, the mean age in years is 29.1, SD=3.7), 6 male and 9 female.More detailed information can be found in Table S3.

Experimental protocol.
The study adhered to all relevant guidelines and regulations.Signed informed consent was obtained from all participants before undertaking the experimental protocol.The ethics declaration is included at the end of this section.

Dataset generation for papillae
Each participant's point cloud was split into two smaller parts of approximate size 13mm by 9mm in order to reduce the size of the point cloud.On each part, The Screened Poisson surface reconstruction 32 in Meshlab 33 is applied.Then, a number of circular segments of radius r + δ , where r is set to match max(r f ungi f orm , r f ili f orm )µm and δ = 100, were extracted according to 10/25 our algorithm for extracting candidates for papillae locations described below.Based on previous work 18 and our experiments in detecting papillae, we find that r f ungi f orm = 439, r f ili f orm = 177.5 work well for automated detection.These segments have been manually labelled into one of three classes: fungiform papillae, filiform papillae or None (neither a fungiform nor a filiform).The final dataset consists of 414 fungiform, 1489 filiform and 190 None, resulting in 2092 tongue segments in total.The number of segments per participants can be found in Table S2.

Finding Candidates for papillae locations
The pipeline for segment extraction works as follows.First, we pick a random point P on the surface.Then, we select a radius r + δ of points around P, where we set r to match max(r f ungi f orm , r f ili f orm )µm.Note that the distance from P is computed as the 3D Euclidean distance in the ambient space.If the set of points contain disconnected components, then components not containing P are discareded from the computation.We set δ = 100 to fully cover any papilla in the region.After that, we fit a plane based on the RANSAC algorithm 34 and identify the point M furthest away from the plane, which will be a local maxima.We identify it to be the centre of the segment.Finally, we cut a region of radius r around m as a candidate segment.This process is applied repeatedly to identify multiple papilla segments.In future iterations, any maximum within a previously processed segment is ignored.Number of iterations and samples in our experiments were limited by the need for manual labelling.In applying our model for mapping papillae (e.g. as in Figure 6) the process can be continued until no new papillae is found.

UMAP for visualisation
UMAP 48 represents data by fitting it to non-linear manifolds and thus can capture complex information.We have used the supervised version of the method for visualisation.The supervised method explicitly tries to separate known classes by embedding their connectivity graphs.We use it to test presence of distinction between classes.

Baseline, curvature and topological features
Three sets of features are extracted from each of the selected segments -baseline, curvature and topological.

Baseline features
We use geometric measurements for baseline feature identification, which comprises of two quantitative shape characteristics of the papillae: height and radius.From the data presented in Table 1 in 18 , based on Tukey's test for statistical significance of the means and standard deviation, the diameter (and the radius, respectively) and height are different between fungiform and filiform.Therefore, they can serve as features for distinguishing between the three classes.
We note that defining the height and radius automatically is a challenging task due to the irregular nature of these structures and to our knowledge no unambiguous definitions exist in the literature to date to identify these features accurately.Human participants do this manually by observing the continuity of the papillae from the base to the tip.
We compute height and radius as follows.The point m, identified as the local maximum for the segment, as the centre of the structure.Then we define the radius r as the radius value of the sphere, centered at M, which contains 90% of the points in the segment.We compute this iteratively, by first guessing the value of the radius i as a small value (100µm), and count the number of points in the neighbourhood of radius i (we use KDTree with FLANN 49 for nearest neighbor search).We then increase i by 10, until the number of points contained in the neighbourhood exceeds 90% of all points.The value of i at the stopping condition is our candidate for radius value, r.The computation of the height, h, is dependent on the value of r.It works as follows: we first cut a region around the centre of radius r, we then fit a plane using the RANSAC algorithm 34 and find the maximum distance from the plane to the local maximum point M.This value is our height value, h.All the computations have been performed using the Python libraries open3d and numpy.The algorithm is mimicking the manual procedure which a tongue expert would use to compute these values.An illustration of the procedure can be found in Figure S8.

Curvature features
For each x ∈ H, where H is the surface generated by the Poisson surface reconstruction process, we compute the discrete curvature as defined by Meyer et al. 50.The definition in the discrete case on the triangular mesh is via the vertex's angular deficit k H (v i ) = 2π − ∑ j∈N(i) θ i j , where N(i) are the triangles incident on vertex i and θ i j is the angle at vertex i in triangle j.The way that Gaussian and mean curvatures are computed uses averaging Voronoi cells and the mixed Finite-Element/Finite-Volume method 50 .We use the existing implementation from the Python version of Meshlab, called pymeshlab 51 .
We use the maximum and minimum of the Gaussian and mean curvature as features, the ratio of positively curved points to the number of all points in the mesh (k positiveratio ), and we introduced a new feature called curvature ratio (k ratio ).Let x be the number of points of positive curvature, and y be the number of points of negative curvature.Therefore, we define the curvature ratio k ratio to be k ratio = y x if y ≤ x and k ratio = x y if x ≤ y.The signs of the mean and the Gaussian curvature provide plenty of information about the local behavior of the surface 52 .We computed the discrete Gaussian and mean curvature for all meshes and calculated the number of vertices of positive and negative curvature (after the Poisson surface reconstruction filter).

11/25
The ratio of positively curved points to the number of all points in the mesh is defined as k positiveratio = x x+y .The full list and intuitive interpretations are provided in the Supplementary material, Table S4.

Topological features
We subsample the 3D point clouds to 1000 points each and compute the Vietoris-Rips complex, using the Euclidean distance as a filtration.Persistent homology 35 of the 3D point cloud was computed using the giotto-tda library 53 and ripser 54 .We then generate 12 features which are one number summary of the diagram, providing different topological information.For more details on persistent homology, please refer to the Supplementary material.
Short bars are the number of intervals of length between 0 and 10.We compute them both in homology dimension 0 and 1.This features has been found to capture the local geometry of an object 42 Persistent entropy 55,56 is the measure of the entropy of the points in a persistent diagram.Concretely, let D = {(b i , d i )} i∈I be a persistent diagram with non-infinite death times, i.e., d i < ∞.Then, the persistence entropy of D is defined as We compute persistent entropy in dimension 0 and 1, and denote it by Persistent entropy (0) and Persistent entropy (1).
Persistence landscapes: The parameter k is called a layer.In this work we consider the case when k = 1.
Persistence image: diagrams are converted to sums of Dirac deltas.The convolution with Gaussian kernel is performed, where the computation is done over a grid with rectangular shape.The locations of the points are evenly sampled from the values of the filtration, turning it into a raster image, which is then flattened into a vector.
Amplitude can be defined as the distance from the persistent diagram to the empty diagram, which contains only the diagonal points.Here we use 2 kernels (persistence landscapes 57 and persistence image 58 ) and the amplitude of the kernel is computed using the L2 norm, and 2 metrics (Wasserstein and Bottleneck).For the computation, we use the default parameters in giotto-tda.
We here denote Persistence image amplitude by Amplitude (Image, 0) Amplitude (Image, 1) for the computation of the amplitude with the persistent image kernel (which is the is the L2 norm of that vector) in homology dimension 0 and 1, respectively.Similarly, Amplitude (Landscape, 0) and Amplitude (Landscape, 1) is the Persistence Landscape amplitude in homology dimension 0 and 1.
The Wasserstein amplitude of order p is the Lp norm of the vector of point distances to the diagonal, which is We denote them by Amplitude (Wasserstein, 0), Amplitude (Wasserstein, 1) and Amplitude (Bottleneck, 0), Amplitude (Bottleneck, 1) respectively, corresponding to the different homology dimensions.

Machine learning and statistics
Classification models.The experiments use classes of simple models -Support vector machines (SVMs) and Logistic regression models.The implementations from scikit-learn 59 were used without modification and with the default hyperparameters.The SVMs were used with a radial basis kernel (RBF).Details of these techniques can be found in any introductory book on machine learning.We use 20% of the data for testing and the other 80% for training using a random split.The procedure is repeated 50 times.
Performance Metrics for machine learning .Accuracy represents the proportion of correct predictions made by the model out of the total number of predictions.To adjust for the varying number of samples across classes, we compute the balanced accuracy.It calculates the average of the correct classification proportions for both positive and negative observations.Feature Importance.The plots are based on classification by the best balanced accuracy split of the data, and 30 permutations of the features for that split.The black line represents the standard deviation of the feature importance over the 30 runs.Here we see PCA plots for all the features.In the first row on the left, we see the labels for the 3 classes.The second plot in the first rows shows the gender.In the third plot we see the different ages of the participants.In the fourth plot we see the PCA plot for the participants.We can see that participant 9 and participant 5 are clustered very nicely.The remaining participant's features also exhibit some sort of clustering though not as clear as the others, which is an interesting observation.We filtered down to this set of features by removing other features that had high correlation (more than 0.65).The features remaining on this matrix show little correlation, implying that they capture different properties of papillae shape.Thus they are the basis of our analysis.However, once we filter by papillae type, some higher correlation is visible.For example, when we look at the correlations for the features of fungiform only, there is almost perfect correlation between Positive mean and Positive Gaussian.This was hidden in the overall correlation matrix.(d-f) are for gender and (g-i) are for the age task.The clusters are more well defined when we look from the top to the bottom (from curvature to all features), which is also demonstrated by the performance at the classification tasks -the topological features outperform the curvature, and the combination of all features together achieves the best performance.In plot (a), the distribution of the Minimum Gaussian for Fungiform follows a very different pattern from Filiform and None -it is mostly flat and evenly distributed on the interval (−1, 0), while Filiform and None are densely concentrated around 0. In plot (b), the distribution of Filiform and None are very similar, with Fungiform having higher value of Radius, which is as expected from previous work 18 .In plot(c), the papillae Types Filiform and None follow a similar pattern to (a).Fungiform has higher value of Persistent entropy (0).In plot (d), even though Maximum Gaussian is not amongst the top three most important features, the value for Maximum Gaussian is higher for Filiform and None compared to Fungiform.This is as expected due to the sharper shape of filiform with more pronounced drop and steeper sides.S4.Balanced accuracies for age and gender tasks.The performance when we use Leave-one-group-out approach.The results are worse than before due to a small number of participants having too low accuracies.It will be a question for future work to investigate this.The results on the right are when these small number of participants have been removed.It can be that they are outliers and their topological features do.However, it is difficult to make conclusions given the small sample size.It could be that some people are topological outliers, and their features are not similar at all to the other people in the same age category From an arbitrary point P, all mesh vertices within a radius r + δ are taken.Then a Best fit Plane is found using the RANSAC algorithm.The candidate point for the peak of a papilla (if present) is found as m -the point furthest from the plane.This distance is taken to be the height, and m is assumed to be the centre of the papilla.

Figure 1 .
Figure 1.3D representation of a small portion of the dorsal part of the human tongue.Plot (a) shows the 3D mesh of tongue surface obtained from masks taken on a real human tongue.The color bar shows the z-coordinate of the points on the surface representing the height.In plots (b) and (c) we see regions of the tongue with (b) Single Fungiform papilla and (c) Multiple filiform papillae.We note the distinctive shapes of papillae in plots (b) and (c), i.e, the dome-shaped Fungiform papilla in (b) and the crown-like shaped Filiform papillae in (c).Impressions of human tongue was collected at University of Leeds (Ethics DREC ref: 120318/AS/245, University of Leeds) 18 from healthy adults (n = 15 subjects, 9 females, age 18-55 years)

Figure 2 .
Figure 2. Papillae identification and topological feature characterization Plots (a-c) show how the candidates for papillae from one Participant (Participant id 3) as meshes, using the library open3d.They are representatives from the 3 classes (a) Funigform, (b) Filiform and (c) None -no papilla.Plots(d-f) show their respective topological representations of (a-c) in the form of persistent diagrams measuring two main topological features: H 0 -the connected components and H 1 -the equivalent loops.Plots (g-i) show the equivalent representation of the persistent diagram in the form of a barcode, where the bars in red correspond to the connected components and the bars in blue -to the loops.Each bar represents a persistent generator, which is an interval where its left end point corresponds to the first filtration level where this topological feature appears, and its right end point is the filtration level where it disappears.Plots (j) and (k) show the kernel density estimate (KDE) using Gaussian kernels -plots representing the distribution of the lengths of bars from the barcode (for a,b and c.).Plot (l) reveals the curvature distribution across the different labels.

Figure 3 .
Figure 3. Feature importance across the classification tasksThe plots (a-d) represent the three most important features in the individual classification tasks.In particular, in plot (a) we see the papillae type task feature importance, in plot (b) Gender task features ordered by importance, in plot (c) the Age task features and in plot (d) the participant task features ordered by importance.The x-axis represents the accuracy drop when the feature of interest is permuted, and the black line represents the standard deviation over 30 runs.In plot (d) we see the relative importance of all features from each kind in each task.The curvature followed by topological features are the most important for papillae type classification; the topological are the most important for the Gender classification task; the topological are even more important for the Age classification task.We note the growing relative importance of topological features from 0.34 to 0.69 and the diminishing importance of the baseline features from 0.22 to 0.04, from left to right.The curvature features are the most important for the Type task with 0.44 and maintain consistent medium importance across the Gender, Age and Participant prediction task with 0.28, 0.25 and 0.27, respectively.

Figure 6 .
Figure 6.Automatic identification of tongue papillae Illustration of the result of our tool for positioning papillae on the surface of the human tongue.Here our tool has detected the positions of fungiform (in blue) and filiform (in yellow) on the tongue surface.It has found 14 fungiform and 40 filiform papillae.As a red dot we see the centre of the papillae, which is determined as the local maxima for the structure with the highest distance from a fitted plane, using the RANSAC algorithm.

p. 2 2
Here we use p = 2. Similarly, the Bottleneck amplitude, A B , is defined by letting p to ∞ in the definition of the Wasserstein amplitude.In other words, it is a fraction of the longest bar A B = √ sup i∈I (d i − b i ).

17 / 25 Figure S2.
Figure S2.Here we see PCA plots for all the features.In the first row on the left, we see the labels for the 3 classes.The second plot in the first rows shows the gender.In the third plot we see the different ages of the participants.In the fourth plot we see the PCA plot for the participants.We can see that participant 9 and participant 5 are clustered very nicely.The remaining participant's features also exhibit some sort of clustering though not as clear as the others, which is an interesting observation.

Figure S5 .
Figure S5.UMAP plots for three classification tasks.(a-c) are for papillae type,(d-f) are for gender and (g-i) are for the age task.The clusters are more well defined when we look from the top to the bottom (from curvature to all features), which is also demonstrated by the performance at the classification tasks -the topological features outperform the curvature, and the combination of all features together achieves the best performance.

Figure S6 .
Figure S6.Comparison between the papillae type for the most important gender and age features .

Figure S7 .
Figure S7.KDE plots of most important features for type classification task.In plot (a), the distribution of the Minimum Gaussian for Fungiform follows a very different pattern from Filiform and None -it is mostly flat and evenly distributed on the interval (−1, 0), while Filiform and None are densely concentrated around 0. In plot (b), the distribution of Filiform and None are very similar, with Fungiform having higher value of Radius, which is as expected from previous work18 .In plot(c), the papillae Types Filiform and None follow a similar pattern to (a).Fungiform has higher value of Persistent entropy (0).In plot (d), even though Maximum Gaussian is not amongst the top three most important features, the value for Maximum Gaussian is higher for Filiform and None compared to Fungiform.This is as expected due to the sharper shape of filiform with more pronounced drop and steeper sides.

Figure S8 .
Figure S8.Schematic figure: Profile view of processing a segment with a fungiform papilla.From an arbitrary point P, all mesh vertices within a radius r + δ are taken.Then a Best fit Plane is found using the RANSAC algorithm.The candidate point for the peak of a papilla (if present) is found as m -the point furthest from the plane.This distance is taken to be the height, and m is assumed to be the centre of the papilla.

Table 1 .
Important features for gender classificationThe most important features for gender classification and its aggregate distribution.Both the aggregate and individual distributions show that the females have lower number of short bars than males.Balanced accuracies for age, gender and participant prediction tasks

Table 2 .
Important features for age classificationThe most important features for age classification and its aggregate distribution.Comparison of classification results for the classification task for 3-class classification (fungiform vs. filiform vs. none)

Table S1 .
Signed informed consent was obtained from all participants before undertaking the experimental protocol.Ethical approval for this study was granted by the University of Leeds ethics committee DREC ref: 120318/AS/245, as well as the University of Edinburgh (Reference number 2019/71645).Description of non-correlated baseline, curvature and topological features.

Table S2 .
Number of segments per participant.

Table S3 .
Demographics of the participants.