Using complex networks towards information retrieval and diagnostics in multidimensional imaging

We present a fresh and broad yet simple approach towards information retrieval in general and diagnostics in particular by applying the theory of complex networks on multidimensional, dynamic images. We demonstrate a successful use of our method with the time series generated from high content thermal imaging videos of patients suffering from the aqueous deficient dry eye (ADDE) disease. Remarkably, network analyses of thermal imaging time series of contact lens users and patients upon whom Laser-Assisted in situ Keratomileusis (Lasik) surgery has been conducted, exhibit pronounced similarity with results obtained from ADDE patients. We also propose a general framework for the transformation of multidimensional images to networks for futuristic biometry. Our approach is general and scalable to other fluctuation-based devices where network parameters derived from fluctuations, act as effective discriminators and diagnostic markers.


Mean value of pixels versus Principal Component Analysis
We have conducted a principal component analysis (PCA) of the cropped portion of the images obtained by videographing the eyes for the purpose of dimensional reduction. Let Z ij (t) denote the pixel image matrix at time t, of the cropped portion. In the main text, we chose Z(t) , the mean value of pixels of the cropped region for each frame obtained by decomposition from the video. To contrast with Z(t) , herein we study the time series of λ k (t), where, k denotes all principal components which are obtained from the matrix Z ij (t) at time, t.
We ensure that there is no significant variation of the principal components over time, as observed in Fig. S 1. The first principal component, P C1, alone retains about 55% of the total variance of the data and the first three principal components collectively retain about 82%. To contrast with the mean pixel value, Z(t) ; herein we choose to create a network from the time series of the first principal component, or the leading eigen value, λ 1 (t), obtained from the matrix Z ij (t) at time t. The procedure of creating the network from the time series of P C1 is exactly identical to that of creating the network from Z(t) , as detailed in the main text.
Comparing Fig. 2 of main text and Fig. S 2 of SI, it is observable that there is essentially no difference in the nature of classification inferred from the cumulative distributions of edge betweenness centrality. As evident, the former has been obtained by using Z(t) and the latter by using the first principal component. Thus, calculating the spatial mean of pixels of the cropped portion essentially gives the same results and is obviously far less computationally expensive than PCA.
As evident from Figs. S 3 and S 4, it is evident that the networks cannot be differentiated upon mere visualisation. Further topological analyses are necessary. Distributions of topological metrics have been studied for both P C1 and Z(t) and observed to exhibit similar behaviour. To avoid redundancy, here we present results for only Z(t) in Figs. S 5, S 6, S 7 and S 8. Below, we examine standard network metrics like degree, closeness centrality and betweenness centrality of nodes. Similarly, for edges, we study edge proximity and edge betweenness centrality of edges. Edge betweenness centrality clearly serves as the best classifier between healthy individuals and patients with dry eye disease. This is true for both P C1 and Z(t) as reflected in Fig. S 2 and Fig. 2 of main text respectively.

Degree of nodes
The degree of a node is the total number of edges incident on that node. Since our networks are directed, degree of each node is the sum of its in-degree and out-degree. Cumulative degree distribution for control and ADDE group is shown in Fig. S 5.
Supplementary Figure S 5: Cumulative degree distribution for networks mapped from pooled thermal imaging time series for healthy eye and dry eye groups.

Closeness of nodes
Closeness centrality, C i , of any node, i, is the inverse of the sum of its shortest distance d(i, j), calculated using the directed shortest paths, with every other node, j, in the network. Thus, where N is the total number nodes in the network. Cumulative distributions of C i for control and ADDE group is plotted in Fig. S 6.