Learning physical properties of liquid crystals with deep convolutional neural networks

Machine learning algorithms have been available since the 1990s, but it is much more recently that they have come into use also in the physical sciences. While these algorithms have already proven to be useful in uncovering new properties of materials and in simplifying experimental protocols, their usage in liquid crystals research is still limited. This is surprising because optical imaging techniques are often applied in this line of research, and it is precisely with images that machine learning algorithms have achieved major breakthroughs in recent years. Here we use convolutional neural networks to probe several properties of liquid crystals directly from their optical images and without using manual feature engineering. By optimizing simple architectures, we find that convolutional neural networks can predict physical properties of liquid crystals with exceptional accuracy. We show that these deep neural networks identify liquid crystal phases and predict the order parameter of simulated nematic liquid crystals almost perfectly. We also show that convolutional neural networks identify the pitch length of simulated samples of cholesteric liquid crystals and the sample temperature of an experimental liquid crystal with very high precision.

Machine learning algorithms have been available since the 1990s, but it is much more recently that they have come into use also in the physical sciences. While these algorithms have already proven to be useful in uncovering new properties of materials and in simplifying experimental protocols, their usage in liquid crystals research is still limited. this is surprising because optical imaging techniques are often applied in this line of research, and it is precisely with images that machine learning algorithms have achieved major breakthroughs in recent years. Here we use convolutional neural networks to probe several properties of liquid crystals directly from their optical images and without using manual feature engineering. By optimizing simple architectures, we find that convolutional neural networks can predict physical properties of liquid crystals with exceptional accuracy. We show that these deep neural networks identify liquid crystal phases and predict the order parameter of simulated nematic liquid crystals almost perfectly. We also show that convolutional neural networks identify the pitch length of simulated samples of cholesteric liquid crystals and the sample temperature of an experimental liquid crystal with very high precision.
The idea of having a machine capable of imitating intelligent human behavior broadly defines the field of artificial intelligence. By quoting McCarthy, who first coined this term in 1956, we may define artificial intelligence as "the science and engineering of making intelligent machines, especially intelligent computer programs" 1 . In this context, machine learning can be understood as a subfield of artificial intelligence and represents a technique for realizing artificial intelligence. The first use of the term machine learning is usually attributed to Samuel 2 in a 1959 article, where he verified the possibility of programming a computer to learn how to play the game of checkers. It was also around the same time that Rosenblatt proposed the perceptron algorithm 3 , often considered to be the first artificial neural network, for pattern and shape recognition. In spite of important developments such as the Vapnik-Chervonenkis theory 4 , it was only after the beginning of the 21st century that machine learning, and particularly the class of deep learning 5-7 methods started to be used more widely in areas such as game playing 8,9 , natural language processing 10 , history of art 11 , speech recognition 12 , medical diagnosis [13][14][15][16][17][18][19] , and computer vision 7,20,21 .
Despite the great improvement in several applications of machine learning algorithms, the process of extracting meaningful information from images, that is, to replicate what the human visual system can do, proved to be a more challenging task 21,22 . Convolutional neural networks are considered to be the state-of-the-art tool for analyzing image data, and they also have the great advantage of not requiring manual feature extraction from images. In particular, these deep neural networks use a hierarchical cascade of convolutions and non-linear functions that automatically learn representations and low-level features directly from input images 23 . This is one of the reasons these deep convolutional neural networks are very good at identifying objects in images.
Indeed, an acclaimed example of success of deep learning algorithms is documented in the ImageNet Large Scale Visual Recognition Challenge 24 , an annual competition among computer algorithms for large-scale image classification and object detection. The introduction of a deep neural network model (AlexNet) by Krizhevsky et al. 25 in 2012 is considered the major breakthrough in the competition not only because the top-5 error rate was Results convolutional neural networks. Before we start describing our results, we briefly introduce the conceptual framework underlying convolutional neural networks [5][6][7] . These networks are a particular type of artificial neural network where the basic unit is a neuron or a node. The design of artificial neural networks is inspired by biological neural networks concepts and consists of layers of neurons that are fully connected to each other via weight values. Each neuron receives input values from the previous layer, calculates the weighted sum of these values, adds a bias term, evaluates a non-linear function (activation function), and outputs the function value to the next layer. This process loosely mimics the behavior of biological neurons that fire under enough stimuli. The process of training an artificial neural network consists of adjusting weights and bias terms so that input signals yield the required output values provided by the training set. The updates of weights and bias are based on a loss function that quantifies the output error and on a process known as backpropagation. During this process, a gradient descent algorithm is used to iteratively update weights and bias to minimize the loss function, and each complete pass over the training data is called an epoch.
The main difference between standard neural networks and convolutional neural networks is the existence of convolutional layers. Differently from fully connected layers, neurons in convolutional layers receive inputs from small and spatially continuous regions of the previous layer. These windowed inputs are further multiplied by filters that share weight across the entire input data. Thus, convolutional networks preserve the spatial structure and optimize filter weights that are responsible for extracting and detecting low-level features in different locations of the input data (usually images). To formally define a convolutional layer, we need to specify the size of the spatial windows (filter size) and the overlap between adjacent windows (stride). For instance, a filter size of 2 × 2 and a stride of 1 means that the filter operates over windows with dimensions of 2 × 2 pixels (if the input is an image) that move in unitary steps over the input data. In addition to convolutional layers, convolutional networks usually have pooling or downsampling layers. A pooling layer operates similarly to convolutional layers, but instead of calculating weighted sums, it outputs simple calculations for each region such as maximum (max pooling) or average values (average pooling). These pooling layers summarize the presence of features, help in making feature representations invariant to small translations in the input data and reduce data dimensions (and amount of parameters), which in turn improves the computational efficiency of the network.
In our applications, the input data of the convolutional neural networks are texture images of liquid crystals, and the output are particular properties of these materials (phase state, average order parameter, pitch length, and sample temperature). The data set used here is basically the same as in ref. 44 . and comprises texture images obtained from simulated nematic and cholesteric liquid crystal samples as well as experimental textures obtained from E7 liquid crystal samples (a material commonly used in liquid crystal research). In Methods, we provide further details about the procedures involved in building this data set.
There is a myriad of possibilities for design choices of convolutional neural network architectures (number of layers, number and size of filters, strides, and so on). These choices are mostly empirical, dependent on the type of input data, and often inspired by other architectures that proved to be successful at particular tasks. However, some design patterns are common to several network architectures 45 . These include parsimony, symmetry, incremental feature construction, and downsampling strategy as we go deeper into the network 45 . Our particular design choices have been guided by these principles but are also based on trial-and-error procedures as well as cross-validation over a few network parameters.
predicting the phase of liquid crystals. In a first application, we use a convolutional neural network to detect whether a nematic liquid crystal is in the nematic or isotropic phase. To do so, we numerically generate textures from a model (see Methods) presenting nematic to isotropic transition at a critical temperature T c .
Textures with temperature below T c are labeled as "nematic" and those with temperature above T c are considered as "isotropic". This classification task is visually straightforward when textures are obtained from temperatures far from the critical temperature but becomes challenging with textures around the critical temperature 44 . Figure 1A illustrates the network architecture initially used for this task. In this network, input images (100 × 100 pixels) pass through two blocks of 2 × 2 convolutions and 2 × 2 max-pooling layers, followed by two fully connected layers (with 32 and 16 nodes, respectively) and an output layer. We use rectified linear unit (ReLU) activation functions in all convolutional and fully connected layers, while the output layer uses a sigmoid activation function (corresponding to logistic regression).
We separate 15% of data for final evaluation (test set) of the model and use the remaining as validation (20%) and training (80%) sets. The network parameters are optimized using the Adam algorithm 46 (learning rate of 0.001), and the loss function is the binary cross-entropy (commonly used in binary classification). To avoid overfitting, we apply an early stopping regularization procedure (with patience set to 10 epochs) and an L2 weight regularization (hyperparameter λ = 0.001) over all convolutional and fully connected layers. Figure 1B depicts the training and validation scores (fraction of correct classifications) as a function of the number of epochs, where we note that this network achieves ideal accuracy with just a few training epochs. Figure 1C shows the confusion matrix obtained by applying the trained network to the 15% of data never exposed to the algorithm. These results This network comprises two blocks of convolutional (red) and max-pooling (green) layers followed by two fully connected layers (yellow) and an output layer. An input image of size 100 × 100 pixels is convolved with five 2 × 2 filters (with unitary strides), yielding five 99 × 99 feature maps (C1 in red) that are passed through rectified linear unit (ReLU) activation functions. These feature maps are then passed to 2 × 2 max-pooling operations that coarse grain the representation to five 49 × 49 feature maps (S1 in green). Next, these feature maps pass to the same configuration of convolution and max-pooling blocks, yielding five 23 × 23 feature maps (S2 in red) that are flattened and passed through two fully connected layers with 32 and 16 nodes. Finally, the phase classification (nematic or isotropic) takes place in the output layer via a sigmoid activation function (corresponding to logistic regression). (B) Training and validation scores (fraction of correct classifications) as a function of the number of epochs used during the training stage. We separate 15% of data as test set, and the remaining is divided into training (80%) and validation (20%) sets (all obtained in a stratified manner).
(C) Confusion matrix obtained when applying the trained network to the test set (never exposed to the trained network). (D) Accuracy of the network over the test set as a function of the number of convolution (and maxpooling) blocks n b in the architecture (panel A corresponds to n b = 2). The circles are average values over ten realizations of the training procedures, and the error bars are 95% confidence intervals.
indicate that our network also achieves perfect accuracy in identifying the liquid crystal phase (nematic or isotropic) in the test set. We also test if variations of the network architecture shown in Fig. 1A are capable of classifying phases with similar performance. To do so, we consider network variations where only the number of convolution blocks n b (followed by max-polling layers) changes from 1 to 5 (the network of Fig. 1A corresponds to n b = 2). We thus train ten realizations of each of these networks by using the same procedure described for the architecture of Fig. 1A. After training, we estimate the average accuracy of the classification task in the test set as a function of n b . The results of Fig. 1D indicate that networks with n b = 1, 2, or 3 convolution blocks are equally good at classifying liquid crystal phases with accuracy very close to the ideal value. Thus, it would be preferable to use n b = 3 when deploying this model in a more practical application, since the number of fitting parameters diminishes when the number of convolution blocks increases, which in turn facilitates the training procedures. We further observe in Fig. 1D that the classification performance decreases substantially when increasing the number of convolution blocks beyond n b = 3, reaching an accuracy of ~0.7 for n b = 5.
predicting the order parameter of liquid crystals. In another application, we propose to predict the order parameter p of simulated liquid crystals directly from their textures. The order parameter describes the orientational order of a sample; it is considered one of the most important physical properties of the nematic phase since other anisotropic properties are determined from p. Figure 2A shows the dependence of p on the temperature T r , where we observe that p decreases with T r . This liquid crystal undergoes a transition from nematic to isotropic phase when the temperature exceeds the critical value T c = 1.1075 44 . Differently from the phase classification, we now have a regression problem where the network output is a continuous number representing the order parameter p. For this regression task, we consider essentially the same network architectures used for   www.nature.com/scientificreports www.nature.com/scientificreports/ classifying liquid crystal phases. We only replace the sigmoid activation function of the output layer by a linear activation function. Figure 2B depicts the architecture with four convolutional (and max-pooling) layers (n b = 4).
We train this network by optimizing the mean square error (loss function) and following the same procedures used for the phase classification. Figure 2C shows that the coefficients of determination (between actual and predicted values) for training and validation sets approach 1 after only a few training epochs. We also find a coefficient of determination of ≈0.997 when applying the trained network to the test set. This result demonstrates that our convolutional neural network is remarkably efficient in predicting the order parameter p, outperforming a shallow learning approach based on two image features (permutation entropy and statistical complexity) and k-nearest neighbors algorithm 44 . Figure 2A shows a comparison between actual and predicted values for the order parameter p, where we visually observe the high degree of accuracy achieved by the network.
We further investigate how the number of convolution (and max-pooling) blocks n b affects network accuracy. To do so, we train ten realizations of the network for a given value of n b ∈ {1, 2, 3, 4, 5} and estimate the average value of the coefficient of determination for the test set. Figure 2D shows that these networks display excellent precision for different number of convolution blocks, but an optimal performance occurs for n b = 4 (the architecture of Fig. 2A).
predicting the pitch length of cholesteric liquid crystals. Cholesteric liquid crystals are materials displaying a periodical helical (chiral) structure. This arrangement might be viewed as if formed by layers in between which the preferential director axis changes periodically with the period known as pitch length η. The value of η is easily estimated when the helical axis is perpendicular to the viewing direction of an optical microscope but cannot be obtained from standard experimental arrangements (Grandjean textures) used in reflective displays 47 . It can thus be of practical interest to find a simple way of estimating the pitch η directly from textures of cholesteric liquid crystals. To test whether convolutional neural networks are useful for predicting the value of η, we have built a data set of textures associated with different values of η ∈ {15, 17, 19, …, 29} nm from numerical simulations (see Methods). We then apply the same general network architecture used with nematic textures for the task of classifying the values of η. Figure 3A shows a network with n b = 4 convolution blocks (and max-polling) used with the cholesteric textures. When compared with networks used with nematic textures, the only difference is in   www.nature.com/scientificreports www.nature.com/scientificreports/ the last layer that now comprises 8 nodes (one for each pitch value) with softmax activation functions (commonly used in multiclass classification tasks). We train this network by following the same procedures used before and by considering the categorical cross-entropy as the loss function. Figure 3B shows that the training and validation scores approach the ideal accuracy after about 10 training epochs. Figure 3C further demonstrates the high accuracy of this network by depicting the confusion matrix estimated from the test set (15% of data never presented to the algorithm). We note that this network perfectly classifies all pitch values. This performance is quite superior to one obtained by the shallow learning approach reported in ref. 44 , where an accuracy of ≈85% is reported. We also investigate the accuracy of different network architectures by changing the number of convolution blocks n b . Results of Fig. 3D shows that the average accuracy is very low for n b < 3, reaches an optimum value for n b = 3 and 4, and decreases when n b = 5.
Predicting the sample temperature of E7 liquid crystals. In a last application, we propose to predict the sample temperature from experimental textures of E7 liquid crystals. These materials are a multicomponent mixture (cyanobiphenyl and cyanoterphenol) frequently used for the production of commercial displays. E7 liquid crystals exhibit a nematic to isotropic transition when the temperature exceeds the critical value of ≈58 °C 44 . Figure 4A shows examples of nematic textures obtained at different temperatures. We have initially verified that the general architecture used in all previous applications does not yield good results when dealing with    www.nature.com/scientificreports www.nature.com/scientificreports/ these experimental textures. Because of that, we propose to slightly modify the network architecture by including additional convolutional layers before each max-polling operation. We also increase the number and size of the convolution filters (there are now eight 4 × 4 filters per convolution block) as well as the size of the max-polling filters (that are now 3 × 3 pixels). The fully connected layers remain equal to the previous cases, that is, we have two fully connected layers with 32 and 16 nodes, followed by an output layer. ReLU activation functions are used after all convolution operations, and a linear activation function is used in the output layer. Figure 4B illustrates the modified network structure with n b = 3 convolution blocks.
In spite of the modification in the network architecture, the training and regularization procedures remain the same. We also use the mean square error as the loss function for this regression problem. Figure 4C shows the coefficient of determination for the training and validation sets as a function of the training epochs. We observe that both scores approach the ideal value after a few training epochs. The results of Fig. 4D shows the relationship between predicted and true temperatures obtained by applying the trained network to the test set. This relationship closely follows the 1:1 line (dashed line) and has a coefficient of determination of ≈0.982, indicating the high precision achieved by our approach. Figure 4A also shows a comparison between the actual temperature values associated with each texture and the network predictions (values within brackets). The accuracy of our network outperforms the shallow learning approach reported in ref. 44 , where a coefficient of determination of ≈0.93 is obtained with the k-nearest neighbors algorithm. We have also investigated the accuracy of our approach with different number of convolution blocks n b . Figure 4E shows the average coefficient of determination as a function of n b , where we notice that the optimal accuracy occurs for n b = 2 or n b = 3.

Discussion
While neural networks and other machine learning methods will not entirely replace experimental procedures, these methods are already improving experimental results and overcoming several difficulties in experimental analysis. From the results shown here, we expect that the use of neural networks in experimental situations may experience a large growth in the near future. In fact, there are several basic and applied science scenarios that can benefit from these techniques, which goes far beyond the exclusive use in liquid crystals research.
Upon correct training and design, neural networks and other machine learning methods have proved useful for identifying phase transitions 44,48 (thus coming in aid to regular thermal characterization such as differential scanning calorimetry measurements), pattern formation and pattern identification 49 (thus replacing the need for ever more specialized imaging tools and helping researchers to pick up very small details). Other examples of success of machine learning methods include investigation of biological materials 50 , deep space investigation 51 , and the search for early stage breast cancer by looking for calcification cluster on mammograms 52 . It is likely that, in the foreseeable future, machine learning (particularly convolutional neural networks) will find its use in basically all imaging tools: from optical to electronic microscopy.
Our work have demonstrated the usefulness of deep convolutional neural networks for predicting physical properties of liquid crystals directly from their optical textures. We worked out a series of applications with simulated and experimental textures, in which these networks showed to be quite efficient for predicting phases (nematic or isotropic), order parameters, cholesteric pitches, and sample temperatures of different liquid crystals. Our results thus help reducing the shortage of machine learning research, and in particular the application of deep learning algorithms, on liquid crystals. Methods implementing neural networks. All convolutional neural networks used here are implemented in Python language via TensorFlow 53 with the Keras 54 high-level API. We have particularly used the Keras sequential model, where deep neural networks are created by sequentially assembling layers. The general network architectures used in all applications are detailed in Figs. 1A, 2B, 3A, and 4B. Except for the study with E7 textures, all networks are built by stacking n b blocks of convolutional layers followed by max-polling layers. Next, the resulting feature maps are flattened and passed to two fully connected layers with respectively 32 and 16 nodes. A ReLU activation function is used in all convolutional and fully connected layers. The output layer is composed of a single node with a linear activation function in the regression tasks (when predicting the order parameter and sample temperature). The output has a single node with a sigmoid activation function in the binary (nematic or isotropic) classification of phases, and eight nodes with softmax activation functions when classifying the cholesteric pitches (η ∈ {15, 17, 19, …, 29} nm). For the E7 textures, the network architecture is modified and comprises n b blocks of convolutional layers followed by other convolutional layers followed by max-polling layers. The resulting feature maps are then passed through the same structure of fully connected layers.
We train these networks by optimizing a loss function with the Adam stochastic optimization algorithm 46 . In all applications, we have fixed the learning rate in 0.001 and the exponential decay rates in β 1 = 0.9 and β 2 = 0.999 (commonly used settings 46,53 ). We use the mean square error as the loss function for the regression tasks (predictions of order parameters and temperatures), and binary and categorical cross-entropy for the classification tasks (predictions of phases and pitches, respectively). In all applications, we separate 20% of data for final validation of the model (test set) and divide the remaining of data into training (80%) and validation (20%) sets. To avoid overfitting, we consider an early stopping regularization procedure that ends the training process when the validation loss function stops improving within a ten epochs interval (the patience parameter). We have also included a penalty term in the loss function proportional to the sum of the squares of the layer parameters (an L2 norm for regularization, where λ = 0.005 is the constant of proportionality). This regularization procedure also helps in avoiding overfitting and reduces fluctuations in the loss function. In the Supplementary Information we show excerpt of codes used for defining the convolutional neural networks in our work (Codes 1, 2, 3 and 4) and also a minimal example of code used for training these networks (Code 5).
Nematic and isotropic textures from Monte Carlo simulations. The textures used for predicting the liquid crystal phase and order parameters are obtained by Monte Carlo simulations of the so-called Lebwohl-Lasher model 55 . This model describes headless spins located over the sites of a n x × n y × n z lattice (with n x = n y = 100 and n z = 20). Unit vectors representing their directions characterize each of these spins. These spins interact with each other via the Lebwohl-Lasher potential Φ ∝ → ⋅ → u u cos ij i j , where → u i and → u i refers to i-th and j-th spins. We use periodic boundary conditions along the x and y directions, while the first and last layer of spins along the z direction have a fixed direction pointing along the y and x directions, respectively. We simulate this system with a given reduced temperature T r , discarding the initial 10 4 Monte Carlo steps to avoid transient behaviors. We use other 10 4 steps for estimating the average order parameter p and the textures are obtained with the Stokes-Muller methodology 56 averaged over the latest 50 Monte Carlo steps. We run 200 realizations for each value of T r ∈ {0.10, 0.12, 0.14, …, 1.52}, yielding 14,400 images of size 100 × 100 pixels. This model presents nematic to isotropic transition at the critical temperature T c = 1.1075 44 , so that textures with T r < T c are labeled as "nematic" and those with T r > T c are considered "isotropic". An average order parameter p is also associated with each texture.
Cholesteric textures from simulations. The cholesteric textures used for classifying the pitches are obtained from simulations of the Landau-de Gennes modeling approach 57 (continuum elastic theory). These simulations are carried out via finite differences method in a uniform grid of size 200 × 200, and the liquid crystal parameters are chosen to mimic the 5CB liquid crystal 42 . This system is simulated with periodic boundary conditions and for different values of the pitch η ∈ {15, 17, 19, …, 29} nm. The optical textures are estimated via the Jones 2 × 2 method 58 , an approach that is well known to produce textures very similar to experimental results 44,59 . We generate 1,000 textures for each value of η, yielding 8,000 images of size 200 × 200 pixels.
Experimental E7 textures. The experimental textures of E7 liquid crystals are obtained via polarized optical microscope imaging. Each sample consists of a rectangular capillary (300 μm × 4 mm) filled with the E7 mixture. We place these samples under a polarized optical microscope setup coupled with a temperature controller. We take pictures of the textures every 90 s, starting at 40 °C and heating the sample at a constant rate of 0.2 °C per minute up to 58 °C (sample temperatures are thus T ∈ {40, 40.3, 40.6, …, 58} °C). All image files have dimensions 2047 × 1532 pixels (with 24 bits per pixel) and are converted into grayscale via luminance transformation 60 . These images are sliced into 12 non-overlapping parts of size 510 × 511 pixels. We further use an augmentation procedure that adds a copy of each image horizontally and vertically flipped to the data set. We consider 5 different samples, yielding 180 images per temperature value (total of 10980 images).

Data availability
All data supporting the findings of this study are available from the corresponding author on reasonable request.