Predicting individual emotion from perception-based non-contact sensor big data

This study proposes a system for estimating individual emotions based on collected indoor environment data for human participants. At the first step, we develop wireless sensor nodes, which collect indoor environment data regarding human perception, for monitoring working environments. The developed system collects indoor environment data obtained from the developed sensor nodes and the emotions data obtained from pulse and skin temperatures as big data. Then, the proposed system estimates individual emotions from collected indoor environment data. This study also investigates whether sensory data are effective for estimating individual emotions. Indoor environmental data obtained by developed sensors and emotions data obtained from vital data were logged over a period of 60 days. Emotions were estimated from indoor environmental data by machine learning method. The experimental results show that the proposed system achieves about 80% or more estimation correspondence by using multiple types of sensors, thereby demonstrating the effectiveness of the proposed system. Our obtained result that emotions can be determined with high accuracy from environmental data is a useful finding for future research approaches.

reduction of wireless devices [18][19][20][21][22][23][24][25][26][27][28] . Many electric devices are connected to the Internet by adhering to the idea of the IoT. The IoT enables physical objects and/or space to communicate with each other. It likewise enables us to obtain various types of environmental data, which can be used for big data analysis. The IoT can also be utilized for various types of applications (i.e., smart home, smart building, smart health care, and smart rearing) [25][26][27][28] . From the idea of Society 5.0, as proposed by Japanese Government, combining various types of data obtained using the IoT with machine learning and/or big data analysis enables us to solve social issues 29 . Hong et al. developed a system that estimates humans' actions (invasion/indoor movement) based on array sensor information 30 . In Ref. 30 , humans' action is estimated using the Support Vector Machine (SVM). Tao et al. developed a system that predicts the amount of wind power generation by using deep learning 31 . It is expected that the idea of Society 5.0 will enable us to estimate emotions without the need of camera sensors or wearable devices. However, it has been difficult to specify data set in estimating emotions.
This study proposes and builds a customized emotion estimation model for individuals based on collected indoor environment data regarding human perception such as temperature, humidity, light intensity. At the first step, we develop wireless sensor nodes to be used in monitoring working environments. The developed system collected indoor environment data regarding human perception via the Wireless Sensor Network (WSN) and emotions through the system of Ref. 10 . The developed system collects indoor environment data and the emotions data as big data. Then, the proposed system estimates individual emotions without image data from camera sensors or vital data from wearable sensors. In addition, this study investigates whether sensory data are effective for estimating individual emotions. Indoor environmental data obtained by developed sensors and emotions data obtained from vital data are logged over a period of 60 days. Emotions are estimated from indoor environmental data by machine learning method. The experimental results show the effectiveness of the proposed system.

Methods
Proposed system structure. Figure 1 shows the structure of the proposed system. The proposed system collects and saves indoor environment data, and sensor nodes measure environmental data regarding human perceptions. Thereafter, sensor nodes send the measured data to the coordinator node. The coordinator node then transfers the received data from the sensor nodes to the data logger. The data logger then logs the data from the sensor node and sends them to the cloud server. Vital and emotion data obtained using the NEC Emotion Analysis Solution 10 are saved on the cloud server, which are then used as correct answer data for machine learning.
Individual emotions are estimated from obtained indoor environmental data. Emotions are estimated using machine learning method. At the data collection phase, indoor environment data are collected from the developed personal and indoor environment sensors. The phase of estimating emotions from vital sensors draws on the report in [10][11][12]38 . The emotion analysis system proposed in [10][11][12]38 functions by obtaining emotions from the fluctuations of the pulse and skin temperature based on the knowledge of fluctuation analysis of biological signals. The Emotion Analysis System analyzes the balance between the sympathetic nerve and the parasympathetic one the measured skin temperature and heart rate. Then the arousal and valence levels are determined based on the analysis results. It then classifies into following four types of emotions based on the obtained levels: HAPPY, STRESSED, SAD, and RELAXED.
In order to investigate which machine learning method is suitable for the proposed system, we pretested the estimation correspondences of three machine learning methods: SVM, K-Nearest Neighbor (KNN), and random www.nature.com/scientificreports/ forest. Tables 1, 2, and 3 show the setting parameters for SVM, KNN, and random forest, respectively. These parameters were obtained by grid searching. Table 4 shows the correspondences of SVM, KNN, and random forest, which were obtained by a leave-one-out cross-validation test. In Table 4, nine types of sensors were used. The average correspondence is 74.7% for SVM, 82.1% for KNN, and 86.7% for random forest by using multiple sensor types. Since the random forest algorithm can achieve the highest correspondence, the proposed system estimates emotions with the random forest algorithm. To create the decision tree of the random forest algorithm, the proposed system makes use not only of collected environmental data from personal and indoor environment sensors but also emotions data from NEC Emotion Analysis Solution, which is used as the correct answer. Environment sensor data are linked with individual emotions obtained from emotion analysis method [10][11][12] , which are measured during working in the experiment room. At the development phase, first, random samples are selected from the collected data set. Next, a decision tree is created and grown for every sample. Estimation results are obtained from every decision tree. At the emotions estimation phase, measured sensor data are encoded. Thereafter, prevailing data on emotions is selected through a majority decision. A decision tree is created for each person. The emotion of each person is then estimated from the decision tree.
Sensor nodes. Each sensor node is composed of environmental data measuring sensors, a wireless sensor module (XBee), and a one-board microcomputer (Aruidno). The operation of sensor nodes was carried out The distance metric to use for the tree, metric Minkowski Power parameter for the Minkowski metric, p 2 Table 3. Decision tree parameters of the proposed system.
Criteria for measuring the quality of a split, criterion Gini

Maximum depth of decision tree 30
Minimum number of samples required to be at a leaf node 1 Minimum number of samples required to split an internal node 2

Number of trees in the forest, n_estimators 30
Randomness of the estimator, random_state 42 Table 4. Emotion estimation accuracy for SVM, KNN, and random forest methods. Each sensor node measures the indoor environment data periodically. In this study, we developed personal, indoor environment, and thermography sensors in order to measure environment data regarding human perception. The developed personal and indoor sensors are shown in Fig. 2. Personal sensors include temperature and humidity sensors (DHT11), illuminance sensors (TSL2561), blue light intensity sensors (LM393), sound sensors (DFR0034), odor intensity sensors (TGS2450), distance sensors (HC-SR04), and human detection sensors (SE-10). Indoor environment sensors include CO2 concentration sensors (MH-Z16), dust concentration sensors (GP2Y1010AU0F), and atmospheric pressure sensor (BME280). Finally, point based thermo sensors pertain to the infrared array sensor (AMG8833). A point based thermo sensor measures temperature around the sensor and sends the measured data (Degree Celsius) as 8x8 points data. Point based thermo sensors are used for measuring humans' surface temperature. The server saves the collected data as CSV files. The files include measured data, sensor ID, and sensor data reception time. The details on the construction of the proposed system are described in Ref. 32 .
Data measurement using sensor nodes. Environmental measurement devices were composed of the developed sensors, a one-board microcomputer, and the XBee router. The star topology sensor network was constructed in two experimental rooms. There are three coordinator nodes, seven personal sensor nodes, two indoor environment sensor nodes, and two point based thermo sensor nodes in each room. Ten personal sensor nodes were placed around ten persons. Point based thermo sensor nodes were placed in front of Person 1 and Person 4. This study was performed in accordance with relevant guidelines and regulations. All participants gave written informed consent, and this study was approved by Chiba University. Table 5 shows the information of the equipped sensors of each node. Each sensor node measures the environment every 10 s. The experiment was conducted over a period of 60 days.
Emotion estimation. The proposed system estimates emotions from environmental data. In particular, the proposed system does not use image data and vital data. Environmental data (i.e., temperature, humidity, illuminance, blue light intensity, loudness, odor intensity, human detection, distance, CO2 concentration, dust concentration, point based thermo sensor, and atmospheric pressure) were logged over a period of 60 days. From the logged environmental data, 70% were used as training data, and the remaining 30% were used as test data.

Results
We conducted the results by Python language with the scikit-learn library. We obtained the results within several seconds by Intel Core i5 CPU.  www.nature.com/scientificreports/ Estimation correspondence of emotions. Table 6 shows the ratio of each emotion estimation correspondence of ten persons, which was obtained by a hold-out test. The ratio of each emotion is defined as the ratio of the number of times each of them appeared to the total number obtained in the experiment.The emotions were estimated from nine types of sensors. Table 6 indicates that emotions estimation correspondence of 8 out of 10 subjects achieved over 80% and that of the remaining two ones achieved over 75%.
In order to confirm the absence of difference between the measured data and estimated data for relatively small data size, we calculated the Bayes factors 39 under the hypothesis that these data are different (BF 10 ), and obtained 0.328 by the use of Bayesian t test in terms of JASP 40 . This value of the Bayes factor is within the level of the moderate evidence for H 0 (below 1 3 ) 41 ; the measured data is not different from the estimated data. Table 7 shows the confusion matrices, which were obtained by a hold-out test. Table 7 shows that the appearance ratio of Happy or Stressed is relatively high, while that of Relaxed or Sad is relatively low. Table 7 also shows that the behavior of each emotion ratio differs from person to person. Figure 3 show the estimation correspondence as a function of the number of data, which was obtained by a hold-out test. The emotions were estimated from nine types of sensors. This figure also indicates that each of estimation correspondence becomes saturated as the number of data increases. www.nature.com/scientificreports/ Table 8 shows the estimation correspondence of emotions, which was obtained by a leave-one-out crossvalidation test. It also indicates that the estimation correspondence differs according to the types and the number of input sensor data. Using multiple types of sensors improves the estimation correspondence. As presented in Table 8, the estimation correspondence achieved over 80% by using multiple types of sensors. These results show that the WSN-based big data collection is useful for emotions estimation. Table 9 shows the importance of each sensor type, which was obtained by a hold-out test. The Table 9 indicates that the importance of the CO2 concentration was relatively high for estimating human emotions. The importance of the point based thermo sensor was also relatively high.

Discussions and conclusions
First, we discuss the impact of the emotion estimation correspondence in Table 6. The estimation correspondence of each person was shown to be about 80% or more. This result shows that the developed personal and indoor environment sensors are effective in estimating emotions. Tables 6 and 7 show the ratio of Happy or Stressed is relatively high, while the ratio of Relaxed or Sad is relatively low. Tables 6 and 7 also show that the behavior of each emotion ratio differs from person to person.
Next, we discuss the impact of the number of sample data. Figure 3 shows that the estimation correspondences become saturated as the number of sample data increases. Although the estimation correspondences are unstable at a small number of sample data, the estimation correspondence of each person becomes stable given a larger number of sample data. As also presented in Table 6 and Fig. 3 that the estimation correspondence achieves over 80% given a large number of sample data although the ratio of each emotion fluctuates at the low number of sample data and also the behavior differs from person to person. www.nature.com/scientificreports/ Next, we discuss the number of types of sensors. Table 8 shows that the estimation correspondence differs by the types and the frequency of encoded sensor data. If the proposed system uses only a few sensors, it fails to realize high estimation correspondence. The estimation correspondence directly increases with the number of sensors, particularly when the number of sensors is within the range of one to four. The estimation correspondence is shown to be almost saturated when the number of sensors is larger than five. Further, Table 8 shows that there is a possibility that increasing the number of types of sensors possibly causes the estimation correspondence to decrease owing to over-fitting. Therefore, it is important, in terms of estimating emotions, that the proposed system is able to select the types of sensors to be considered. The experimental results indicate that using nine types of sensors achieved the highest estimation correspondence. The results in Fig. 3 and Table 8 show the effectiveness of big data collection of the proposed system. Furthermore, the importance of each sensor is presented in Table 9. Clearly, the importance of CO2 concentration ranks the highest among the sensors, regardless of the person analyzed. This result implies that the CO2 concentration can affect emotion. Since a point based thermo sensor can obtain the fluctuation of facial temperature, the importance of a point based thermo sensor was also relatively high. Although the importance of other sensor data depends on the persons analyzed, the emotion estimation correspondence of each person was still over 80%.
Lots of literature have reported the relationships between emotion and physical data such as odor, sound, lighting, and CO2 concentration [33][34][35] . Bombail introduced that conversely odours can also affect animal/human emotions by inducing a stress response 36 . Ayash et al. reported that student emotion and performance in learning environments were affected by illumination intensity and level 37 . Noguchi et al., investigated and found the relationship between the emotional state, respiratory rate, tidal volume, minute ventilation, and CO2 concentration 38 .
Our personal and indoor sensors can measure multi-modal data, including the above odor, sound, lighting, and CO2 physical data regarding emotion. Our measured data and emotion predictions are implicitly supported by such conventional researches.
In conclusion, this study proposed and built a customized emotion estimation model for individuals based on collected indoor environment data regarding human perception. At the first step, we developed wireless sensor nodes to be used in monitoring working environments and emotion estimations. The developed system collected indoor environment data regarding human perception via the WSN and emotions through the system of Ref. 10 . In addition, the developed system integrated indoor environment data with emotion data. Then, the proposed system estimated individual emotions without image data from camera sensors or vital data from wearable sensors. In addition, this study investigated whether sensory data are effective in estimating individual emotions. The experimental results showed that the proposed system achieved about 80% estimation correspondence by using multiple types of sensors, thereby demonstrating the effectiveness of the proposed system.
Our obtained result that emotions can be determined with high accuracy from environmental data is a useful finding for future research approaches. There is also a possibility that the obtained results contribute to build a less stressful environment. These are the contributions of this study to global innovation. Future works include the increase in the number of research subjects, experiments taking into account seasonality, and creating a general estimation model. Also, we will examine whether it is possible to control emotions by changing the surrounding environment.