Detrended Multiple Cross-Correlation Coefficient applied to solar radiation, air temperature and relative humidity

Due to the importance of generating energy sustainably, with the Sun being a large solar power plant for the Earth, we study the cross-correlations between the main meteorological variables (global solar radiation, air temperature, and relative air humidity) from a global cross-correlation perspective to efficiently capture solar energy. This is done initially between pairs of these variables, with the Detrended Cross-Correlation Coefficient, ρDCCA, and subsequently with the recently developed Multiple Detrended Cross-Correlation Coefficient, \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{DM}}{{\boldsymbol{C}}}_{{\bf{x}}}^{{\bf{2}}}$$\end{document}DMCx2. We use the hourly data from three meteorological stations of the Brazilian Institute of Meteorology located in the state of Bahia (Brazil). Initially, with the original data, we set up a color map for each variable to show the time dynamics. After, ρDCCA was calculated, thus obtaining a positive value between the global solar radiation and air temperature, and a negative value between the global solar radiation and air relative humidity, for all time scales. Finally, for the first time, was applied \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{DM}}{{\boldsymbol{C}}}_{{\bf{x}}}^{{\bf{2}}}$$\end{document}DMCx2 to analyze cross-correlations between three meteorological variables at the same time. On taking the global radiation as the dependent variable, and assuming that \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\boldsymbol{DM}}{{\boldsymbol{C}}}_{{\bf{x}}}^{{\bf{2}}}={\bf{1}}$$\end{document}DMCx2=1 (which varies from 0 to 1) is the ideal value for the capture of solar energy, our analysis finds some patterns (differences) involving these meteorological stations with a high intensity of annual solar radiation.

to be considered is the environmental impact caused by the use of the silicon in the production chain of the photovoltaic cells. However, the current and recurring challenge is to establish innovative manufacturing processes and use high-performance materials in order to obtain more efficiency in the collecting surface of the photovoltaic cells.
The efficiency of the photovoltaic cells depends not only on internal manufacturing factors, but also on external factors. External factors include shading by trees and/or clouds, rain, dust, solar radiation, air temperature, relative air humidity, and the wind speed and direction, among others. Since the external factors are not controllable, there has been research to understand the their effects. Due to its strong influence on the performance of photovoltaic cells, the solar radiation and the air temperature have been the most studied external factors [9][10][11][12] . There have also been many studies of the impact of the air temperature on photovoltaic cells, but there are still only a few studies that address the impact of these environmental factors in regions with a tropical climate, such as the Northeast of Brazil or that study how the variables are inter-related and interact 13 . Thus, not only should our attention be focused on energy generation: it is also important to have robust statistical tools in the area of the environmental sciences, with the purpose of analyzing how the external meteorological variables are related, as will be proposed in this paper with the DCCA multiple cross-correlation coefficient 14 .

Data Set
Taking into consideration that the global solar radiations is measured while we have sunlight on the sensor (pyranometer), hence the data was taken hourly from 10 to 21 h UTC (Coordinated Universal Time). Our data were obtained from meteorological stations managed by Brazilian Institute of Meteorology (INMET) 15 . Therefore, in order to study its potential for solar power we chose three meteorological stations localized in the state of Bahia (Brazil), see Fig. 1. These stations are important because they are the ones that have the best databases (about global solar radiation) 15 , and because they have the following characteristics, see Table 1.
Our first choice of analysis was to temporally order the data in a color map, with the three meteorological variables side by side, which are: • (a) Global Solar Radiation (KJ/m 2 ); • (b) Air Temperature (°C); • (c) Relative Air Humidity (%).
For each station, these variables can be see in Fig. 2 (Barreiras), Fig. 3 (Cruz da Almas), and Fig. 4 (Salvador). In these figures, a day (with 12 h) start at 10 h and ends at 21 h (UTC). It can be seen that the maximum global solar radiation is usually concentrated around the peak of sunlight (15 h). Logically, there are fluctuations in this value, depending on the season of the year or the time of day. We can also note in the color map a direct relation between the variables, but the color map is not able to robustly quantify that value; it gives us a beautiful visual display of information as to the dynamic changes in the variables. The next step is to quantitatively describe the features of the data for the whole of the period: for example, if we take the minimum and maximum values one can see: But, these statistics can be more refined if the seasons are considered. More details about these statistics can be seen in the results section below.

Results
Descriptive statistical. Initially as the results, we compute the mean values in the point of view of the annual seasons in the southern hemisphere. In this sense to simplify the climatological calculations and keep them uniform we choose the meteorological definition for seasons 16 with: • Spring from September/01 to November/30; • Summer from December/01 to February/28; • Autumn from March/01 to May/31; • Winter from June/01 to August/31. Figure 5 present these mean values (performed at every time (UTC)) for the global solar radiation, air temperature, and relative air humidity at each season (Spring, Summer, Autumn, Winter). It is possible to observe that the radiation in the sensor has a distribution (apparently normal) characterized by a mode varying with the season. The value for maximum solar global radiation is around 15 h (UTC) and this intensity depends on the location (see Table 2). The maximum global solar radiation is at 15 h, except at Salvador in the spring and summer, that is at    www.nature.com/scientificreports www.nature.com/scientificreports/ 16 h. The peak of maximum air temperature and minimum relative air humidity has a greater variation depending on the annual season. There is a clear inverse relation between air temperature and relative air humidity, more evidenced for Barreiras station, as shown in 17 . But, depending on the location (northern hemisphere) such a relation is not always true 18 . In this paper we also measured other moments, that are: in this case, 〈 〉 x is the mean and sd is the sample standard deviation, see Table 3 with the results. We can see that the highest relative standard deviation is for the global solar radiation (≈35%), except in the Spring and Winter for Barreiras station, where the relative air humidity has the highest. For the skewness, we note in general values different from 0 (but close), indicating that our data-set diverges a little from the mean with positive or negative values. There is an excess Kurtosis, ≠ K 0 e for most of the values found in the  www.nature.com/scientificreports www.nature.com/scientificreports/ (platykurtic distribution) and with > K 0 e (leptokurtic distribution). These results indicate that the meteorological time series is non-stationary. But, these descriptive statistics are well known and we want to propose something innovative in the study of direct or indirect relations between the main meteorological variables. With ρ x ,x i j and DMC x 2 , we will succeed.

DCCA cross-correlation coefficient ρ x x
, i j . Figure 6 presents the values of ρ x ,x i j for the cross-correlations between global solar radiation × air temperature, global solar radiation × relative air humidity, and air temperature × relative air humidity (these are particular cases of DMC x 2 for a pair of time series). We can see clearly that the variables are related for all time scales n and for all meteorological stations, because ρ ≠ 0 x ,x   Table 3. Descriptive statistics of the variables with: Standard deviation (%), Skewness, and Kurtosis. www.nature.com/scientificreports www.nature.com/scientificreports/ to observe that ρ < 0 x ,x i j for all time scales n and stations. Again, if we take as reference = n 360, patterns for each of the stations can be identified. For the DCCA cross-correlation between air temperature and relative air humidity, Fig. 6 ( ), ρ < 0 x ,x i j which agrees with 17 . But, if we want to study a cross-correlation between three (global solar radiation, air temperature and relative air humidity) or more variables, we must apply a method that generalizes ρ x ,x i j , such as DMC x 2 . The results of its application will be presented in the following.

DCCA multiple cross-correlation coefficient DMC
for global solar radiation, air temperature and relative air humidity at the same time. From this figure, it can be seen that the variables are related globally, because they have DCCA multiple cross-correlations that range from weak to very strong. In an intuitive way, as introduced here, this result depends on the dependent variable {y}, the station, and the time scale involved. Figure 7 show that DMC x 2 for {Air Temperature; (Global Radiation × Relative Humidity)} ( ) behave similarly, for < n 360, for all stations, but differ for > n 360. Up to the value  n 70 {Air Temperature; (Global Radiation × Relative Humidity)} and {Relative Humidity; (Global Radiation × Air Temperature)} yield approximately the same value of DMC x 2 , going from very strong to strong DCCA multiple cross-correlation. For the Salvador station, DMC x 2 (□) are closer to ( and ) than they are for the Barreiras station, mainly for small time scales n.
Our goal here is to study how the meteorological variables are related. We believe that this can help promote the efficiency of capturing solar energy in photovoltaic cells located in a certain region with climatic variations. To this end, looking at DMC n ( ) x 2 and taking as a dependent variable the global solar radiation, we have the following results, Fig. 8. Setting = DMC 1 x 2 for maximum efficiency in capturing solar energy for all time scales, both in short and long-term, we can analyze separately each location and observe the dependence of DMC x 2 on the time scale and location and define in this way the potential efficiency of each location. In our case study, all stations have > . DMC 0 2 x 2 with levels of the multiple cross-correlation lying between weak and very strong (see Table 4).  www.nature.com/scientificreports www.nature.com/scientificreports/

Discussion
Taking into account the global solar radiation, the air temperature and the relative air humidity, we have studied the cross-correlations from a global perspective, by using the multiple detrended cross-correlation coefficient, DMC x 2 . Initially we provide, for better data visualization, the descriptive statistics of these time series. Then, with their mean values and with the capture and or generation of solar energy in view, we found the maximum global solar radiation value. It can be seen here that usually this maximum is concentrated at 15 h (UTC), but with different air temperature and relative air humidity depending on the season. Logically, with standard deviation, skewness, and kurtosis, we can infer whether or not the probability distribution function approach the normal distribution. In this paper, it can be seen that depending on the season, the distributions deviate from the normal, characterizing these time series as non-stationary. But, this classical statistical analysis only takes into consideration each variable separately. To analyze the relations between them in pairs (or more) we must apply a statistical tool that has this capability.
Developed by Zebende 19 , Detrended Cross-correlation Coefficient, ρ x ,x i j , was constructed in order to analyze the cross-correlations between pairs of non-stationary time series. It is robust if compared, for example, with the Pearson's coefficient 20 . In this sense, the values of ρ x ,x i j between the global solar radiation, air temperature, and relative air humidity (in pairs) depend on the geographical location and the time scale, for these three meteorological stations chosen here. There is a positive relation between global solar radiation and air temperature and an inverse relation between global solar radiation and relative air humidity.
As mentioned in the introduction, the efficiency of a photovoltaic cell depends mainly on internal factors, but that the external factors are also important, as well as their inter-relations. Our goal here was to apply the multiple DCCA cross-correlation, DMC x 2 , to study globally the relation between three (main) variables involved in solar energy. It is noteworthy that such application has not yet been performed and this paper is the pioneer to treat together these three variables at the same time with DMC x 2 . Thus, assuming = DMC 1 x 2 to be the ideal value for solar energy capture, if we have global solar radiation as the dependent variable, than for the meteorological stations of Barreiras, Cruz das Almas, and Salvador, located in the northeast of Brazil, an area with a high intensity of annual solar radiation, we did our analysis and noticed some patterns (differences). Because DMC x 2 is a function of the time scale n, we can determine this multiple coefficient for small, medium, and long time scales. For example, for = n 12 (small time scale), the Salvador station has the best value for capture solar energy according to DMC x 2 , with very-strong value. For = n 360 (one month) all stations have an intermediate value for DMC x 2 , that is, between medium and strong. But, for long time scales, Barreiras and Cruz das Almas have the best values for capture solar energy if compared to Salvador.
Finally, it is worth pointing out that in a certain way the stations are close to each other, and that a study for other stations around the planet would be very welcome. But, the purpose of this paper was to apply a new method to the analysis of multiple cross-correlations between meteorological variables in a global (innovative) way. In conclusion, as the expression for multiple correlation is quite general, other variables can be employed,   www.nature.com/scientificreports www.nature.com/scientificreports/ such as atmospheric pressure and wind speed, among others, adding even more information to the calculation of DMC x 2 .

Methods
For DCCA multiple cross-correlation coefficient 14 presentation, we employ the DCCA cross-correlation coefficient ρ x ,x i j 19 , which is defined in terms of the F n ( ) DFA 21  The DCCA cross-correlation coefficient in Eq. 1 ranges between − ρ ≤ ≤ 1 1 x ,x i j , and has been applied in several papers, such as 20,23-27 , among many others 28 . It is possible to generalize the idea behind ρ x ,x i j to more than two variables, and such a new multiple coefficient is referred to as the DCCA multiple cross-correlation coefficient, denoted by DMC