A symbolic network-based nonlinear theory for dynamical systems observability

When the state of the whole reaction network can be inferred by just measuring the dynamics of a limited set of nodes the system is said to be fully observable. However, as the number of all possible combinations of measured variables and time derivatives spanning the reconstructed state of the system exponentially increases with its dimension, the observability becomes a computationally prohibitive task. Our approach consists in computing the observability coefficients from a symbolic Jacobian matrix whose elements encode the linear, nonlinear polynomial or rational nature of the interaction among the variables. The novelty we introduce in this paper, required for treating large-dimensional systems, is to identify from the symbolic Jacobian matrix the minimal set of variables (together with their time derivatives) candidate to be measured for completing the state space reconstruction. Then symbolic observability coefficients are computed from the symbolic observability matrix. Our results are in agreement with the analytical computations, evidencing the correctness of our approach. Its application to efficiently exploring the dynamics of real world complex systems such as power grids, socioeconomic networks or biological networks is quite promising.

Variables spanning the state space of a dynamical system which is irreducible to a few smaller subsystems are always dependent on each other through linear and nonlinear interactions. Consequently, one may expect to be able to determine an adequate subset of variables together with their well-selected Lie derivatives to get a full observability of the underlying dynamics, that is, for distinguishing all possible states of the network 1,2 . With the emergence of the Science of Complexity, complex networks are more and more often considered in various fields as well exemplified by power grids 3 , socio-economics networks [4][5][6] , or biological systems [7][8][9][10] . To allow a reliable monitoring, dynamical analysis or control of these high-dimensional systems, suitable and systematic techniques are required to identify the subset of variables providing the best (if not the full) observability of their underlying dynamics. A related problem is how to unfold the whole dynamics by completing this subset of variables to reconstruct a space whose dimension is at least equal to the dimension of the original state space.
Dealing with multivariate time series, specially those produced by high-dimensional dynamical networks, is not a trivial problem [11][12][13] . Attempts to estimate network observability using symbolic techniques 14,15 were made to overcome the large computational times associated with the exact analytical calculations. In those approaches, a dimension reduction is performed in real time on a symbolic observability matrix until state estimation is possible from the selected measurements. However, linear and nonlinear interactions among variables are considered on an equal footing while it is strongly required to distinguish them for a reliable assessment of the observability of a system 2,13 . In order to tackle such a challenging task, we propose a methodological approach that will be applied to reaction networks derived from dynamical systems with appropriately large dimension to corroborate our assessments with rigorous analytical calculations, and yet provide a framework making also possible the verification of observability in networked dynamical systems. The chosen reaction networks are models of interesting biological and physical systems: the circadian oscillation in the Drosophila period protein, the Rayleigh-Bénard convection, and the DNA replication. They also represent nonlinear systems with increasing nonlinear complexity, commonly observed in other natural and man-made systems. Therefore, they are an appropriate subset of

Results
On rare occasions, nonlinear systems are fully observable from just a single scalar time series 20 as previously investigated for many chaotic systems [21][22][23] . Since full (global) observability warrants that every distinct point of the original state space x d ∈  can be univocally identified, there is a great interest to target the minimal set of variables to measure for accomplishing such a full observability condition.
As shown in Section Methods, to assess the (local or global) observability through a given measurement vector s, both a subset of m variables-sometimes designated as "sensors" 24 -and the Lie derivatives have to be provided. Our aim is therefore to provide a method that can indeed solve the problem of determining the minimum set of variables to measure for observing a large complex system. In order to avoid testing the rank of the observability matrix via algebraic computation, we propose to use a technique based on a symbolic computation of the observability matrix in which the terms are not explicitely expressed but only their linear, nonlinear polynomial or rational character 2,13 .
The general and systematic procedure developed by Bianco-Martinez and coworkers 2 and compiled in the Methods section, is in fact very time consuming since the computation of the observability coefficients corresponding to all the 5.2 ⋅ 10 6 possible vectors spanning a reconstructed space of a 13-dimensional system would require intense and long computational time (more than 18 days). One optimization strategy is to reduce the number of possible combinations by identifying candidate variables that should be disregarded as members of the measurement set. A lack of observability has its origin in the existence of a singular observability manifold, a domain in the original state space where the determinant Det  of the observability matrix  is zero 25 . Let us note here that there is one very special case in which Det  ≈ 0 and, consequently, practical problems in the state estimation may occur. This usually happens when Det  depends on some parameter(s) which may be arbitrarily small 26 . In general, a linear system may be (rarely and practically) non-observable with a nonzero determinant of the observability matrix. By construction, a null or non-constant determinant Det  is rooted in a null or a nonlinear component in the Jacobian matrix. Our technique relies precisely on tracking those nonlinear terms in the Jacobian matrix and, therefore, taking into account both linear and nonlinear interactions between variables becomes so relevant in assessing observability.
By analogy with what is done for chemical reactions 27 , it is possible to consider any dynamical system as a reaction network, whose associated weighted adjacency matrix is the symbolic Jacobian matrix ∼  . Using the terminology from graph theory, we define the linear out-strength of the node i, σ i ( ) out lin , as the number of times the ith variable appears in linear terms in the governing equations, that is, The larger is σ i ( ) out lin , the higher the probability the ith variable needs not to be measured because it is related to other variables via linear couplings which will not induce nonlinear terms in the determinant of the observability matrix.
The situation in which   = = J J 1 ij ji and i j ( ) ( ) 1 out lin out lin σ σ = = means that variables i and j are exclusively linearly coupled with no other variables involved. Consequently, full observability of the ith variable can only be accessed by measuring the jth variable and vice versa. It is thus necessary (and sufficient) to measure at least one of them because they cannot be simultaneously excluded from the set of measured variables. A second criterion to decide which variable to choose between these two is needed. The idea is built on how a variable candidate to be Scientific REPORTS | (2018) 8:3785 | DOI:10.1038/s41598-018-21967-w non-measured is influenced by the other candidate variables. Let be {x k } the set of variables candidate to be non measured with k ∈ V nm ⊂ {1, 2, ..., m} and V nm the set of integers indexing the non-measured variables. Thus, the in-strength of the ith variable provided by the non-measured ones is defined as Using this correspondence for the symbolic terms, we can assume that the larger is σ i ( ) in nm , the less observable through the ith variable the system is. The rationale is as follows: the more nonlinearly coupled is the ith function f i (x) with the non-measured variables, the larger the degree of the determinant of the observability matrix, and the less observable the dynamics through the measurements is 22 . Therefore, we should preferably remove that variable with the largest non-measured in-strength σ in nm . Therefore, the minimal set of variables to measure for reconstructing a state space with a full observability can be automatically determined from (i) the symbolic Jacobian matrix ∼  of the system under study, (ii) the linear out-strength out lin σ , and (iii) the in-strength in nm σ provided by the non-measured variables. Note that the knowledge of the exact functional dependence of the coupling between variables is not necessary, only its polynomial or fractional nature 2 . From the vector of state variables x, those components x i having the largest σ out lin are discarded as candidates to be sensors after having checked they are present at least once in the equations governing the dynamics of a sensor variable. All remaining possible embeddings s that can be constructed from the final set of variables candidate to become sensors are then tested using a Matlab ® algorithm and ranked according to the corresponding estimated symbolic observability coefficient η s .
In order to validate whether our proposed method is in agreement with algebraic computations, we will consider three dynamical systems of increasing dimension (d = 5, 9 and 13) describing complex systems coming from biology or physics. As it is known that the presence of symmetries in the state space can affect the assesment of observability 22 , we will also consider the case of equivariant dynamical systems obeying f(Γ ⋅ x) = Γ ⋅ f(x), where Γ defines a discrete symmetry like a rotation or an inversion 28 . A 5D rational model for the circadian PER oscillations in Drosophila. In our attempt to consider biological or physically motivated systems, let us start with the model proposed by Goldbeter for the circadian oscillation in the Drosophila period protein 29 . This is a five-dimensional rational model which produces a limit cycle for the parameter values initially reported 29 . This system is interesting in the sense that its complexity already presents a big challenge from the analytical point of view. The corresponding symbolic Jacobian matrix reads as x x x x x ( , , , , ) i i i i i whose observability to span the state space of the original system is estimated. According to the observability coefficient values, the ranking of the variables providing better observability is 3 . This is in a rather good agreement with the analytical determinants , and where complexity exceeds our computational abilities; since the simpler determinant (with singularity of the 10th degree) is obtained for variable x 5 providing the best observability, then variable x 1 is associated with a determinant with a singularity of the 14th degree, and the three variables x 2 , x 3 and x 4 providing the poorest observability are associated with determinants too complicated to be computed with MAPLE ® . The number (125) of all possible combinations of dimension 5 is still sufficiently small for allowing a systematic computation of the corresponding symbolic observability coefficients η s . Prior to carry out those computations, we conducted our a priori analysis to target the candidate variables to be discarded. The linear out-strengths are (1) ( 4) (5) 1 out lin out lin out lin σ σ σ = = = , the two others being null. Among the off-diagonal terms of the symbolic Jacobian matrix which are equal to 1, we have J 45 = J 54 = 1, meaning that at least one of the two variables x 4 and x 5 has to be measured. Therefore, this suggests that the sets with the minimum number of sensors providing full observability comprise at least three variables, either (x 2 , x 3 , x 4 ) or (x 2 , x 3 , x 5 ). In order to have a five dimensional space, these two sets have to be completed with two Lie derivatives. It turns out that only two combinations yielded full observability: ( 1 ) x x x where the derivatives of x 2 and x 4 and of x 2 and x 5 are, respectively, the ones completing each set of sensors. These results are algebraically confirmed by the constant determinants of the observability matrix (or equivalently of the Jacobian matrix of the coordinate transformation between the original state sapce and the reconstructed space 30 ) and equal to k k x x x s2 , thus reflecting a significant lack of observability. This is further supported by the corresponding determinant which is now no longer constant as it depends on variable x 4 . There is thus a singular observabiliy manifold. On the other hand, if the derivative of the second variable is substituted with the derivative of the third one, the Jacobian matrix J of such a transformation Φ x x x 2 3 2 5 2 can be rank deficient with a null determinant for some domain of the original state space.
Finally, we wanted to assess the observability of the couple of variables {x 1 ,x 5 }, identified as sensors of the system (3) in a previous work 16 . Surprisingly enough, we got 0 48 and 0 = for all the possible 5 dimensional vectors constructed with those two variables. These symbolic observability coefficients are also in agreement with the determinants of the Jacobian matrices of the corresponding transformations, whose rational dependence on variables x 2 and x 3 defines a singular observability manifold associated with the transformation x x 1 5 4 Φ , and the other three , characterizing a rank deficient observability matrix . This is therefore a first evidence that our method to reduce the number of sensor variables correctly assesses the observability of this rather complex reaction network.
A 9D system for the Rayleigh-Bénard convection. Let us consider now a nine-dimensional system describing the dynamics of three-dimensional fluid cells with a square plateform in a Rayleigh-Bénard convection 31 . It was obtained by applying a triple second-order Fourier series ansatz to the governing hydrodynamic equations. The equations read aṡ In fact, eight variables are symmetry-related by pairs, namely ( , and (x 7 − x 8 ), and variable x 6 is the single one left invariant under the symmetry (9). Up to four co-existing chaotic attractors were observed in this system 31 . An example of one of those chaotic attractors is shown in Fig. 1 for the (10). And a = 0.5. This system is irreducible in the sense that it cannot be split in lower-dimensional independent systems. This system is an interesting example because it constitutes a highly connected reaction network for which a graphical approach as the one developed by Liu and coworkers 17 leads to only measure one of its nine variables to estimate its states (see the Supplementary Section S1).The variable that least influences the others (or equivalently, the one least "seen" by the rest) is x 6 (σ out (6) = 2) since it only affects nonlinearly the derivative of x 9 . From a symmetry point of view, variable x 6 must be measured to recover the right symmetry property: without this variable, the reconstructed state space would be necessarily associated with an inversion symmetry (a symmetry the original system does not have).
The symbolic Jacobian matrix of system (8) is of the 9 variables which are (2) ( 4) These values indicate that variables x 2 , x 4 , x 5 , x 6 are necessarily to be included in the list of variables to be measured for obtaining a full observability since none of them linearly affect the dynamics of the rest. The other five are candidate variables to be removed from the measurements. Their respective in-strengths coming from the non measured variables are:  ) and (x 3 , x 8 ) thus form two pairs of mutually exclusive variables. Variable x 9 is the single one not involved in an exclusive pair and can be removed from the set of measured variables. To decide which variable from each pair can be safely removed, we check which variables have the largest in-strength from the candidate variables to be non measured. The comparison returns that variables x 7 and x 8 are the ones to be removed since σ The first test to assess the accuracy in selecting the minimal set of variables providing the highest observability consists in systematically investigating those combinations where all the variables are measured except one. The symbolic observability coefficients are reported in Table 1. In all cases providing full observability with just a single variable not being considered, the discarded measure matches one of the candidate variables x 1 , x 3 , x 7 , x 8 and x 9 (marked in bold face in the table) confirming our preselection analysis.
Our systematic computation of the symbolic coefficients allows us to quantify the number of times N i (η) the variable x i is not part of an embedding providing a given observability value η. In particular, the values of N i (η = 1.0) for the variables potentially candidate to be non measured (  Table 2), support our initial choice for not measuring x 7 , x 8 , and x 9 but measuring x 1 and x 3 , since x 7 , x 8 and x 9 seem to be less essential for providing full observability. Consequently, as long as full observability is required, our two network-based criteria correctly identify those variables whose absence from the set of sensors does not affect the full observability of the system. Indeed, when the minimal number of variables, that is, m = 6, is measured, the two possible combinations providing a full observability correspond to a space reconstructed from variables x 1 , x 2 , x 3 , x 4 , x 5 and x 6 (see the Supplementary Table S1. We therefore correctly assessed the best variables to measure for getting full observability with the minimum of variables. Of course, it is also possible to get full observability by measuring more than 6 variables. In that case, we searched for them among the 8 preselected variables using the linear out-strength. From the 354 possible combinations, we obtained 6 combinations using 7 measured variables and 2 with 8 measured variables. Performing a full blind search, with no preselection, from a total number of 1080 combinations with 7 or 8 measured variables, we found 2 and 6 additional combinations providing full observability, respectively. All of them are reported in the Supplementary Table S1. We also investigated the combinations with less than 6 measured variables and providing the largest symbolic observability coefficients whose dependency on m is shown in Fig. 2. For m = 5 (m = 4, 3, 2 and 1) there are 4 (2, 6, 4, and 4, respectively) combinations with η = 0.90 (η = 0.73, 0.36, 0.14, 0.04, respectively). All those combinations are made up of the six preselected variables identified by solely using the symbolic Jacobian matrix and σ out lin and in nm σ to rank them. The non-preselected variables can be involved in reconstructed vectors when a good but not a full observability is desired or when m > 6 (a good observability is considered when η > 0.75 as reported in a previous work 33 ). For instance, this is exemplified in the middle and lower part of Table 2 with a systematic computation of the symbolic observability coefficients η for embeddings built from 5 or 6 measured variables. In this case, we observe that a very good observability (η = 0.90) can be obtained with only 5 variables being measured, namely (x 1 , x 2 , x 3 , x 4 , x 5 ). The fact that variable x 6 is not included, prevents from a full observability and, in particular, its absence induces a lack of symmetry, as previously discussed. Again, by looking at the distribution of N i (0.9) we have: N 1 (0.9) = 5, N 3 (0.9) = 5, N 7 (0.9) = 14, N 8 (0.9) = 14, and N 9 (0.9) = 20. Therefore, for this level of observability (η = 0.9) the last three variables, x 7 ,x 8 , and x 9 can be again chosen to be removed from the set of observations. If we accept a slightly lower observability coefficient (η = 0.80), other possibilities emerge in which variable x 6 is systematicaly removed from the set of measured variables (last part of Table 2). Our two criteria are thus very efficient to detect those variables not really impacting the access to a full observability measure, but discard some possibilities offering a good (but not full until m ≤ 6) observability with severe consequences on the symmetry properties of the reconstructed attractor.
A 13D model for DNA replication. A third and even more challenging case is now considered, a 13-dimensional model for the DNA replication in fission yeast. Fission yeast cells are carrying two mutant genes, wee1 − and cdc25 OP , which initiate mitosis in eukaryotic cells before the end of their DNA replication. A second feature is that DNA synthesis can be restarted without intervening mitoses. Novak and Tyson proposed a model for cell cycle in fission yeast taking into account these two properties in Schizosuccharomyces pombe 34 . The underlying mechanisms are described by the set of thirteen differential equations   This 13-dimensional rational model is characterized by the symbolic Jacobian matrix According to the linear out-strengths σ out lin , we have five candidate variables eligible to be excluded from the observations since having not null σ out lin values: . Variables x 4 , x 9 , and x 10 can therefore be safely removed from the set of measured variables and x 1 and x 8 can not be simultaneously removed. The in-strength in nm σ from the set of potentially non-measured variables is equal to 2 for all the candidate variables except for x 9 which is σ = (9) 0 in nm . Our criteria does not allow us this time to resolve the uncertainty between x 1 and x 8 .
Following the same procedure as with the two previous examples, we performed our a priori analysis by systematically computing the symbolic observability coefficients when a single variable is removed and collecting only those combinations providing either full or null observability (see Table 3). As expected, when one of the variables x 1 , x 4 , x 8 , x 9 or x 10 is not included in the observation set of 12 variables plus one derivative, the observability is full. On the contrary, if one of the variables x 6 , x 7 , x 11 and x 12 is removed, the symbolic observability coefficient drops to zero for any possible choice of the first derivative. These variables are therefore essential and need to be measured. This is due to the fact that these variables have no out-connection other than to themselves as shown in the corresponding columns of the symbolic Jacobian matrix in Eq. (13).
Let us now validate whether it is possible to retrieve a full observability when either the set {x 1 , x 4 , x 9 , x 10 } or {x 4 , x 8 , x 9 , x 10 } are removed from the list of variables to measure. This was performed by systematically computing the symbolic observability coefficients for all the combinations reconstructing a 13-dimensional space without taking into account those two sets of variables. We found that for this DNA model, there are not too many possibilities to reconstruct a space providing full observability of the original state space (see the Supplementary  Table S1). For instance, when removing two of them, x 8 and x 9 the reconstructed state vector is the only one providing full observability. If we discard three variables (x 8 , x 9 , and x 10 ), there are two combinations allowing for a full observabiltiy embedding: ( , , , , , , , , , , 2 3 3 4 5 6 7 10 11 12 13 . A systematic analysis of the symbolic observability coefficients as the number of variables are removed from the set of observations indicate that the coefficient already drops to 0.93 when m = 9 variables are measured (see Fig. 3). As the estimated threshold for an optimal observability is 0.75 33 , it is worthless to investigate sets of size smaller than m = 7.
To actually check the results accounted for the symbolic observability coefficients η s , we computed all the determinants Det  s corresponding to η s = 1 (results are reported in the Supplementary Table S1). In the 14 cases for which the symbolic observability coefficients are equal to one, the determinant Det  s was always nonzero (for the whole state space): our technique always correctly identify reconstructed vectors providing full observability of the original space. As another example, as shown in Fig. 3, full observability is never achieved for m = 9 and the largest symbolic observability coefficient is 0.93, which still provides a good observability. By using the reconstructed space  Table 4) the determinant is zero for x 12 = 1, a singular observability manifold of first order, explaining why the observability coefficient is no longer equal to 1 but close to it. To further show how the observability coefficient decreases when Det  s vanishes for a singularity of higher degree 22 ( , , , , , , , , , 1 2 2 3 3 6 7 10 11 11 12 12 13 providing a slightly smaller observability (η = 0.86) in agreement with the singular observability manifold (15) of second order, defined by Det  s = 0, that is, by (x 11 − 1)(x 12 − 1) = 0. Due to a too large complexity, it was not possible to analytically compute the observability matrix when a single variable is measured.
As detailed in the Supplementary Section S1, Liu and coworkers' graphical technique shows that by measuring the four variables x 6 , x 7 , x 11 , and x 12 it is possible to estimate the states of the system. Clearly, our results are in strong disagreement. At this point, it is relevant to explain why our results are so different from those reported in previous works 16,17 . The first reason for the discrepancy is that Liu's algorithm uses a linear theory, only taking into account whether the ith variable participates or not in the differential equation of variable j and not concerned on how is that dependence. The latter is just equivalent to use a symbolic Jacobian matrix equal to 1 and not the one defined in Eq. (13). Despite there are more than 2100 combinations with 6 or 7 measured variables resulting in full observability, there is none with a single variable. However, with a linear approach, it is still possible to show that the combinations providing full observability correctly identify the variables which must necessarily be used (Table 5), that is, variables x 6 , x 7 , x 11 , and x 12 . Nevertheless, the observability is obviously Non-measured Derivative retained η x 10 x 2  or  x 8 1.00 Table 3. Symbolic observability coefficients when twelve (out of thirteen) variables of the DNA model (12) are measured. The derivative used for reconstructing a 13-dimensional state space is also reported.

Discussion
The observability of a complex system refers to the property of being able to infer its whole state space by measuring the dynamics of a limited set of its variables. Determining the conditions that guarantee the full observability of a system involves testing a number of possibilities that increases exponentially with the dimension of that system and, for each case, it is required to compute the determinant of the observability matrix defining the singular observability manifold, that is, the subset of the state space that cannot be observed from the measurement 25 . It was shown in one of our previous works 2 that for a five-dimensional rational system, the analytical computation of such a determinant may already exceed the capacity of softwares like Maple ® or Mathematica ® .
Therefore, alternative approaches to investigate large complex systems are needed. Those proposed for instance by Sedoglavic 16 or Liu and coworkers 17 remain yet unsatisfactory as discussed by Wang and coworkers 35 , mainly because they do not provide a method to select which Lie derivatives accompany the measured variables and, more importantly, they do not consider a nonlinear observability theory appropriate to deal with nonlinear systems, nonlinearities occuring in the node dynamics or nonlinearly coupled units. Actually, the treatment proposed by Sedoglavic 16 is only probabilistic and tests local observability, not the global one. On the other hand, the graphical approach developed by Liu and co-workers 17 is based on a linear description of the system which can only lead, by definition, to approximated results since, as previously discussed, the lack of observability mainly originates in the location (in the fluence graph) of nonlinear terms. We here investigated the three systems considered by Liu and coworkers (see the Supplementary Section S1) and showed that, in contrast with our results, theirs are not in agreement with the analytical predictions. In our previous work 2 , we investigated the same five-dimensional rational system considered by Sedoglavic 16 . While in the latter reference, the algorithm developed by the author identifies the first variable as the one providing (in fact local) observability, it is only poorly the case when the symbolic algorithm developed in the former one is applied to the possible combinations using this variable, even combined with other variables. And what is even more questionable, it is that when x 1 is combined with one of the four other variables, x 2 , x 3 , x 4 and x 5 , the largest symbolic observability coefficient is still very small, that is, 0.30, 0.18, 0.30, and 0.48, respectively.
We have shown how the efficiency of the algorithm initially proposed by Bianco-Martinez and coworkers 2 is improved by identifying the minimal set of measured variables providing full observability before any search for the corresponding Lie derivatives. The reduced sets of candidate variables capable of fully reconstructing large reaction networks was correctly determined by analyzing the way the variables interact, by only applying two simple criteria on the symbolic Jacobian matrix of the networked system. For the 13 DNA model, there are 5.2 ⋅ 10 6 possible combinations to test (see the Supplementary Section S2 for the details), requiring more than 18 days of computations with a 2.5 GHz Intel Core i5 processor. With our preselection of variables, only 2870 combinations are needed to be tested lowering the computation time to about 4 min, that is, by a factor greater than 1800! These criteria reduce drastically the time spent for searching candidate variables, thus providing the grounds to observe natural and man-made complex systems.
In order to evaluate the reliability of our procedure, we computed a success rate defined as the number of times a symbolic observability coefficient equal to 1 actually corresponds to getting a constant analytical determinant of x 10 x 11 x 12 x 13 η   9  2  2  2  -1  1  1  ---1  2  1  0.93   9  2  2  2  -1  1  2  ---1  1  1  0.93   9  2  2  2  -1  2  1  ---1  1  1  0.93   9  2  2  2  -2  1  1  ---1  1  1  0.93   8  2  2  2  --1  1  ---2  2  1  0.86   8  2  2  2  --1  2  ---2  1 1 0.86 the observability matrix. For the 42 combinations providing η = 1, the success rate is 100%. While we were able to identify and algebraically check all the resulting combinations for the 5D and 9D models, it was impossible for the 13D model due to the large amount of them. In the case of the 9D model, for which we obtained m p = 6 preselected variables, our procedure missed 2 out of 8 combinations with m = 7, and 4 out of 6 with m = 8 (see the Supplementary Table S1). The missed combinations involve at least one variable which was not preselected and, consequently, not considered in our computations. When m = 6, some combinations are associated with a symbolic observability coefficient equal to 0.90: in that case, 36 out of 54 corresponding to this value of the symbolic observability coefficient were made up of the preselected variables. When m = 5 < m p , 100% of the combinations (34) associated with the largest symbolic observability coefficient (0.90) involved the preselected variables. It is important to note that, when m ≤ 6, all combinations providing the largest symbolic observability coefficient (see Fig. 2) are made up of the preselected variables (and are actually found). This means that the preselected variables are indeed the revelant ones for estimating the system states and, that all combinations using these variables are correctly identified. To the best of our knowledge, we have a single case for which the full observability was not detected by our procedure 33 : it corresponds to the rare case for which two nonlinear terms cancel each other in the computation of the determinant of the observability matrix. Finally, as firstly reported by Parlitz and coworkers 36 , the observality of a system could be addressed by using delay coordinates. As shown by Gibson and coworkers 37 , delay coordinates are related to derivatives by a rotation and a rescaling. Consequently, any result valid for derivative coordinates (not affected by rotation and rescaling) holds for delay coordinates. The reduced sets of m measured variables (m < d) are not dependent on the use of delay or derivative coordinates, only on the choice of the complementary coordinates to reconstruct a d-dimensional space. Therefore, the extension of the technique proposed in this work to networks of discrete time systems (discretization of continuous-time systems) and iterated maps seems to be rather straightforward according, for instance, to Sarachik and Kreindler 38 and to Nijmeijer 39 , respectively.  (17) is said to be state observable if and only if the observability matrix has full rank, that is, rank ( s ) = d. Notice that, the full observability of a system is determined by the space spanned not only by the measured variables but also by their appropriate Lie derivatives 1 .
The observability matrix  s corresponds in fact to the Jacobian matrix of the change of coordinates Φ s : x → X where ∈ X d  is the reconstructed state vector from the m measured variables and their adequately chosen d − m Lie derivatives 30 . Let us make explicit the observability matrix x x i j  for the situation where two arbitrary variables x i and x j are measured directly, that is, when s = (h 1 (x),h 2 (x)) = (x i , x j ), and h 1 and h 2 are two measurement functions. In this case, the observability matrix reads as, where the order Lie derivatives k and l are such that k + l = d − 2, that is, there are d − 2 + 1 possibilities for choosing k and l. According to the Takens theorem 41 , it is possible to increase the dimension up to 2d H + 1 where d H is ideally the Haussdorff dimension of the attractor to ensure a global diffeomorphism between the original state space and the reconstructed one, as long as the measurement function is generic. Showing that a measurement function is generic is not a trivial problem which is out of the scope of the present work. There is, therefore, no guarantee that the Takens theorem applies here. Moreover, our aim is to select the minimal set of measurements providing the best observability of the system. When a higher-dimensional reconstructed space is considered, this means that a global diffeomorphism perhaps may be obtained but it also means that the measurements provide information that is non-optimal and from which the analysis is most likely problematic and tricky 42,43 . Consequently, investigating higher-dimensional reconstructed spaces has a rather limited interest in the present context. The fact the system is fully observable from the two measured variables considered in the matrix (20) depends also on the particular choice of the pair (k, l), the numbers of successive derivatives computed from x i and x j , respectively. Therefore, it is crucial to specify how the measured variables and their derivatives are used to reconstruct the state space. An approach-as the ones developed in other works [16][17][18] -missing this necessary condition cannot indeed properly address the problem of full (or even good) observability.
Symbolic observability formalism. The procedure to calculate the symbolic observability coefficients is implemented in four steps as follows: i) Construction of the symbolic Jacobian matrix ( ∼  ). The Jacobian matrix  , composed of elements J ij , of the system (17) is transformed into a symbolic Jacobian matrix ∼  by replacing each linear element J ij by 1, each non-linear polynomial element J ij by, and each rational element J ij by when the j th variable is present in the denominator or by 1 otherwise. This is more or less equivalent to the so-called influence (or fluence) diagram as used by Letellier and Aguirre 23 where linear and nonlinear coupling terms are associated with solid and dashed arrows, respectively, and as used by Liu and coworkers 17 where coupling terms are labelled with arrows (without distinguishing linear from nonlinear couplings). In the present approach, rational terms are distinguished from nonlinear polynomial terms since they strongly reduce the observability 2 . ii) Construction of the symbolic observability matrix (  O s ). Let us consider for simplicity a univariate measurement s = h(x) = x i . For this particular case, the first row of O  s is just defined by the derivative of the measurement function dh, that is, O 1 , and the construction of each block follows the same rules described above for univariate measures. iii) Computation of the symbolic observability coefficients. The determinant of O  s is computed according to the symbolic product rule defined in (21) and expressed as products and addends of the symbolic terms 1, 1 and 1, whose number of occurrences are stored in variables N 1 , N and N, respectively. A special condition is required for rational systems such that, if N = 0 and N ≠ 0 then N = N. The symbolic observability coefficient for the measurement s is then equal to