Self Calibrated Wireless Distributed Environmental Sensory Networks

Recent advances in sensory and communication technologies have made Wireless Distributed Environmental Sensory Networks (WDESN) technically and economically feasible. WDESNs present an unprecedented tool for studying many environmental processes in a new way. However, the WDESNs’ calibration process is a major obstacle in them becoming the common practice. Here, we present a new, robust and efficient method for aggregating measurements acquired by an uncalibrated WDESN, and producing accurate estimates of the observed environmental variable’s true levels rendering the network as self-calibrated. The suggested method presents novelty both in group-decision-making and in environmental sensing as it offers a most valuable tool for distributed environmental monitoring data aggregation. Applying the method on an extensive real-life air-pollution dataset showed markedly more accurate results than the common practice and the state-of-the-art.

Problem Statement. The increasing availability of sensors and communication technologies have both facilitated 1,2 and catalysed 3,4 the development of Wireless Distributed Environmental Sensor Networks (WDESNs) that consist of low-cost Micro Sensing Units (MSUs). WDESNs present an unparalleled means for studying environmental processes such as air-pollution [5][6][7][8] , water quality 9,10 , smart cities 11,12 and wildlife ecosystems 13,14 . These networks may consist of many sensing nodes and may be deployed over large geographical areas, rendering the calibration process of the nodes as a major obstacle in them becoming the common practice.
Here we present a new, robust and efficient method for aggregating measurements acquired by an uncalibrated WDESN, and producing accurate estimates of the observed environmental variable's true levels. To accomplish that we introduce a new group-decision-making method -consensus aggregation of incomplete ratings. The suggested methodology produces accurate results without requiring the MSUs, constituting the WDESN, to be calibrated. Thus, after the aggregation process, the herein proposed methodology renders the network to be self calibrated.
Without loss of generality, let us consider now a WDESN with K sensory nodes that measure the same physical phenomenon. The same physical phenomenon can be, naturally measured when the MSUs are collocated [5][6][7] . Even when the sensors are not collocated measuring the same phenomenon can be achieved when it is uniform in all measuring points 7 . With that, due to the inherent MSUs' limitations, collocating is currently the common practice [5][6][7][8] . MSU k ∈ K measures pollutant's levels, at a given frequency, generating a time series, a k . The goal then is to find a consensus time series, r, that agrees the most with all the MSUs' acquired time series,  a a a { , , , } . The agreement of r with each acquired time series, say a k , is measured by a distance function, d(a k , r), that fulfills a set of axioms [15][16][17] . Examples for d() are the L 1 and L 2 norms, and the Kemeny & Snell 16  Problem (1) in known as the group-decision-making problem [15][16][17][18][19][20][21] . The group-decision-making problem has been widely-studied and has many applications, such as: voting 18 , jury decisions 19 , consumer opinion aggregation 20 , and project selection 21 . In general terms, the group-decision problem is defined as follows: a group of K entities or individuals (referees) collectively evaluate n objects. In our context, the evaluations are cardinal evaluations each MSU (referee) assigns to each object (location-time pair) it evaluates. The problem then is to aggregate Scientific RepoRts | 6:24382 | DOI: 10.1038/srep24382 the referees' evaluations into a consensus evaluation of each and every object. Note that the referee evaluations as well as the consensus evaluation are allowed to contain ties.
For an environmental field campaign that is carried over a time window, T, an MSU's time series, a k , is considered complete if for all time periods t ∈ T, the MSU gives a valid measurement; otherwise we say that the MSU provides an incomplete time series. The latter might happen when MSUs become faulty or switch locations. The latter scenario of sensors switching locations was described by Mead et al. 5 , Moltchanov et al. 7 and Lerner et al. 8 . Regardless of the incompleteness of the MSUs' time series, we require that the consensus time series is complete. All previous group-decision work has considered the specific case of complete evaluations [15][16][17] . Here we introduce and solve the incomplete group decision making problem. To this end we (i) introduce a set of natural axioms that must be satisfied by a distance, d(), between incomplete ratings (time series); (ii) prove the uniqueness and existence of a distance, herein called the normalized projected Cook-Kress distance -d NPCK , which satisfies these axioms; and (iii) provide an efficient and practical method for finding the optimal rating (time series) r* for problem (1) when using d NPCK as the distance function. While we present the new axiomatic distance in the context of WDESNs, it can be used for data fusion in many complete or incomplete group-decision-making application in general and in distributed sensing applications in particular. In our WDESN context, and specifically when presenting the study below, we use the term time series when referring to a vector of measurements (provided by an MSU or the consensus time series provided by the aggregation process), while when presenting the methodology, we use the term rating in order to emphasize that our proposed methodology is general and applies to aggregating any set of ratings (i.e., vectors consisting of cardinal evaluations).
Air Quality Wireless Distributed Sensor Networks. Air-Pollution (AP) is known to increase risks for a wide range of diseases, such as respiratory and heart diseases. Recent data indicate that in 2010, 223,000 deaths from lung cancer worldwide resulted from air pollution 22 . This number is expected to grow as studies indicate that in recent years exposure levels have increased worldwide with a significant raise in rapidly industrialising countries with large populations 23 . Studying AP and its impact on health, requires accurate exposure assessments. AP related exposure metrics, typically used in environmental epidemiology studies, are based either on short term sampling 24 or on pollutant measurements by regulatory standard Air Quality Monitoring (AQM) stations over extended time periods 25 . AQM stations provide accurate measurements but suffer from limited deployment due to their bulkiness, high costs, and their frequent maintenance and calibration requirements. The limited deployment tampers the AQM network's ability to adequately capture air pollutant spatial concentrations because these concentrations are highly variable. In contrast, intensive sampling campaigns use a large number of AP sensors, deployed at high densities, but are limited to relatively short time periods 22 . Consequently, accurate exposure assessment and the study of AP-health associations are still challenging tasks 26 .
Since AP-MSUs cost significantly less than AQM stations, MSUs can be spread more densely and thus provide data with higher spatial resolution. However, MSUs are error-prone, may become faulty, have limited coherence over time and are inaccurate when compared to AQM stations [5][6][7][8] . Early studies that evaluated MSUs' capabilities in a controlled lab environment 27,28 stressed the need for a calibration process in order to sustain reliable measurements. Field deployments of such MSUs, measuring ambient O 3 levels by metal-oxide sensors 6 , and measuring CO, NO and NO 2 by electrochemical 5 or metal-oxide 29 probes, have shown that calibration processes applicable for controlled lab environments do not work in the field, when the calibrated data is compared to data collected at a collocated standard AQM station 6,7 (even after an initial field calibration has been applied 6 ). Thus, the field calibration process is a critical hurdle that one must overcome, in order to make WDESN a viable tool for AP exposure assessment. Having said that, the suggested method is applicable to many WDESN applications, even though the examples here focus on AP-WDESNs.

Methods
In this section the set of axioms that a distance metric between incomplete ratings must fulfil so that, when using this distance within problem (1), the obtained consensus rating appropriately minimizes the disagreement of the judges (MSUs' measurements in our context) is presented. In doing so, our aim is to have a distance that is appropriate to aggregate the measurements obtained by uncalibrated MSUs.
Each sensor presents two types of errors -normal measurement error and calibration error. The former is typically considered to be additive, normally distributed with zero mean and constant standard deviation over time [30][31][32] . The later is assumed to be independent from other sensors' errors; and roughly stable throughout the measurement collection process/timeframe. The mean calibration error is assumed to be zero, though no assumption is made on the shape of the distribution. Finally the calibration error is considered to be additive. In case of multiplicative error, the algorithm can deal with this in two ways: (1) The algorithm can be applied as is and still obtain meaningful results to the extent that the multiplicative error is significantly smaller w.r.t. the readings themselves. (2) Otherwise, one can take the logarithm of each measurement and apply the (unchanged) algorithm to this re-scaled data because this data re-scaling effectively transforms a multiplicative error to an additive error (since log(ab) = log(a) + log(b)).
The essence of the proposed method is that, due to the calibration error, given any one of the MSU's the difference between any pair of its measurements is significantly more reliable than the absolute value of the measurements themselves. As such, these differences among the same MSU measurements will be the focus of the following definitions, the axioms proposed, and the resulting distance. Specifically, the main aim of our method is to extract as much information as possible from the reliable measurement differences, and then, with that information in hand, solve the bias problem as a second step (this will be most evident in the solution procedure described at the end of this section).
Scientific RepoRts | 6:24382 | DOI: 10.1038/srep24382 Notation and Definitions. Let us consider two arbitrary incomplete ratings, a and b, in a universe V of n objects; each rating evaluating the objects in  ⊆ V and ⊆ V  , respectively. Hereafter, we represent a rating as a vector of the form a = (a 1 , a 2 , … , a n ), where a i is the score (a cardinal evaluation) of object i if object i is evaluated in a and a i is undefined otherwise. We also assume without loss of generality that the possible scores are contained in some pre-specified interval  u [ , ]; this assumption is without loss of generality, as the MSU's have a limited measurement range ≡ −  R u . Given two arbitrary incomplete ratings a and b, the following concepts are defined. Definition 1. Given a rating a and a subset S of the object universe V, the projection of a on S, denoted as a| S , is the rating of the objects in S that preserves the scores specified by a to the objects in S (similarly, the objects in S that were not evaluated by a, will remain un-evaluated in a| S ).
The following three definitions are natural extensions for incomplete ratings of the corresponding definitions given for complete ratings by Cook & Kress 15 : That is, if for every pair of objects their score difference in rating a is either the same as in rating b or differs by exactly one unit.

Definition 3. Rating a is said to be adjacent of degree k to rating b if a is adjacent to b and
That is, if the number of object pairs in the set ∩ A B for which their score difference differs by one unit is k. Axioms. The objective is to design a distance such that, when used within problem (1), the obtained consensus rating minimizes the disagreement of the judges (uncalibrated MSUs in our context). Following is a set of axioms that a distance metric between incomplete ratings must satisfy so that our objective is achieved. Remark: when designing these axioms, we have in mind that, (i) given any MSU, the difference between any pair of its measurements is significantly more reliable than the absolute value of its measurements themselves; and (ii) an MSU providing a large amount of measurements for a particular location is not necessarily more reliable/accurate than an MSU providing a comparatively smaller amount of measurements.

Definition 5. Ratings a and b are opposite ratings on
, and equality holds if and only if It is important to note that Axioms 2 to 4 for incomplete ratings are natural extensions of Cook & Kress' non-negativity, commutativity and triangular inequality axioms for complete ratings. Indeed, these two sets of axioms are identical when restricted to complete ratings. Similarly, Axioms 5 and 6 are a natural extension of Cook & Kress' proportionality axiom; the only minor difference is that Cook & Kress' axiom fixes the proportionality constant to '1'; while our normalization Axiom 6 (as shown later), sets the proportionality constant to the This minor difference is critical in the context of aggregating incomplete ratings. Specifically, normalization guarantees that when solving problem (1) all of the incomplete ratings are given the same importance regardless of the number of objects that each evaluated-this is critical since larger amounts of data/measurements does not necessarily mean higher accuracy.

Normalized
The Normalized Projected Cook-Kress (NPCK) distance is given by: The following sequence of results will allow us to prove that the d NPCK distance is the unique distance satisfying Axioms 1 to 6 simultaneously.

Lemma 6. Given a set V of n objects and a rating interval
. Moreover, this maximum distance is attained by any two opposite ratings.
Proof Sketch. The lemma can be restated as follows: "Any pair of opposite (complete) ratings is a global maximizer of problem (4) with an optimal objective value of It can be shown that, when a and b are assigned values so that they are opposite ratings, (i) one obtains a local maximum of the problem (all feasible directions are non-increasing), and (ii) the objective value of such assignment is equal to . Since the above optimization problem is convex, every local maximum is a global maximum and thus the result follows. ◻  a and b, d(a, b) = d NPCK (a, b). We divide our analysis in the following two cases: For complete ratings, Axiom 1 is a tautology and, as argued above, axioms 2 to 5 are identical to all of Cook & Kress' axioms except for the proportionality constant. Therefore, for complete ratings, Axioms 2 to 5 uniquely determine d CK except for a proportionality constant. Consequently, since d(a, b) satisfies Axioms 1 to 5 we conclude that, for complete ratings, In view of eqs (5) and (6), in order to conclude that d(a, b) = d NPCK (a, b) for complete ratings, we only need to prove that This result follows since both d(a, b) and d NPCK (a, b) attain their extreme values (zero and one) at lots of rating pairs. Specifically, given any two opposite ratings, say a′ and b′ , axiom 6 stipulates that d(a′ , b′ ) = d NP-CK (a′ , b′ ) = 1. Similarly, given any rating, say a′ , eqs (5) and (6) The first and last equalities follow from Axiom 1, while the second equality follows from our analysis of case 1 and the fact that , the incomplete-rating aggregation problem (Eq. (1)) using the NPCK distance is a special case of the separation-deviation problem and can be reformulated as: i Problem (7) is a special case of the convex dual of the minimum cost network flow problem, and thus it can be solved in 34 , where n is the number of objects (in our context, number of time points when the measurements were taken), and ε is the desired accuracy.
Finally, recall that our aim when designing the distance function was to extract as much information as possible from the reliable measurement differences (in contrast to the unreliable absolute measurements). Indeed in Problem (7) the bias of each MSU is completely ignored; specifically, Problem (7) can be interpreted as finding the vector r, whose pairwise differences, z, are as close as possible to the given MSU's pairwise measurement differ- This is precisely what we aimed for because the MSU's are uncalibrated and thus the MSU's pairwise measurement differences are significantly more reliable than the absolute values of the measurements. Now, note that given any optimal solution to Problem (7), say r*, the vector r′ = r* + c, for any given scalar constant c, has exactly the same pairwise differences, z*, and thus is also an optimal solution to Problem (7). As such, the last step of our MSU aggregation method, is to calibrate our aggregated/consensus "measurements", r*. In particular, we need to find the best calibration constant, c, to calibrate our consensus measurement vector r* (keeping fixed all of its pairwise differences, z*). This is achieved by solving the problem i We note that Problem (10) is efficiently solvable by a simple binary search procedure over c. Indeed it can be shown that the objective functions of problems (7) and (10) can be combined in a single objective function by adding them and multiplying the objective function of Problem (7) by a large constant so that it is lexicographically more important than that of Problem (10). Moreover, the resulting combined optimization problem would still be a special case of the separation deviation problem and thus efficiently solvable.
To illustrate the acquired data, Fig. 1   the same value. The CI is calculated as the standard error multiplied by the critical two-tailed value of z for α = 0.05 35 . Note that the consensus time series obtained when using the NPCK distance when solving problem (1) present, most of the times, higher R 2 and lower CI values as compared to those obtained when using either the L 1 and L 2 distances. Specifically, the NPCK has shown higher R 2 and lower CI for both NO 2 campaigns and for three, out of the four O 3 campaigns. In addition, when the NPCK does not present the best results, it is not far behind presenting almost the same score. Therefore, we conclude that the consensus measurements/time series obtained when using d NPCK , is the best fit for estimating the real AQM measurements/time series based on the consensus of all MSUs.
To illustrate the notions above visually, Fig. 2 plots three consensus time series against the AQM time series obtained for first two campaigns. Each point in the graphs corresponds to a specific time, its x-coordinate is the "measurement" of the consensus time series at that time and its y coordinate is the measurement taken by the AQM at that time. For the Igud campaign, comparing Fig. 2a,b with Fig. 2c, it is evident that the linear relation between the AQM measurements and the consensus time series is stronger for the NPCK as the measurements spread around the linear line is smaller; this exact same result holds for the Tel-Hai campaign (as evident when comparing Fig. 2d,e with Fig. 2f). Supporting the quantitive analysis above.
Robustness Analysis. The robustness of the suggested scheme is presented next. For this purpose, two time series (#135 and #136), acquired in the second campaign in conjunction with the data of Fig. 1d,e were added into the aggregation process. These two time series were acquired using EC ozone MSUs (AQMesh of GeoTech, UK). While EC MSUs have been previously used for ozone measurements, this technology suffers heavily from    interferences 5,6 and thus, produces measurements that are less accurate than those obtained by using metal-oxide ozone MSUs (which was the type of MSUs used to obtain the data in Fig. 1d,e). Figure 3 presents the Tel-Hai AQM station's complete time series alongside the incomplete time series measurements acquired from the GT135 and GT136 MSUs. Table 3 depicts the correlation coefficient and the Mean Squared Error (MSE) between the AQM measurements and all sensors that took part in this campaign (see Table 1) and the two added time series (Fig. 3). Note that the last two MSUs added to the process, GT135 and GT136, have a significantly lower correlation and higher MSE then the rest of the MSUs. Figure 4 is analogous to Fig. 2 and plots, against Tel-Hai's AQM, the consensus "measurements" aggregating both metal-oxide and electro-chemical MSUs when using the L 1 , L 2 and d NPCK metrics within problem (1). Figure 4 also presents the coefficients of determination, R 2 , between the three consensuses and the AQM measurements. Similarly to the results when using only metal-oxide MSUs, the correlation coefficient of consensus measurements obtained with d NPCK is by far the largest one. Therefore, we again conclude that the consensus measurements/time series obtained when using d NPCK , is the best fit for estimating the real AQM measurements/ time series based on the consensus of all MSUs.

Discussion
This paper introduces a scheme for the aggregation of incomplete ratings into a group consensus decision making. The core of the method is the herein-developed axiomatic Normalized Projected Cook-Kress (NPCK) distance. The NPCK distance is derived from a set of axioms any distance between incomplete ratings should fulfil so the consensus rating aggregates the given ratings. The consensus rating is the rating that minimizes the sum of all distances from the different ratings. The NPCK approach is an extension of Cook and Kress complete rating aggregation problem, making it suitable to many new applications. An efficient algorithm for finding the consensus rating is also provided.
Wireless Distributed Environmental Sensory Networks (WDESN) have become technically and economically feasible. However, WDESNs may consist of many sensors and thus, the calibration process is a major obstacle. The suggested NPCK distance presents a new, robust and efficient method for aggregating measurements acquired by an uncalibrated, inexpensive and error-prone WDESN, and producing accurate estimates of the observed environmental variable's true levels. Given a set of collocated Micro Sensing Units (MSUs), the NPCK incomplete ratings scheme is applied, where each measurement (defined by time and location) is considered as a referee evaluation. These time series can be incomplete as sensors might become faulty or shift locations. Based on a set of collocated measurements (in time and in space) a consensus measurement is derived using the NPCK scheme.
The methods have been applied to a wide set of pollutants measurements (i.e., ozone, nitrogen oxide, nitrogen dioxide and carbon monoxide) acquired by all available MSU technologies (metal oxide and electrochemical). When compared to a standard regulatory Air Quality Monitoring (AQM) station, the suggested methodology has shown markedly more accurate results than the common practice and the state-of-the-art, without requiring the Micro Sensing Units (MSUs), constituting the WDESN, to be calibrated, rendering the network to be self calibrated. To achieve this, some assumptions on the error behaviour are made (i.e., additive, zero mean error). While these assumptions are commonly accepted, we have also presented a simple logarithmic data re-scaling technique which enables the method to handle multiplicative errors. Therefore, generalising the suggested scheme even further.