Modelling the expected probability of correct assignment under uncertainty

When making important decisions such as choosing health insurance or a school, people are often uncertain what levels of attributes will suit their true preference. After choice, they might realize that their uncertainty resulted in a mismatch: choosing a sub-optimal alternative, while another available alternative better matches their needs. We study here the overall impact, from a central planner’s perspective, of decisions under such uncertainty. We use the representation of Voronoi tessellations to locate all individuals and alternatives in an attribute space. We provide an expression for the probability of correct match, and calculate, analytically and numerically, the average percentage of matches. We test dependence on the level of uncertainty and location. We find that the overall mismatch is considerable even for low uncertainty—a possible concern for policy makers. We further explore a commonly used practice—allocating service representatives to assist individuals’ decisions. We show that within a given budget and uncertainty level, the effective allocation is for individuals who are close to the boundary between several Voronoi cells, but are not right on the boundary.


Proposition 1.
(1) where B(x, ρ) is the ball around x of radius ρ. If ρ > ρ max , then P ρ saturates at P ρ = J j=1 (vol D j ) 2 /(vol A) 2 . Proof. To see (1), recall that given that x ∈ D j , the probability P ρ (x) that we select D j as the basin of attraction is the relative area of the intersection of the ball of radius ρ with the cell D j : We want to compute the expected value of P ρ (x) (we average over x): To see the saturation value, take ρ to be larger than the diameter of the space of attributes A, so that for each point x, the ball B(x, ρ) coincides with all of A. Then for each x ∈ A, the intersection B(x, ρ) ∩ D j = D j , and we can compute P ρ simply as a conditional expectation, by writing vol D j vol A and then vol D j vol A 2 as claimed.
1.1. The one dimensional case. Equation (1) makes sense in any dimension, but it is only in dimension K = 1, when the space of attributes is an interval A = [0, L], that we know how to extract an exact expression from it for small ρ.
For x in the interior region, we have B(x, ρ) ⊂ D j so that D j ∩ B(x, ρ) = B(x, ρ) and hence To compute the contribution of the boundary components, note that there are two types, corresponding if they coincide with the boundary of the interval, namely j = 1, J + 1, or not (j = 2, . . . , J).
For components which do not intersect the boundary, namely if j = 1, J + 1 then for all x ∈ [a j , a j + ρ] ∪ [a j+1 − ρ, a j+1 ], the "ball" B(x, ρ) = [x − ρ, x + ρ] has length 2ρ, but Altogether, we obtain for j = 1, J that For components which do intersect the boundary, that is for D 1 = [0, a 2 ] or D J = [a J , L], we have and we get a contribution of Therefore, Altogether we find 1.2. Higher dimensions K ≥ 2. We now pass to the higher dimensional case K ≥ 2. Our goal in this section is to obtain an exact formula for the first variation of P ρ , that is for the slope at ρ = 0.
For each Voronoi cell D j , we denote by ∂ int D j the part of the boundary of D j which does not lie on the boundary of the box (the space of attributes) A.
Proposition 3. In dimension K ≥ 2, the mean probability for correct assignment P ρ for ρ small is Proof. We start by using equation (1) (3) There are two types of points x ∈ D j : Type I, those x ∈ D j so that the ball B(x, ρ) is entirely contained in D j , and type II are the rest (Supplementary Figure 2). Note that if x is close to the boundary of A: dist(x, ∂A) < ρ, but far from the interior boundary of the cell, that is dist(x, ∂ int D j ) > ρ, then B(x, ρ) ⊆ D j is entirely contained in the cell, even though it is only a truncated ball (Supplementary Figure 3). This means that these points are type I. Thus type II points are precisely For type I points, we have B(x, ρ)∩D j = B(x, ρ) so that the quotient of volumes equals unity: Thus the type I points contribute Supplementary Figure 2. Type I region, type II region (shaded) and the excised points near the boundary (shaded region with stripes).
The type II points are contained in a "strip" around the interior boundary of "width" 2ρ. We excise the contribution of points which are also ρ-close to ∂A or to more than one interior face (Supplementary Figure 2). The volume of these points is bounded by O(ρ 2 ), since they are at distance ≤ ρ from the intersection of two faces or the intersection of a face with ∂A, which has codimension 2. Since vol (B(x, ρ) ∩ D j ) /volB(x, ρ) ≤ 1 in any case, the total contribution of such points is O(ρ 2 ), which is negligible. Thus we need only consider points x with dist(x, ∂ int D j ) < ρ and in addition that B(x, ρ) is an actual Euclidean ball, not a truncated one.
Proposition 4. For ρ sufficiently small, the contribution of type II points is Putting together equation (4) Figure 3). Then for every x ∈ H + , we have dist(x, H) = x K and we assume that 0 ≤ x K ≤ ρ. We need to compute

Using polar coordinates in
Dividing we find that · ρ which equals c K ρ.
We can now complete the proof of Proposition 4: Until now, we have fixed the coordinates (x 1 , . . . , x K−1 ), where the particular face of the cell is (x 1 , . . . , x K−1 , 0) ∈ H ∩ D j ; integrating over these coordinates, we obtain the (K − 1)-dimensional volume of that face up to an error of O(ρ 2 ) , and summing over all interior faces of the cell D j and then over the various cells, we obtain as asserted by Proposition 4.

Distance Based Matching Metric
We explore the sensitivity of our results in Figure 2 of the main text to a matching measure which is based on distance rather than a matching/non-matching binary classification. One can say, that a binary match/non-match classification does not provide information as to how much the chosen alternative is worse than the optimal one, and thus it makes it harder to evaluate the overall dissatisfaction in the population. A metric which measures the average distance between the possible chosen alternatives and the true location, might provide additional information as to the level of satisfaction of the individual from the chosen alternative.
To construct such metric, consider an individual at location x, within a single Voronoi cell D j . If there is no uncertainty in the perceived location of that individual (ρ = 0),the individual is assigned to alternative j, at a distance d(ρ = 0, x) = |x − j| to the alternative. If the individual mistakenly perceives his position as y, leading to choosing another alternative, i, then the distance between the true location and the chosen alternative is |x − i|. Assuming a uniformly distributed error ball of radius ρ around x, we obtain the average distance between the chosen alternative and the true position as: Figure 5a shows the effect of uncertainty on the average distance to the chosen alternative: d (ρ, x) − d (ρ = 0, x). By integrating over the attribute space we obtain the average distance between all of the individuals and their chosen alternatives. To measure the elasticity of the overall match on the error ρ, we divide the above average by the average distance obtained for ρ = 0: The metric d (ρ) represents the average incremental distance between the true location and all the possible chosen alternatives within the error ball, relative to the no error case. The larger is its deviation from 1, the higher is the distance between the individuals and their chosen alternatives.
Supplementary Figure 5a visualizes the effect of uncertainty for the case of the attribute space shown in Figures 1 and 2 of the main text. We plot d(ρ, x) − d(ρ = 0, x) at each point of the attribute space. Just as in the case of the binary metric, most of the effect of the uncertainty lies within a strip of radius ρ around the boundaries. However, unlike the binary metric, where the boundaries are the most sensitive to the occurrence of a mismatch, for the distance-based metric, the boundaries are the regions which are the least sensitive to a mismatch, as the distance to the alternatives on both sides of the boundary is of similar magnitude.
Supplementary Figure 5b shows 1/ d (ρ) vs. ρ for the same attribute space. The value of 1/ d (ρ) decreases with ρ. Note, that unlike the linear decrease in the binary metric, this decrease, for low values of ρ, can be fitted by a parabola d (ρ) ∝ ρ 2 . The effect of the uncertainty is thus second order in ρ.
To compare the effect of the uncertainty between the binary and the distance based cases, consider ρ = 0.15. The mismatch probability in the binary case is 20% (as shown in Figure 2 of the main text), however the value of d(ρ) is 1.03. That is, although on average 20% of the population is expected to choose an alternative which is not optimal, the distance to their chosen alternatives is expected to increase by 3%.