Land and building separation based on Shapley values

The total value apportionment between land and building components remains an international issue both in theory and in practice. There are several concepts and methods of value separation, each leading to approximate estimations and therefore to divergent opinions about their reliability. In this paper, we present an alternative method of value apportionment based on Shapley’s scheme of values, well recognized in the coalitional game theory. The practicality of this method is verified using observed prices of 14,715 residential properties sold during the year 2019 over all the 27 districts in Montreal (Canada). This unique data comes with detailed information about the essential attributes of the land and the building components. The empirical results of the method presented in this work are in line with practical expectations of total and separate values, either taken case-by-case or in aggregation per district. They are indeed encouraging when compared to the results of two other independent methods (i.e., the city evaluations and the OLS predictions) for the same properties. The results are interesting not only regarding the separation of value but also in several other related aspects. For instance, land values are often close to or even higher than the building values. This shows a phenomenon of building depreciation and land value appreciation. Some districts seem to favor the quality of the building, others being influenced by the location and quality of the land. Interestingly, in contrast to what is believed in practice, a good quality parcel of land does not necessarily have a good quality building according to the results.


Introduction
T he separate estimation of the land and its building in practice is more challenging than global value estimation. In theory, there are diverging opinions on the separability/ inseparability of value between the two components, raising concerns especially about the imputation of revenues to its agents of production and the impact of different systems of taxation. Classically, the inseparability thesis was assumed by Ely (1922), Russell (1945), Ratcliff (1950), Fisher (1958, and Dorau and Hinman (1969), and it is restated in some of the contemporary literature (Mills, 1998;Connellan (2004) ;Fischel, 2000;Kitchen, 2003;Hendriks, 2005). The consensus is that apportionment is not practicable (and is useless) because land and improvements are merged together like in an "omelet". However, in contrast, several early (George, 1881;Smith, 1886;Marshall, 1922) and contemporary authors (Brueckner, 1986;Rice, 1982) defended separability. As these explanations are essentially theoretical, some responses are expected from those working in the field. Perhaps because of the same theoretical uncertainties, priority is usually given to global value estimation, which is often needed for taxation (Andelson, 2000).
Importance of value separation. The land and the building components (of a property) each have specific characteristics making them types of capital that are more different than alike and thus requiring separate values (Gaffney, 1994;Gihring, 1999). Separate values are relevant everywhere and concern all types of properties, especially those in residential urban areas for which the land market is scarce and classical methods are ineffective. Land and building values coexist and shape the dynamics of real estate markets in a specific way. In developed urban areas, land value progressively catches up to the value of the improvements, and sometimes goes much higher (Bourassa et al., 2011). Many practical situations require separate estimations, for example, city and capital gains taxation, tax incidences (Gloudemans, 2001), challenges by contributors in courts (Keligian, 1994), applicability of the cost method for the portion of improvements (Appraisal Institute, 2013), decisions about option value and amortization (Dye and McMillen, 2007), volatility of land and building values over time (Davis and Palumbo, 2008), management of insurances and mortgages (Guofang et al., 2003), sound land use policies and practices (Dancaescu, 2000), and preparation of land value maps (Ohno, 1985).
Contribution to the issue of price separation. The value scheme of Shapley (1953), combined with fuzzy logic, has been thoroughly examined in coalitional game theory models. It is almost absent in real estate analysis and valuation. The main contribution of this work is the application of Shapley value concept in real estate valuation as an alternative method to separate the total value of a residential property between its land and building components, following a game theory framework. We also contribute in several other ways by extending the convenient Shapley method, notably with the integration of the fuzzy logic set of values in the Gaussian function in a way that reflects the capacity of subsets of property attributes. These capacities are optimized under several logical constraints and an error minimization process.
Shapley method offers an occasion to appreciate related, important issues of value separation, such as the optimal timing and location to rebuild/change the existing structure/usage (for instance, when land values are significantly higher than the depreciated building values). The method is helpful for investors and planners to appreciate per district if the quality of land and improvements goes together; some districts may favor the quality of the building, others being influenced by the quality of the location.
The use of the game theory basis, Shapley's global value and fuzzy logic contribute to reviving interest on the separation issue, especially in regions where the land market is scarce. Direct attention to the value separation difficulties and the subtleties within Shapley's special method make it more palpable in practice, in contrast to an approach that takes them for granted or leaves them to speculation. The Shapley method tangibly promotes a coherent system of separate and total value estimations for professionals working in both case-by-case and mass evaluation contexts. The following describes the conceptual bases of this method, tested with a practical application in the case of the housing market in Montreal (Canada).

Value separation approaches
Each of the three possible value apportionment theories in Table 1 supports various methods in practice. In fractional apportionment theory (FAT), the deduction of the cost portion from the total price allows one to derive an estimation of the land value. The allocation and abstraction techniques are preferred if there is a shortage of land sales in built-up areas (they establish a typical land-to-building value ratio). The extraction technique estimates land value by deducting the estimated depreciated cost of the building from the total observed price. In rent apportionment theory (RAT), which supports the income method, subdivision, land residual and ground rent techniques are also traditionally used in practice (appropriate for stable income generating properties). The most reliable technique, supported by price apportionment theory (PAT), is the direct comparison when data on land sales is available.
Comparing available approaches. The FAT techniques are approximate and essentially subjective as they proceed ad hoc almost by a rule of thumb and some simple assumptions (Mills, 1998). Techniques in RAT are also approximate, except for the land residual in the income capitalization approach (ICA) which is quite consistent in the case of income generating properties. Originally defined by Ricardo (1817) in an agricultural production context, the land residual bases were later taught with the classical urbanization models, notably by Alonso (1964), Mills (1967), and Muth (1969. The related contemporary option value concept is about the dialectics between the rent of a land parcel (empty or improved) and the type/amount of capital to be optimally realized on top of it (Quigg, 1993;Yavas and Sirmans, 2005;Clapp and Salavei, 2010). The OLS approach, which refers essentially to the hedonic modeling tradition , proposes an interesting solution. However, it is limited to the independency hypothesis between the attributes, the accuracy of their isolated contributions and the thinness of the constant term. Elsewhere, the geometric approach applies to mass appraisals and its performance increases with the size of the data base (Özdilek, 2016). In comparison to existing approaches using the basis of the utility maximization theory , the "Shapley value" method rests on the foundations of the coalitional apportionment theory (CAT), as suggested in Table 1. Besides its application in real estate in general, the development of this theory for valuation is particularly appealing. Although we focus on the practicality of the Shapley method, basic conceptual insights are provided in the next section. The empirical verification of this method in this work involves only the sales comparison approach (SCA), but we believe it has potential to conveniently address the separation of income and cost attributes with the ICA and the cost summation approach-CSA, respectively (for a description of these approaches, see Appraisal Institute, 2013).
Recent statistical approaches typically fall within either the value prediction or explanation schemes. The predictive methods aim at a reliable estimation of the total value, while the explicative methods have an additional focus to discern "marginal contribution" of attributes. The explicative methods usually assume independence among attributes, which is far from being the case in property value formation. When interaction exists, there are various solutions such as finding an appropriate functional form of the attributes, using reliable data, removing/combining specific attributes, operating a spatio-temporal a structural segmentation of the market (Fox, 1991;Wickens, 1995).
Along with the issue of interactions, research has shown there are several other questions related, for instance, to the inefficiency of the housing market (Anenberg, 2016), the information disparities among the negotiating subjective agents (Genesove and Mayer, 2001), spatio-temporal autocorrelation of prices (Basu and Thibodeau, 1998), the optimal weighting of comparables, and number of attributes (Vandell,1991). In dealing with the reliable estimations, authors usually resort to more robust statistical models like neural networks (Riley et al., 2001), casebased reasoning (Bonissone and Cheetham, 1997), data mining (Soibelman and Gonzalez, 2002), spatial autoregressive models (Wyatt, 1997) or some hybrid approaches (McCluskey and Anand, 1999).
These modeling approaches with advanced statistics provide alternative solutions to a particular difficulty, however they are generally affected by the interaction among attributes, the complexity of techniques, the generalization of subtle valuation details, and the derivation of artificial results (Kummerow and Galfalvy, 2002;Epley, 1997). In comparison to these models, Shapley method as well can suffer from the imperfections of the market and the quality of information. However, it is flexible to take into account these issues, in particular the interactions among the attributes (Su et al., 2019). It also benefits from the capacity of fuzzy measures with a Gaussian function, under several logical constraints and optimization, tested with detailed information from the market.
Shapley in game theory and real estate. The issue of value separation is not only relevant in real estate, but also elsewhere, especially in fields related to value creation, portfolio optimization, compensation, or cost allocation (Moriarity, 1975;Goetzmann and Rouwenhorst, 2005;Greyserman et al., 2006). In these areas, there are promising value separation analyses with reference to the basis of game theory (Neumann and Morgenstern, 1944;Nash, 1951). In the setting of a cooperative game, a fair allocation of value to entities jointly involved in its creation is examined. The Shapley value is a solution in this framework of game theory formally presented by Shapley (1953). Mainly welcomed in economic, financial, and socio-political fields, the Shapley method estimates separate values comparable to a payoff (or bargaining power) for an agent involved in the cooperative game of "value creation". It is well adapted to the problems of resource sharing/partition with separate portions of value creating attributes, sometimes designated as Shapley value, influencing factor or power index (Young, 1988). This scheme of value developed by Shapley thus offers a convenient and practical basis for exploration in property price separation.
The application of the Shapley method is not direct and may involve several preparation steps in the context of real estate analysis and valuation. The method in fact requires the importance of attributes considered individually and in combination to be delineated. Originally developed by Zadeh (1965), and generally applied in control systems/decision making, fuzzy set theory offers a flexible way to take into account the importance, logical constraints, and vagueness of interactions between the attributes (Zimmermann, 2001). The integration of fuzzy measures over integrals in the Shapley method is extended by Sugeno (1974) and others (Atanasov, 1986;Grabisch, 1996) in dealing with specific interactions among attributes using nonadditive measures (also known as fuzzy integrals or capacity function). The fuzzy measures of relations can refer to the experience (expert opinion) or any other particular method, such as the optimization and the behavior learning in databases (Grabisch, 1997).
The specialists in real estate analysis in general and to a much lesser degree in valuation (total and separate values) are, for the most part, unaware of the Shapley method. There have however been several attempts to use the fuzzy logic process in real estate management and investment risks (Sun et al., 2003;Barranco et al., 2004). Dilmore (1993) presented an interesting paper at a conference about the relevance of fuzzy logic in real estate as agents inaccurately handle and acknowledge utility levels from attributes. Bagnoli and Smith (1998) also restated similar observations about the relevance of fuzzy logic in price formation dynamics. The Shapley method presented in this work thus proposes a new perspective in real estate evaluation, involving at least two important challenges described in the following sections: conceptualizing attributes as players participating in a game and empirically verifying its consistency in practice.

Shapley separate values
The following is a conceptual description of the Shapley scheme of value in price comparison method; similar steps of separation can also be followed in cost and income approaches, respectively, using the data on building cost and stabilized future incomes. We first present Shapley's original method, integrating the fuzzy set of measures. Building these fuzzy measures varies depending on the context and the type of subject, but a convenient process is introduced and explained in the property valuation context. Following the original steps of the Shapley method allows us to globally separate the values of the land and the building.
Considering value attributes as players. In the application of the Shapley method, we primarily need to explain on why the property parameters can be seen as players in a game of negotiation. We believe that the basis of game theory welcomes such an assumption. There are ample examples, as mentioned above, with other types of capital and contexts in value creation (i.e., portfolio evaluation or cost allocation). It might seem strange to assume, for instance, that the "living area" attribute is an agent acting in the negotiation game of value creation (or price formation). The agent is not the attribute, but instead the deciders, such as the buyers/sellers, who feel and simulate the exercise of negotiation themselves as representatives in the market.
Having a living area more or less bigger than, for instance, 100 square meters expresses the power of that attribute in the same market of different size properties. We can assume this variation from the same attribute as its power of "vertical influence" on the realized prices (with changing living areas of buildings). When introducing the "bargaining power" of another attribute in the game, like the "presence of a garage" in the property, the economic agents will do the same exercise of comparing the different sizes, at the same time being under the "horizontal influence" of comparing different types of attributes on the price. Should they include the garage in their basket of utilities rather than other types of possible attributes? The vertical and horizontal types of a power relationship modulate the hedonic decisions in holding specific attributes and dropping others in the value creation game. Considering the property attributes as players makes even more sense when realizing the "invisible hand" of the market that objectively retains value determining attributes and strongly clearing those individually subjective during the negotiation.
Moreover, depending on the type of property, there are plenty of attributes no longer attractive in the game of negotiation, like the old-style external brick facing which is being progressively replaced by vitrified facades (usually in the case of commercial properties) or the outmoded laundry chutes in old buildings which were popular in the past. Some other parameters might reappear or be newly introduced in the market like sustainable building characteristics in comparison to those in conventional ones.
On a special note, familiarity with building/land parameters (about their potential values) leads to an informational advantage for economic agents during the game of negotiation. The market is less aware on the amount and the type of impacts (positive/ negative) from certain attributes, for instance, of electric wind turbines newly built in the district, in comparison to the size of the living area attribute. In fact, familiarity with price differential comparisons reflects better the economic impacts (the internalization of benefits) of living area as this attribute is usually negotiated in the previous sales that informs and re-informs new buyers/sellers. However, there is less information on price differential regarding wind turbines, newly introduced on the market (with the disadvantage of having few comparables, if any).

Global separate values
The utility space U of properties is a subspace of the standard space R N , for a finite or infinite N > 0. As a subspace, U has a number of observable utility attributes u i . We suppose that i belong to a finite set {1,…, M}, and that u i s are those characteristics of properties like their size, number of rooms, the presence of a swimming pool, the proximity to urban parks, etc. The utility space U has to be appropriate with a sufficient number M of observable functions u i , allowing one to distinguish prices P i , P j for different properties i, j. Property price P is a nonnegative function P:U → R + and it is a Euclidean measure on utility space U, so that (U, P) is a measure of that space in an appropriate sense. P(A) expresses the total price of properties, identified with all points of the set A ⊂ U.
Considering a set of finite attributes, let coordinates u i of the utility space U be split into two groups: attributes related to the building (we will call them x i s) and attributes related to the land (respectively y j s) so that μ is a capacity function which corresponds to a degree of importance for each of the building and land attributes. It follows that μ(x i ) = u x i and μ(y j ) = u yj , so μ(x i , y j ) = u ij . If the μ function is additive, then u ij = u i + u j (this is a case of linearly weighted aggregate function).
Let also L y be the sum of all the normalized utility parameters of land (with a weight of importance ω 1 ) and B x the sum of all the normalized utility parameters of the building (with a weight of importance ω 2 ). The total price P of a property will be the sum of all the weighted utilities belonging to its land and building components, with respectively X = {x 1 , x 2 ,…, x m } and Y = {y 1 , y 2 , …, y n }.
where ω x i is the importance weights of x i and ω y j the importance weights of y j . This specification expresses the estimation of the total price based on the weighted linearity in the respective importance of attributes. In aggregate and for more clarity, we can approximate p(X, Y) as where ω 1 is the average weight of the land, and ω 2 the average weight of the building. This additive and weighted form of land and building utility aggregation provides an estimation of the total price. The reality of price formation is more complex mainly because of possible interactions among land and building attributes. The economic agents express their multi-criteria utility decisions based on the availability of some attributes taken individually, but also in combination. Elsewhere, adding an attribute of a property in the model does not necessarily change the price. Taken in isolation, it might provide a nil or weak utility, but it sometimes might turn out to be very significant when combined with another attribute. For example, the attribute "altitude" of a piece of land that is far from a river (another spatial attribute) would probably not affect the price, but it will significantly change it when closer to the river (mostly for a scenic view).
In general, the price explanation and prediction models do not consider interactions caused by the combination of attributes. Considered as a problem to solve, interactions are either kept constant or an attempt is made to exclude them from the models. But, attributes are naturally in interaction and coalition in forming the total price (or the worth of the "game") for properties: Before the application of the Shapley method, the analyst needs to specify a set of attributes, with their values, such as the living area, the relevant weights and constraints, if there are any, based on experience or in any other ideally objective ways. The normalized and weighted values of property attributes in a way reflect their capacity in determining a property's total price. These capacities of attributes, considered individually and in combination, are non-additive fuzzy measures, i.e., with probabilities varying between 0 and 1. These measures can be assumed to increase randomly or in following a given type of function.
There is in fact no standard method for building fuzzy indicators, which depend on the context and the type of phenomenon in question. The initial fuzzy measures (with abstract relations) become "defuzzified" providing the analyst with more flexibility to specify some conditions, with others being less obvious, but logically and generally following the predictable behavior of attributes and their combinations. Some parts, such as the importance of certain attributes, may benefit from expert opinion or initial results based on the use of hedonic models. The minimization of the squared error criterion can be used in identifying more probable fuzzy measures (Torra, 1999).
Shapley's basic method. In specifying Shapley's method, consider E = {x 1 , x 2 , x 3 ,…, x N } and PðEÞ being the set of subsets of E. This is represented by PðEÞ = {X}, with X ⊂ E and Card [PðEÞ = 2 N ]. The normalized and weighted original values of attributes X ∈ PðEÞ are obtained from the sample. Let μ be a partition capacity function and a monotone non-additive fuzzy measure on μðXÞ. Initially, we assume that μ(∅) = 0 and μ(E) = 1 and, where E is the global set of PðEÞ. The non-additive capacities between these intervals are generated, in the present research, by a Gaussian function in conformity to the capacity criteria such that ∀X and The value of Shapley (1953) is a concept of solution verifying certain properties (axioms). The concept of the solution is adapted to the problems of sharing resources (or cost allocation) with the characteristic function μ: a. Efficiency: Additivity: If μ 1 and μ 2 are two coalitional games, then (μ 1 + μ 2 )(X) = μ 1 (X) + μ 2 (X), then ϕ(μ 1 + μ 2 ) = ϕ(μ 1 )+(μ 2 ).
The separation (sharing) of total value among property attributes is realized under these conditions (axioms). The efficiency condition indicates that the total value available to the property attributes (players) in the grand coalition is distributed among them, following their relative capacity. In the symmetry condition, equivalent attributes will have equal contributions and an attribute with zero contribution to all coalitions receive zero payoff according to the nullity condition. Additivity condition simply settles that values of two games (markets) sum up to the value computed for the sum of both games.
We use the following alternative formula to estimate Shapley value, which satisfies these axioms (Osborne and Rubinstein, 1994;Branzei et al., 2008): where φ i (μ) is a weighted sum of the marginal contributions of the attribute i, Θ is the set of all orders of the set of players, and P θ i is the set of players which precede i in the order θ. This expression captures the separate value of each property attribute over all the different sequences in the grand coalition. Based on these separate estimations (individually informative to compare an attribute's marginal production cost to its selling price), it becomes possible to estimate property's total and separate values for land and building components.
The Shapley separate value is a global contribution (capacity) of an attribute considered alone and in interaction within the same game of total value creation. The interaction indices between each pair of attributes are sufficient in multi-criteria decision-making (MCDM) problems even though it is possible to define them further (Magoc et al., 2011). Several authors tried Shapley's index extensions of degree two which takes into account the interactions between pairs of attributes (two attributes considered simultaneously) and all other attributes in the set of combinations (Grabisch, 1997;Roth, 1988). In this work, for the demonstration, Shapley's basic method suffices to obtain marginal values of attributes that are needed in total price apportionment.
Gaussian fuzzy measures MCDM problem is extensively investigated for the evaluation and the choice of alternatives, offering multiple attributes (Wallenius et al., 2008). Agents take decisions under uncertainty and cognitive bias as they vary in their motivations, use of information and capacity of evaluation (Kahneman et al., 1982). Uncertainties in alternatives involve imprecise decisions in various contexts that can be reliably studied based on fuzzy measures (Zadeh, 1965). To date, various fuzzy measures are proposed in control systems, cooperative game theory, and combinatorial optimization, varying in relation to factors like the application domain, the availability of data, the mathematics involved and particular needs (Atanassov, 1986). When attributes explaining a phenomenon of interest are nonadditive and in interaction, fuzzy measures can address them adequately (Ishii and Sugeno, 1985). Sugeno (1974) initially proposed a nonadditive monotone fuzzy measure that can take into account the interactions between attributes (Weber, 1988). In that perspective, the Shapley index has been proposed to redistribute (or separate) fuzzy measures (Murofushi and Soneda, 1993).
Expert opinions or hedonic measures on the contribution of parameters might help in building the weights of fuzzy measures containing all the possible combinations of criteria 2 n in the whole set PðEÞ, but they are essentially biased (Grabisch, 1997). The main reason is the imperfect price levels given the subjective decisions of economic agents, but there is a coherence when related to the quality/quantity of attributes, allowing to derive some objective measures. In our case, we use weighted and normalized correlations of attributes with observed prices per district in Montreal.
These correlations are used as initial values in a Gaussian function that generates capacities of attributes under logical conditions, Shapley's axioms and Sugeno's criteria of nonlinear monotonic increase of contributions (Grabisch and Raufaste, 2008). The choice of that function rests on several arguments; notably, it is known that real estate price formation usually obeys a Gaussian law of behavior (Wyman et al., 2011), responding adequately to the criteria of Sugeno (1974) in building the fuzzy measures of capacities. The total capacity in the Gauss integral is assumed to cover well the functional behavior of decisions on the probable value of combinations, contributing to a more decrease in bias factor. The general Gaussian function is the following: where P(x) is the probability of the Gaussian function, x the average value, and σ the standard deviation. Without getting into the details of the proofs, we can easily show that the probability is equal to unit 1. The fuzzy measure can be a mathematical function of probability, for instance, as a Gaussian function, under the conditions to be satisfied such as μ(∅) = 0, μ(E) = 1 and X ⊂ T if μ(X) ≤ μ (T). The condition μ(E) = 1 is not automatically reached in the field of probability and the highest combined value "E" will instead have a maximum value of capacity in the Gaussian function that we use. Mathematically speaking, the maximum weight is to be given to "E" in such a way that μ(E) = Max in the Gaussian function.
For the fuzzy measures of capacities, we prefer to use an adjusted version of the Gaussian function. In the equation above, x is the average value of the sample and it is a characteristic which changes from sample to sample and determines the location of the Gaussian function on the x-axis. If x ¼ 0, then the expression above becomes: The value of the standard deviation σ flattens the Gaussian curve, in a more clearly distinguishable way, and independently from the small size of the samples. In other terms, x is only a translation of the x values in the capacity measures while σ directly influences the results of μ(X) containing all the combinations PðEÞ. Ultimately, what is important here is keeping the Gaussian form, which reflects the image of the general tendency of all the samples and not one particular sample. By considering x ¼ 0 in the equation, we allow that the system is not depending on diverging behaviors from the small size of the samples. The behavior should rather depend on the system. It is the main reason why we kept σ from the sample and omitted x from the expression above.
The treatment of the rough data and its normalization to build the probabilistic values of x (small values) allow them to be kept between 0 < x < 1. The probabilistic P values in the Gaussian function will, however, be concentrated around a maximum value, very close to each other with tiny differences that are difficult to detect using the original Gaussian function. The sum of probabilities P 1 i¼0 P i continuously increases and disrupts unnecessarily the capacities in the fuzzy table. To remedy that situation, as a first reason, we operate a change of variable x by X ¼ 1 x . The second reason for that modification is to keep the monotonic increase in the capacity of combinations. The normalization of the rough data provides individual values of x and the computation of values for their combinations PðEÞ is additively and linearly operated. Regarding this choice, one may then ask why we do not take their product or other types of combinations. The following mathematical demonstration responds to that question.
The sum of two criteria x 1 and x 2 has a probability higher than Px 1 · Px 2 . For the demonstration of this theorem, consider the simplest function of a Gaussian distribution P ¼ e Àx 2 , with P x 1 ¼ e Àx 2 1 , P x 2 ¼ e Àx 2 2 , and P x 1 þ x 2 ð Þ¼e À x 1 þx 2 ð Þ 2 : we deduce that P(x 1 + x 2 ) ≥ P x1 ·P x2 .
Error minimization process. In the following practical example, we considered the most important parameters of land and building in Shapley's method that approaches a realistic explanation of price differentials. However, individuals taking part in the market subjectively express their utility in deciding on the number, type and importance of these parameters, therefore requiring an objective support of results. A consistent way in the field is to compare separate values with those of other approaches such as city and OLS evaluations. The performance of comparable models can rely on a common behavior of the market, i.e. a normal distribution of separate prices (and errors). A direct basis of comparison is clearly through the recently sold vacant lands within the same usage and market. Such a data is rare (and sales are usually biased) in almost entirely built-up cities. In our approach, we compare separate results of Shapley method with those of OLS and city evaluations. Also, we use a statistical error minimization test, integrated within the entire process of Shapley's separate value computation. This test is theoretically and practically appropriate as the weights of attributes are objectively estimated using a key assumption in valuation. The adjusted prices of the comparables (here three) and the observed price (or the estimated market value) of the subject have to converge at the end of the process. In fact, following the process of price adjustments, properties become very similar. Weights of attributes are iteratively changed under the constraint of error minimization that converges adjusted prices of similar properties.
Observed prices can be adjusted using either a ratio or marginal contribution of each attribute (provided by an expert or derived from, for instance, an OLS model). To simplify the computations, we use a ratio derived from a Euclidean norm representing a degree of proximity (similarity/dissimilarity) of each comparable to the subject property. Considered in an ndimensional Euclidean space, the Euclidean norm of a vector is expressed by Although there are various types of distance estimation, this expression measures the size of a vector and reduces to an absolute value for n = 1. It assigns to each attribute vector the length (or magnitude) from the origin to point y, a consequence of the Pythagorean theorem. This weighable distance helps to measure the degree of similarity between comparables and the subject property. For each comparable and subject property j, we form the weighable distance jy j j ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi where it has assumed that x i ω j ≡ 0, i ≠ j: We normalize the weighable distance for each property with the summation of all weighable distances; we call this parameter, normalized weighable distance ||y j ||. The adjusted price of property j is calculated by multiplying the observed property j price P oj and the subject normalized weighable distance ||y s ||, divided by the normalized weighable distance of property j, ||y j ||. Therefore, the adjusted price of property j can be represented by P Adj ¼ P oj y s k k y j k k : In deriving the weights of the attributes, we use an optimization process that aims at minimizing the distances between the adjusted and the observed prices. The following rootmean-square error (RMSE) formula computes the standard distance between these prices: where ΔP Adj represents all the relative differences between P oj , and the other three adjusted prices P Adj j = 1, 2, 3, with n = 6. Once an optimal weight of attributes is found, Eq. (7) should converge to zero, in theory, as the normalized attributes of the three comparables and the subject are the same.

A practical example
The practicality of the Shapley method is verified here based on the observed prices of 14,715 single-family properties sold during the year 2007 over all the 27 districts in Montreal. This data is convenient and already operational as the technical details of gathering, validation, codification, and computation have been realized by the city. It already contains two comparable bases of separate values obtained from the OLS method and the city evaluations for the same properties.
Data specification. In this quite heterogeneous market, Table 2 shows that the total prices range between $51,000 and $617,829, ARTICLE PALGRAVE COMMUNICATIONS | https://doi.org/10.1057/s41599-020-0444-1 with an average of $151,272. The database comes with detailed information about four essential attributes of the properties, distinguished between their land and building components. The X-Y geographical coordinates of each of the properties enabled us to locate them precisely on the map and appreciate the spatial distribution of separate estimations. Among the four important attributes, the structural quality of the buildings has been qualified in different classes by the city appraisers who have visited each property in the past. We build an attribute on the quality of the land with the aid of the geographical information systems (GIS), using the multiple thematic maps that show the distances of each property to important location points by district (city parks, schools, subway stations, river, commercial centers, etc.).
It is possible to consider a higher number of attributes related to the land, building and financial aspects, assuming that the time effect is kept constant (adjusted prices or sales within the same year). For a residential building, the model might include, for instance, the type of exterior covering, the floors, the size of the garage, the presence of a fireplace or a swimming pool. These attributes might provide more explanations for price variations. In this practical example, we expect that each pair of size and quality attributes about the land and building will address most of the explanations in price variations. In practice, professionals often manually operate a classical grid method in which the comparable properties found around the subject usually require an average of 3-4 adjustments. The price method works better when comparables are very similar, thus requiring fewer adjustments. The Shapley method can also work with more attributes, but this needs an automatized algorithm of computation due to the higher number of possible combinations in the fuzzy table.
The Shapley method we use here will directly separate the observed total price of the subject property (randomly taken from the data) between its land and building. A global estimate of the market value for the same subject can also be divided between the two components, assuming that the evaluation correctly integrated interactions between the attributes. The Shapley method has also the potential to predict the separate values of the subject, based on the information from its neighbors. Using the GIS tool, we were able to identify the three comparables in the data that are the closest to the subject.
Fuzzy measures. Before proceeding to direct separation and prediction of the separate prices, we can build the relative values (or bargaining powers or "X values") for the four attributes in the fuzzy table. There are several options to consider in building these values, with an elaborated discussion around the possible conditions or specific constraints for each attribute, taken in isolation and in combination. For example, it is easy to guess the minimum and maximum values of attributes for an empty set (property without any attributes) and a full set (all the attributes, with their combinations). A land attribute can participate in the game, without any quality or building (polluted and vacant land). In this case, a building's negotiation power of the two attributes is absent. Other different interpretations and situations might exist, but the challenge is between the extreme values, especially with the combined bargaining power.

Results
Thus far, we have prepared the data and the cumulative capacity of μ(X) values in Table 3, needed for the application of the Shapley method. Considering that all the lands are built (improved) in the data, normalized and weighted correlations of land and building attributes are used as initial values. The Gaussian cumulative distribution of capacities is computed under the scheme of Shapley's axioms and Sugeno's criteria of nonlinear monotonic increase of contributions. Instead of considering the expert's subjective opinion on these contributions, the Gaussian function works with the correlations between property attributes and observed prices.
Shapley's separate values are computed based on a key assumption we made. Accordingly, when the observed prices of comparables are correctly adjusted by their attributes of various importance, in comparison to those of the subject, they are theoretically supposed to be the same or, at least, very close. To objectively conduct this process, we allowed weights of attributes to change under the optimization process of price convergence that automatically connects all the steps of Shapley's separate value estimations. When these weights change, prices of comparables are adjusted with error minimization. Weighting optimization stops when the total distance between the adjusted prices is minimized.
As shown in Table 4, the optimal weights of L1, L2, B1, and B2 attributes are, respectively, 0.16, 0.24, 0.40, and 0.20. Based on these weights, Shapley method computed an indication of capacity for each attribute of the subject property (S) in the example, allowing to estimate separate values. Accordingly, for the subject property, the estimated separate value for the building is $259,304 (57%) and $195,616 (43%) for the land.
For the test, we used correlations without optimization which resulted in a RMSE of 0.039, in comparison to a RMSE of 0.011 for the optimized weights. The adjusted prices with optimization are very close to the observed price of the subject property and, consequently, suggest closer separate values for land and building components. The RMSE optimization process allowed to tune Shapley's separate values based on the essential argument of price convergence. Other statistical approaches like the OLS derive marginal contributions of attributes based on price differentials and use these contributions for price adjustments in later steps from which an opinion on the market value of land and building is reached. In Shapley's method, integrating the RMSE process, any slight adjustment in the weight of a given attribute simultaneously changes all the interconnected measures in the algorithm of computation. Do these separate values make sense or are they in line with the practical expectations? Even though they are in line with practice, direct response cannot be provided with this single example; we thus need to consider a higher number of cases as in the data per district. Using the same algorithm, we computed separate estimations for all the other 14,714 properties in the data, considering each time their own three comparables. Also, a comparison with the separate results of the same properties in the data obtained by the two other independent methods (i.e., city and OLS) are more enlightening. Table 5 displays the average results of the method per district, sorted in increasing order of the observed prices. The results are interesting not only regarding the separation but also in several other related aspects. For instance, the land values are close to or even higher than the building values in certain districts; the   inverse is usually the case. This shows a phenomenon of building depreciation and land value appreciation in the districts close to downtown areas such as Verdun and Westmount. Interestingly, those two districts are on opposite sides of the market. In Westmount, like in Verdun, the buildings are mostly old and this probably reflects the reticence of the market due to the exponential increase of residential property values (and taxes) on lands that have a huge potential in rent. When the districts are far apart, the land values usually decrease, with a better quality of buildings.
The distribution of the bargaining power of the four attributes demands particular attention per district. We can observe where clearly the size and the quality of the land and building dominate. These results imply that we cannot consider and interpret a uniform distribution of the so-called "hedonic contribution" on the utility of each attribute. It varies considerably in space, depending on the district in which the attributes are in competition. Some districts seem to favor the quality of the building (with the maximum bargaining power of 0.664 in Anjou), others being influenced by the size of the land (like 0.385 in Ile-des-Soeurs). And most interestingly, in contrast to what is believed in practice, a good quality parcel of land does not necessarily have a good quality building according to these average results per district.
The bargaining powers of the L1, L2, B1, and B2 are respectively 0.219, 0.155, 0.294, and 0.332. These average values are in coherence with the observations in practice. The ratio of land to total value oscillates between 0.195 and 0.631, with an average of 0.374 for all the districts. In comparison, the average ratio of the land is 0.348 in the case of the city evaluations (values between 0.187 and 0.511) and 0.399 according to the OLS model (with values between 0.194 and 0.626).
The following figure gathers the behavior of the results of the land and the building in four different districts selected from the data (the price of the land is sorted in increasing order) (Fig. 1). By observing Fig. 1a-c, we can discuss several striking interpretations. First, the building values in general closely follow the same trend as the land values, considering districts of Dorval and St-Léonard. Another observation in these districts is the apparent fluctuations of building values in Dorval, but a more stable trend in the values of both components in St-Leonard. This might signify that the increasing values of lands are almost systematically improved with the increasing quality of the buildings. The situation is completely different in the case of Anjou where the building values fluctuate indistinctly in comparison to the slightly increasing values of lands, with a large gap between the values of the two components.
In the bottom part of the same figure, the separate values of the land steadily increase and even overtake the values of the building for most of the properties in Westmount (Fig. 1d). Building values in this district are almost insensitive to the increases in land values, meaning that the capacity of the land is far from being optimized. We can extend this interpretation to the concept of "sleeping rents" on which a political debate can be elaborated. In fact, rents being captured "under the land" might signify a loss of value and consequently less income in property taxes for the municipalities in the long run. Recognition of that situation might motivate politicians to take some incentive measures, such as property tax relief, to encourage a better quality of buildings. Figure 1e shows the separate values of the same lands and the buildings, estimated almost by a rule of thumb, based on a typical ratio that professionals use in the city. This hides the reality of the practical dialectic between the two components and shows at the same time the importance of reliably estimating the separate values.
The separate results of the mass valuation on the map show these patterns, especially interesting from a spatial economics point of view (Fig. 2). They clearly follow what Alonso (1964) described in his bid rent theory, hypothesizing a decreasing pattern of land values with distance from the center of a city. According to separate estimations of Shapley method, the building values are higher around the old high-status markets encircling Mount Royal, while also being present far from downtown areas, especially in the west part of the city (due to the demand for direct access and greater visibility for lands along the border of the St. Lawrence River).

Discussion
Value clearly is a fundamental concept and knowledge about the separate contributions of its attributes is important in multiple contexts, for various economic agents. Ignoring separate values of property attributes and avoiding their interactions not only perpetuates the belief that they are meaningless, but it may also eliminate precious information. The issue of value separation is however important and challenging both in theory and in practice (Özdilek, 2012). There are mainly three types of difficulties that are related to the identification/definition of the land and building attributes, the measure of their separate values and the application of a reliable method. Arguments on the separability/inseparability thesis regarding the contributive values have been quite extensive in previous research. There are also several value separation methods in existence, each having some degree of consistency in practice.
This work contributes to the issue of separation by exploring and applying the Shapley scheme of value as a new method, involving both the conceptual basis of game theory and the use of fuzzy logic in the settlement of the initial conditions. This method provides a robust solution, with a relatively simple and accommodating basis, to the issue of separate and total value estimations. It allows professionals to tangibly (and even manually) break down a total value by clearly understanding the computational steps, otherwise uncertainly applied in practice.
Reference to game theory is quite new in a field where the analysts are used to thinking in terms of the marginal contributions of property attributes. The role of the player is given to the attributes in a strategically cooperative (or not) game of value creation/destruction. This parallel is helpful in understanding the attributes of utility that are in a weaker or stronger position to influence the economic deciders who actually play the game. These attributes are in the position to "claim", as we defend here, their separate part of the contribution in the game (individually or in cooperation). In settling the "rules" and the most probable constraints, the fuzzy measures also provide an interesting solution in taking into account the interactions among the value creating attributes in the game. In that context, we have introduced a Gaussian function that "defuzzifies" the fuzzy measures of probabilities in the set of combinations.
The Shapley method globally provides separate contributions (or payoffs) of property attributes, as shown in this work. The empirical validation of this method rests on the use of an important database from the market of 14,715 sold single-family properties. The results are very coherent and without ambiguity for all the properties within the data. They are in line with practical expectations of values, either taken case-by-case or in aggregation per district. They are indeed encouraging when compared to the results of the city evaluations and the OLS predictions for the same properties and attributes. The results are derived from an objective solution to the difficulty of the separation issue, offering there the occasion to appreciate other related issues from the land-building dialectic, which are usually overlooked or simply ignored.
The Shapley method depends on the reliability of weights of attributes processed as fuzzy measures within the Gaussian function. The initial values based on the correlations of attributes per district are iteratively changed and stabilized under the constraint of the minimization of distances (errors) between the adjusted prices of comparables and the observed price of the subject. Professionals will benefit from the reliability of the method, but most importantly, they can confidently defend and demonstrate that those weights are objectively in line with the theoretical and practical expectance in valuation.
Further work. Based on the general framework of the Shapley method that we presented, future work might be interested in several technical aspects to enhance separate values. For instance, we processed direct separation of observed prices based on the Shapley method. Because of the possibility of residual interactions among attributes that can resist after the application of the Shapley's global approach, an extended version can be elaborated (for instance, by the integration of simultaneous interactions of two or more attributes on other ones in the same framework).
Even though we presented only briefly, the Shapley method also has a coherent framework for the prediction of the separate values. The process of prediction can be submitted to a robust modeling algorithm with feed forwarded and back forwarded learning steps of separation under the constraints to decrease the errors of estimation. This sounds advantageous especially in the context of mass evaluation, but the tangible benefits of the Shapley method might become complex for professionals. There is also the possibility of improving several other steps in the method, such as using another probability distribution function and changing the number of attributes (and their weights). In this perspective, spatial autocorrelation methods can be joined to the Shapley method, especially in improving the explanation portion of the land value. Also, in future work, more discussion on the political issue of the separate system of tax rates (and its consequences) would be beneficial for public finance management. Finally, using Shapley axioms, more sensitive approaches can focus, for instance, on the optimal timing of improving vacant lands and demolishing the depreciated buildings for new constructions.

Data availability
The author confirms that the data supporting the findings of the example in this study are available within the article. The raw dataset is not publicly available due to legal restrictions (they contain individual information that could be accessed and compromise the privacy of properties and their owners).