Legislators’ roll-call voting behavior increasingly corresponds to intervals in the political spectrum

Scaling techniques such as the well known NOMINATE position political actors in a low dimensional space to represent the similarity or dissimilarity of their political orientation based on roll-call voting patterns. Starting from the same kind of data we propose an alternative, discrete, representation that replaces positions (points and distances) with niches (boxes and overlap). In the one-dimensional case, this corresponds to replacing the left-to-right ordering of points on the real line with an interval order. As it turns out, this seemingly simplistic one-dimensional model is sufficient to represent the similarity of roll-call votes by U.S. senators in recent years. In a historic context, however, low dimensionality represents the exception which stands in contrast to what is suggested by scaling techniques.


Intersection graphs
This section includes some additional technical details on interval and boxicity 2 graphs. Proofs of stated theorems can be found in the original work.

Interval graphs
The following theorem provides the technical justification for using the clique-membership matrix to detect interval graphs. Note that interval graphs can be characterized in ways that do not involve the clique-membership matrix M. As stated in the main paper, we only employ Theorem 1 for practical reasons.
Definition 1 An n × n matrix A is said to be a Robinson matrix if and only if it is symmetric and Note that the diagonal elements are not specified. If a matrix can be permuted into a Robinson matrix, then A is a pre-Robinson matrix.
The following two theorems yield the justification of using the Fiedler vector to identify interval graphs.
Theorem 2 (Atkins et al. 3 ) Let A be a pre-Robinson matrix with a simple Fiedler value and a Fiedler vector with no repeated values. Let Π 1 (respectively, Π 2 ) be the permutation matrices induced by sorting the values in the Fiedler vector in increasing (decreasing) order. Then Π 1 A and Π 2 A are Robinson matrices and no other permutations of A produce Robinson matrices.
The theorem can be generalized to allow for repeated entries in the Fiedler vector 3 .
Theorem 3 (Kendall 4 ) Suppose A is a (0, 1)-matrix with the C1P. The permutations of the rows of A which produce consecutive one's correspond exactly to the permutations which, when applied simultaneously to rows and columns, put AA T into Robinson form.
Theorem 2 and 3 imply that the Fiedler vector of MM T yields a C1P ordering for M if the associated network is an interval graph. Once M is permuted, the interval representation for each node can simply be inferred from the location of ones in the respective columns.
An illustration of the outlined steps to recover the interval representation is shown in Figure 1.

Boxicity 2 graphs
The method used in the main paper to detect boxicity 2 graphs is based on the work by Quest and Wegner 5 . Let A be the n × n adjacency matrix of a niche overlap network G = (V, E). For each h ∈ {1, . . . , n}, define the vertex sets The proof of sufficiency introduces a way to actually retrieve the box representation in two dimensional space. Relate the jth clique to the line x = j and the sets J h to the lines y = n − h. If M jv = 1 and v ∈ J h , we label the point ( j, n − h) with v. The box representation of v is then given by the convex hull of all points labeled v.
Although the theorem gives a characterization of boxicity 2 graphs, there is no way to determine the necessary permutations of A and M in an efficient way. In our analysis, we employ simulated annealing to determine permutation matrices Π A and Π M which yield an upper bound for the lazarus count of all clique-membership matrices M (h) . If permutations are found that yield a Lazarus count of 0, then the graph is guaranteed to have boxicity 2. Note, however, that a nonzero count does not rule out the possibility of the graph having boxicity 2.

Stochastic degree sequence model
This section includes two robustness checks for the stochastic degree sequence model (SDMS), which was used to compute the niche overlap networks. Note that, in general, there is no satisfactory way to validate the quality of one-mode projections derived from different binarization techniques. In the absence of such a method, we use the Lazarus count of the resulting networks to compare the one-mode projections. The less it varies, the more stable are our results and thus independent from employed binarization methods.

Polytope model
The SDSM allows several different link functions to be used to fit a binary model on the roll-call data. In the main paper, the scobit model was used. Figure 2 shows the normalized Lazarus count for the networks computed with the "polytope" model, which is implemented in the R package backbone 6 and the preferred choice of the authors (personal communication). The figure shows that the Lazarus count does not vary notably by changing the model.

Minority filtering
In the main paper, we excluded all votes from the data where the minority is below 2.5%. This was done in order to be comparable with NOMINATE. From a technical perspective, however, the SDSM should be able to handle these cases without exclusion. Figure 3 shows the normalized Lazarus count for the networks computed without filtering in comparison to the filtered networks. Overall, the results do not vary significantly and no method clearly outperforms the other. We note though that in the unfiltered case, two senates are no longer one dimensional.

Alternative distance measures to interval graphs
In the paper, we use the Lazarus count to assess the structural differences between a network and interval graphs. There are, however, others that could be employed which will be discussed in this section.

Graph edit distance
An alternative distance measure can be derived via graph edit distances, i.e. how many edges must be added/deleted in order to turn a graph into an interval graph. There exist at least three feasible instantiation of the interval edit problem: interval graph completion (only edge additions allowed), interval graph deletion (only edge deletions allowed), and interval editing (both edits allowed). All three, however, are NP-hard [7][8][9] . An approximation for the interval completion problem can be obtained via randomized interval supergraphs 10 . Given a graph G = (V, E) and a permutation π of its vertices, we define a map M (G, π) which associates an interval supergraph G π = (V, E π ) to the pair (G, π) as follows. Let u be an arbitrary vertex and let v ∈ N[u] be the vertex with π(v) = min w∈N[u] π(w). Associate the interval [π(v), π(u)] to the vertex u. It is easy to verify that the resulting graph G π then is an interval supergraph of G.
This construction of interval supergraphs can be used to approximate the interval completion problem by solving the minimization problem using standard search heuristics such as simulated annealing. Figure 4 shows the lowest obtained edit distance for the non-interval niche overlap networks.

Run-length Lazarus count
The Lazarus count is not the only metric to assess "non-consecutiveness" of matrix columns. Recall that the Lazarus count is the sum of the total number of zeros between the first and last non-zero entry in each column. An alternative metric can be defined by summing up the number of consecutive series of zeros between the first and last non-zero entry in each column 11 . We refer to this metric as the run-length Lazarus count since it is equivalent to the Lazarus count of the run-length encoded columns. In many ways, the metric behaves similar to the Lazarus count. If the network is an interval graph, then the Fiedler vector of the clique-membership matrix induces the ordering such that the run-length Lazarus count is zero. An interesting aspect about the metric, though, is its connection to a specific traveling salesman problem. Let M be the k × n clique-membership matrix of a network G = (V, E). Add two rows containing only zeros to M, one at the top and one at the bottom. Define the k + 2 × k + 2 distance matrix D where entries D i j correspond to the Hamming distance between the rows i and j of M. The solution of the travel salesman problem with distance matrix D gies the permutation which minimizes the run-length Lazarus count. For more technical details see 11 .
Note that the run-length Lazarus count can be minimized efficiently given that exact solvers for the traveling salesman problem can easily handle problems of our size (i.e. instances with ≤ 1000 cities). Figure 5 shows the exact minimum run-length Lazarus counts, normalized by the number of nodes, which where determined with the Concorde TSP solver 12 .    In general, there is no established rule on which metric to choose to assess the structural divergence of a graph from being an interval graph. The adequate choice may vary between applications. We opt for the Lazarus count since it appears to be more established in the literature but note that the presented alternatives would have given comparable results.

Niche representations
This section shows all niche representation of overlap networks that were found to be one or two dimensional, as well as the Fiedler vector representation used to assess polarization.

Interval representations
The figures below show the interval representations for the six niche overlap networks that were found to be interval graphs.   Figure 11. Interval representation of the 116th Senate.

Boxicity 2 representations
The figures below show the two dimensional boxes for the five niche overlap networks that were found to be boxicity 2 graphs. The figures are not annotated and structural equivalence classes are contracted (indicated by thickness in the interval representation).
1st Dimension 2nd Dimension Niche point projections Figure 17 shows the one dimensional niche point projections based on the Fiedler vector approach. These representations are used in the main paper to compute the distance between parties to assess polarization.

Co-sponsorship networks
In this section, we repeat analyses of the main paper for bill co-sponsorship data (93rd to 116th Senate). The data was obtained from pro publica (https://www.propublica.org). The niche overlap networks obtained with the SDSM are shown in Figure 18. Figure 19 shows the normalized Lazarus count for the niche overlap networks of the co-sponsorship networks in comparison with the co-voting networks. None of the networks was found to be one dimensional. Although we do observe a general tendency toward a declining number of dimensions, it remains significantly higher than for the networks derived from roll-call votes. These results seem to weaken the observations from the main paper, however, they do confirm previous research where it was reported that the underlying space of bill co-sponsorship is of a higher dimensionality than for roll-call votes 13