It's Time to End Gerrymandering IV: Partisan Symmetry

Gerrymandering is the process of drawing legislative districts in order to manipulate election outcome. So far in this series, we have discussed the inherent disproportional nature of single winner elections, statistical tests that can be used to detect gerrymandering, and gave some initial consideration to how those tests might be manipulated. Here we will discuss a concept called partisan symmetry, which is important both from a legal and mathematical perspective.

As of yet, the supreme court has not accepted a standard for partisan gerrymandering that is legally actionable, although they have indicated that they are warm to accepting a standard based on partisan symmetry (link is a pdf). Partisan symmetry means that the election outcomes can be disproportionate, but the disproportionality should not only favor one party. Suppose that a given party wins 70% of seats with 60% of the vote. If in a hypothetical outcome with the same districting plan, 40% of the vote would have gotten that party 30% of seats, then the result can be considered symmetric even though it is not proportional. The hypothetical results can be constructed using a concept of a seats-votes curve, where the vote tallies are shifted by a uniform percentage in each district and the results recomputed. We used a similar concept for analyzing the electoral college results in the past three presidential election in a previous post. The legal argument against partisan asymmetry is that it violates the fourteenth amendment equal protection clause since the districting plan disadvantages citizens based on their political preferences. 

Mathematically, the asymmetry can be understood based on the concept of skew, which is a measure of how lopsided some data are around a particular location. For a probability distribution or a set of data consisting of a full population (as opposed to a sample of data from within that population), skew can be reliably measured about the mean. Additionally, while there are some exceptions, the difference between the mean and the median is a measure of skew which is particularly useful for small data samples coming from a larger, unknown population. This is the idea behind the mean median difference test proposed by Dr. Sam Wang, a Princeton neuroscientist and gerrymandering expert. The difference between the mean and median of a set of election results also has a strait forward interpretation, since your party wins at least half of the seats if it wins the median district. The difference between the mean and median therefore tells us the vote percentage a party would need in excess of 50% in order to win 50% of the seats. If a party needs to win a large excess of 50% of the votes in order to get 50% of the seats in a state, then the redistricting plan strongly disfavors the voters of that party. The usefulness of this test comes from the fact that it combines the legal concept of partisan asymmetry with the mathematical concept of skew in a way that allows for the definition of a threshold for striking down a partisan gerrymander.

The notion of asymmetry proposed here (link is pdf) is based off the deficit or excess in seats when a party gets 50% of the votes. This concept of asymmetry is determined by the number of seats a party gets in excess of half that have a higher vote percentage than the party's statewide level of support. If one type of asymmetry exists, then both exist by definition. However, the magnitude of one type of asymmetry could be small while the magnitude of the other could be large. For example, in a closely divided state, if gerrymandering has occurred then the mean and median will be significantly different. On the other hand in a more partisan state, the mean and median might not be very different but there would be a excess in seats at 50% of the votes for the redistricting party if gerrymandering had taken place. There is overlap between the two tests, and aggressively gerrymandered states may show significant levels of both types of asymmetry. On general, both types of partisan asymmetry should be tested for.

We have covered how to test for differences in the mean and median previously, and here we will introduce how to test for a deficit in seats at 50% of the vote. This is done using a test for symmetry around the mean district vote percentage. We first take the actual election results, and then reflect them around the mean percentage. Then we can use standard statistical tests for determining the likelihood that two sets of data were generated by the same distribution. If a result is significantly different than its reflection, then we can say that the result is not partisan symmetric. We are assuming that there is no uncertainty in mean that we observe from the election results. This assumption is defensible since gerrymanderers seeks to draw districts that maximize their election wins with a given statewide level of voter support, but the district boundaries don't increase or decrease the total votes in the state.

This symmetry test works by comparing the empirical distribution function (EDF) of the election results to the EDF of the results reflected about the mean. There are a few tests that can be used to do this. One such test is known as the two sample Cramer von Mises (CvM) test, which is a measure of the total difference between the two EDFs. This is probably the ideal test for this problem in terms of statistical power, but the CvM test is not too common, which is a disadvantage. A similar but more common test is the two sample Anderson Darling test, but this test gives higher weight to differences at the extreme ends of the two distributions while we are most interested in the differences in the middle. The two sample Kolmogorov-Smirnov (KS) test on the other hand is very common (it is implemented in excel for example), and examines the maximum difference between any point in the two EDFs under consideration. The KS test statistic, which is often abbreviated as D, has a strait forward interpretation when applied to election symmetry. D/2 multiplied by the total number of seats represents the maximum asymmetry in terms of actual seats. Compared to the CvM test, the KS test will probably be a bit conservative, but since it is simpler and more common, we'll use it for now.

Before applying this test to each state one by one, we will apply it collectively to all results using the 2008, 2012, and 2016 presidential election data tallied at the district level by Daily Kos Elections. Below we've plotted the EDF of the Democrat two party vote share in each district nationwide (blue line), the reflected EDF around the mean district two party vote share of 53.8% (green line), and the average of the two EDFs, which represents a perfectly symmetric result (red line). To avoid any confusion, note that while this may look like a seats votes curve, it is something else entirely. The seats votes curve is computed by considering hypothetical shifts in vote total and recomputing the seats won while the EDF is essentially just the district vote percentages sorted in ascending order.

ecdf.png

Since a majority of districts fell below the Democrat's mean level of support, there is a nationwide asymmetry which disadvantages Democratic voters. The asymmetry is equivalent to 25 seats and has a less than 1% probability of occurring by chance according to the p value computed for the KS test. The asymmetry is very stable throughout the past three election, remaining steady at a 25 seat disadvantage to Democrats with a less than 1% probability of occurring by chance. For comparison, a similar analysis using seats votes curves estimated the asymmetry to be 30 seats.

Year Asymmetry Measured in Seats p value
2008 25.00 0.58%
2012 25.50 0.46%
2016 25.00 0.58%

This is a significant disadvantage, but not all of it can be attributed to gerrymandering. We have not attempted to distinguish between the effects of urban packing, or the tendency of Democratic voters to be tightly clustered in urban areas, and intentional gerrymandering. Additionally, the national result only gives us the net effect, meaning that gerrymandering by Democrats gets cancelled out by the combination of asymmetry arising from urban packing and Republican gerrymandering. Nonetheless, this is quite a significant disadvantage for Democratic voters.

Applying this test state by state yields the following map, which shows the asymmetry in terms of seats averaged over the 2008, 2012, and 2016 elections for each state. States which met the 5% chance threshold were Texas, Alabama, North Carolina, Pennsylvania, and Ohio, while states meeting the 10% chance threshold were Indiana, Tennessee and Louisiana.

This symmetry test is a useful tool for detecting gerrymandering. It is simpler and more precise than the tests we proposed that are based on spread, and it is directly related to the concept of partisan asymmetry. It complements and overlaps with Dr. Wang's mean median difference test, and could be performed by anyone who can enter data into excel, which would hopefully mean the standard could be applied without an expert witness's testimony (the statistical power of the test could likely be improved by using the Cramer von Mises test instead of the KS test, but this would come at the cost of availability of the test in standard software). From examination of the results of this test as well as the tests based on spread, it appears that gerrymandered states with mild skew often have clustered results. The mean median difference in Dr. Wang's test considered relative to the sample standard deviation, but this measure tends to overestimate the actual spread of the results, which increases the likelihood of a false negative for the skew test. Some preliminary work has shown that the mean-median difference test can be improved, especially for states which are not close to 50-50 partisan support, by replacing the standard deviation with the median absolute deviation. We will examine the effect of the spread measure used in the skew test more thoroughly in the future.

These tests, like all other proposed gerrymandering tests and standards, can not eliminate gerrymandering but rather can only put the brakes on it. Redistricting should be in the hands of independent and transparent committees, not state legislatures. Send this letter to your state representatives, and demand that they turn over redistricting to such a commission.


The source code for this post will be uploaded to GitHub shortly!