It's Time to End Gerrymandering III: Safe Wins Continued
/Gerrymandering is the practice of drawing legislative districts in order to deliberately manipulate election outcomes. So far in this series we have discussed how the winner take all voting system employed for electing congressional delegation does not typically result in a party winning seats in proportion to its overall level of support and how this can complicate analysis of gerrymandering. We also discussed how statistical tests can be used to identify when the redistricting party has engineered artificial safe wins, or wins by comfortable but not blowout margins, to maximize their share of seats.
The first statistical test discussed was introduced by Dr Sam Wang of Princeton and was based on the concept of skew, defined by the difference between the mean and median vote share. A second test based on the concept of spread based on ratio of the standard deviation and the interquartile range was also discussed. These two tests are related to strategies that parties employ to rig the electoral map to increase their number of safe wins. The skew test reveals a strategy called packing, where a party loads its opponents into a few throwaway districts where it expects to lose by large margins, reducing the competition they face in the remaining districts. The spread test reflects a strategy where a party spreads its own supporters evenly across the state, resulting in win margins that cluster in the safe win region.
In this post, we will follow with further discussion of statistical tests and introduce some additional measures that can be used to develop tests based on spread. The goal is to create tests based on well known statistics that detect gerrymandering but are simple enough to avoid the need for special software. This is easier said than done, since there is no one size fits all strategy to gerrymander a state and a test that is simple may not cover all cases. In addition, with computerized redistricting, once a statistical test is set as standard for a legally actionable threshold for striking down a district map, parties can attempt to develop strategies to increase their advantage while flying under the radar of the test. We will attempt to address this in by identifying gerrymandered voting patterns which would pass the statistical tests under consideration.
This post is probably more technical than what we generally aim for, so if you want to jump to the punch line here it is: in our opinion, the statistical tests under consideration can put the brakes on partisan gerrymandering, but probably can not eliminate it entirely. Redistricting is inherently a partisan process and single winner district voting systems are extremely vulnerable to manipulation. At best we can just limit the extent to which a party can draw maps to their own benefit. This is why we have advocated the use of independent redistricting commissions, and you can find a form letter to send to your state representatives demanding them to turn the redistricting process to such a commission here. Beyond that, moving to a multiple winner proportional voting method for electing congressional delegations and state legislatures would lead to a significant improvement in election fairness since such systems are difficult to manipulate, promote electoral competition, and result in proportional representation. On the other hand, if you want to dive into the numbers, we'll start by discussing a few measures that can be used to detect whether election results are unusually clustered.
Measures of Spread
The mean-median as measure of skew proposed by Dr. Wang to detect gerrymandering involves simple and well known statistics, is directly related to a gerrymandering strategy known as packing, and is effective for states that are close to 50-50 in partisan support. On the other hand, it may not work for more partisan states, since the gerrymandering party does not necessarily need to aggressively pack its opponents into a few throwaway districts. Instead, the redistricting party may simply spread it's own supporters evenly across the state ensure that they win as many safe districts as possible. In this case, we are looking for unusually clustered results, as determined by some measure of spread. Dr. Wang compared the standard deviation of the redistricting party's wins in a state to nationwide characteristics for this purpose, and in the previous post we used the ratio of the interquartile range, or middle half of the voting results, to the standard deviation.
Before discussing alternate tests of spread, we should mention that tests to determine if data is likely to be different from a normal distribution due to skew and or a measure of spread called kurtosis. These are the Jarque-Bera test and the D’Agostino-Pearson test. However, these tests are not very useful for gerrymandering because the measure of skew employed by them, the moment coefficient of skewness, has a very high variability for small sample sizes (aside from being a more complex expression than the mean median difference). This makes it more difficult for us to say with statistical certainty whether or not the observed skew in an election is the result of deliberate gerrymandering. The same argument applies to the kurtosis, and further the kurtosis is not necessarily what we are looking for in the first place when trying to uncover gerrymandering. Kurtosis is a measure of the weight of the tails of a distribution, but what we are really interested is a strong peak indicating clustered results rather than a heavy tail. In other words, we want a statistic that is strongly influenced by the data points that are close to the middle, while the kurtosis is most strongly influenced by data points that are far from the middle.
The ratio of interquartile range (IQR) to standard deviation (SD) gives us an idea of the strength of a peak since a small IQR relative to the SD tells us that there is a sub group of data points that are tightly clustered relative to the overall spread of the data. A test based on this statistic can be developed using the relationship between the ratio of IQR and SD for a sample and the normal distribution derived here. This is not the only option however and in fact the median absolute deviation from the median (MAD) is possibly a better alternative to the IQR. The MAD is computed by calculating the median, then finding the absolute value of the difference between every data point and the median, and then taking the median of those differences. This tells us the median distance of the data points from the median of the data, and is therefore a measure of spread. For a symmetric distribution, the MAD is actually equivalent to half of the IQR as the number of data points in a sample increases (see here), but MAD has some advantages over the IQR (link is a pdf) in terms of statistical power for small samples.
The use of the ratio of MAD to SD for detecting gerrymandering follows along the same lines of the IQR/SD ratio; gerrymandering is suspected if the ratio indicates tight clustering of district vote percentages. A third measure of spread worth considering is the average absolute deviation from the median (MAAD), which has been used to detect if data is different from a normal distribution due to a heavy tail. This statistic is thus related to the kurtosis, but it is more powerful than the traditional kurtosis statistic for small sample sizes. The details of statistical tests based on the ratio of SD to MAAD and SD to MAD can be found here. The skew test could potentially be combined with one of the spread tests into an omnibus test similar to the Jarque-Bera or the D'angostino Pearson tests, but for now we will just treat the skew and spread tests as distinct. In the next section, we'll see how easy or difficult it is to game these tests by finding potential election outcomes that would result from gerrymandering but would pass the proposed tests.
Gaming the Metrics
As long as redistricting is controlled by partisan entities, they will try to use the process to give themselves an advantage. Any statistical test or constraint that is put in place to attempt to limit how much partisan gerrymandering is acceptable could potentially be skirted using sophisticated computerized redistricting algorithms. Here we will explore this by trying to find election outcomes that could be the product of gerrymandering but would not be flagged by any of the tests we have considered. For simplicity, we will assume that there are no geographical constraints on redistricting and that the redistricting party has perfect foreknowledge of voting outcomes. These are extremely lax and even unrealistic assumptions, but this lets us see what sort of strategies a party might employ to circumvent a given standard.
Suppose the redistricting party targets a given number of safe wins, and they want to determine if they have enough statewide support to allow them to get their safe wins without getting caught by one of the statistical tests. We can explore this by using multi objective optimization, where we want to find voting results that would guarantee a given number of safe wins, minimize the redistricting party's statewide support (as measured by it's average district voting percentage), and maximize the p value for a given statistical test. There is an absolute tradeoff between minimizing the statewide support and maximizing a test's p value because as a party's statewide support decreases, it gets harder and harder for it to guarantee itself the targeted number of safe wins without running afoul of at least one of the statistical tests. The multi-objective optimization algorithm, which in this case is a genetic algorithm implemented in the python package platypus, will give us the so called Pareto front. This is the collection of solutions where a decrease in the average party support can only come at the expense of a decrease of the p value of a given test.
As a reference point, we will also compute the likelihood of a given outcome using inference from a numerical simulation. We draw voting percentages at random from a uniform distribution and then compute the proportion of outcomes that result in smaller levels of statewide support than the targeted level of support among all the outcomes that resulted in at least as many safe wins as the targeted number. Dr. Wang has proposed a somewhat similar inference technique based on much more realistic assumptions accounting for national trends in political geography, but the inference technique used here is more consistent with the assumptions used for the optimization analysis. Again, the goal here is to try to uncover weaknesses in the statistical tests under the most favorable possible conditions for the redistricting party. Geographical constraints make gaming a particular test more difficult, so we are essentially finding the worst case scenario in this analysis.
We'll compare the skew test based on the mean - median, as well as the spread tests based on the IQR, MAD, and MAAD. All four of these tests are meant to detect a particular strategy indicative of gerrymandering, and while they are not tests normality per se, they all use a normal distribution as a model for determining what outcomes should be regarded as unlikely. So for good measure, we'll throw in a generic test for normality known as the Anderson Darling test. It may not be possible to argue that failure of this test indicates gerrymandering, but it is useful to have it as a reference since all the other tests detect specific types of departure from normality while the AD test is generic.
Here is the result of this analysis for a state targeting 10 safe wins out of 13 districts. The solutions constructed by the optimization algorithm garantee that 10 of the 13 districts have at least 55% support for the redistricting party. This is the current scenario in North Carolina, where Republicans have 10 of 13 congressional seats in a state with roughly evenly divided partisan support.
The optimization algorithm is able to find solutions that are highly unlikely to occur by chance with respect to the inference analysis, and that guarantee 10 of 13 safe wins without much more statewide support than 50% while fooling any individual test. Although again, we assumed that the redistricting party has complete freedom in creating districts with exactly the voting percentages it wants without any geographic constraints or uncertainty. The usefulness in this analysis is not in the results themselves but rather in uncovering how the tests can be gamed. The skew test can be gamed by trying to move the mean closer to the median if the redistricting party tries to balance out packing it's opponents districts by increasing it's win margin in a few of it's own districts. The median is harder to move around, but if the median is still larger than the mean resulting in skew, then it could try to move most of it's win percentages closer to the safe win threshold of 55% while keeping one or two of it's districts with large win margins. This is easier to do if the redistricting party is already dominant in their state.
However, these strategies will tend to decrease the MAD or IQR while increasing the SD, which puts them at risk of running afoul of the spread test. The skew test and spread test therefore complement each other in the sense that trying to beat one of them puts the redistricting party a risk of failing the other. Beating both tests seems to result in outcomes with high kurtosis, in which case the spread test based on the MAAD would be violated. The Anderson Darling test captures all of these tests to some extent, but since it is generic and not necessarily related to a strategy for gerrymandering, it might not be accepted as a standard.
In the future, we will repeat this analysis with realistic considerations of geographical constraints, but for now it seems that the skew test and a spread test based on IQR or MAD would put meaningful restrictions on how much gerrymandering a party can get away with. Some states would actually fail multiple tests, for example North Carolina miserably fails every single test except the spread test based on MAAD. Note, the voting results here are not from actual congressional elections, but rather are presidential election results tallied at the congressional district level for 2008, 2012, and 2016. The post 2010 census districts are used; full details can be found from Daily Kos elections.
Year | Avg. Dist Rep Vote | p Inference | p Skew | p Spread IQR | p Spread MAAD | p Spread MAD | p AD Normality |
2008 | 49.54% | 0.000 | 0.020 | 0.001 | 0.200 | 0.000 | 0.000 |
2012 | 51.31% | 0.000 | 0.014 | 0.001 | 0.225 | 0.000 | 0.000 |
2016 | 50.77% | 0.000 | 0.032 | 0.001 | 0.391 | 0.000 | 0.001 |
Alabama also fails multiple tests, but it manages to pass the skew test since it is more partisan than NC, which means that it can be gerrymandered without the aggressive packing that would lead to significant skew. On the other hand, it fails the spread tests based on IQR and MAD.
Year | Avg. Dist Rep Vote | p Inference | p Skew | p Spread IQR | p Spread MAAD | p Spread MAD | p AD Normality |
2008 | 60.87% | 0.143 | 0.696 | 0.018 | 0.276 | 0.000 | 0.105 |
2012 | 61.01% | 0.151 | 0.681 | 0.015 | 0.254 | 0.000 | 0.091 |
2016 | 62.60% | 0.249 | 0.617 | 0.002 | 0.189 | 0.000 | 0.091 |
The skew test combined with one of the spread tests, (the spread test based on the MAD is probably the frontrunner for now) would place a meaningful restriction on how much gerrymandering a party can get away with. It may be also be worthwhile to modify the skew test to account for spread based on the MAD. The difference in the mean and median for the skew test is taken relative to the standard deviation, but it seems that the spread of election results are better reflected by the more robust MAD. In this case, states like Alabama, which appear mildly skewed relative to the standard deviation, would be significantly skewed relative to the MAD. Additionally, robust statistics like the median and MAD will lead to standards that are more difficult to manipulate, since manipulating them requires the ability to engineer precise outcomes in many districts in coordination.
Adoption these types of standards will not eliminate gerrymandering, but will just put the brakes on it. As long as partisan entities are in control of the redistricting process, they will try to use it to their own advantage. More sophisticated strategies for gerrymandering include designing districts that reduce the influence of a party's strongest opposition in ways that might not be apparent by looking at election results alone, meaning that these tests would not detect them. For example, with a strategy sometimes called slice and mix (link is a pdf), a party creates highly polarized districts, where it places its strongest opposition in the same district as its strongest supporters such that their supporters are in the majority. Following this strategy might not lead to a party gaining extra seats, but it can give them a more subtle advantage since they would not have to worry about moderating their policy positions or compromising with their strongest opposition. This is why in addition to setting standards, the redistricting process needs to be turned over to an independent and transparent commission. The 2020 census and redistricting will be here in no time, let your representatives know that this is an important issue to you.
The code for this post will be uploaded to GitHub shortly!