It's Time to End Gerrymandering II: Safe Wins

Gerrymandering is the term used to describe deliberate manipulation of the electoral process by drawing non competitive districts for congressional and state assembly races. In the first post in this series, we discussed the concept of disproportionate outcomes, were the number of districts won by a party is out of proportion with their statewide level of support. Using district level data from the 2008, 2012, and 2016 presidential elections, we saw that proportional outcomes were very rare. Unfortunately, this is not enough to declare that we have uncovered gerrymandering because single winner elections tend not to lead to proportional representation, and even can lead to minority rule by chance.

Here we will attempt to separate chance outcomes from deliberate manipulation using statistical tests aimed at uncovering irregularities in the vote percentages of a state's districts. In particular, we are trying to uncover if the architects of the districting plan were trying to carve out safe wins for one party or the other. Safe wins are comfortable enough to limit or remove competition in the district, but gerrymanderers don't want to win by blowout margins either because this is inefficient. Safe wins provide the best tradeoff between eliminating competition and maximizing representation for the gerrymandering party. Note that the notion of safe wins in the context of partisan gerrymandering may be somewhat different than in the context of bipartisan gerrymandering. For bipartisan gerrymandering, two parties collude to protect their own incumbents and maintain the existing balance of power. Therefore both parties are looking to reduce competition as much as possible. For partisan gerrymandering, one party wants to maximize its representation, and therefore must balance between winning by small, efficient margins and winning by safe margins so that they don't run the risk of their gerrymander backfiring. In this post, we will only consider partisan gerrymanders, and will discuss bipartisan gerrymandering in a future post.

The concept that we will use in this post is generally referred to as statistical hypothesis testing, where we compare an observed outcome to outcomes that we deem "usual" under certain assumptions. If the outcomes we observe are unusual in the sense that there is a small likelihood that such outcomes would occur by chance, then we conclude that our initial assumptions were not correct.

This may seem quite vague so let's consider an analogy. We have a coin, and we want to determine if it is a fair coin, meaning there is an equal chance of getting heads or tails, or if it's loaded, with unequal chances of heads or tails. If we assume that the coin is fair, then we know that it would unusual if we flipped the coin a given number of times and observed a lot more heads than tails or vice versa. But how exactly do we say what's unusual? Suppose we flipped the coin 10 times, and observe 6 heads and 4 tails, how unusual is that? The probability of getting at least 6 heads in 10 flips of a fair coin is about 38%, meaning that if we did many sets of 10 flips, about 38% of those sets would have at least 6 heads. This does not seem that unusual, so we would probably not conclude that the coin is loaded based on this evidence.

On the other hand, we would probably conclude that the coin is loaded if 9 out of the 10 flips came up heads, since the probability of getting at least 9 heads in 10 flips of a fair coin is about 1%. If the coin was fair as we had initially assumed, then there is a small chance that we would observe this outcome, calling the initial assumption of fairness into question. Our ability to determine what is and is not unusual increases with the more information we observe. For example, suppose instead of 10 flips we did 100, and observed 60 heads. This is the same proportion of heads as the earlier case which we determined was not unusual, however this time there is about a 2% chance of getting at least 60 heads in 100 flips of a fair coin and is therefore a much more unusual result. This presents a practical challenge for applying similar tests to detect gerrymandering because some states have a small number of districts, making it more difficult for us to determine if an outcome is or is not unusual.

Dr. Sam Wang, a Neuroscience professor at Princeton, has proposed a statistical test for detecting safe wins. His blog is a fantastic resource for quantitative analysis of gerrymandering which we highly recommend. His test for unusual safe wins comes in two parts, the first part is based on a concept called skew, and can detect gerrymandering in swing states (here we mean swing state as a state with roughly equal partisan support for each of major parties). The second part is based on a concept called spread, and can detect gerrymandering in states where the party in control of redistricting is dominant. Below, we'll define what we mean exactly by skew and spread, and how we can quantify these things in election data. We will use Dr Wang's test for detecting skew, but we will introduce a different test than his for detecting spread. Finally, we will combine these two concepts into a single test based on skew and spread. The more technical details of these tests can be found in the appendix to this post, and in a future post we also introduce an additional test for gerrymandering (including bipartisan gerrymandering) based on a gaussian mixture model. If you are knowledgeable in statistics, then please send your comments and criticisms of these tests to us. As of yet, there is no clearly defined and widely accepted criterion for determining a level of gerrymandering that meets a legally actionable threshold. If you are able to help contribute to our efforts to develop such a criterion, then please do so! 

If you have decided that you don't really care about the numbers and you just want to see gerrymandering ended for good, no worries. Here is a form letter which you can send to your representatives demanding that the redistricting process be turned over to an independent commission. If you are ready to dig into the numbers, we'll add some qualifiers first. Below we apply three statistical tests to presidential election data reported at the district level in an attempt to uncover irregularities in voting results that may indicate partisan manipulation. This is a bit like throwing a plate of spaghetti at the wall and seeing what sticks. In order to definitively prove or rule out gerrymandering in a particular state, we have to look at all of the available evidence in much more detail that what we've done here. This post is only intended to give readers a birds eye view of a particular methodology for uncovering gerrymandering. Future posts will examine individual states in more detail.


In a set of data, which in our case are vote percentages for a party in each district, skew can be defined as asymmetry about the middle of the data. The 'middle' of a set of data could be defined in a number of different ways, and in this case we will compare two different definitions of the middle: the mean and median. The mean is the sum of the data divided by the number of data points, while the median is defined so that half of the data lies above it, and half below. If the data is not skewed, then the mean and the median should be the same. We can therefore state Dr. Wang's test for skew as follows: if we assume that the districts in a swing state are drawn without partisan manipulation, then it wold be unusual to observe skewed election results. That means that if we observe skewed election results, as measured by the difference between the mean and median district vote share for a party, we should reject our initial assumption that the districts were drawn without partisan manipulation.

Let's look at a concrete example, using North Carolina. North Carolina is roughly evenly split along party lines, but Republicans have 10 of the 13 congressional seats there. Below, each dot represents the Democratic vote share in each of North Carolina's districts in the 2008, 2012, and 2016 presidential elections. The red tick marks are the mean vote share in each election, while the green tick marks are the median.

These results are massively skewed in favor of Republicans. Democratic voters are packed in three districts, which they win by large margins. The republican seats are also not very competitive, but North Carolina Republicans are efficiently spread throughout the state's districts. This gives Republicans a lot of safe wins, maximizing the number of Republican seats given their level of statewide support. Just like in the example of the coin flips, we can establish a probability that an election result would be at least as skewed as this in a fair election. The details of how this is done may be found in the appendix, but for North Carolina, these are 1.6%, 1.0%, and 2.3% for 2008, 2012, and 2016 respectively. Especially given that the skew persisted over three elections, we can say with very high confidence that the North Carolina is gerrymandered to favor Republicans. The reason that skew can detect gerrymandering is that when the North Carolin Democrats are packed into a few districts, the mean moves the the right but the median doesn't change very much.

We'll apply the skew test to all the available election results, but we have to add a few disclaimers. Skew may be the result of a phenomenon called urban packing, or the tendency of Democratic voters to cluster in urban regions. This can put Democratic voters at a disadvantage, but it is not necessarily the result of partisan manipulation. Therefore, particularly in states dominated by Democrats, we may see skewed results that favor Republicans that are due to urban packing rather than intentional gerrymandering. This is why a test based on spread is needed, which we will discuss in the next section. To reiterate, if a Democrat-dominated state shows skew favoring Republicans, it may be the result of urban packing, and further analysis is needed. Blue states showing such skew may still be gerrymandered by Democrats or even both parties in collusion with one another. The skew test is very reliable for detecting gerrymandering in swing states however. Below, states which had at least one election year where the skew showed irregularities that had at least a 1 in 20 chance of occurring randomly are shown on the map. The colors on the map represent the difference in the democratic mean and median vote share, where positive numbers show skew favoring Republicans and the reverse for negative numbers. If a state did not meet the 1 in 20 chance threshold for irregular skew, then it is filled in white. Each instance of a result exhibiting irregular skew is shown in the table below the map.


State Elections Showing Skewed Win Margins (Difference Democratic Mean and Median Vote Share)
Georgia 2008 (0.39%), 2012 (0.40%)
Wisconsin 2012 (0.46%), 2016 (0.51%)
Pennsylvania 2012 (0.30%)
Michigan 2008 (0.52%), 2012 (0.50%)
Florida 2012 (0.24%)
Virginia 2008 (0.38%)
North Carolina 2008 (0.45%), 2012 (0.49%), 2016 (0.42%)
New York 2008 (0.44%), 2012 (0.41%), 2016 (0.32%)
California 2016 (-0.25%)
Massachusetts 2012 (0.44%)
Missouri 2012 (0.44%), 2016 (0.46%)
Indiana 2008 (0.44%)
Ohio 2008 (0.40%), 2012 (0.41%)
Texas 2008 (0.24%), 2012 (0.33%)

There was only one instance where skew favoring Democrats met the 1 in 20 chance threshold, which was California in 2016. New Mexico came close, but since it only has 3 districts, our ability to determine a level of skew that comes from gerrymandering rather than chance is not very good. We can also see that New York and Massachusetts showed skew favoring Republicans, even though these are Democrat dominated states which had disproportionate wins favoring democrats. In these cases, the skew test is just detecting urban packing in cities like Boston and New York. Wisconsin, Michigan, North Carolina, and Ohio are swing states that failed the skew test more than once, indicating that these states are likely to be gerrymandered. 

Skew is not the only thing that we can examine to attempt to detect gerrymandering. Next we will look at how the concept of spread can help us find gerrymandered states.


Dr. Wang's safe wins test for states that are dominated by the party in charge of redistricting relies on the concept of spread. If the party in charge is carving districts to give themselves safe wins, then we would see unusual clustering of wins in the safe win region. Unusual clustering can be quantified by measures of spread, and Dr. Wang's has proposed comparing the standard deviation of the results in a state to nationwide results to flag states with unusually narrowly spread vote shares. Here we will propose a different test that does not require us to compare a state's results with nationwide results. This test is similar in spirit to the skew test, which looks for discrepancies in two different measures of the 'middle' of the election data. Here we will compare two different measures of spread: the standard deviation (s) and the interquartile range (IQR), which is the range that the middle half of the data falls in. The first quartile is defined in a similar way to the median, but with 25% of the data falling below it and 75% above. Similarly, the third quartile is defined so that 75% of the data is below it and 25% is above it. The IQR is then the difference between the third and first quartiles. Instead of the difference between these two values, we are looking at the ratio. If the IQR is a lot smaller than the standard deviation, then the results are unusually clustered, which indicates that safe wins are being engineered by the redistricting party. The IQR should typically be smaller than the standard deviation, but we can account for that, and we are looking for an unusually small ratio. The details of the statistical test can be found in the appendix. We can state the test for spread as follows: if we assume that the districts in a state dominated by one party are drawn without partisan manipulation, then it wold be unusual to observe clustered election results.

We'll use Alabama as an example, which is a Republican dominated state that was not flagged by the skew test. Alabama has one deep blue district, two deep red districts, and four districts which reliably fall within the safe win region for Republicans. Since there are four districts that are clustered tightly in the safe win region, the IQR (the distance between the magenta ticks) is small compared to the standard deviation. This indicates that Alabama's legislative districts are drawn to maximize the Republican share of seats in congress. 

Urban packing still presents a problem for using the spread test for Democrat dominated states. Lets take Massachusetts as an example, this was flagged by the skew test as favorable to Republicans, but that result is likely just due to urban packing. Democrats won all the districts in Massachusetts in all three presidential elections, and had a large number of safe wins in the 55%-65% region. Because so many Democratic wins were tightly clustered in this way, the IQR (the distance between the magenta ticks) is small compared to the standard deviation. However, when Democrats are packed in urban districts, the standard deviation of the Democrat's district vote shares increases. This makes it more likely that the spread test will detect unusual clustering. Intuitively it seems that even despite urban packing, Massachusetts may be gerrymandered to give safe wins to Democratic candidates, giving ill-gotten disproportionate power to Democrats. A similar situation exists in Maryland, and it is probably necessary to combine these statistical tests additional tests or with map based methods to definitively prove that gerrymandering has taken place

We can again apply this test to every state for all three elections. The map below shows states which had at least one instance of irregular spread with less than a 1 in 20 chance of occurring randomly, as well as the spread statistic, which is defined in the appendix. If you don't want to read the appendix a negative value of spread indicates that there is unusual clustering, which favors whichever party is in charge of redistricting. A positive value of the spread indicates that the results not clustered but rather spread out in an unusual way. This could indicate that the state is highly polarized politically based on geography. New York for example has several urban districts which are deep blue, but also has several rural districts with strong republican support. Since gerrymandering always exploits and amplifies existing political geography, unusually large spread does not rule out gerrymandering. We will need to look at the cases of unusually large spread more carefully, and we will do so in future posts. For now, states with unusual spread are shown on the map, and each instance of unusual spread is in the table below.

State Elections Showing Irregular Spread (Spread Statistic)
Pennsylvania 2008 (-0.54), 2012 (-0.52), 2016 (-0.60)
Oklahoma 2012 (-0.92)
North Carolina 2008 (-1.01), 2012 (-0.99), 2016 (-1.00)
Tennessee 2008 (-1.16), 2012 (-1.09), 2016 (-1.16)
New York 2016 (0.62)
Massachusetts 2016 (-0.89)
Ohio 2008 (-0.66), 2012 (-0.70)
South Carolina 2008 (-0.79), 2012 (-0.81), 2016 (-1.06)
Indiana 2008 (-0.73), 2012 (-0.86)
Alabama 2008 (-0.84), 2012 (-0.86), 2016 (-1.03)
Louisiana 2008 (-0.87), 2012 (-0.91), 2016 (-0.96)

There are a few newcomers such as Tennessee, as well as repeat offenders like North Carolina. Massachusetts is the only true blue state that we've caught, which failed the spread test in 2016. It came fairly close in 2008 and 2012 as well, where the spread in those years had a 1 in 10 probability of occurring by chance. We set the threshold for failing the spread test at 1 in 20, but this threshold is arbitrary. The fact that Massachusetts shows consistent irregular clustering suggests that at the very least, further investigation is warranted. We can now look at the results of both tests. In the scatter plot below, each dot represents an election result in a state, plotting the skew vs the spread. Skew on the right of zero indicate a skew favoring Republicans and skew to the left of zero favors democrats. Negative spread indicates clustering favoring whichever party is in charge of redistricting. Black dots passed all tests, and tests which failed one or both tests are colored accordingly. Note, that failure is defined as having a smaller than 1 in 20 chance of occurring randomly. That means that there might be skew for example that is more extreme in a small state that passes than in a large state that fails. New Mexico is an example of this, which shows large skew favoring Democrats, but since it only has three districts, the test used here could not determine if this is a chance result or the product of gerrymandering. In general, if a state passes these tests, that does not rule out gerrymandering, it just means we might need to look further.

The lower left quadrant of the graph is where we are looking for Democratic gerrymanders, and the lower right quadrant is where we are looking for Republican gerrymanders as well as Democratic gerrymanders in states with urban packing. Republicans seem to be winning the gerrymandering game so far, but they also have more opportunity to gerrymander since they are in charge of more state legislatures. In the next section, we'll combine the skew test and the spread test into a single test for detecting irregular election results.

The Skew and Spread Test (SAST)

We can actually combine the skew and spread tests into a single test. This single test is always at least as good as either test individually, and it will help flag instances where the combination of skew and spread is unusual, but not very unusual when each statistic is viewed individually. Again, the details may be found in the appendix. Here are the states where at least one election was flagged by the SAST with a 1 in 20 chance of the degree of skew and spread occuring randomly.

The same information can be visualized on the scatter plot from the previous section. The SAST flags a lot more elections than either test on its own.

From this result, it appears that gerrymandering is truly widespread, however it is possible that the SAST has flagged a few false positives where irregular skew and spread is the result of political geography alone. False negatives are also a possibility since the SAST is likely to let states with subtle gerrymanders involving only a few districts slip through the cracks. Developing a generic statistical test for gerrymandering that improves upon SAST will probably require us to directly account for urban packing and political geography in general. We will take a closer look at these things in future posts, but for now the SAST looks like a promising tool. If you are an expert in statistics, we would love to get your feedback on this post and the appendix. Any comments or criticism (constructive or otherwise) are welcome.

If your head is now spinning from all the numbers, not to worry. Tests like these are required to define clear criteria for fighting gerrymandering in court but that is not the only way to fight back. Petition your state representatives to turn over the redistricting process to an independent commission. Here is a letter you can use, share, or modify as you see fit. Keep in mind that it is unlikely that your state representatives will be willing to simply give up their power to shield themselves from accountability or maximize their party's influence through gerrymandering. Eliminating gerrymandering therefore means that we will all have to demand the same thing, and it needs to be clear to politicians that even a noncommittal position on gerrymandering is unacceptable. While the country is deeply divided on many issues, we should not be divided on this one. Think of it like the golden rule; if you want to protect yourself from being disenfranchised by gerrymandering, the best way to do that is to fight back whenever and whenever gerrymandering occurs, even if the people disenfranchised by gerrymandering in your state vote for the party across the aisle.