Better Inferences at the Congressional District Level

Better Inferences at the Congressional District Level

About a week ago I wrote about how to estimate congressional district level opinion from the Cooperative Congressional Election Survey, which is a large national opinion poll. I sent the post to Andrew Gelman, a statistician and political scientist, who gave me some great feedback on the methodology I used. In this post, I've revised my opinion model based on that feedback, and have made a few other additional enhancements as well. Compared to the previous model, this iteration produces very similar point estimates, but with reduced variance. Here I'll explain the changes and the reasoning behind them.

In the original post, I used a hierarchical student T distribution to model district level regression intercepts, with the standard deviation and degrees of freedom estimated from the data. I did not use a state or regional level regression as is typically done in the literature. There were two main reasons for this, the first is that I tended to get inefficient inferences when I included the state and regional level variation, which I blamed on correlations between the district level predictors and the state/regional regression indicators. However, at Andrew's suggestion I tried using more informative priors for the variance parameters in the model (which for example represent the standard deviation of the group distribution for the district level intercepts) and this problem resolved instantly.

Read More

Analyzing Opinions in Congressional Districts from Large National Polls

Analyzing Opinions in Congressional Districts from Large National Polls

I recently came across an article in Vice by Sean McElwee which made a very compelling case for Democrats moving to the left on issues like immigration and racial justice to mobilize millennial voters instead of moving to the center on these issues in an attempt to win over white working class voters. This is a move that aligns with my personal politics, but it seems to run completely contrary to a good deal of conventional punditry on how Democrats should go about winning elections. The author made his case by analyzing several questions from the 2016 Cooperative Congressional Election Survey, which is a very large national survey that covers a number of topics related to policy and political attitudes.

In this post, I'll attempt a deeper dive into some of the CCES data using Bayesian Inference with PyMC3. I'm mainly interested in how much the CCES data can tell us about how opinions vary by congressional district, with a particular focus on college educated millennials. The results backup the main thesis of the vice article pretty strongly, so I won't try to rehash much of the commentary. Instead, this post will be more focused on the computational and modeling issues surrounding this type of analysis.

Read More