Scale and the Analysis of Large Text Databases

I gave a talk Friday as part of IIT’s Social Networks and Innovation conference, and here are the slides:

Basically, my talk was an overview of three projects that analyze subsets of my big Twitter data. You can Framing in Social Media now, and papers from the others are waiting for publishers’ OK to share. In each project, I analyzed 10’s or 100’s of thousands of tweets.

In the mentioning networks paper, I examined the social network that results from members of Congress mentioning one another on Twitter to see if that network is structurally different from other Congressional networks like roll call votes or shared press releases. The structure of the network would determine who has influence, and I determined that influence offline is the best predictor of influence online. Sadly, social media is not a new route to influence.

The framing paper also analyzes Congress’ tweets, this time to see whether and how Congress frames political issues when they don’t have to go through traditional media to get their message out. We found that Congress does frame when talking directly to the public, especially about issues such as energy policy, gender equality, and healthcare. We created a measure of polarization based on framing efforts that, when combined with DW-Nominate voting data, gives a more nuanced and complete picture of political polarization in the 113th Congress.

The last project analyzes tweets aimed at Congress, authored by citizens. In this project, we investigated whether citizens use social media to try to impact political outcomes, and to understand how they do so. We found 16 different strategies citizens employ to lobby their representatives and grouped those into 5 different speech acts. It turns out we do more than shout and moan on social media; we do actually try to understand where our representatives stand and to get them to change their minds about issues that are important to us.

