In studying the #GamerGate discussion(s) on Twitter, I’m using a variety of theories including persuasion, community action, and paranoid style in politics. I could use some help making sense of what I’m seeing, so I’ll be blogging as I go. Please contact me or use the comment functions if you have ideas.
First up, language and social influence. Why this approach? The argument about what #GamerGate is – a discussion about ethics in game journalism or a coordinated harassment effort – could be explored in part by examining the language participants use. I’m interested especially in whether users are persuasive in their languages – can they convince readers that the discussion is about what they claim it’s about? [The mainstream press says, “no“, “no“, “no“, you get the idea.] I’m not advocating for that argument. As you can see from the links in this paragraph, it’s been resolved. I’m curious, rather, about whether the language used by tweeters indicates an argument was even happening, and I’m using the presence/absence of persuasion markers to figure that out.
Some Background Research
A number of language features are connected to social influence, especially in online communication :
- lexical diversity
- powerful language
- language intensity
Lexical diversity refers to a class of measures of the range of vocabulary a writer uses. Often, lexical diversity is measured using a type-token ratio (number of unique words [types] divided by total number of words [tokens]) . Low linguistic diversity leads to low evaluations from readers  and “negatively impacts credibility and influence” [1, p. 598]. A number of social factors have been shown to impact lexical diversity including anxiety  and writing apprehension , both of which produced writing with low lexical diversity.
Powerful language is defined by what it’s missing, namely linguistic features such as tag questions (e.g. “isn’t it?”), hedges (e.g., “sort of”, “kinda”), hesitations (e.g., “um”), fragments, and intensifiers (e.g., “really”) . Writers who use powerful language are perceived as more competent and authoritative , and their arguments are judged as more persuasive . Several studies have found that women tend to use less powerful language style than men [8, 9, 10].
Many researchers use Bowers’ [11, p. 345] definition of language intensity as “the quality of language which indicates the degree to which the speaker’s attitude toward a concept deviates from neutrality.” Intensity is often conveyed through emotionality  and is measured using some variety of a scale in which words are labeled according to their intensity. Some popular scales include Jones and Thurstone  and Burgoon and Miller . Intense language is associated with persuasion , resistance to persuasion , perceived credibility , attitude-behavior consistency . Receivers have been shown to tolerate more intense messages from men than from women .
Based on this research about the relationships between linguistic style and social influence, I expect influential tweeters to have high lexical diversity and use powerful, intense language. By influential, I mean those tweeters who are able to control the message. I’m thinking retweets are decent proxies for persuasion and influence – I assume people RT persuasive arguments/authors. I also expect those most committed to the ethics in gaming argument to use the most intense language – they seem the least likely to be persuaded otherwise, based on mainstream media reports. I’m also wondering whether the “ethics in journalism” argument is failing because its supporters do not use persuasive language. Rather, posts like “Actually, it’s about ethics” are not persuasive. So, maybe that argument is losing because (a) there’s a crap-ton of harassment happening that makes it wrong/irrelevant, and (b) even when it’s not harassing, the language used isn’t very persuasive.
Lexical diversity was calculated using a standard type:token ratio (unique words divided by total words).
LD = wu/wt (1)
I created a measure for powerful language using a similar ratio approach. First, I calculated the ratios of common hedges (e.g., “i feel like”, “probably”) and intensifiers (e.g., “really”), and I then used the inverse of the combined ratio (total hedges divided by total words) to measure the power of the language used.
I used a ratio of high intensity markers (e.g., “very”, “strongly”) to low intensity markers (e.g., “poor”, “mildly”) to measure language intensity.
LI =ih/il (3)
You’re right. This project is also a great excuse for me to learn more Python. If you’ve ever talked to me about code in person, you know how that makes me feel [not awesome]. But, here we are. I’m writing some utilities for automatically analyzing tweets, and the code is available on GitHub. That code assumes you used TwitterGoggles to collect and parse the tweets.
 J. J. Bradac, J. W. Bowers, and J. A. Courtright, “Three Language Variables in Communication Research: Intensity, Immediacy, and Diversity,” Human Communication Research, vol. 5, no. 3, pp. 257–269, 1979.
 J. J. Bradac, C. W. Konsky, and R. A. Davies, “Two Studies of the Effects of Linguistic Diversity Upon Judgments of Communicator Attributes and Message Effectiveness,” Communication Monographs, vol. 43, no. 1, pp. 70–79, Mar. 1976.
 J. J. Bradac, M. R. Hemphill, and C. H. Tardy, “Language Style on Trial: Effects of ‘Powerful’ and ‘Powerless’ Speech Upon Judgments of Victims and Villains,” Western Journal of Speech Communication: WJSC, vol. 45, no. 4, pp. 327–341, Fall 1981.
 M. Burgoon and G. R. Miller, “Prior attitude and language intensity as predictors of message style and attitude change following counterattitudinal advocacy.,” Journal of Personality and Social Psychology, vol. 20, no. 2, p. 246, 1971.
 M. Burgoon, S. B. Jones, and D. Stewart, “Toward a Message-Centered Theory of Persuasion: Three Empirical Investigations of Language Intensity1,” Human Communication Research, vol. 1, no. 3, pp. 240–256, 1975.