Calculating Geometric Mean in NodeXL
My social networks reading group read De Choudury et al’s WWW’10 paper about inferring social networks from email (citation’s below) this week, and I was inspired by our discussion to calculate geometric means for Twitter mentions. You can read elsewhere about the public officials and social media project, but basically I have a bunch of data from Twitter including how often members of Congress mention one another. DeChoudury and colleagues point out that using a binary decision to evaluate the weight of edges (reciprocated or not) doesn’t make much sense when you’re talking about personal communication. They use an alternative weight they call “geometric mean” that I thought might be useful for Twitter mentions. Their equation for geometric mean is
wsij = sqrt[(wij*wji)]
Where two individuals i and j exchange messages. wij is the number of messages from i to j and wji is the number of messages from j to i. In my data, these are mentions. My data is already in a NodeXL workbook, and I’ve already collapsed duplicate edges to create an “Edge Weight” column. Here’s what my data look like now:
To calculate the geometric mean, I used a helper column “MergedVertices” with the formula
=[@[Vertex 1]]&[@[Vertex 2]].
Then the formula for “Geometric Mean” is:
=SQRT(SUMIF([MergedVertices],[@[Vertex 1]]&[@[Vertex 2]],[Edge Weight])*(SUMIF([MergedVertices],[@[Vertex 2]]&[@[Vertex 1]],[Edge Weight])))
Citation: De Choudhury, M., Mason, W. A., Hofman, J. M., & Watts, D. J. (2010). Inferring relevant social networks from interpersonal communication. Proceedings of the 19th international conference on World wide web – WWW ’10 (p. 301). New York, New York, USA: ACM Press. Retrieved from http://dl.acm.org/citation.