Recently the senior class voted on prom themes and there was also an additional field requesting people to select their friends. This was going to be used as an attempt to forecast possible winners of prom king and queen. My theory was that by using PageRank, an algorithm that Google uses to rank importance of a website, to determine the “most popular” people and thereby predict who will win prom king and queen.
TL;DR
- tried to survey people of their friends
- use pagerank on the resulting graph determine most popular and predict prom king and queen winners
- failed: not enough participants, people didn’t understand why, PageRank doesn’t seem to correlate with popularity.
- other ideas?
TL
Of the 209 seniors, 104 people voted on prom themes. There were three themes and each person was allowed to put a number for each theme between 1 and 10. The reason for this, rather than a first choice, second choice, third choice system was because some people feel adamantly about one about and really don’t want the other options, while some could careless. Those who careless generally give the themes about the same score, whereas those that really don’t want one theme can give it a low score relative to others. I had many 1,10,1 kinds of votes, and a number of 6,7,8 kinds of votes. Clearly some people felt stronger about this than others (I should’ve graphed participation between girls and boys, I’ll do that tomorrow). To determine the prom theme, the scores were averaged and the highest average was chosen.
Now while less than half the class voted (almost half), the results are representative of the class. There was a clear theme which had a higher average than the other two, thus a complete class participation wouldn’t likely change the outcome (also those who didn’t vote didn’t care enough about the theme anyway).
At the bottom of the voting form was something for the voter to select all of their friends. The form required at least 3 friends to be selected. AT LEAST, most people only selected 3 friends and then submitted their vote. On a side note this says something about how attentive people read when they are directed from Facebook, or it may be that people didn’t have much time to list their friends (though there was a quick search as you type box).
I ran PageRank on this (initially the R implementation and then I coded my own) friend graph. While I got results of who the “most popular” people were, the results aren’t very conclusive. While the prom voting is, because those are simple stats, doing PageRank on a graph requires most of the graph, ideally the whole graph. With only half of the graph, the results are inconclusive.
Further, PageRank is useful for determining “good quality” web pages because it uses link data, what pages other people link to, to determine good pages. I don’t know if this analogy can apply to people, or at least I’m not sure of how to interpret this. When the “links” are friendship lines, what is being determined by finding the highest ranking people with PageRank? Most friendly, most popular, most influential? I’m not sure and I’d like to know your thoughts on this.