EDBT/ICDT 2009 Joint Conference

Electronic Conference Proceedings

Efficient Identification of Starters and Followers in Social Media

Authors

Abstract

Activity and user engagement in social media such as web logs, wikis, online forums or social networks has been increasing at unprecedented rates. In relation to social behavior in various human activities, user activity in social media indicates the existence of individuals that consistently drive or stimulate ‘discussions’ in the online world. Such individuals are considered as ‘starters’ of online discussions in contrast with ‘followers’ that primarily engage in discussions and follow them.

In this paper, we formalize notions of ‘starters’ and ‘followers’ in social media. Motivated by the challenging size of the available information related to online social behavior, we focus on the development of random sampling approaches allowing us to achieve significant efficiency while identifying starters and followers. In our experimental section we utilize BlogScope, our social media warehousing platform under development at the University of Toronto. We demonstrate the scalability and accuracy of our sampling approaches using real data establishing the practical utility of our techniques in a real social media warehousing environment.

Session

EDBT Research Session 20: Workflow Techniques (Thursday, March 26, 11:00—12:30)