The spectacular scientific opportunities afforded by the use of social media are readily apparent when we consider the richness and precision of data on participation in elections, protests, riots, and other spontaneous political events. We are constructing comprehensive data sets of incoming and outgoing social media messages using systematically structured formats that are ideally suited to machine learning methods. We plan to integrate information on social network connectivity and a vast array of metadata on individuals and their social contacts. By developing new methods to harvest and combine these data sources effectively, it will be possible to transform the scientific study of social and political attitudes and behavior.

Every time individuals use social media, they leave behind a digital footprint of what was communicated, when it was communicated, and, to whom it was communicated. Typically, such precise estimates of these variables are available only to laboratory investigators working in artificial settings. To our knowledge, no previous research team has successfully used fine-grained social influence data such as these to predict consequential behavioral outcomes, such as attendance at a given protest or rally or the casting of a vote in an election. We are also conducting panel surveys, which are essential for drawing causal inferences about the cognitive and motivational processes whereby social media use facilitates political participation.

Our overarching goal is to forge an interdisciplinary collaboration that examines the impact of social media on political behavior by iterating through stages of model development, testing, refinement, and validation. First, from social psychology and political science we derive fundamental hypotheses about how, why, and when social media affects citizens’ cognitions and motivations with respect to political participation. Second, we express these questions as empirically testable hypotheses derived from behavioral models (e.g., with quantitative response and predictor variables). And third, drawing from biology and computer science we adapt sophisticated computational methods of approximate inference and machine learning (adapting methods developed for the analysis of Systems Biology data) to evaluate our behavioral models using extremely large social media and social network datasets.