Governments everywhere would love to be able to foresee and respond to tumultuous events like natural disasters, economic instability, political disruption, or warfare. Thanks to a project recently approved by the US Director of National Intelligence (DNI), head of the Intelligence Community (IC), the idea is not far from reality. This past August, the Intelligence Advanced Research Projects Activity (IARPA) approved funding for a three-year Open Source Indicators (OSI) project. The risky project uses publicly available data harvested from an array of sources such as traffic webcams, Wikipedia edits, blogs, and web search queries in order to supply intelligence agencies with accurate predictions of major events. It does so by relying heavily on past models that have successfully predicted disease outbreaks and consumer behaviour.

The OSI program is only one of many programs undertaken by the different arms of the US government that synthesizes mathematics, computer science, economics, and the social sciences into a predictive model for sociopolitical systems. Programs like OSI pay homage to the fictional mathematics professor Hari Seldon, from Isaac Asimov’s Foundation series, who formulates the mathematical laws of “psychohistory.” Psychohistory is only useful for large groups of people: the larger the group, the more accurate the predictions. In the novels, Seldon uses psychohistory to predict the collapse of the Galactic Empire and sets in motion a secretive plan to spark the rise of a second empire. The Galactic Empire and the US are like apples and oranges, but hopefully similar benefits will be reaped.

The recent explosion in “big data” and social media is what takes the OSI from the domain of science fiction into that of cutting-edge research. “Big data” refers to the exponentially increasing number of large datasets that can be stored on computers. It is only possible thanks to the dirt-cheap price of data storage. The dropping price of data storage and the development of new software tools coincided with the rise of Facebook, Twitter, and the other paragons of social media. The widespread use of these services generates a large capacity of data that records user activity. User activity can be analyzed for sentiment, allowing researchers to separate data into three general levels of sentiment: Negative, Positive, or Neutral. The fluctuating direction of levels provides strong leads for businesses looking for profitable information.

More so than governments, businesses are eager to mine the datasets generated by social media and join OSI. Recorded Future uses proprietary software to sift through 300,000 different sources every hour in order to predict the movement of the stock market and even terrorist activity. Recorded Future is one of many companies applying to the OSI project.

Other companies, although not directly related to the OSI program, are still developing technology that effectively mines these datasets. For example, Toronto-based Sysomos provides businesses and organizations with tools to monitor and quantify the sentiment in online conversations happening on social media outlets regarding their brands. Sysomos uses proprietary natural language-processing technology to extract sentiment from relevant text online. Klout, on the other hand, is another company attempting to measure and rank the influence of individuals on their peers in social networks. Klout uses your online social footprint — based on how your network engages the content you create or share — to gauge three different indicators: the number of people you influence, the extent to which you influence those in your network, and how influential the people in your network are. Using these three indicators, Klout derives a ‘klout’ score, on a scale of one to 100, for each individual. Though scores aren’t always perfectly accurate, most companies would nevertheless love to market themselves directly at the most influential people.

Some academics are worried that models of sentiment analysis and influence designed for predicting consumer behaviour on social networks is inapplicable to sociopolitical prediction. Robert Albro, an anthropologist at American University, expresses concerns that consumer behaviour-driven models and their assumptions will eventually influence IARPA. Kalev Leetaru, a computer scientist at the Univerity of Illinois, believes the OSI project is better off attempting to predict trends, like the Arab Spring, instead of discrete events. In his view, the technology used for predicting stock market movements or consumer behaviour is a far cry from predicting a riot next week.

Whether or not the OSI project succeeds, governments and corporations everywhere will be attempting to predict the future. One thing we may expect to see is further expansion in the data mining market along with increased capacity for big data.