With more than 500 million Twitter messages posted daily, social media is exploding on the Internet. To better make sense of this microblogging flood, researchers at College of Science and Technology and Binghamton University have been awarded a three-year, $1.08 million National Science Foundation grant. The research could have vast applications for both government and industry. The goal is to enable the targeted monitoring of social media in order to:
- collect and understand user's opinions about products and brands, people, election preferences, or recent world events
- aggregate data about such topics as reviews of products and services
- mine data for early crisis detection and response
- mine data to fight crime
- mine data to enhance national security and combat terrorism.
"From a computer's point of view, the contents of social media microblogs are simply streams of data," says Eduard Dragut, assistant professor in the College of Science and Technology's Department of Computer & Information Sciences and one of the principal investigators. "We have to develop an algorithm that will enable computers to identify specific entities within the text of a microblog message, such as Coca-Cola, President Obama, Temple University or Boko Haram."
Challenges to that task include the massive volume of messages posted daily on such social media platforms as Twitter, Facebook, Pinterest and Instagram; the speed at which they are posted; their free-form language; lack of context; and the use of multiple languages. Ultimately, the goal is to be able to detect, in near real-time, pieces of text that reference specific entities, and then to link such entity references to both other social media platforms and to web pages, including Wikipedia, that mention and define these persons, groups and products.
"One interesting facet of this project is that, despite the fact that such microblogs such as Twitter are so short, through the identification and aggregation of specific entities mentioned we can extract a lot of information," says Yuhong Guo, associate professor of CIS and the research project's other lead investigator.
The resulting algorithms and software will be distributed as free, open-source software to universities, industries and government agencies.
Weiyi Meng, professor of computer science at Binghamton University, is also collaborating on the research. The project will support and train at least three graduate students and one undergraduate student at Temple and one graduate student at Binghamton.