Meena Nagarajan                      

Applied Research

Several results of my research have fed into two currently deployed social intelligence applications built over user-generated content from popular online social media platforms.

BBC SoundIndex : Pulse of The Music Populace - In collaboration with IBM Almaden

The results of the entity extraction work in the music domain allowed us to track and trend online mentions of artist, song, track or album mentions. Going beyond volume of entity mentions and to get a sense of the popularity of these entities in the online music community, we built a real-time dashboard system that mines user comments, eliminates spam, identifies entities and associated sentiments (paying close attention to transliterating slang sentiments using a domain dictionary) and generates Top-N lists of popular artists. In doing so, we paid particular attention to the social aspects of the UGC, preserving who wrote the comment, so that the notion of ‘one chart for everyone’ could be replaced with a notion that each group can generate their own charts reflecting the popularity of people like themselves [to appear in a special issue of the VLDB Journal on "Data Management and Mining for Social Networks and Social Media"].

Results from the work were absorbed into a larger Social Intelligence application, BBC’s SoundIndex, that also looks at what music is being requested, traded and sold in digital environments, in addition to mining popularity from comments on artist pages. SoundIndex has been featured in mainstream news and has generated interest in academia and industry alike.

Sample press coverage of the BBC SoundIndex can be found here .

A slidedeck accompanying the VLDB2010 submission can be downloaded from here.

Twitris : Social Signals From Experiential Data On Twitter - In collaboration with Knoesis Center

In Twitris, we use spatial, temporal and thematic contexts implicit in the user-generated tweets8 to extract summaries of social perceptions behind real-time events. Twitris is powered by our work on extracting spatio-temporal-thematic summaries of micro-blog feeds, where the intuition is to supplement the statistical importance of extracted themes with cues from the content’s spatial and temporal attributes. Consequently, in providing summaries of what people are saying about the healthcare debate, Twitris is able to preserve the local social and cultural perceptions that generated the data, for example, preserving the liberal vs. conservative opinions originating from Oregon vs. Georgia. Extracted social summaries are further used to pull semantically related content from the Web and provide richer contexts to the end user.

You can try out Twitris here.

A slide deck describing the system can be downloaded from here.