I am a Ph.D student at Kno.e.sis Center and I am working with Professor Amit Sheth. My research interests include: Emotion Identification, Sentiment Analysis and Semantic Social Web.
Research Experience
Understanding and Modeling Emotions with Tweets (2011.11-present)
There is nothing more exciting than embracing the era of of big data. We aim to study people's emotions at the level of millions of data entries. We plan to collect, analyze, model emotions in social media and eventually predict people's emotions by the texts people write. Here are some preliminary but interesting discoveries on Twitter users from Estern Standard Timezone (US & Canada) between Nov. 10th and Nov. 28th:
The most significant (more than 65%) emotion on Nov. 24th is thankfulness. (of course^_^) diagram link
The most significant emotion after people get up is thankfulness (see the peak between 7am and 9am); The most significant emotion at night is love(see the love/affection peak around 10pm). diagram link
Computers are good at numbers, while humans are good at words. When it comes to sentiment analysis, besides presenting sentiment polarities (numbers between -1(negative) and 1(positive)), we extract meaningful sentiment clues (both words and phrases). We think it is more interesting and convincing to tell a user that a movie is "must see" and "rate 5 starts" than to say the overall polarity is 0.85. By applying an optimization model to minimize the inconsistency relations among sentiment clues, we extract target-dependent word and phrase clues as well as corresponding polarities from tweets.
Suicide takes away over 98 lives each day in US [stat]. If we can identify and keep track of people's emotions, we will be able to monitor people's mental status and prevent people from committing suicide. We constructed a hybrid classifier that is able to discover the sentence-level emotions from 16 categories, e.g., love, pride, abuse, anger, happiness, guilt, etc. We explored a variety of lexical, syntactic and knowledge-based features. we also proposed an algorithm to automatically extract effective syntactic and lexical patterns from training examples.
Twitris is a Semantic Web application that facilitates understanding of social perceptions by Semantics-based processing of massive amounts of event-centric data. My contributions include:
Designed and implemented a multi-process program to collect tweets for multiple events, collect user profiles and decode user locations to latitudes and longitudes.
Improved website response time by optimizing database schema, MySQL database and SQL queries
Internship in WOO (A Web Of Objects) Team @ Yahoo! (2010.6-2011.4)
The goal of WOO project is to integrate existing high quality knowledge bases, such as DBpedia, Yahoo! Movies, etc. and come with an integrated WOO knowledge base. My job is to disambiguate and integrate movie, actor and director objects in DBpedia and Yahoo! Movies. My contributions include:
Designed and implemented object disambiguation algorithm in Hadoop environment
Generic features as well as domain/problem specific features
Automatically collected training data for supervised machine learning
The ultimate goal of HPCO is to better perform knowledge discovery by semantic search and browsing. Focused domain hierarchy is semi-automatically constructed from Wikipedia and triples are extracted from scientific literatures (PubMed). My job is to align predicates in extracted triples with existing predicates in domain ontology and the idea is to automatically discover verb relationships like, synonym/antonym, from seed verb synonyms/antonyms. My contributions include:
Utilized 100 CPU cores to process large biomedical paper abstracts over Hadoop DFS
Extracted a restricted set of seed synonym/antonym verbs from Wordnet
Constructed probability enabled synonym/antonym patterns from POS tagged training corpus
Applied learned patterns to obtain more synonym/antonym verbs
Research of Services intelligence in Convergence Network Environment @ BUPT(2005-2008)
Extended OWL-S profile to describe telecom service more precisely
Wenbo Wang, Christopher Thomas, Amit Sheth, Victor Chan. Pattern-Based Synonym and Antonym Extraction. 48th ACM Southeast Conference, ACMSE2010, Oxford Mississippi, April 15-17, 2010