Identifying Tweets with Implicit Entity Mentions

TitleIdentifying Tweets with Implicit Entity Mentions
Publication TypeThesis
Year of Publication2016
AuthorsAdarsh Alex
Academic DepartmentDepartment of Computer Science and Engineering
Number of Pages56
Date Published08/2016
UniversityWright State University
CityDayton
Thesis TypeMS Thesis
Abstract

Social networking sites like Twitter and Facebook have become a significant source of user-generated content in the past decade. Mining of this user-generated content has proved beneficial for a broad range of applications like Event Extraction, Document Retrieval, and Sentiment Analysis. Identifying entities is one of the major tasks that fuel important information for above tasks. Identification of entities is typically performed in two steps; Named Entity Recognition (NER) and Entity Linking. State of the art NER solutions focus on recognizing the entities that are mentioned explicitly in social media posts. However, entities are frequently mentioned implicitly in them. For example, the tweet ‘Didn’t know that its the same actress in Fault in our stars and Divergent.’ contains explicit references to movies Fault in our stars and Divergent while it implicitly refers to actress Shailene Woodley. Spotting and classifying tweets with such implicit entity mentions (i.e. recognize that above tweet has implicit entity of type ACTRESS) is the initial step towards identifying the implicit mention of Shailene Woodley in this tweet. In this thesis, we propose a two step semantic driven approach to address the spotting and typing of implicit entity mentions in text. Specifically, we answer two research questions in this thesis:

  1. How to find tweets that have implicit entity mentions of a given type?
  2. What features help to distinguish tweets with implicit entity mentions from tweets with explicit entity mentions and tweets with no entity mentions at all?

We answer the first question by developing a technique to find semantic cues that indicate the presence of implicit entity mentions in tweets. The second research question is answered by exploiting the syntactic features of the tweets, along with semantic features extracted from crowd-sourced knowledge bases like Wikipedia and DBpedia, to determine whether a tweet has an implicit entity mention or not. We evaluate our approach by creating a gold standard dataset for two domains namely movies and books.

Related Files: