Meenakshi Nagarajan (Meena)
Researcher / PhD Student

Research

I am a Ph.D student with the Knoesis Center in the Department of Computer Science and Engineering at Wright State University. I moved with my advisor Amit Sheth from the University of Georgia where I pursued a M.S and Ph.D in Computer Science. My undergraduate education was a degree in Master of Management Studies with a specialization in Computer Science from BITS - Pilani, India.

I am interested in the statistical and natural language processing of data originating from social software. Specifically, my work tries to address some of the challenges that social data - structured, semi-structured and unstructured bring to information analytics applications. Using a combination of statistical and linguistic techniques backed by (often limited) domain knowledge, my work assists in the gleaning of underlying semantics of the data while taking advantage of the social structure from which the data originates.

Recent News:

Primary contributor to recently funded proposals:

  • UIMA Innovation Award: IBM awarded our proposal to the UIMA competition, "UIMA-based Infrastructure for Summarizing Casual, Unstructured Text" based on my work during my summer internship with the Semantic Super Computing group at IBM Almaden in 2007.
  • Microsoft's "Beyond Search - Semantic Computing and Internet Economics" 2008 Award: Microsoft awarded our proposal on "Chatter, Intent, Good Karma and Targeted Advertising in Social Networks" based on my work.

Publications:

  • B. Aleman-Meza, M. Nagarajan, L. Ding, A. Sheth, I. B. Arpinar, A. Joshi, and T. Finin, "Scalable Semantic Analytics on Social Networks for Addressing the Problem of Conflict of Interest Detection," ACM Transactions on the Web, 2 (no.1), February 2008. More information here:

Projects

Some past efforts in this direction have been the following:

1. Disambiguation of entities in real-world social network data (WWW 2006 and ACM TWeb 2008)

This was an effort to build scalable algorithms for disambiguating entity mentions in two real-world online sources DBLP and FOAF in the context of a conflict of interest application. The rule based implementation (tested on two graphs with more than 500,000 entities each) takes into account the connections between the entities, attributes of the entities and previous reconciliation decisions to suggest confidence scores in the disambiguation. While reference reconciliation of entities is not a new problem, using connections between entities in a social network brings a new level of understanding and creativity to the problem. National security applications that need to merge information about the same person (occurring as two different entities in the real world) by using connections with other entities are an example of potential applications that this implementation can be extended to.

2. Data mediation applications on semi-structured data (ICWS 2006, JWSR 2007)

Once matches between concepts have been established, using them at run-time for supporting data mediation type applications is a different problem. One of the most important challenges is in generating mappings or transformations between concepts beyond the established matches. For example, two Name objects might be semantically equivalent but structurally different. One may encode first, middle and last names while the other includes only the first and last name of a person. Such cases are especially commonplace in a business process where potentially interoperable components cannot function together because of heterogeneities at the structure level. More importantly, executing the transformations at run-time using the existing technology infrastructure without significant overheads is also an important consideration. This work addresses the above two concerns: specifying mappings to ‘socially agreed vocabularies’ or Ontologies and executing them at runtime to facilitate mediation using existing business process standards (WSDL, BPEL) and tools (Axis 2).

3. Socially agreed vocabularies as expectations of word co-occurences (WWW 2007)

This was a joint work with Hewlett Packard research labs in summer 2006. The idea was to use background knowledge found in agreed vocabularies or Ontologies as corroboration for term co-occuences in text. It is a well known fact that two words in a sentence do not always occur independently of each other. The likelihood that one will find Florida in a document when one spots the term Miami is quiet high. Words occuring in a document can be related to each other in many ways. This work exploits the semantic relationships between terms (Place State) explicated in Ontologies to alter the discriminatory position of terms in a document term vector for the task of classification. The goal is to emphasize the usefulness of background knowledge, often not available in the document, to identify terms that are most important or discriminating in a document.

4. Mining topical popularities from unstructured data in social communities (Submitted to WWW 2008)

This was a joint work with the Semantic Super Computing group at IBM Almaden in summer 2007. The UIMA based implementation (which was primarily my responsibility) mined user comments on music artist pages to correlate volume and type of comments (positive or negative) with artist popularity over time. Mining comments in such a social network included removing spam, identifying music, artist and track related entities and sentiment polarities. The uniqueness in the approach was a result of the nature of teen-authored content. Beset with demographic specific diction, slang and broken English, the content was analyzed using a combination of statistical and linguistic techniques. The text analysis techniques developed for this data was used and found to be effective on call center logs (that represent similar linguistic challenges) to identify popularly expressed customer sentiments about an agent or product.

Publications

See publications section in resume: pdf

Experience

KNO.E.SIS Center - Wright State University : Member and Graduate Research Assistant (Jan 2007 - Present)

Large Scale Distributed Information Systems Lab, LSDIS - University Of Georgia : Member and Graduate Research Assistant (Aug 2003 - Dec 2006)

IBM Almaden Research Center - San Jose, California : Summer Research Internship (June 2007 - Aug 2007)

Hewlett Packard Research Labs - Palo Alto, California : Summer Research Internship (May 2006 - Aug 2006)

Courses

Content coming soon

Contact

Email - meena610ATgmailDOTcom