RDF Query Processing

Our work in Keyword Search Interface to express path queries in RDF provides the capability to take keywords and then extract semantic associations by transforming the keyword query into an internal representation of SPARQ2L. We perform semantic query expansion on the keywords using its synonyms.The set of keywords is expanded to include their synonyms,stemmed words and tokenized compound words. This allows the user to not worry about the internal representation as to how the entities are named.Also,the user need not worry as to how the entities are linked to each other in the dataset.

We are using WordNet to perform the expansion as of now, but the system can be customized for a specific domain by switching to a domain specific ontology. We use the internal representation of SPARQ2L to find the semantic association. Execution of this query extracts relevant paths connecting the two entities.SPARQ2L uses Tarjans algorithm for single source path expression problem. This allows us to find all the paths connecting two entities and represent them in a compact form such as a regular expression.

To exemplify, keyword like "Council Infiltration" and "Middle East" on Insider Threat Dataset will return paths connecting the two entities. A possible path would be

Council->located_in->Beirut->capital_of->Lebanon->located_in->Middle East

We can see applications of similar path related searches in some of the new and upcoming search engines like PowerSet,where their features like PowerMouse allows people to find the missing link given two entities. However Powerset uses a syntatic technique to extract facts from Wikipedia.

Our future plans for this work include 1.Extend the system Sparq2l 2.Provide a service end point to perform query optimization 3.Include more constraints and further enhance our Keyword to Sparq2l query mapping.

This material is based on work partially supported by the National Science Foundation under Award No. 071441 to Wright State University and No. IIS-0325464 to University of Georgia titled “SemDis: Discovering Complex Relationships in the Semantic Web.” Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. PI: Amit Sheth, Co-PIs: I. Budak Arpinar, Krys Kochut, and John Miller.