METIS: Automating Metabase Creation from Multiple Heterogeneous Sources

Publication TypeThesis
Year of Publication2000
UniversityUniversity of Georgia
Thesis Typemasters
KeywordsData heterogeneity, Domain modeling, Field matching, Fuzzy matching and Object fusion., information extraction, information integration, metadata, Object Identification, semantic heterogeneity, Semi-structured information, Structural heterogeneity, Syntactic heterogeneity

An important component in information systems involving information extraction and integration from semi-structured (Web) sources for answering information requests is a metabase. A metabase is a single database of information about an entity where the data is extracted, integrated and fused from multiple heterogeneous and autonomously created information sources. These metabases can be virtual metabases created during run-time or metabases created off-line through extraction from different sources. However, creation of such a metabase is not an easy task, since information for each entity may be modeled, structured, named, and stored differently across sources. Previous works have focused on solving these issues in different contexts. This thesis seeks to solve the main issues involved in this sort of integration and to automate the process of metabase creation for any domain. It also provides a toolkit to automate the creation of the metabase by declaratively specifying rules to standardize and integrate this information.

