TY - CONF
T1 - Intent Classification of Short-Text on Social Media
T2 - SocialCom 2015: 8th IEEE International Conference on Social Computing and Networking
Y1 - 2015
A1 - Hemant Purohit
A1 - Guozhu Dong
A1 - Valerie Shalin
A1 - Krishnaprasad Thirunarayan
A1 - Amit Sheth
KW - Contrast mining
KW - Crisis Informatics
KW - Declarative Knowledge
KW - Intent Mining
KW - Psycholinguistics
KW - Social Media
AB - Social media platforms facilitate the emergence of citizen communities that discuss real-world events. Their content reflects a variety of intent ranging from social good (e.g., volunteering to help) to commercial interest (e.g., criticizing product features). Hence, mining intent from social data can aid in filtering social media to support organizations, such as an emergency management unit for resource planning. However, effective intent mining is inherently challenging due to ambiguity in interpretation, and sparsity of relevant behaviors in social data. In this paper, we address the problem of multiclass classification of intent with a use-case of social data generated during crisis events. Our novel method exploits a hybrid feature representation created by combining top-down processing using knowledge-guided patterns with bottom-up processing using a bag-of-tokens model. We employ pattern-set creation from a variety of knowledge sources including psycholinguistics to tackle the ambiguity challenge, social behavior about conversations to enrich context, and contrast patterns to tackle the sparsity challenge. Our results show a significant absolute gain up to 7% in the F1 score relative to a baseline using bottom-up processing alone, within the popular multiclass frameworks of One-vs-One and One-vs-All. Intent mining can help design efficient cooperative information systems between citizens and organizations for serving organizational information needs.
JA - SocialCom 2015: 8th IEEE International Conference on Social Computing and Networking
PB - IEEE
CY - Chengdu, China
ER -
TY - JOUR
T1 - Pattern-Aided Regression Modeling and Prediction Model Analysis
JF - IEEE Transactions on Knowledge and Data Engineering
Y1 - 2015
A1 - Guozhu Dong
A1 - Vahid Taslimitehrani
KW - Correlation and regression analysis
KW - Data Mining
KW - error analysis
KW - mining methods and algorithms
KW - model validation and analysis
AB - This paper first introduces pattern aided regression (PXR) models, a new type of regression models designed to represent accurate and interpretable prediction models. This was motivated by two observations: (1) Regression modeling applications often involve complex diverse predictor-response relationships, which occur when the optimal regression models (of given regression model type) fitting two or more distinct logical groups of data are highly different. (2) State-of-the-art regression methods are often unable to adequately model such relationships. This paper defines PXR models using several patterns and local regression models, which respectively serve as logical and behavioral characterizations of distinct predictor-response relationships. The paper also introduces a contrast pattern aided regression (CPXR) method, to build accurate PXR models. In experiments, the PXR models built by CPXR are very accurate in general, often outperforming state-of-the-art regression methods by big margins. Usually using (a) around seven simple patterns and (b) linear local regression models, those PXR models are easy to interpret; in fact, their complexity is just a bit higher than that of (piecewise) linear regression models and is significantly lower than that of traditional ensemble based regression models. CPXR is especially effective for high-dimensional data. The paper also discusses how to use CPXR methodology for analyzing prediction models and correcting their prediction errors.
VL - 27
CP - 9
ER -
TY - Generic
T1 - A New CPXR Based Logistic Regression Method and Clinical Prognostic Modeling Results Using the Method on Traumatic Brain Injury
T2 - IEEE 14th International Conference on BioInformatics and BioEngineering (BIBE)
Y1 - 2014
A1 - Vahid Taslimitehrani
A1 - Guozhu Dong
KW - contrast pattern mining
KW - Logistic regression
KW - Prognostic modeling
KW - Traumatic brain injury
AB - Prognostic modeling is central to medicine, as it is often used to predict patients' outcome and response to treatments and to identify important medical risk factors. Logistic regression is one of the most used approaches for clinical prediction modeling. Traumatic brain injury (TBI) is an important public health issue and a leading cause of death and disability worldwide. In this study, we adapt CPXR (Contrast Pattern Aided Regression, a recently introduced regression method), to develop a new logistic regression method called CPXR(Log), for general binary outcome prediction (including prognostic modeling), and we use the method to carry out prognostic modeling for TBI using admission time data. The models produced by CPXR(Log) achieved AUC as high as 0.93 and specificity as high as 0.97, much better than those reported by previous studies. Our method produced interpretable prediction models for diverse patient groups for TBI, which show that different kinds of patients should be evaluated differently for TBI outcome prediction and the odds ratios of some predictor variables differ significantly from those given by previous studies; such results can be valuable to physicians.
JA - IEEE 14th International Conference on BioInformatics and BioEngineering (BIBE)
PB - IEEE
CY - Boca Raton, Florida
ER -
TY - CONF
T1 - Logical Linked Data Compression
T2 - 10th Extended Semantic Web Conference (ESWC 2013 )
Y1 - 2013
A1 - Amit Joshi
A1 - Pascal Hitzler
A1 - Guozhu Dong
AB - Linked data has experienced accelerated growth in recent years. With the continuing proliferation of structured data, demand for RDF compression is becoming increasingly important. In this study, we introduce a novel lossless compression technique for RDF datasets, called Rule Based Compression (RB Compression) that compresses datasets by generating a set of new logical rules from the dataset and removing triples that can be inferred from these rules. Unlike other compression techniques, our approach not only takes advantage of syntactic verbosity and data redundancy but also utilizes semantic associations present in the RDF graph. Depending on the nature of the dataset, our system is able to prune more than 50% of the original triples without affecting data integrity.
JA - 10th Extended Semantic Web Conference (ESWC 2013 )
CY - Montpellier, France
ER -
TY - JOUR
T1 - Mining Effective Multi-Segment Sliding Window for Pathogen Incidence Rate Prediction
JF - Data & Knowledge Engineering
Y1 - 2013
A1 - Lei Duan
A1 - Changjie Tang
A1 - Xiasong Li
A1 - Guozhu Dong
A1 - Xianming Wang
A1 - Jie Zuo
A1 - Min Jiang
A1 - Zhongqiao Li
A1 - Yongqing Zhang
KW - Data Mining
KW - Multi-segment sliding window
KW - Pathogen incidence rate prediction
KW - Time series modeling
AB - Pathogen incidence rate prediction, which can be considered as time series modeling, is an important task for infectious disease incidence rate prediction and for public health. This paper investigates applying a genetic computation technique, namely GEP, for pathogen incidence rate prediction. To overcome the shortcomings of traditional sliding windows in GEP based time series modeling, the paper introduces the problem of mining effective sliding window, for discovering optimal sliding windows for building accurate prediction models. To utilize the periodical characteristic of pathogen incidence rates, a multi-segment sliding window consisting of several segments from different periodical intervals is proposed and used. Since the number of such candidate windows is still very large, a heuristic method is designed for enumerating the candidate effective multi-segment sliding windows. Moreover, methods to find the optimal sliding window and then produce a mathematical model based on that window are proposed. A performance study on real-world datasets shows that the techniques are effective and efficient for pathogen incidence rate prediction.
VL - 87
ER -
TY - JOUR
T1 - Survey of Emerging Pattern based Contrast Mining and Applications
Y1 - 2012
A1 - Lei Duan
A1 - Changjie Tang
A1 - Guozhu Dong
A1 - Nin Yang
A1 - Chi Gou
ER -
TY - JOUR
T1 - Use Attribute Behavior Diversity to Build Accurate Decision Tree Committees for Microarray Data
Y1 - 2012
A1 - Qian Han
A1 - Guozhu Dong
ER -
TY - CONF
T1 - Discovering Dynamic Logical Blog Communities Based on Their Distinct Interest Profiles.
T2 - SOTICS 2011
Y1 - 2011
A1 - Neil Fore
A1 - Guozhu Dong
JA - SOTICS 2011
CY - Barcelona, Spain
ER -
TY - CONF
T1 - An Equivalence Class Based Clustering Algorithm for Categorical Data
T2 - International Conference on Advances in Information Mining and Management
Y1 - 2011
A1 - Qingbao Liu
A1 - Wanjun Wang
A1 - Su Deng
A1 - Guozhu Dong
JA - International Conference on Advances in Information Mining and Management
PB - International Conference on Advances in Information Mining and Management
CY - Barcelona, Spain
ER -
TY - CONF
T1 - Overview of Contrast Data Mining as a Field and Preview of an Upcoming Book
T2 - 11th IEEE International Conference on Data Mining Workshops
Y1 - 2011
A1 - Guozhu Dong
A1 - James Bailey
JA - 11th IEEE International Conference on Data Mining Workshops
PB - ICDM 2011
CY - Las Vegas, NV
ER -
TY - Generic
T1 - Analyzing and Tracking Weblog Communities Using Discriminative Collection Representatives
T2 - SBP 2010
Y1 - 2010
A1 - Guozhu Dong
A1 - Ting Sa
KW - Behavioral Modeling
KW - Discriminative Collection Representatives
KW - Prediction
KW - Social Computing
JA - SBP 2010
CY - Bethesda, MD
VL - 6007
ER -
TY - JOUR
T1 - A Clustering Comparison Measure Using Density Profiles and its Application to the Discovery of Alternate Clusterings
JF - Data Mining and Knowledge Discovery
Y1 - 2010
A1 - Eric Bae
A1 - James Bailey
A1 - Guozhu Dong
KW - alternate clustering algorithms
KW - alternate clusterings
KW - cluster analysis
KW - clustering
KW - clustering comparison
KW - clustering similarity
AB - Data clustering is a fundamental and very popular method of data analysis. Its subjective nature, however, means that different clustering algorithms or different parameter settings can produce widely varying and sometimes conflicting results. This has led to the use of clustering comparison measures to quantify the degree of similarity between alternative clusterings. Existing measures, though, can be limited in their ability to assess similarity and sometimes generate unintuitive results. They also cannot be applied to compare clusterings which contain different data points, an activity which is important for scenarios such as data stream analysis. In this paper, we introduce a new clustering similarity measure, known as ADCO, which aims to address some limitations of existing measures, by allowing greater flexibility of comparison via the use of density profiles to characterize a clustering. In particular, it adopts a 'data mining style' philosophy to clustering comparison, whereby two clusterings are considered to be more similar, if they are likely to give rise to similar types of prediction models. Furthermore, we show that this new measure can be applied as a highly effective objective function within a new algorithm, known as MAXIMUS, for generating alternate clusterings.
ER -
TY - JOUR
T1 - Logical Queries over Views: Decidability and Expressiveness
JF - ACM Transactions on Computational Logic
Y1 - 2010
A1 - James Bailey
A1 - Anthony Widjaja
A1 - Guozhu Dong
KW - conjunctive query
KW - containment
KW - database query
KW - database view
KW - decidability
KW - first-order logic
KW - Lowenheim class
KW - monadic logic
KW - ontology reasoning
KW - Satisfiability
KW - unary logic
KW - unary view
AB - We study the problem of deciding the satisfiability of first-order logic queries over views, with our aim to delimit the boundary between the decidable and the undecidable fragments of this language. Views currently occupy a central place in database research due to their role in applications such as information integration and data warehousing. Our main result is the identification of a decidable class of first-order queries over unary conjunctive views that general the decidability of the classical class of first-order sentences over unary relations known as the Lowenheim class. We then demonstrate how various extensions of this class lead to undecidability and also provide some expressivity results. Besides its theoretical interest, our new decidable class is potentially interesting for use in applications such as deciding implication of complex dependencies, analysis of a restricted class of active database rules, and ontology reasoning.
ER -
TY - JOUR
T1 - Pattern Space Maintenance for Data Updates and Interactive Mining
JF - Computational Intelligence
Y1 - 2010
A1 - Mengling Feng
A1 - Guozhu Dong
A1 - Jinyan Li
A1 - Yap-Peng Tan
A1 - Limsoon Wong
KW - Data mining algorithms
KW - data update and interactive mining
KW - frequent pattern
KW - incremental maintenance
AB - This paper addresses the incremental and decremental maintenance of the frequent pattern space. We conduct an in-depth investigation on how the frequent pattern space evolves under both incremental and decremental updates. Based on the evolution analysis, a new data structure, Generator-Enumeration Tree (GE-tree), is developed to facilitate the maintenance of the frequent pattern space. With the concept of GE-tree, we propose two novel algorithms, Pattern Space Maintainer+ (PSM+) and Pattern Space Maintainer- (PSM-), for the incremental and decremental maintenance of frequent patterns. Experimental results demonstrate that the proposed algorithms, on average, outperform the representative state-of-the-art methods by an order of magnitude.
VL - 26
CP - 3
ER -
TY - JOUR
T1 - Unary First Order Logic Queries Over Views
Y1 - 2010
A1 - Guozhu Dong
A1 - James Bailey
A1 - Anthony Widjaja
ER -
TY - CONF
T1 - A Contrast Pattern Based Clustering Quality Index for Categorical Data
T2 - IEEE International Conference on Data Mining series (ICDM 2009)
Y1 - 2009
A1 - Qingbao Liu
A1 - Guozhu Dong
AB - Since clustering is unsupervised and highly explorative, clustering validation (i.e. assessing the quality of clustering solutions) has been an important and long standing research problem. Existing validity measures have significant shortcomings. This paper proposes a novel Contrast Pattern based Clustering Quality index (CPCQ) for categorical data, by utilizing the quality and diversity of the contrast patterns (CPs) which contrast the clusters in clusterings. High quality CPs can characterize clusters and discriminate them against each other. Experiments show that the CPCQ index (1) can recognize that expert-determined classes are the best clusters for many datasets from the UCI repository; (2) does not give inappropriate preference to larger number of clusters; (3) does not require a user to provide a distance function.
JA - IEEE International Conference on Data Mining series (ICDM 2009)
CY - Miami, Florida
ER -
TY - BOOK
T1 - Emerging Pattern Based Classification
Y1 - 2009
A1 - Guozhu Dong
A1 - Jinyan Li
ER -
TY - JOUR
T1 - Emerging Patterns
Y1 - 2009
A1 - Jinyan Li
A1 - Guozhu Dong
ER -
TY - BOOK
T1 - Evaluation of Inter Laboratory and Cross Platform Concordance of DNA Microarrays through Discriminating Genes and Classifier Transferability
Y1 - 2009
A1 - Shihong Mao
A1 - Chalres Wang
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Incremental Computation of Queries
Y1 - 2009
A1 - Guozhu Dong
A1 - Jianwen Su
KW - Computer Communication Networks
KW - computer imaging
KW - database management
KW - Information Storage and Retrieval
KW - information systems applications
KW - Multimedia Information Systems
KW - pattern recognition and graphics
KW - vision
AB - A view on a database is defined by a query over the database. When the database is updated, the value of the view (namely the answer to the query) will likely change. The computation of the new answer to the query using the old answer is called incremental query computation or incremental view maintenance. Incremental computation is typically performed by identifying the part in the old answer that need to be removed, and the part in the new answer that need to be added. Incremental computation is desirable when it is much more efficient than a re-computation of the query. Efficiency can be measured by computation time, storage space, or query language desirability/availability, etc. Incremental computation algorithms could use auxiliary relations (in addition to the query answer), which also need to be incrementally computed. Two query languages can be involved for the incremental query computation problem. One is used for defining the view to be maintained, and the other for describing the incremental computation algorithm. For relational databases, the two languages can be relational algebra, SQL, nested relational algebra, Datalog, SQL embedded in a host programming language, etc.
ER -
TY - CHAP
T1 - Maintenance of Frequent Patterns: A Survey
Y1 - 2009
A1 - Jinyan Li
A1 - Limsoon Wong
A1 - Mengling Feng
A1 - Guozhu Dong
AB - This chapter surveys the maintenance of frequent patterns in transaction datasets. It is written to be accessible to researchers familiar with the field of frequent pattern mining. The frequent pattern main-tenance problem is summarized with a study on how the space of frequent patterns evolves in response to data updates. This chapter focuses on incremental and decremental maintenance. Four major types of maintenance algorithms are studied: Apriori-based, partition-based, prefix-tree-based, and concise-representation-based algorithms. The authors study the advantages and limitations of these algorithms from both the theoretical and experimental perspectives. Possible solutions to certain limitations are also proposed. In addition, some potential research opportunities and emerging trends in frequent pat-tern maintenance are also discussed.
ER -
TY - CHAP
T1 - Mining Conditional Contrast Patterns
Y1 - 2009
A1 - Guozhu Dong
A1 - Guimei Liu
A1 - Limsoon Wong
A1 - Jinyan Li
AB - This chapter considers the problem of 'conditional contrast pattern mining.' It is related to contrast mining, where one considers the mining of patterns/models that contrast two or more datasets, classes, conditions, time periods, and so forth. Roughly speaking, conditional contrasts capture situations where a small change in patterns is associated with a big change in the matching data of the patterns. More precisely, a conditional contrast is a triple (B, F_{1}, F_{2}) of three patterns; B is the condition/context pattern of the conditional contrast, and F_{1} and F_{2} are the contrasting factors of the conditional contrast. Such a conditional contrast is of interest if the difference between F_{1} and F_{2} as itemsets is relatively small, and the difference between the corresponding matching dataset of B∪F_{1} and that of B∪F_{2 is relatively large. It offers insights on 'discriminating' patterns for a given condition B. Conditional contrast mining is related to frequent pattern mining and analysis in general, and to the mining and analysis of closed pattern and minimal generators in particular. It can also be viewed as a new direction for the analysis (and mining) of frequent patterns. After formalizing the concepts of conditional contrast, the chapter will provide some theoretical results on conditional contrast mining. These results (i) relate conditional contrasts with closed patterns and their minimal generators, (ii) provide a concise representation for conditional contrasts, and (iii) establish a so-called dominance-beam property. An efficient algorithm will be proposed based on these results, and experiment results will be reported. Related works will also be discussed.
ER -
TY - JOUR
T1 - Mining Disease State Converters for Medical Intervention of Diseases.
Y1 - 2009
A1 - Changjie Tang
A1 - Lei Duan
A1 - Guozhu Dong
KW - Class membership conversion
KW - Classification
KW - Contrast mining
KW - Disease state conversion
KW - Drug design
AB - In applications such as gene therapy and drug design, a key goal is to convert the disease state of diseased objects from an undesirable state into a desirable one. Such conversions may be achieved by changing the values of some attributes of the objects. For example, in gene therapy one may convert cancerous cells to normal ones by changing some genes' expression level from low to high or from high to low. In this paper, we define the disease state conversion problem as the discovery of disease state converters; a disease state converter is a small set of attribute value changes that may change an object's disease state from undesirable into desirable. We consider two variants of this problem: personalized disease state converter mining mines disease state converters for a given individual patient with a given disease, and universal disease state converter mining mines disease state converters for all samples with a given disease. We propose a DSCMiner algorithm to discover small and highly effective disease state converters. Since real-life medical experiments on living diseased instances are expensive and time consuming, we use classifiers trained from the datasets of given diseases to evaluate the quality of discovered converter sets. The effectiveness of a disease state converter is measured by the percentage of objects that are successfully converted from undesirable state into desirable state as deemed by state-of-the-art classifiers. We use experiments to evaluate the effectiveness of our algorithm and to show its effectiveness. We also discuss possible research directions for extensions and improvements. We note that the disease state conversion problem also has applications in customer retention, criminal rehabilitation, and company turn-around, where the goal is to convert class membership of objects whose class is an undesirable class.
ER -
TY - CONF
T1 - Mining Sequence Classifiers for Early Prediction
T2 - Mining Sequence Classifiers for Early Prediction
Y1 - 2008
A1 - Guozhu Dong
A1 - Zhengzheng Xing
A1 - Philip Yu
A1 - Jian Pei
JA - Mining Sequence Classifiers for Early Prediction
ER -
TY - JOUR
T1 - Semantic Knowledge Facilities for a Web-based Recipe Database System Supporting Personalization
Y1 - 2008
A1 - Qing Li
A1 - Liping Wang
A1 - Guozhu Dong
A1 - Yu Li
ER -
TY - CONF
T1 - Substructure Similarity Search in Chinese Recipes
T2 - Substructure Similarity Search in Chinese Recipes
Y1 - 2008
A1 - Guozhu Dong
A1 - Yu Yang
A1 - Qing Li
A1 - Liping Wang
A1 - Na Li
JA - Substructure Similarity Search in Chinese Recipes
ER -
TY - BOOK
T1 - Advances in Data and Web Management: Proceedings of the Joint International ApWeb/WAIM Conference on Web-Age Information Management
Y1 - 2007
A1 - Xuemin Lin
A1 - Guozhu Dong
A1 - Yu Yang
A1 - Jeffrey Xu Yu
A1 - Wei Wang
ER -
TY - JOUR
T1 - Efficient Computation of Iceberg Cubes by Bounding Aggregate Functions
Y1 - 2007
A1 - Xiuzhen Zhang
A1 - Pauline Lienhua Chou
A1 - Guozhu Dong
ER -
TY - CONF
T1 - Evolution and Maintenance of Frequent Pattern Space When Transactions Are Removed. Proceedings of PAKDD
T2 - Evolution and Maintenance of Frequent Pattern Space When Transactions Are Removed. Proceedings of PAKDD
Y1 - 2007
A1 - Jinyan Li
A1 - Limsoon Wong
A1 - Yap-Peng Tan
A1 - Mengling Feng
A1 - Guozhu Dong
JA - Evolution and Maintenance of Frequent Pattern Space When Transactions Are Removed. Proceedings of PAKDD
ER -
TY - JOUR
T1 - Mining Minimal Distinguishing Subsequence Patterns with Gap Constraints
Y1 - 2007
A1 - Guozhu Dong
A1 - Xiaonan Ji
A1 - James Bailey
ER -
TY - BOOK
T1 - Sequence Data Mining
Y1 - 2007
A1 - Jian Pei
A1 - Guozhu Dong
ER -
TY - CONF
T1 - Clustering Similarity Comparison Using Density Profiles and its Application to the Discovery of Alternate Clusterings.
T2 - Clustering Similarity Comparison Using Density Profiles and its Application to the Discovery of Alternate Clusterings.
Y1 - 2006
A1 - Guozhu Dong
A1 - James Bailey
A1 - Eric Kyoo Han Bae
JA - Clustering Similarity Comparison Using Density Profiles and its Application to the Discovery of Alternate Clusterings.
ER -
TY - JOUR
T1 - egression Cubes with Lossless Compression and Aggregation
Y1 - 2006
A1 - Guozhu Dong
A1 - Jianyong Wang
A1 - Benjamin Wah
A1 - Yixin Chen
A1 - Jiawei Han
A1 - Jian Pei
ER -
TY - CONF
T1 - Masquerader Detection Using OCLEP: One-Class Classification Using Length Statistics of Emerging Patterns
Y1 - 2006
A1 - Lijun Chen
A1 - Guozhu Dong
PB - International Workshop on INformation Processing over Evolving Networks (WINPEN)
ER -
TY - CONF
T1 - Minimum Description Length (MDL) Principle: Generators Are Preferable to Closed Patterns
T2 - Minimum Description Length (MDL) Principle: Generators Are Preferable to Closed Patterns
Y1 - 2006
A1 - Limsoon Wong
A1 - Jinyan Li
A1 - Guozhu Dong
A1 - H. Li
A1 - Jian Pei
JA - Minimum Description Length (MDL) Principle: Generators Are Preferable to Closed Patterns
ER -
TY - CONF
T1 - Succinct and Informative Cluster Descriptions for Document Repositories
T2 - Succinct and Informative Cluster Descriptions for Document Repositories
Y1 - 2006
A1 - Lijun Chen
A1 - Guozhu Dong
JA - Succinct and Informative Cluster Descriptions for Document Repositories
ER -
TY - JOUR
T1 - Cai: Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams.
Y1 - 2004
A1 - Yixin Chen
A1 - Jian Pei
A1 - Jiawei Han
A1 - Y. Dora
A1 - Guozhu Dong
A1 - Benjamin Wah
A1 - Jianyong Wang
ER -
TY - JOUR
T1 - On the decidability of the termination problem of active database systems
Y1 - 2004
A1 - James Bailey
A1 - K. Rammamohanarao
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - DeEPs: A New Instance-based Discovery and Classification System
Y1 - 2004
A1 - Limsoon Wong
A1 - Kotagiri Ramamohanarao
A1 - Jinyan Li
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Incremental Recomputation in Local Languages
Y1 - 2003
A1 - Leonid Libkin
A1 - Limsoon Wong
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Online Mining of Changes from Data Streams: Research Problems and Preliminary Results
Y1 - 2003
A1 - Philip Yu
A1 - Guozhu Dong
A1 - Jiawei Han
A1 - Haixun Wang
A1 - Jian Pei
A1 - Laks V.S. Lakshmanan
ER -
TY - BOOK
T1 - Proceedings of the 4th International Conference on Web-Age Information Management
Y1 - 2003
A1 - Wei Wang
A1 - Guozhu Dong
A1 - Changjie Tang
ER -
TY - JOUR
T1 - Proceedings of The Fourth International Conference on Web-Age Information Management
Y1 - 2003
A1 - Guozhu Dong
A1 - Changjie Tang
A1 - Wei Wang
ER -
TY - JOUR
T1 - Pushing Aggregate Constraints by Divide-and-Approximate
Y1 - 2003
A1 - Jiawei Han
A1 - Yudong Jiang
A1 - Guozhu Dong
A1 - Ke Wang
A1 - Jeffrey Xu Yu
ER -
TY - JOUR
T1 - On Computing Condensed Frequent Pattern Bases
Y1 - 2002
A1 - Wei Zou
A1 - Jiawei Han
A1 - Jian Pei
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - CubeExplorer
Y1 - 2002
A1 - Guozhu Dong
A1 - Jianyong Wang
A1 - Ke Wang
A1 - Jiawei Han
A1 - Jian Pei
ER -
TY - CONF
T1 - MultiDimensional Regression Analysis of Time-Series Data Streams.
T2 - MultiDimensional Regression Analysis of Time-Series Data Streams.
Y1 - 2002
A1 - Jian Pei
A1 - Jianyong Wang
A1 - Benjamin Wah
A1 - Jiawei Han
A1 - Wei Zou
A1 - Guozhu Dong
JA - MultiDimensional Regression Analysis of Time-Series Data Streams.
ER -
TY - CONF
T1 - Multi-Dimensional Regression Analysis of Time-Series Data Streams
T2 - Multi-Dimensional Regression Analysis of Time-Series Data Streams
Y1 - 2002
A1 - Jianyong Wang
A1 - Jiawei Han
A1 - Guozhu Dong
A1 - Benjamin Wah
A1 - Yixin Chen
JA - Multi-Dimensional Regression Analysis of Time-Series Data Streams
ER -
TY - JOUR
T1 - Online analytical processing stream data: is it feasible?
Y1 - 2002
A1 - Jian Pei
A1 - Benjamin Wah
A1 - Guozhu Dong
A1 - Yixin Chen
A1 - Jianyong Wang
A1 - Jiawei Han
ER -
TY - JOUR
T1 - Building behavior knowledge space to make classification decision
Y1 - 2001
A1 - Kotagiri Ramamohanarao
A1 - Guozhu Dong
A1 - Xiuzhen Zhang
ER -
TY - JOUR
T1 - Combining the strength of pattern frequency and distance for classification
Y1 - 2001
A1 - Jinyan Li
A1 - Guozhu Dong
A1 - Kotagiri Ramamohanarao
ER -
TY - JOUR
T1 - Efficient Computation of Iceberg Cubes with Complex Measure
Y1 - 2001
A1 - Jian Pei
A1 - Jiawei Han
A1 - Ke Wang
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Efficient Mining of Niches and Set Routines
Y1 - 2001
A1 - Kaustubh Deshpe
A1 - Guozhu Dong
ER -
TY - CHAP
T1 - Knowledge discovery in databases
Y1 - 2001
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Making Use of the Most Expressive Jumping Emerging Patterns for Classification
Y1 - 2001
A1 - Kotagiri Ramamohanarao
A1 - Guozhu Dong
A1 - Jinyan Li
ER -
TY - JOUR
T1 - Mining Multi-Dimensional Constrained Gradients in Data Cubes.
Y1 - 2001
A1 - Joyce Lam
A1 - Ke Wang
A1 - Jian Pei
A1 - Guozhu Dong
A1 - Jiawei Han
ER -
TY - JOUR
T1 - Query processing with an FPGA Coprocessor Board
Y1 - 2001
A1 - Jack Jean
A1 - Baifeng Zhang
A1 - Xinzhong Guo
A1 - Guozhu Dong
A1 - Hwa Zhang
ER -
TY - JOUR
T1 - Emerging Patterns and Classification
Y1 - 2000
A1 - Kotagiri Ramamohanarao
A1 - Jinyan Li
A1 - Guozhu Dong
ER -
TY - CONF
T1 - Exploring Constraints to Efficiently Mine Emerging Patterns from Large High-dimensional Datasets
T2 - Exploring Constraints to Efficiently Mine Emerging Patterns from Large High-dimensional Datasets
Y1 - 2000
A1 - Guozhu Dong
A1 - Kotagiri Ramamohanarao
A1 - Xiuzhen Zhang
JA - Exploring Constraints to Efficiently Mine Emerging Patterns from Large High-dimensional Datasets
ER -
TY - JOUR
T1 - Incremental Maintenance of Recursive Views Using Relational Calculus/SQL
Y1 - 2000
A1 - Guozhu Dong
A1 - Jianwen Su
ER -
TY - CONF
T1 - Information-based Classification by Aggregating Emerging Patterns
T2 - Information-based Classification by Aggregating Emerging Patterns
Y1 - 2000
A1 - Xiuzhen Zhang
A1 - Guozhu Dong
A1 - Kotagiri Ramamohanarao
JA - Information-based Classification by Aggregating Emerging Patterns
ER -
TY - CONF
T1 - Instance-based classification by emerging patterns
Y1 - 2000
A1 - Guozhu Dong
A1 - Jinyan Li
A1 - Kotagiri Ramamohanarao
ER -
TY - JOUR
T1 - Local Properties of Query Languages
Y1 - 2000
A1 - Guozhu Dong
A1 - Limsoon Wong
A1 - Leonid Libkin
ER -
TY - JOUR
T1 - Making Use of the Most Expressive Jumping Emerging Patterns for Classification
Y1 - 2000
A1 - Guozhu Dong
A1 - Jinyan Li
A1 - Kotagiri Ramamohanarao
ER -
TY - CONF
T1 - nstance-based classification by emerging patterns
T2 - nstance-based classification by emerging patterns
Y1 - 2000
A1 - Kotagiri Ramamohanarao
A1 - Guozhu Dong
A1 - Jinyan Li
JA - nstance-based classification by emerging patterns
ER -
TY - CONF
T1 - Optimization techniques for data intensive decision flows
T2 - Optimization techniques for data intensive decision flows
Y1 - 2000
A1 - Richard Hull
A1 - Bharat Kumar
A1 - Guozhu Dong
A1 - Francois Llirbat
A1 - Gang Zhou
A1 - Jianwen Su
JA - Optimization techniques for data intensive decision flows
ER -
TY - JOUR
T1 - Separating Auxiliary Arity Hierarchy of First-Order Incremental Evaluation Using (3+1)-ary Input Relations
Y1 - 2000
A1 - Guozhu Dong
A1 - Louxin Zhang
ER -
TY - JOUR
T1 - The Space of Jumping Emerging Patterns and Its Incremental Maintenance
Y1 - 2000
A1 - Jinyan Li
A1 - Kotagiri Ramamohanarao
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - CAEP: Classification by Aggregating Emerging Patterns
Y1 - 1999
A1 - Limsoon Wong
A1 - Guozhu Dong
A1 - Xiuzhen Zhang
A1 - Jinyan Li
ER -
TY - CONF
T1 - Data Integration by Describing Sources with Constraint Databases
T2 - Data Integration by Describing Sources with Constraint Databases
Y1 - 1999
A1 - Jianwen Su
A1 - Guozhu Dong
A1 - Tzekwan Lau
A1 - Xun Cheng
JA - Data Integration by Describing Sources with Constraint Databases
ER -
TY - CONF
T1 - Decidability of First Order Logic Queries over Views
T2 - Decidability of First Order Logic Queries over Views
Y1 - 1999
A1 - Guozhu Dong
A1 - James Bailey
JA - Decidability of First Order Logic Queries over Views
ER -
TY - CONF
T1 - Declarative Workflows that Support Easy Modification and Dynamic Browsing
T2 - Declarative Workflows that Support Easy Modification and Dynamic Browsing
Y1 - 1999
A1 - Guozhu Dong
A1 - Richard Hull
A1 - Francois Llirbat
A1 - Gang Zhou
A1 - Jianwen Su
A1 - Eric Simon
A1 - Bharat Kumar
JA - Declarative Workflows that Support Easy Modification and Dynamic Browsing
ER -
TY - JOUR
T1 - Discovering Jumping Emerging Patterns and Experiments on Real Datasets
Y1 - 1999
A1 - Xiuzhen Zhang
A1 - Jinyan Li
A1 - Guozhu Dong
ER -
TY - CONF
T1 - Efficient Mining of Emerging Patterns: Discovering Trends and Differences
T2 - Efficient Mining of Emerging Patterns: Discovering Trends and Differences
Y1 - 1999
A1 - Jinyan Li
A1 - Guozhu Dong
JA - Efficient Mining of Emerging Patterns: Discovering Trends and Differences
ER -
TY - CONF
T1 - Efficient Mining of High Confidence Association Rules without Support Thresholds
T2 - Efficient Mining of High Confidence Association Rules without Support Thresholds
Y1 - 1999
A1 - Guozhu Dong
A1 - Jinyan Li
A1 - Qun Sun
A1 - Ramamohanarao Kotagiri
A1 - Xiuzhen Zhang
JA - Efficient Mining of High Confidence Association Rules without Support Thresholds
ER -
TY - CONF
T1 - Efficient Mining of Partial Periodic Patterns in Time Series Database
T2 - Efficient Mining of Partial Periodic Patterns in Time Series Database
Y1 - 1999
A1 - Jiawei Han
A1 - Guozhu Dong
A1 - Yiwen Yin
JA - Efficient Mining of Partial Periodic Patterns in Time Series Database
ER -
TY - CONF
T1 - Efficient support for decision flows in e-commerce applications
T2 - Efficient support for decision flows in e-commerce applications
Y1 - 1999
A1 - Gang Zhou
A1 - Jianwen Su
A1 - Richard Hull
A1 - Bharat Kumar
A1 - Guozhu Dong
A1 - Francois Llirbat
JA - Efficient support for decision flows in e-commerce applications
ER -
TY - JOUR
T1 - A Framework for Optimising Distributed Workflow Executions
Y1 - 1999
A1 - Gang Zhou
A1 - Bharat Kumar
A1 - Guozhu Dong
A1 - Richard Hull
A1 - Jianwen Su
ER -
TY - JOUR
T1 - Incremental Evaluation of Datalog Queries
Y1 - 1999
A1 - Guozhu Dong
A1 - Rodney Topor
ER -
TY - CONF
T1 - Incremental FO(+,<) Maintenance of All-pairs Shortest Paths for Undirected Graphs After Insertions and Deletions
T2 - Incremental FO(+,<) Maintenance of All-pairs Shortest Paths for Undirected Graphs After Insertions and Deletions
Y1 - 1999
A1 - Ramamohanarao Kotagiri
A1 - C. Pang
A1 - Guozhu Dong
JA - Incremental FO(+,<) Maintenance of All-pairs Shortest Paths for Undirected Graphs After Insertions and Deletions
ER -
TY - JOUR
T1 - Incremental Maintenance of Recursive Views: A Survey
Y1 - 1999
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Maintaining Transitive Closure of Graphs in SQL
Y1 - 1999
A1 - Leonid Libkin
A1 - Jianwen Su
A1 - Guozhu Dong
A1 - Limsoon Wong
ER -
TY - JOUR
T1 - Using CAEP to Predict Translation Initiation Sites from Genomic DNA Sequences
Y1 - 1999
A1 - Xiuzhen Zhang
A1 - Limsoon Wong
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Bounds for First-Order Incremental Evaluation and Definition of Polynomial Time Database Queries
Y1 - 1998
A1 - Jianwen Su. Arity
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Decidability and Undecidability Results for the Termination Problem of Active Database Rules
Y1 - 1998
A1 - James Bailey
A1 - Ramamohanarao Kotagiri
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Efficient Incremental View Maintenance in Distributed Databases by Tagging
Y1 - 1998
A1 - Guozhu Dong
A1 - M. Mohania
A1 - X. Wang
A1 - James Bailey
ER -
TY - JOUR
T1 - Interestingness of Discovered Association Rules in terms of Neighborhood-Based Unexpectedness
Y1 - 1998
A1 - Jinyan Li
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Relational expressive power of constraint query languages
Y1 - 1998
A1 - Michael Benedikt
A1 - Leonid Libkin
A1 - Guozhu Dong
A1 - Limsoon Wong
ER -
TY - JOUR
T1 - Deterministic FOIES are Strictly Weaker
Y1 - 1997
A1 - Jianwen Su
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - First-order maintenance of transitive closure after node-set and edge-set deletions
Y1 - 1997
A1 - Guozhu Dong
A1 - C. Pang
ER -
TY - JOUR
T1 - Local properties of query languages
Y1 - 1997
A1 - Limsoon Wong
A1 - Leonid Libkin
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Maintaining constrained transitive closure by conjunctive queries
Y1 - 1997
A1 - Ramamohanarao Kotagiri
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Some Relationships between FOIES and Sigma 1 1 Arity Hierarchies
Y1 - 1997
A1 - Guozhu Dong
A1 - Limsoon Wong
ER -
TY - JOUR
T1 - Structural issues in active rule systems
Y1 - 1997
A1 - Kotagiri Ramamohanarao
A1 - James Bailey
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Algorithms for adapting materialised views in data warehouses
Y1 - 1996
A1 - M. Mohania
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Conjunctive query containment with repect to views and constraints
Y1 - 1996
A1 - Guozhu Dong
A1 - Jianwen Su
ER -
TY - JOUR
T1 - Relational Expressive Power of Constraint Query Languages
JF - Journal of the ACM
Y1 - 1996
A1 - Michael Benedikt
A1 - Guozhu Dong
A1 - Leonid Libkin
A1 - Limsoon Wong
AB - The expressive power of first-order query languages with several classes of equality and inequality constraints is studied in this paper. We settle the conjecture that recursive queries such as parity test and transitive closure cannot be expressed in the relational calculus augmented with polynomial inequality constraints over the reals. Furthermore, noting that relational queries exhibit several forms of genericity, we establish a number of collapse results of the following form: The class of generic boolean queries expressible in the relational calculus augmented with a given class of constraints coincides with the class of queries expressible in the relational calculus (with or without an order relation). We prove such results for both the natural and active-domain semantics. As a consequence, the relational calculus augmented with polynomial inequalities expresses the same classes of generic boolean queries under both the natural and active-domain semantics. In the course of proving these results for the active-domain semantics, we establish Ramsey-type theorems saying that any query involving certain kinds of constraints coincides with a constraint free query on databases whose elements come from a certain innite subset of the domain. To prove the collapse results for the natural semantics, we make use of techniques from nonstandard analysis and from the model theory of ordered structures.
ER -
TY - JOUR
T1 - On decompositions of chain datalog programs into P (left)-linear 1-rule components
Y1 - 1995
A1 - Seymour Ginsburg
A1 - Guozhu Dong
ER -
TY - CONF
T1 - On impossibility of decremental recomputation of recursive queries in relational calculus and SQ
Y1 - 1995
A1 - Guozhu Dong
A1 - Leonid Libkin
A1 - Limsoon Wong
PB - Fifth International Database Programming Languages Workshop
ER -
TY - JOUR
T1 - Incremental and Decremental Evaluation of Transitive Closure by First-Order Queries
Y1 - 1995
A1 - Guozhu Dong
A1 - Jianwen Su
ER -
TY - CONF
T1 - Incremental boundedness and nonrecursive incremental evaluation of datalog queries (extended abstract)
T2 - Incremental boundedness and nonrecursive incremental evaluation of datalog queries (extended abstract)
Y1 - 1995
A1 - Jianwen Su
A1 - Guozhu Dong
JA - Incremental boundedness and nonrecursive incremental evaluation of datalog queries (extended abstract)
ER -
TY - JOUR
T1 - On the index of positive programmed formal languages
Y1 - 1995
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Nonrecursive Incremental Evaluation of Datalog Queries
Y1 - 1995
A1 - Guozhu Dong
A1 - Jianwen Su
A1 - Rodney Topor
ER -
TY - CONF
T1 - Space bounded FOIES
T2 - Space bounded FOIES
Y1 - 1995
A1 - Guozhu Dong
A1 - Jianwen Su
JA - Space bounded FOIES
ER -
TY - JOUR
T1 - A framework for object migration in object-oriented databases
Y1 - 1994
A1 - Qing Li
A1 - Guozhu Dong
ER -
TY - BOOK
T1 - Discussion Report: Object Migration and Classification
Y1 - 1993
A1 - K. Davis
A1 - A. Heuer
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - First-order incremental evaluation of datalog queries
Y1 - 1993
A1 - Jianwen Su
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - On the monotonicity of (LDL) logic programs with sets
Y1 - 1993
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Datalog expressiveness of chain queries: Grammar tools and characterisations
Y1 - 1992
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Incremental evaluation of datalog queries
Y1 - 1992
A1 - Rodney Topor
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Object migration in object-oriented databases
Y1 - 1992
A1 - Qing Li
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Representation and translation of queries in heterogeneous databases with semantic discrepancie
Y1 - 1992
A1 - Guozhu Dong
A1 - Kotagiri Ramamohanarao
ER -
TY - JOUR
T1 - On Datalog Linearisation of Chain Queries
Y1 - 1991
A1 - Guozhu Dong
ER -
TY - JOUR
T1 - Localizable constraints for object histories
Y1 - 1991
A1 - Guozhu Dong
A1 - Seymour Ginsburg
ER -
TY - CONF
T1 - Object behaviors and scripts
Y1 - 1991
A1 - Jianwen Su
A1 - Guozhu Dong
PB - 3rd Internat'l Workshop on Database Programming Languages
ER -
TY - JOUR
T1 - On the decomposition of datalog program mappings
Y1 - 1990
A1 - Seymour Ginsburg
A1 - Guozhu Dong
ER -
TY - CONF
T1 - Inference of cyclicity and acyclicity constraints among recursively typed objects with identifiers
Y1 - 1990
A1 - Guozhu Dong
PB - Far-East Workshop on Future Database Systems
ER -
TY - CONF
T1 - On distributed processibility of datalog queries by decomposing databases
T2 - On distributed processibility of datalog queries by decomposing databases
Y1 - 1989
A1 - Guozhu Dong
JA - On distributed processibility of datalog queries by decomposing databases
ER -
TY - JOUR
T1 - On the composition and decomposition of datalog program mappings
Y1 - 1988
A1 - Guozhu Dong
ER -
TY - CONF
T1 - Localizable Constraints for Object Histories
Y1 - 1986
A1 - Guozhu Dong
A1 - Seymour Ginsburg
ER -
}