%0 Conference Proceedings
%B 25th International World Wide Web Conference (WWW 2016)
%D 2016
%T Tweet Properly: Analyzing Deleted Tweets to Understand and Identify Regrettable Ones
%A Lu Zhou
%A Wenbo Wang
%A Keke Chen
%X Inappropriate tweets can cause severe damages on authorsâ€™ reputation or privacy. However, many users do not realize the negative consequences until they publish these tweets. Published tweets have lasting effects that may not be eliminated by simple deletion because other users may have read them or third-party tweet analysis platforms have cached them. Regrettable tweets, i.e., tweets with identifiable regrettable contents, cause the most damage on their authors because other users can easily notice them. In this paper, we study how to identify the regrettable tweets published by normal individual users via the contents and usersâ€™ historical deletion patterns. We identify normal individual users based on their publishing, deleting, followers and friends statistics. We manually examine a set of randomly sampled deleted tweets from these users to identify regrettable tweets and understand the corresponding regrettable reasons. By applying content-based features and personalized history-based features, we develop
%B 25th International World Wide Web Conference (WWW 2016)
%I ACM
%C Montreal, Canada
%P 603-612
%8 04/2016
%G eng
%0 Conference Proceedings
%B 2015 IEEE 8th International Conference on Cloud Computing
%D 2015
%T Scalable Euclidean Embedding for Big Data
%A Zohreh Alavi
%A Sagar Sharma
%A Lu Zhou
%A Keke Chen
%K Algorithm design and analysis
%K Approximation algorithms
%K arbitrary metric space
%K Big Data
%K Big data scale
%K Complexity theory
%K data reduction
%K data visualisation
%K data visualization
%K Euclidean embedding algorithms
%K Euclidean space
%K FastMap-MR algorithm
%K LMDS-MR algorithm
%K massive data parallel infrastructure
%K Measurement
%K parallel algorithms
%K parallel processing
%K Scalability
%K scalable Euclidean embedding algorithm
%K visualization technique
%X Euclidean embedding algorithms transform data defined in an arbitrary metric space to the Euclidean space, which is critical to many visualization techniques. At big-data scale, these algorithms need to be scalable to massive data-parallel infrastructures. Designing such scalable algorithms and understanding the factors affecting the algorithms are important research problems for visually analyzing big data. We propose a framework that extends the existing Euclidean embedding algorithms to scalable ones. Specifically, it decomposes an existing algorithm into naturally parallel components and non-parallelizable components. Then, data parallel implementations such as MapReduce and data reduction techniques are applied to the two categories of components, respectively. We show that this can be possibly done for a collection of embedding algorithms. Extensive experiments are conducted to understand the important factors in these scalable algorithms: scalability, time cost, and the effect of data reduction to result quality. The result on sample algorithms: Fast Map-MR and LMDS-MR shows that with the proposed approach the derived algorithms can preserve result quality well, while achieving desirable scalability.
%B 2015 IEEE 8th International Conference on Cloud Computing
%I IEEE
%C New York City, NY
%P 773 - 780
%8 07/2015
%G eng
%M 15399748
%R 10.1109/CLOUD.2015.107