Selective Wander Join: Fast Progressive Visualizations for Data Joins
Name:
informatics-06-00014.pdf
Size:
464.9Kb
Format:
PDF
Description:
Final Published Version
Publisher
MDPICitation
Procopio, M., Scheidegger, C., Wu, E., & Chang, R. (2019, March). Selective Wander Join: Fast Progressive Visualizations for Data Joins. In Informatics (Vol. 6, No. 1, p. 14). Multidisciplinary Digital Publishing Institute.Journal
INFORMATICS-BASELRights
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).Collection Information
This item from the UA Faculty Publications collection is made available by the University of Arizona with support from the University of Arizona Libraries. If you have questions, please contact us at repository@u.library.arizona.edu.Abstract
Progressive visualization offers a great deal of promise for big data visualization; however, current progressive visualization systems do not allow for continuous interaction. What if users want to see more confident results on a subset of the visualization? This can happen when users are in exploratory analysis mode but want to ask some directed questions of the data as well. In a progressive visualization system, the online aggregation algorithm determines the database sampling rate and resulting convergence rate, not the user. In this paper, we extend a recent method in online aggregation, called Wander Join, that is optimized for queries that join tables, one of the most computationally expensive operations. This extension leverages importance sampling to enable user-driven sampling when data joins are in the query. We applied user interaction techniques that allow the user to view and adjust the convergence rate, providing more transparency and control over the online aggregation process. By leveraging importance sampling, our extension of Wander Join also allows for stratified sampling of groups when there is data distribution skew. We also improve the convergence rate of filtering queries, but with additional overhead costs not needed in the original Wander Join algorithm.Note
Open access journalISSN
2227-9709Version
Final published versionSponsors
National Science Foundation (NSF) [1527765, 1564049, 1513651, 1452977]; DARPA United States Department of Defense Defense Advanced Research Projects Agency (DARPA) [FA8750-17-2-0107]ae974a485f413a2113503eed53cd6c53
10.3390/informatics6010014
Scopus Count
Collections
Except where otherwise noted, this item's license is described as © 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).