• The Effect of Data Transformations on Scalar Field Topological Analysis of High-Order FEM Solutions

      Jallepalli, Ashok; Levine, Joshua A; Kirby, Robert M; Univ Arizona, Dept Comp Sci (IEEE COMPUTER SOC, 2020-01-01)
      High-order finite element methods (HO-FEM) are gaining popularity in the simulation community due to their success in solving complex flow dynamics. There is an increasing need to analyze the data produced as output by these simulations. Simultaneously, topological analysis tools are emerging as powerful methods for investigating simulation data. However, most of the current approaches to topological analysis have had limited application to HO-FEM simulation data for two reasons. First, the current topological tools are designed for linear data (polynomial degree one), but the polynomial degree of the data output by these simulations is typically higher (routinely up to polynomial degree six). Second, the simulation data and derived quantities of the simulation data have discontinuities at element boundaries, and these discontinuities do not match the input requirements for the topological tools. One solution to both issues is to transform the high-order data to achieve low-order, continuous inputs for topological analysis. Nevertheless, there has been little work evaluating the possible transformation choices and their downstream effect on the topological analysis. We perform an empirical study to evaluate two commonly used data transformation methodologies along with the recently introduced L-SIAC filter for processing high-order simulation data. Our results show diverse behaviors are possible. We offer some guidance about how best to consider a pipeline of topological analysis of HO-FEM simulations with the currently available implementations of topological analysis.
    • Evaluating Cartogram Effectiveness

      Nusrat, Sabrina; Alam, Md. Jawaherul; Kobourov, Stephen; Univ Arizona (IEEE COMPUTER SOC, 2018-02)
      Cartograms are maps in which areas of geographic regions, such as countries and states, appear in proportion to some variable of interest, such as population or income. Cartograms are popular visualizations for geo-referenced data that have been used for over a century to illustrate patterns and trends in the world around us. Despite the popularity of cartograms, and the large number of cartogram types, there are few studies evaluating the effectiveness of cartograms in conveying information. Based on a recent task taxonomy for cartograms, we evaluate four major types of cartograms: contiguous, non-contiguous, rectangular, and Dorling cartograms. We first evaluate the effectiveness of these cartogram types by quantitative performance analysis (time and error). Second, we collect qualitative data with an attitude study and by analyzing subjective preferences. Third, we compare the quantitative and qualitative results with the results of a metrics-based cartogram evaluation. Fourth, we analyze the results of our study in the context of cartography, geography, visual perception, and demography. Finally, we consider implications for design and possible improvements.
    • Event-Based Dynamic Graph Visualisation

      Simonetto, Paolo; Archambault, Daniel; Kobourov, Stephen; Univ Arizona (IEEE COMPUTER SOC, 2020-07)
      Dynamic graph drawing algorithms take as input a series of timeslices that standard, force-directed algorithms can exploit to compute a layout. However, often dynamic graphs are expressed as a series of events where the nodes and edges have real coordinates along the time dimension that are not confined to discrete timeslices. Current techniques for dynamic graph drawing impose a set of timeslices on this event-based data in order to draw the dynamic graph, but it is unclear how many timeslices should be selected: too many timeslices slows the computation of the layout, while too few timeslices obscures important temporal features, such as causality. To address these limitations, we introduce a novel model for drawing event-based dynamic graphs and the first dynamic graph drawing algorithm, DynNoSlice, that is capable of drawing dynamic graphs in this model. DynNoSlice is an offline, force-directed algorithm that draws event-based, dynamic graphs in the space-time cube (2D+time). We also present a method to extract representative small multiples from the space-time cube. To demonstrate the advantages of our approach, DynNoSlice is compared with state-of-the-art timeslicing methods using a metrics-based experiment. Finally, we present case studies of event-based dynamic data visualised with the new model and algorithm.
    • Extracting Inter-Sentence Relations for Associating Biological Context with Events in Biomedical Texts

      Noriega-Atala, Enrique; Hein, Paul D; Thumsi, Shraddha S; Wong, Zechy; Wang, Xia; Hendryx, Sean M; Morrison, Clayton T; Univ Arizona, Sch Informat; Univ Arizona, Dept Comp Sci; Univ Arizona, Dept Linguist; et al. (IEEE COMPUTER SOC, 2020-12-08)
      We present an analysis of the problem of identifying biological context and associating it with biochemical events described in biomedical texts. This constitutes a non-trivial, inter-sentential relation extraction task. We focus on biological context as descriptions of the species, tissue type, and cell type that are associated with biochemical events. We present a new corpus of open access biomedical texts that have been annotated by biology subject matter experts to highlight context-event relations. Using this corpus, we evaluate several classifiers for context-event association along with a detailed analysis of the impact of a variety of linguistic features on classifier performance. We find that gradient tree boosting performs by far the best, achieving an F1 of 0.865 in a cross-validation study.
    • Full-Duplex or Half-Duplex: A Bayesian Game for Wireless Networks with Heterogeneous Self-Interference Cancellation Capabilities

      Afifi, Wessam; Abdel-Rahman, Mohammad J.; Krunz, Marwan; MacKenzie, Allen B.; Univ Arizona, Dept Elect & Comp Engn (IEEE COMPUTER SOC, 2018-05)
      Recently, tremendous progress has been made in self-interference cancellation (SIC) techniques that enable a wireless device to transmit and receive data simultaneously on the same frequency channel, a.k.a. in-band full-duplex (FD). Although operating in FD mode significantly improves the throughput of a single wireless link, it doubles the number of concurrent transmissions, which limits the potential for coexistence between multiple FD-enabled links. In this paper, we consider the coexistence problem of concurrent transmissions between multiple FD-enabled links with different SIC capabilities; each link can operate in either FD or half-duplex mode. First, we consider two links and formulate the interactions between them as a Bayesian game. In this game, each link tries to maximize its throughput while minimizing the transmission power cost. We derive a closed-form expression for the Bayesian Nash equilibrium and determine the conditions under which no outage occurs at either link. Then, we study the coexistence problem between more than two links, assuming that each link is only affected by its dominant interfering link. We show that under this assumption, no more than two links will be involved in a single game. Finally, we corroborate our analytical findings via extensive simulations and numerical results.
    • Learning parameter-advising sets for multiple sequence alignment

      DeBlasio, Dan; Kececioglu, John; Computational Biology Department, Carnegie Mellon University; Department of Computer Science, The University of Arizona (IEEE COMPUTER SOC, 2017)
      While the multiple sequence alignment output by an aligner strongly depends on the parameter values used for the alignment scoring function (such as the choice of gap penalties and substitution scores), most users rely on the single default parameter setting provided by the aligner. A different parameter setting, however, might yield a much higher-quality alignment for the specific set of input sequences. The problem of picking a good choice of parameter values for specific input sequences is called parameter advising. A parameter advisor has two ingredients: (i) a set of parameter choices to select from, and (ii) an estimator that provides an estimate of the accuracy of the alignment computed by the aligner using a parameter choice. The parameter advisor picks the parameter choice from the set whose resulting alignment has highest estimated accuracy. We consider for the first time the problem of learning the optimal set of parameter choices for a parameter advisor that uses a given accuracy estimator. The optimal set is one that maximizes the expected true accuracy of the resulting parameter advisor, averaged over a collection of training data. While we prove that learning an optimal set for an advisor is NP-complete, we show there is a natural approximation algorithm for this problem, and prove a tight bound on its approximation ratio. Experiments with an implementation of this approximation algorithm on biological benchmarks, using various accuracy estimators from the literature, show it finds sets for advisors that are surprisingly close to optimal. Furthermore, the resulting parameter advisors are significantly more accurate in practice than simply aligning with a single default parameter choice.
    • Lossless Multi-component Image Compression Based on Integer Wavelet Coefficient Prediction using Convolutional Neural Networks

      Ahanonu, Eze; Marcellin, Michael; Bilgin, Ali; Univ Arizona, Dept Elect & Comp Engn; Univ Arizona, Dept Biomed Engn (IEEE COMPUTER SOC, 2020-06-02)
    • Moving Beyond Readability Metrics for Health-Related Text Simplification

      Kauchak, David; Leroy, Gondy; Univ Arizona, Eller Coll Management, Management Informat Syst (IEEE COMPUTER SOC, 2016)
      Limited health literacy is a barrier to understanding health information. Simplifying text can reduce this barrier and possibly other known disparities in health. Unfortunately, few tools exist to simplify text with demonstrated impact on comprehension. By leveraging modern data sources integrated with natural language processing algorithms, we are developing the first semi-automated text simplification tool. We present two main contributions. First, we introduce our evidence-based development strategy for designing effective text simplification software and summarize initial, promising results. Second, we present a new study examining existing readability formulas, which are the most commonly used tools for text simplification in healthcare. We compare syllable count, the proxy for word difficulty used by most readability formulas, with our new metric 'term familiarity' and find that syllable count measures how difficult words 'appear' to be, but not their actual difficulty. In contrast, term familiarity can be used to measure actual difficulty.
    • Node-Link or Adjacency Matrices: Old Question, New Insights

      Okoe, Mershack; Jianu, Radu; Kobourov, Stephen; Univ Arizona (IEEE COMPUTER SOC, 2019-10)
      Visualizing network data is applicable in domains such as biology, engineering, and social sciences. We report the results of a study comparing the effectiveness of the two primary techniques for showing network data: node-link diagrams and adjacency matrices. Specifically, an evaluation with a large number of online participants revealed statistically significant differences between the two visualizations. Our work adds to existing research in several ways. First, we explore a broad spectrum of network tasks, many of which had not been previously evaluated. Second, our study uses two large datasets, typical of many real-life networks not explored by previous studies. Third, we leverage crowdsourcing to evaluate many tasks with many participants. This paper is an expanded journal version of a Graph Drawing (GD'17) conference paper. We evaluated a second dataset, added a qualitative feedback section, and expanded the procedure, results, discussion, and limitations sections.
    • Preserving Command Line Workflow for a Package Management System Using ASCII DAG Visualization

      Isaacs, Katherine E; Gamblin, Todd; Univ Arizona, Dept Comp Sci (IEEE COMPUTER SOC, 2019-09)
      Package managers provide ease of access to applications by removing the time-consuming and sometimes completely prohibitive barrier of successfully building, installing, and maintaining the software for a system. A package dependency contains dependencies between all packages required to build and run the target software. Package management system developers, package maintainers, and users may consult the dependency graph when a simple listing is insufficient for their analyses. However, users working in a remote command line environment must disrupt their workflow to visualize dependency graphs in graphical programs, possibly needing to move files between devices or incur forwarding lag. Such is the case for users of Spack, an open source package management system originally developed to ease the complex builds required by supercomputing environments. To preserve the command line workflow of Spack, we develop an interactive ASCII visualization for its dependency graphs. Through interviews with Spack maintainers, we identify user goals and corresponding visual tasks for dependency graphs. We evaluate the use of our visualization through a command line-centered study, comparing it to the system's two existing approaches. We observe that despite the limitations of the ASCII representation, our visualization is preferred by participants when approached from a command line interface workflow.
    • Secure Physical Layer Voting

      Ghose, Nirnimesh; Hu, Bocan; Zhang, Yan; Lazos, Loukas; Electrical and Computer Engineering, University of Arizona (IEEE COMPUTER SOC, 2018-03-01)
      Distributed wireless networks often employ voting to perform critical network functions such as fault-tolerant data fusion, cooperative sensing, and reaching consensus. Voting is implemented by sending messages to a fusion center or via direct message exchange between participants. However, the delay overhead of message-based voting can be prohibitive when numerous participants have to share the wireless channel in sequence, making it impractical for time-critical applications. In this paper, we propose a fast PHY-layer voting scheme called PHYVOS, which significantly reduces the delay for collecting and tallying votes. In PHYVOS, wireless devices transmit their votes simultaneously by exploiting the subcarrier orthogonality of OFDM and without explicit messaging. Votes are realized by injecting energy to pre-assigned subcarriers. We show that PHYVOS is secure against adversaries that attempt to manipulate the voting outcome. Security is achieved without employing cryptography-based authentication and message integrity schemes. We analytically evaluate the voting robustness as a function of PHY-layer parameters. We extend PHYVOS to operate in ad hoc groups, without the assistance of a fusion center. We discuss practical implementation challenges related to multi-device frequency and time synchronization and present a prototype implementation of PHYVOS on the USRP platform. We complement the implementation with larger scale simulations.
    • The Topology ToolKit

      Tierny, Julien; Favelier, Guillaume; Levine, Joshua A.; Gueunet, Charles; Michaux, Michael; Univ Arizona (IEEE COMPUTER SOC, 2018-01)
      This system paper presents the Topology ToolKit (TTK), a software platform designed for the topological analysis of scalar data in scientific visualization. While topological data analysis has gained in popularity over the last two decades, it has not yet been widely adopted as a standard data analysis tool for end users or developers. TTK aims at addressing this problem by providing a unified, generic, efficient, and robust implementation of key algorithms for the topological analysis of scalar data, including: critical points, integral lines, persistence diagrams, persistence curves, merge trees, contour trees, Morse-Smale complexes, fiber surfaces, continuous scatterplots, Jacobi sets, Reeb spaces, and more. TTK is easily accessible to end users due to a tight integration with ParaView. It is also easily accessible to developers through a variety of bindings (Python, VTK/C++) for fast prototyping or through direct, dependency-free, C++, to ease integration into pre-existing complex systems. While developing TTK, we faced several algorithmic and software engineering challenges, which we document in this paper. In particular, we present an algorithm for the construction of a discrete gradient that complies to the critical points extracted in the piecewise-linear setting. This algorithm guarantees a combinatorial consistency across the topological abstractions supported by TTK, and importantly, a unified implementation of topological data simplification for multi-scale exploration and analysis. We also present a cached triangulation data structure, that supports time efficient and generic traversals, which self-adjusts its memory usage on demand for input simplicial meshes and which implicitly emulates a triangulation for regular grids with no memory overhead. Finally, we describe an original software architecture, which guarantees memory efficient and direct accesses to TTK features, while still allowing for researchers powerful and easy bindings and extensions. TTK is open source (BSD license) and its code, online documentation and video tutorials are available on TTK's website [108].
    • Visualizing a Moving Target: A Design Study on Task Parallel Programs in the Presence of Evolving Data and Concerns

      Williams, Katy; Bigelow, Alex; Isaacs, Kate; Univ Arizona (IEEE COMPUTER SOC, 2020-01-01)
      Common pitfalls in visualization projects include lack of data availability and the domain users' needs and focus changing too rapidly for the design process to complete. While it is often prudent to avoid such projects, we argue it can be beneficial to engage them in some cases as the visualization process can help refine data collection, solving a "chicken and egg" problem of having the data and tools to analyze it. We found this to be the case in the domain of task parallel computing where such data and tooling is an open area of research. Despite these hurdles, we conducted a design study. Through a tightly-coupled iterative design process, we built Atria, a multi-view execution graph visualization to support performance analysis. Atria simplifies the initial representation of the execution graph by aggregating nodes as related to their line of code. We deployed Atria on multiple platforms, some requiring design alteration. We describe how we adapted the design study methodology to the "moving target" of both the data and the domain experts' concerns and how this movement kept both the visualization and programming project healthy. We reflect on our process and discuss what factors allow the project to be successful in the presence of changing data and user needs.