A Visualization First Perspective on Understanding Program Behavior
Author
Faust, Rebecca JaneIssue Date
2021Advisor
Scheidegger, Carlos
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
This dissertation frames program understanding as data analysis. Specifically we take the perspective that human understanding of a program can be facilitated by the human understanding of data collected during the execution. Here, we leverage existing visualization principles to design tools that simplify the task of collecting and organizing relevant data for program understanding and debugging. These principles also enable these tools to automatically derive appropriate visualizations from the data. While there exists work in software visualization as well as understanding programs without visualization, limited research exists on directly applying visualization principles to the domain of program understanding and debugging. This dissertation addresses this gap along two primary avenues: (1) using visualization to understand general programs and (2) using visualization to understand specific categories of programs, namely non-linear dimensionality reductions. Along the first avenue, we present two visualization tools Anteater and Aardvark. Anteater defines a mapping from the data collected in program traces to a visualization design framework that enables us to then apply visualization principles. It defines how trace data maps to common data structures used in visualization, and how to map from those data structures to effective interactive visualizations. Anteater then operationalizes this mapping to create a prototype implementation of a system for visualizing general Python programs. Aardvark extends Anteater’s mapping to support the comparison of multiple executions of a program through visualizations that apply visualization principles for comparison. Aardvark supports visualizing the effects of change in general Python programs. However, by narrowing the scope to specific classes of programs and specific types of change, we can create more descriptive visualizations of the effect of those changes. DimReader is an example of this where we narrowed the focus to non-linear dimensionality reductions. We augmented these programs with automatic differentiation to simulate changes in the input data and record their effect on the positions of the projected points. After simulating this change, we applied visualization principles to create explanatory visualizations for understanding the behavior of the projection. This dissertation presents two extreme points in the “program understanding as data analysis” design space. Anteater and Aardvark assume very little structure in the program and apply to very general programs. DimReader, on the other hand, requires a particular program structure, and as a result can employ more specific visualizations and techniques, specifically automatic differentiation. We conclude by asking the natural question: is there a middle ground that combines the explanatory features of DimReader with the generalizability of Anteater and Aardvark? Modern machine learning systems notably make central use of automatic differentiation, and form a natural target for future investigations.Type
textElectronic Dissertation
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeComputer Science