Advancing Neural Networks Towards Realistic Settings Using Few-Shot
Author
Hess, Samuel ThomasIssue Date
2022Keywords
Deep Neural NetworksExplainable Artificial Intelligence
Few-Shot Learning
Lifelong Learning
Online Learning
Advisor
Ditzler, Gregory
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
Neural networks have shown remarkable performance across many tasks, including classification, object detection, and image segmentation. Advances in high-performance computing have enabled neural networks to train on extremely large datasets that have resulted in superior performance, often outperforming humans in many tasks. In fact, conventional supervised learning neural networks trained with large volumes of labeled data can produce highly accurate models to classify images, videos, and audio signals. Despite the success of neural networks, their deployment and evaluation are limited to the classes and experiences observed during training. The success of neural networks, however, poses a serious challenge if large labeled datasets are not available to train. Thus, these models are not expected to achieve the same success if there are only a few labeled samples per class. To address this weakness of sample size, an area of research is rapidly evolving known as few-shot learning. Specifically, few-shot learning classifies unlabeled data from novel classes with only one or "a few'' labeled exemplary samples. Unfortunately, few-shot learning comes with its challenges, including reduced classification accuracy with respect to supervised counterparts, requirements on the overall size of the training data, classifier explainability, and evaluation assumptions that can quickly break down with many real-world applications. It is against this background that in this thesis, we present five contributions that expand few-shot performance, explainability, and applicability to new novel tasks. Specifically, our contributions are: (1) A novel few-shot network that improves the classification accuracy over prior models by learning to weight features conditioned on the samples. Conventional techniques perform a one-way comparison of an unlabeled query to a labeled support set; however, the soft weight network allows for two-way cross-comparisons of both query-to-support and support-to-query, which is shown to improve the performance of a few-shot model. (2) A new application and novel few-shot network, namely OrderNet, that can accurately learn an ordering of data given a small labeled dataset. Through pairwise subsampling and episodic training, OrderNet was shown to significantly reduce the amount of training data required to achieve regression accuracy. (3) A new approach for eXplainable Artifical Intelligence (XAI), namely ProtoShotXAI, that uses a few-shot architecture to explain black-box neural networks and is the first approach that is directly applicable to the explanation of few-shot neural networks. (4) A novel similarity metric for a few-shot network that achieves state-of-the-art performance on inductive few-shot tasks. The metric is motivated by the fast approximation of exponentially distributed features in the final layer of a trained few-shot classifier, and maximum log-likelihood estimation. State-of-the-art 1-shot transductive performance is also achieved on imbalanced data using a simple iterative approach with our similarity metric. (5) A novel framework for online detection and classification using few-shot classifiers. In contrast to related work, our lifelong learning framework assumes a continuous data stream of unlabeled and imbalanced data. Additionally, our approach continuously refines classes as new data becomes available while considering computational storage constraints. We demonstrate the capabilities of our proposed approach on benchmark data streams and achieve competitive detection performance and state-of-the-art online classification accuracy.Type
textElectronic Dissertation
Degree Name
Ph.D.Degree Level
doctoralDegree Program
Graduate CollegeElectrical & Computer Engineering