Optimizing Large Language Models for Edge Devices: A Comparative Study on Reputation Analysis
Author
Rahman, Mohammad Wali UrIssue Date
2023Advisor
Hariri, Salim
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
The widespread adoption of social media platforms has led to an exponential surge in user-generated data, shaping the reputations of companies and public figures on a global scale. Consequently, reputation management through automated analysis has become indispensable. Recent advances in large-scale language models, such as BERT (Bidirectional Encoder Representations from Transformers), have shown remarkable success in Natural Language Processing (NLP) tasks. However, deploying such models on resource-constrained edge devices remains challenging due to their size and computational demands. In this thesis, we focus on optimizing large language models for edge devices with limited resources by conducting a comparative study. Our investigation centers around reputation analysis, a critical NLP task, to assess the feasibility and efficiency of deploying transformer-based models on edge devices. We fine-tune MobileBERT, a compact version of BERT, on the reputation polarity task using the RepLab 2013 dataset. To enable their deployment on edge devices, we utilize TensorFlow-Lite models and apply quantization techniques for model optimization. The experimental results demonstrate the efficacy of our approach. The quantized MobileBERT models showcase comparable performance to the original BERT model, with only a marginal 4.1\% drop in accuracy. More importantly, the optimized models exhibit a significant reduction in model size, boasting a 160x smaller footprint, and achieve faster inference times. These results validate the practical applicability of transformer-based reputation analysis on edge devices, making them an attractive solution for real-world deployments. In conclusion, this thesis showcases the potential of optimizing large language models for resource-constrained edge devices. Through a rigorous comparative study, we demonstrate that transformer-based models can be effectively deployed for reputation analysis on edge devices without compromising performance. This research paves the way for leveraging advanced NLP techniques in real-world applications, where efficient edge device implementation is essential, opening new possibilities for a wide range of edge computing scenarios beyond reputation analysis.Type
Electronic Thesistext
Degree Name
M.S.Degree Level
mastersDegree Program
Graduate CollegeElectrical & Computer Engineering
