Mitigating Device-Level Vulnerabilities in Post-CMOS Machine Learning Accelerators
Author
Chowdhury, Md Muhtasim AlamIssue Date
2025Keywords
Emerging Switching DevicesHardware Supply Chain Security
Machine Learning Accelerators
Semiconductor Fabrication
SOT-MRAM
Advisor
Salehi, Soheil
Metadata
Show full item recordPublisher
The University of Arizona.Rights
Copyright © is held by the author. Digital access to this material is made possible by the University Libraries, University of Arizona. Further transmission, reproduction, presentation (such as public display or performance) of protected items is prohibited except with permission of the author.Abstract
Hardware-based acceleration approaches for Machine Learning (ML) workloads have been embracing the significant potential of post-CMOS switching devices to attain reduced footprint and/or energy-efficient execution relative to transistor-based GPU and/or TPU-based accelerator architectures. Meanwhile, the promulgation of fabless IC chip manufacturing paradigms has heightened the hardware security concerns inherent in such approaches. Namely, unauthorized access to various stages of the supply chain may expose significant vulnerabilities that cause malfunctions, including subtle adversarial outcomes via the malicious generation of differentially corrupted output. Whereas the Spin-Orbit Torque Magnetic Tunnel Junction (SOT-MTJ) is a leading spintronic device for use in ML accelerators, as well as holding security tokens, their manufacturing-only security exposures are identified and evaluated herein. The experimental results indicate a novel vulnerability profile whereby an adversary without access to the circuit netlist could differentially influence the behavior of the machine learning application. Specifically, ML recognition outputs can be significantly swayed via a global modification of oxide thickness (Tox) resulting in bit-flips of the weights in the crossbar array, thus corrupting the recognition of selected digits in MNIST dataset differentially creating an opportunity for an adversary. With just 0.05% of bits in crossbar having a flipped resistance state, digits ‘4’ and ‘5’ show highest overall error rates and digit ‘9’ exhibit the lowest impact, with recognition accuracy of digits ‘2’, ‘3’, and ‘8’ unaffected by changing the oxide thickness of SOT-MTJs uniformly from 0.75 nm to 1.2 nm without modifying the netlist nor even having access to the circuit design itself. Exposures and mitigation approaches to such novel and potentially damaging manufacturing-side intrusions are identified, postulated and quantitatively assessed. In conclusion, this thesis showcases the potential of SOT-MRAM process variation to trigger stealthy, application-impacting bitflips in ML accelerators. Early-stage protection against physical-level threats ensures post-silicon ML accelerators remain robust and trustworthy in future AI-enabled systems.Type
textElectronic Thesis
Degree Name
M.S.Degree Level
mastersDegree Program
Graduate CollegeElectrical & Computer Engineering
