In the past decade, deep neural networks (DNNs) have emerged as highly effective approaches to a number of difficult and relevant problems. They will likely enable autonomous vehicles, healthcare solutions, assistive technologies, etc. Further, in the past five years, commercial accelerators customized for DNNs have also hit the market. This includes the Google TPU, the Tesla FSD, and several other offerings from major chip vendors and emerging startups. Architecture conferences have also seen a dramatic rise in research activity that attempts to improve the efficiency of these accelerators. The course will first review the basics of DNN accelerators, using both commercial and academic examples. It will then consider each of the major approaches that have yielded significant improvements and that might be candidates for inclusion in next generation accelerators. These approaches include the following: novel hierarchies and dataflows, management of sparsity in weights and activations, elimination of ineffectuals, analog arithmetic, in-memory computing, spiking neural networks, and compressed data movement during training.
Rajeev Balasubramonian is a Professor at the School of Computing, University of Utah. He received his B.Tech in Computer Science and Engineering from the Indian Institute of Technology, Bombay in 1998. He received his MS (2000) and Ph.D. (2003) degrees from the University of Rochester. His primary research interests include memory systems, security, and application-specific architectures, and his work appears regularly at the top architecture conferences. Prof. Balasubramonian is a recipient of a US National Science Foundation CAREER award, IBM Faculty Partnership awards, HP Innovation Research Program awards, Google Faculty Research awards, an Intel Outstanding Research award, various teaching awards at the University of Utah, and multiple best paper and Top Picks awards.