Research

TinyTS: The TinyML Model Compiler Framework on the MCU
Tiny Machine Learning (TinyML) is becoming realistic on edge computing devices. The low-power Microcontroller Unit (MCU) is a popular candidate to deploy TinyML models. However, the tight memory budget becomes one of the critical bottlenecks when deploying TinyML models on the MCU. We aim to improve the memory footprint and performance of TinyML models on the MCU. This project is supported by MOST [NeuralPS 23][HPCA 24][TECS 24] [HPCA 25]
Enhanced Matrix Computing ISA Extension on SoC
Matrix Computation has been heavily used in machine learning, multimedia and scientific computing. TinyML models have very short inference time and it is not beneficial for the application to run on the specialized hardware. Instead, a powerful CPU with an ISA extension that enhances TinyML application performance is favorable in this case. We aim to create a matrix unit to accelerate TinyML applications on the SoC. This project is supported by MOST, Google Research Grant
Accelerating Irregular Applications on the GPU
GPU has been widely applied on many application domains. However, existing GPU architecture is not good at irregular applications with massive data parallelism. This project targets the acceleration on three applications: 1. Ray tracing on computer graphics. 2. Graph computing. 3. Multi-tenant ML serving. This project is supported by MediaTek and MOST. [HPCA 21] [ASP-DAC 22] [ASP-DAC 23]
Dynamic Tensor Mapping on DNN Accelerator
Dynamic Tensor Mapping on DNN Accelerator Deep Neural Network (DNN) models evolve at a very rapid speed. New DNN models consist of complex neural network operators, and the value of their hyper-parameters varies across each layer. This irregularity increases the challenge when mapping the data of DNN models to a rigid systolic array. We aim to leverage hardware and software approach to improve the utilization and the performance in different DNN models. This project is supported by MOST [DAC 22][ISOCC 22]
Maximizing IOPs on Solid State Drive (SSD) systems
This project aims to maximize the IOPs within the internal SSD system and is supported by Phison.