Profile image

Zhixiong Zhao

I am a second-year master student at the School of Electrical and Electronic Engineering (EEE), Nanyang Technological University (NTU), advised by Prof. Kim Tae Hyoung. I am currently an intern at HOUMO.AI.

My research interests lie in Efficient AI, particularly in hardware-software co-design, model compression, and efficient inference acceleration such as quantization. I strongly believe that “compression is intelligence.”

I am always open to academic exchanges and interdisciplinary collaborations. Please feel free to reach out if you’d like to discuss research or potential synergies!

News

May 01, 2026 Our paper TWLA is accepted by ICML 2026. 🎉🎉🎉
Apr 08, 2026 Our paper BWLA (Main Conference) is accepted by ACL 2026. 🎉🎉🎉
Jan 26, 2026 Our paper KBVQ-MoE is accepted by ICLR 2026. 🎉🎉🎉
Nov 08, 2025 Our paper SpecQuant is accepted by AAAI 2026. 🎉🎉🎉
Jul 02, 2025 Our paper QUARK is accepted by ICCAD 2025. 🎉🎉🎉
May 27, 2025 I will join HOUMO.AI as a research intern. 🚀🚀🚀

Publications

Equal Contribution, * Corresponding Author(s)


  1. TWLA: Breaking the Barrier to W1.58A4 Post-Training Quantization for LLMs
    Zhixiong Zhao , Zukang Xu , Zhixuan Chen , Xing Hu , Zhe Jiang , and Dawei Yang*
    In International Conference on Machine Learning, 2026 (Top Conf. in AI) CCF-A
  2. ACL-Main 2026
    BWLA: Breaking the Barrier of W1AX Post-Training Quantization for LLMs
    Zhixiong Zhao , Zukang Xu , and Dawei Yang*
    In Annual Meeting of the Association for Computational Linguistics, 2026 (Top Conf. in NLP) CCF-A
  3. KBVQ-MoE: KLT-guided SVD with Bias-Corrected Vector Quantization for MoE Large Language Models
    Zukang XuZhixiong Zhao , Xing Hu , Zhixuan Chen , and Dawei Yang*
    In International Conference on Learning Representations, 2026 (Top Conf. in AI) CCF-A
  4. SpecQuant: Spectral Decomposition and Adaptive Truncation for Ultra-Low-Bit LLMs Quantization
    Zhixiong Zhao , Fangxin Liu , Junjie Wang , Chenyang Guan , Zongwu Wang , Li Jiang , and Haibing Guan*
    In Association for the Advancement of Artificial Intelligence, 2026 (Top Conf. in AI) CCF-A
  5. QUARK: Quantization-Enabled Circuit Sharing for Transformer Acceleration by Exploiting Common Patterns in Nonlinear Operations
    Zhixiong Zhao , Haomin Li , Fangxin Liu , Yuncheng Lu , Zongwu Wang , Tao Yang , Li Jiang , and Haibing Guan*
    In International Conference on Computer-Aided Design, 2025 (Top Conf. in EDA) CCF-B