About Me

I am a second-year Ph.D. student at the University of Maryland, College Park, advised by Prof. Heng Huang. Previously, I worked as a research assistant at the Natural Language Processing Laboratory of Northeastern University (China), supervised by Prof. Tong Xiao. I received my B.E. degree from Northeastern University of Computer Science and Engineering in 2021.

Research

My research focuses on enhancing LLM reasoning and test-time scaling.

Reasoning and Test-Time Scaling: Parallel-R1 (ICLR'26) introduces first RL-based parallel thinking for LLMs, moving beyond sequential chain-of-thought. AutoTTS (arXiv'26) unlocks a new direction for test-time scaling—automatically discovering strategies via agentic search instead of hand-crafted inference heuristics. Parallel-Probe (ICML'26) enables efficient parallel thinking through 2D probing. MoT (ICLR'26) explores mixture-of-thought representations for logical reasoning.

Efficient Training & Inference: Multi-Draft Speculative Decoding (ICLR'25) improves the inference speed–quality trade-off via multi-draft decoding. Asymmetric MMT (ACL Findings'25) studies conflict and synergy in post-training for multilingual machine translation.

Foundation Models: UMST (ICML'22) builds multiscale Transformers over sub-word, word, and phrase units with word-boundary and phrase-level structure. EIT (ACL'24) enhances multi-head self-attention by encouraging consensus across heads via inner- and cross-subspace interactions. PartialFormer (ACL Findings'24) replaces monolithic FFNs with multiple partial FFNs for parameter-efficient Transformers.

News

Older news
  • Check our new paper for VLM exploration: VOGUE — visual uncertainty guided exploration.
  • Check our new papers for LLM reasoning: Parallel-R1 and CDE.
  • One paper accepted for publication at EMNLP 2025.
  • Two papers accepted for publication at ACL 2025 Findings.
  • I will join Tencent AI Lab (Seattle) as a research intern this summer.
  • One paper accepted for publication at ICLR 2025.
  • One paper accepted for publication at NeurIPS 2024.
  • One paper accepted for publication at EMNLP 2024 Main.
  • Started my Ph.D. study at University of Maryland, College Park.
  • Two papers accepted for publication at ACL 2024 (1 Main, 1 Findings).
  • One paper accepted at Findings of EMNLP 2023.
  • Learning Multiscale Transformer Models for Sequence Generation accepted at ICML 2022 (First ICML in NEUNLP).
  • Joined NEUNLP lab as a research assistant.
  • Graduated from Northeastern University with an average GPA of 4.0.

Experience

Selected Publications (* Equal Contribution)

Asymmetric Conflict and Synergy in Post-training for LLM-based Multilingual Machine Translation paper

ACL Findings, 2025

Tong Zheng, Yan Wen, Huiwen Bao, Junfeng Guo, Heng Huang

Towards Optimal Multi-draft Speculative Decoding paper

ICLR 2025

Zhengmian Hu*, Tong Zheng*, Vignesh Viswanathan, Ziyi Chen, Ryan A. Rossi, Yihan Wu, Dinesh Manocha, Heng Huang

A Bayesian Approach to Harnessing the Power of LLMs in Authorship Attribution paper

EMNLP 2024 Main

Zhengmian Hu*, Tong Zheng*, Heng Huang

PartialFormer: Modeling Part Instead of Whole Paper

ACL 2024 Findings

Tong Zheng*, Bei Li*, Huiwen Bao*, Weiqiao Shan, Tong Xiao, Jingbo Zhu

EIT: Enhanced Interactive Transformer Paper

ACL 2024 Main Conference

Tong Zheng*, Bei Li*, Huiwen Bao*, Tong Xiao, Jingbo Zhu

Learning Multiscale Transformer Models for Sequence Generation Paper

ICML 2022

Bei Li*, Tong Zheng*, Yi Jing*, Chengbo Jiao, Tong Xiao, Jingbo Zhu

Manuscript

BrainTGL: Temporal Graph Representation Learning for Brain Network by Exploiting Graph Temporal Information Manuscript

Finished at August 2021

Tong Zheng

Selected Honors