CV
Work
-
2025.06 - Present Santa Clara, CA, USA
Research Scientist
NVIDIA NeMo
Project: Multimodal large language models, full-duplex speech-to-speech models
-
2024.05 - 2024.08 Santa Clara, CA, USA
AI Research Intern
NVIDIA NeMo
Project: Speech-text language models with multi-turn mixed-modal chat capabilities
-
2023.05 - 2023.08 Seattle, WA, USA
Research Scientist Intern
Meta AI (Fundamental AI Research, FAIR)
Project: Speech large language models for voice-preserved textless speech-to-speech translation
-
2022.05 - 2022.08 Remote, USA
Education
-
2020.08 - 2025.05 Pittsburgh, PA, USA
Doctor of Philosophy in Electrical and Computer Engineering
Carnegie Mellon University
Advisor: Prof. Shinji Watanabe
Thesis: Towards effective and efficient open speech foundation models
Research areas: Speech foundation models, speech recognition
Open source: Contributor and maintainer of ESPnet
-
2016.08 - 2020.06 Beijing, China
Bachelor of Engineering in Electronic Information Science and Technology
Tsinghua University
GPA: 3.96/4.00, Ranking: 2/262
Advisor: Prof. Liangrui Peng
Thesis: Deep learning-based semi-supervised transfer learning for handwritten text recognition
Awards
- 2025.8.21
ISCA Award for Best Student Paper at INTERSPEECH 2025
International Speech Communication Association (ISCA)
For the first-authored paper: OWSM v4: Improving Open Whisper-Style Speech Models via Data Scaling and Cleaning
- 2024.12.4
IEEE SLT 2024 Best Paper Award
2024 IEEE Spoken Language Technology Workshop
For the paper: Contextualized Automatic Speech Recognition with Dynamic Vocabulary
- 2024.11.14
EMNLP 2024 Best Paper Award
The 2024 Conference on Empirical Methods in Natural Language Processing
For the paper: Towards Robust Speech Representation Learning for Thousands of Languages
- 2023.6.10
ICASSP 2023 Top 3% Paper Recognition
IEEE International Conference on Acoustics, Speech and Signal Processing
For two first-authored papers and one co-authored paper
- 2020.2.20
SPIE Medical Imaging 2020 Best Student Paper Award Finalist
The International Society for Optics and Photonics (SPIE)
For first-authored paper: Microcalcification localization and cluster detection using unsupervised convolutional autoencoders and structural similarity index
Languages
Chinese | |
Native speaker |
English | |
Professional working proficiency |
Services
Organizer | |
Conference reviewer |
|
Journal reviewer | |
Mentor |
Teaching
-
2024.08 - 2024.12 Graduate Teaching Assistant
Carnegie Mellon University
Course:18-781/11-751 Speech Recognition and Understanding
Instructor: Prof. Shinji Watanabe
-
2023.08 - 2023.12 Graduate Teaching Assistant
Carnegie Mellon University
Course:18-781/11-751 Speech Recognition and Understanding
Instructor: Prof. Shinji Watanabe
-
2022.08 - 2022.12 Graduate Teaching Assistant
Carnegie Mellon University
Course:18-781/11-751 Speech Recognition and Understanding
Instructor: Prof. Shinji Watanabe
-
2021.08 - 2021.12 Graduate Teaching Assistant
Carnegie Mellon University
Course:18-781/11-751 Speech Recognition and Understanding
Instructor: Prof. Ian Lane and Prof. Shinji Watanabe