About Me
I am a second-year Ph.D. student in Prof. Mohit Bansalβs group (MURGe Lab) at UNC Chapel Hill. Previously, I was a Research Resident under the supervision of Prof. Viet Anh Nguyen at VinAI Research, Vietnam. I received a bachelorβs degree in Computer Science from Hanoi University of Science and Technology in 2022.
My research focuses on mechanistic interpretability and inference-time interventions for LLM safety alignment. Additionally, Iβm interested in post-training methods for (multimodal) LLMs, including Reinforcement Learning from Human Feedback (RLHF) and Reinforcement Learning with Verifiable Rewards (RLVR).
π₯ News
- July 2025: New preprint GrAInS: Gradient-based Attribution for Inference-Time Steering of LLMs and VLMs on using gradient attribution for steering LLMs and VLMs.
- May 2025: MAT-Steer is accepted to ACL 2025! Thanks Elias for helping present the paper!
- February 2025: New preprint Multi-Attribute Steering of Language Models via Targeted Intervention on steering multiple attributes of LLMs via targeted inference-time intervention.
- October 2024: New preprint LASeR: Learning to Adaptively Select Reward Models with Multi-Armed Bandits on selecting the best-suited reward model for fine-tuning LLMs on a per-task or per-instance basis.
- March 2024: I will be joining Prof. Mohit Bansalβs group as a Ph.D. student at UNC Chapel Hill this Fall!
Old news
- May 2024: Our paper Cold-start Recommendation by Personalized Embedding Region Elicitation is accepted to UAI 2024!
- February 2024: New preprint Cost-Adaptive Recourse Recommendation by Adaptive Preference Elicitation on personalized algorithmic recourse with preference elicitation.
- November 2023: New preprint Coverage-Validity-Aware Algorithmic Recourse on algorithmic recourse under distribution shift.
- January 2023: Our paper Distributionally Robust Recourse Action is accepted to ICLR 2023!
- January 2023: Our paper Feasible Recourse Plan via Diverse Interpolation is accepted to AISTATS 2023!
- October 2022: We are awarded an honorable mention at 2022 INFORMS Undergraduate Operations Research Prize!
- May 2022: One paper accepted to UAI 2022!
- January 2022: One paper accepted to ICLR 2022!
π Publications
Multi-Attribute Steering of Language Models via Targeted Intervention
Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin, and Mohit Bansal.
The 63rd Annual Meeting of the Association for Computational Linguistics (ACL), 2025. Paper Code
Cold-start Recommendation by Personalized Embedding Region Elicitation
Hieu Nguyen, Duy Nguyen, Khoa Doan, and Viet Anh Nguyen.
The 40th Conference on Uncertainty in Artificial Intelligence (UAI), 2024. Paper Code
Coverage-Validity-Aware Algorithmic Recourse
Ngoc Bui, Duy Nguyen, Man-Chung Yue, and Viet Anh Nguyen.
Operations Research (OPRE), 2024. Paper Code
Distributionally Robust Recourse Action
Duy Nguyen, Ngoc Bui, and Viet Anh Nguyen.
The Eleventh International Conference on Learning Representations (ICLR), 2023. Paper Code
Feasible Recourse Plan via Diverse Interpolation
Duy Nguyen, Ngoc Bui, and Viet Anh Nguyen.
The 26th International Conference on Artificial Intelligence and Statistics (AISTATS), 2023. Paper Code
Robust Bayesian Recourse
Tuan-Duy H. Nguyen, Ngoc Bui, Duy Nguyen, Man-Chung Yue, and Viet Anh Nguyen.
The 38th Conference on Uncertainty in Artificial Intelligence (UAI), 2022. Paper Code
Counterfactual Plans under Distributional Ambiguity
Ngoc Bui, Duy Nguyen, and Viet Anh Nguyen.
The Tenth International Conference on Learning Representations (ICLR), 2022. Paper Code
π Honors and Awards
- October 2022: Honorable Mention - INFORMS Undergraduate Operations Research Prize
- October 2022: Best Thesis Presentation Award - Hanoi University of Science and Technology
- September 2019: Excellence Scholarship for the academic year - Hanoi University of Science and Technology
π» Experience
- May 2025 - August 2025: Applied Scientist Intern at Amazon Science, USA
- August 2022 - August 2024: Research Resident at VinAI Research, Vietnam