PhD student at Princeton University, focusing on LLMs, especially Language Modeling and Pretraining, LLM Reasoning, and Reinforcement Learning.
Homepage: https://yifzhang.com
PhD student at Princeton University, focusing on LLMs, especially Language Modeling and Pretraining, LLM Reasoning, and Reinforcement Learning.
Homepage: https://yifzhang.com
[ACL 2025 Findings] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts (As Huggingface Daily Papers: https://huggingface.co/papers/2402.07625)
[NeurIPS 2025 Spotlight] TPA: Tensor ProducT ATTenTion Transformer (T6) (https://arxiv.org/abs/2501.06425)
[TMLR] Cumulative Reasoning With Large Language Models (https://arxiv.org/abs/2308.04371)
[ICML 2025] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment (https://arxiv.org/abs/2410.02197)
Official implementation of Regularized Policy Gradient (RPG) (https://arxiv.org/abs/2505.17508)
Official Project Page for "Exact Coset Sampling for Quantum Lattice Algorithms" (https://arxiv.org/abs/2509.12341)