GitHub - DAMO-NLP-SG/GeoPQA: [EMNLP 2025] GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning

GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning

We investigate a critical bottleneck in Multimodal Large Language Models (MLLMs): their limited visual perception, which hinders their ability to solve complex geometric reasoning tasks. We find that even state-of-the-art models struggle to accurately perceive basic geometric concepts, preventing effective reasoning.

📝 Our Solution:

We introduce GeoPQA, a benchmark to measure this gap, and propose a two-stage training framework:

1️⃣ Perception First: We train the MLLM to accurately identify geometric structures.

2️⃣ Reasoning Second: With a solid visual foundation, we then train it on complex, multi-step reasoning.

📈 The results are exciting! Our two-stage approach boosts geometric problem-solving accuracy by 9.1% over traditional methods. Our work highlights a key principle: for MLLMs to truly reason, they must first learn to see.

✒️ Citation

If you find our work useful, please consider citing our paper:

@misc{chen2025geopqabridgingvisualperception,
      title={GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning}, 
      author={Guizhen Chen and Weiwen Xu and Hao Zhang and Hou Pong Chan and Deli Zhao and Anh Tuan Luu and Yu Rong},
      year={2025},
      eprint={2509.17437},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2509.17437}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning

✒️ Citation

About

Uh oh!

Releases

Packages

License

DAMO-NLP-SG/GeoPQA

Folders and files

Latest commit

History

Repository files navigation

GeoPQA: Bridging the Visual Perception Gap in MLLMs for Geometric Reasoning

✒️ Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages