Agents-X

Jan. 2026

PyVision-RL: Forging Open Agentic Vision Models via RL.

We build PyVision-Image and PyVision-Video via RL, achieving state-of-the-art on visual search, multi-modal reasoning, agentic reasoning and spatial reasoning tasks.

Nov. 2025

TIR-Bench: A Comprehensive Benchmark for Agentic Vision.

Used by Qwen 3.6 Plus

After PyVision, we revisited the fundamental question of what kinds of problems truly require agentic vision capabilities. We introduce TIR-Bench, the first comprehensive agentic vision benchmark, consisting of 1,215 questions and covering 13 different tasks.

July, 2025

PyVision: Agentic Vision with Dynamic Tooling.

We explore using Python code as visual primitives for image manipulation and reasoning. We developed PyVision, a framework that enables agentic vision with dynamic tooling. Your MLLM already possesses agentic vision capabilities!