Qwen2.5-VL-32B-Instruct

Name: Qwen2.5-VL-32B-Instruct
Availability: InStock
Author: 阿里巴巴

多模态大模型

Release date: 2025-03-24更新于: 2025-03-25 11:36:121,102

Live demo GitHub Hugging Face Compare

Parameters

320.0亿

Context length

32K

Chinese support

Supported

Reasoning ability

Qwen2.5-VL-32B-Instruct is an AI model published by 阿里巴巴, released on 2025-03-24, for 多模态大模型, with 320.0B parameters, and 32K tokens context length, requiring about 64GB storage, under the Apache 2.0 license.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Qwen2.5-VL-32B-Instruct

Model basics

Reasoning traces

Not supported

Thinking modes

Thinking modes not supported

Context length

32K tokens

Max output length

2048 tokens

Model type

Qwen2.5-VL-32B-Instruct

Open source & experience

Code license

Apache 2.0

Weights license

Apache 2.0- 免费商用授权

GitHub repo

https://github.com/QwenLM/Qwen2.5-VL

Hugging Face

https://huggingface.co/Qwen/Qwen2.5-VL-32B-Instruct

Qwen2.5-VL-32B-Instruct

Official resources

Paper

Qwen2.5-VL-32B: Smarter and Lighter

DataLearnerAI blog

No blog post yet

Qwen2.5-VL-32B-Instruct

API details

API speed

3/5

No public API pricing yet.

Qwen2.5-VL-32B-Instruct

Benchmark Results

No benchmark data to show.

Qwen2.5-VL-32B-Instruct

Publisher

阿里巴巴

View publisher details

Qwen2.5-VL-32B-Instruct

Model Overview

Qwen2.5-VL-32B-Instruct是通义千问团队于2025年3月24日开源的多模态大模型，基于Apache 2.0协议发布。该模型在Qwen2.5-VL系列基础上，通过强化学习技术优化，以32B参数规模实现多模态能力突破。

核心特性升级

输出风格优化
模型输出内容在格式规范与信息详实度上更贴近人类表达习惯，特别是在复杂场景中能生成结构清晰、逻辑严密的解决方案。

数学推理突破
针对包含多变量方程、几何证明等复杂数学问题，模型通过算法优化将解题准确率提升至行业领先水平。

细粒度视觉分析
在医疗影像解析、工程图纸识别等专业领域，模型展现出像素级内容捕捉能力，并支持多图关联推理与时空维度分析。

性能表现

在MMMU（多模态理解）、MathVista（视觉数学推理）等权威测试集上，该模型以32B参数规模超越Mistral-Small-3.1-24B、Gemma-3-27B-IT等同级竞品，其表现较前代72B模型Qwen2-VL-72B-Instruct提升达12.7%。

在用户体验导向的MM-MT-Bench评估中，模型在开放式问答、指令跟随等场景的响应质量获得显著优化，主观评分较前代提升19.4%。文本处理能力保持同参数规模顶尖水准，在MT-Bench文本基准测试中位列前三。

应用实例

以用户提供的卡车限速场景为例，模型展现多模态协同能力：

视觉解析：准确识别道路限速标志（100 km/h）
时空建模：建立时间（12:00-13:00）、距离（110 km）、速度的三维关系
数学推导：运用运动学公式计算得出1小时6分钟的精确行程时间
逻辑决策：综合时空约束给出"无法准时到达"的结论，并完整展示推导链条

该案例印证了模型在跨模态信息整合、专业领域知识应用以及可解释性输出方面的技术优势。

DataLearner 官方微信

欢迎关注 DataLearner 官方微信，获得最新 AI 技术推送