加载中...
加载中...
Berkeley Function Calling Leaderboard是衡量大模型工具使用或函数调用能力的权威排行榜。
📣 数据版本: 20240421
数据来源: Berkeley官方网站
排名 | 模型名称 | 综合准确率 | 请求成本($) | 时延(秒) | AST Summary | Exec Summary | 相关性得分 | 发布者 | 开源情况 |
|---|---|---|---|---|---|---|---|---|---|
| 1 | GPT-4-0125-Preview | 84.41 | 5.21 | 1.99 | 88.75 | 71.54 | 70.42 | OpenAI | Proprietary |
| 2 | Claude-3-Opus-20240229 | 84.12 | 10.80 | 5.05 | 86.09 | 70.90 | 80.42 | Anthropic | Proprietary |
| 3 | GPT-4-turbo-2024-04-09 | 81.88 | 5.22 | 2.68 | 86.83 | 71.04 | 62.50 | OpenAI | Proprietary |
| 4 | GPT-4-1106-Preview | 81.76 | 5.03 | 6.34 | 84.75 | 68.26 | 80.42 | OpenAI | Proprietary |
| 5 | Gorilla-OpenFunctions-v2 | 81.71 | 1.70 | 2.65 | 86.16 | 71.52 | 60.83 | Gorilla LLM | Apache 2.0 |
| 6 | GPT-4-0125-Preview | 80.29 | 4.82 | 5.03 | 83.75 | 66.13 | 82.92 | OpenAI | Proprietary |
| 7 | Mistral-Medium-2312 | 79.47 | 1.75 | 2.77 | 81.44 | 62.13 | 88.75 | Mistral AI | Proprietary |
| 8 | GPT-4-turbo-2024-04-09 | 78.76 | 4.79 | 5.68 | 81.70 | 65.13 | 88.75 | OpenAI | Proprietary |
| 9 | Claude-3-Sonnet-20240229 | 77.88 | 2.12 | 2.11 | 85.20 | 70.82 | 50.42 | Anthropic | Proprietary |
| 10 | Functionary-Medium-v2.4 | 77.12 | 1.64 | 2.55 | 82.36 | 62.61 | 74.17 | MeetKai | MIT |
| 11 | Functionary-Small-v2.4 | 76.18 | 1.76 | 2.74 | 80 | 65.32 | 67.92 | MeetKai | MIT |
| 12 | Claude-3-Opus-20240229 | 73.71 | 30.65 | 12.63 | 70.35 | 55.20 | 82.50 | Anthropic | Proprietary |
| 13 | Claude-instant-1.2 | 73 | 0.95 | 1.35 | 76.63 | 64.08 | 54.17 | Anthropic | Proprietary |
| 14 | Claude-3-Haiku-20240307 | 71.65 | 0.18 | 0.99 | 77.36 | 64.26 | 29.58 | Anthropic | Proprietary |
| 15 | Claude-2.1 | 65.12 | 6.64 | 3.72 | 62.59 | 46.39 | 83.33 | Anthropic | Proprietary |
| 16 | Mistral-large-2402 | 65 | 4.94 | 2.84 | 62.09 | 47.35 | 84.17 | Mistral AI | Proprietary |
| 17 | DBRX-Instruct-Preview | 64.59 | 1.25 | 0.63 | 65.31 | 64.10 | 56.25 | Databricks | Databricks Open Model |
| 18 | Mistral-large-2402 | 61.71 | 3.90 | 1.86 | 68.98 | 52.46 | / | Mistral AI | Proprietary |
| 19 | GPT-3.5-Turbo-0125 | 58.94 | 0.42 | 1.26 | 70.52 | 67.80 | 2.08 | OpenAI | Proprietary |
| 20 | Mistral-small-2402 | 58.71 | 0.96 | 1.05 | 64.27 | 48.41 | / | Mistral AI | Proprietary |
| 21 | Hermes-2-Pro-Mistral-7B | 58.41 | 0.15 | 0.39 | 67.99 | 54.26 | 10.83 | NousResearch | apache-2.0 |
| 22 | Claude-3-Sonnet-20240229 | 58.06 | 3.41 | 3.35 | 44.06 | 38.66 | 81.67 | Anthropic | Proprietary |
| 23 | Gemini-1.0-Pro | 56.94 | 0.19 | 1.06 | 41.94 | 39.90 | 77.50 | Proprietary | |
| 24 | Claude-3-Haiku-20240307 | 52.59 | 0.29 | 1.52 | 44.69 | 42.72 | 20.83 | Anthropic | Proprietary |
| 25 | FireFunction-v1 | 51.53 | -1 | 1.24 | 39.94 | 34.28 | 73.33 | Fireworks | Apache 2.0 |
| 26 | Nexusflow-Raven-v2 | 50.94 | -1 | 1.86 | 55.05 | 56.93 | 2.08 | Nexusflow | Apache 2.0 |
| 27 | GPT-4-0613 | 49.71 | 10.48 | 3.54 | 38.53 | 26.04 | 91.67 | OpenAI | Proprietary |
| 28 | Mistral-tiny-2312 | 48.71 | 0.13 | 1.79 | 46.91 | 28.71 | 82.08 | Mistral AI | Proprietary |
| 29 | Gemma-7b-it | 41.47 | 0.03 | 0.09 | 39.05 | 33.15 | 60.42 | gemma-terms-of-use | |
| 30 | Deepseek-v1.5 | 39.41 | 0.45 | 1.20 | 36.98 | 29.26 | 56.67 | Deepseek | Deepseek License |
| 31 | Mistral-Small-2402 | 38.18 | 0.70 | 1.09 | 37.66 | 29.25 | 98.33 | Mistral AI | Proprietary |
| 32 | Mistral-small-2402 | 17.65 | 2.02 | 2.93 | 2.53 | 7.26 | 99.58 | Mistral AI | Proprietary |
⚠️数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。