AI 模型排行
实时追踪主流大语言模型的使用量、性能与基准测试数据
| # | 模型 | P50 延迟 | P50 吞吐 |
|---|---|---|---|
| 1 |
openai/gpt-oss-safeguard-20b
openai
|
151 ms | 846.5 t/s |
| 2 |
openai/gpt-oss-20b
openai
|
94 ms | 721.0 t/s |
| 3 |
openai/gpt-oss-120b
openai
|
155 ms | 708.0 t/s |
| 4 |
qwen/qwen3-32b-04-28
qwen
|
204 ms | 414.0 t/s |
| 5 |
inception/mercury-2-20260304
inception
|
493 ms | 326.0 t/s |
| 6 |
meta-llama/llama-3.1-8b-instruct
meta-llama
|
77 ms | 283.0 t/s |
| 7 |
z-ai/glm-4.7-20251222
z-ai
|
449 ms | 265.0 t/s |
| 8 |
meta-llama/llama-4-scout-17b-16e-instruct
meta-llama
|
184 ms | 236.5 t/s |
| 9 |
nvidia/nemotron-3-super-120b-a12b-20230311
nvidia
|
1288 ms | 230.0 t/s |
| 10 |
qwen/qwen3-235b-a22b-07-25
qwen
|
168 ms | 212.0 t/s |
| 11 |
minimax/minimax-m2.5-20260211
minimax
|
374 ms | 193.5 t/s |
| 12 |
moonshotai/kimi-k2-0905
moonshotai
|
117 ms | 171.0 t/s |
| 13 |
google/gemini-2.5-flash-image
google
|
1226 ms | 170.0 t/s |
| 14 |
nvidia/nemotron-3-nano-30b-a3b
nvidia
|
495 ms | 165.0 t/s |
| 15 |
mistralai/devstral-small-2507
mistralai
|
270 ms | 163.0 t/s |
| 16 |
moonshotai/kimi-k2.5-0127
moonshotai
|
349 ms | 162.0 t/s |
| 17 |
x-ai/grok-4-fast
x-ai
|
2725 ms | 159.5 t/s |
| 18 |
meta-llama/llama-3.3-70b-instruct
meta-llama
|
185 ms | 158.0 t/s |
| 19 |
deepseek/deepseek-r1-0528
deepseek
|
563 ms | 149.0 t/s |
| 20 |
arcee-ai/trinity-mini-20251201
arcee-ai
|
358 ms | 139.0 t/s |
| 21 |
google/gemini-2.5-flash-lite
google
|
564 ms | 123.0 t/s |
| 22 |
deepseek/deepseek-chat-v3-0324
deepseek
|
441 ms | 119.0 t/s |
| 23 |
x-ai/grok-4.20-20260309
x-ai
|
590 ms | 116.0 t/s |
| 24 |
z-ai/glm-5-20260211
z-ai
|
434 ms | 114.0 t/s |
| 25 |
qwen/qwen3-30b-a3b-04-28
qwen
|
144 ms | 114.0 t/s |
| 26 |
stepfun/step-3.5-flash
stepfun
|
253 ms | 113.0 t/s |
| 27 |
google/gemini-2.5-flash-lite-preview-09-2025
google
|
576 ms | 113.0 t/s |
| 28 |
mistralai/ministral-8b-2512
mistralai
|
262 ms | 110.0 t/s |
| 29 |
x-ai/grok-4.1-fast
x-ai
|
3008 ms | 110.0 t/s |
| 30 |
deepseek/deepseek-chat-v3.1
deepseek
|
313 ms | 107.0 t/s |
| 31 |
x-ai/grok-3-mini
x-ai
|
352 ms | 107.0 t/s |
| 32 |
qwen/qwen3-coder-next-2025-02-03
qwen
|
339 ms | 106.0 t/s |
| 33 |
qwen/qwen3.5-35b-a3b-20260224
qwen
|
394 ms | 105.0 t/s |
| 34 |
openai/gpt-5-chat-2025-08-07
openai
|
690 ms | 102.0 t/s |
| 35 |
qwen/qwen3-next-80b-a3b-instruct-2509
qwen
|
388 ms | 101.0 t/s |
| 36 |
mistralai/mistral-nemo
mistralai
|
320 ms | 100.0 t/s |
| 37 |
openai/gpt-5-nano-2025-08-07
openai
|
4087 ms | 99.0 t/s |
| 38 |
openai/gpt-4o
openai
|
515 ms | 99.0 t/s |
| 39 |
amazon/nova-lite-v1
amazon
|
434 ms | 97.0 t/s |
| 40 |
google/gemini-3.1-flash-image-preview-20260226
google
|
14070 ms | 96.0 t/s |
| 41 |
qwen/qwen3.5-flash-20260224
qwen
|
511 ms | 95.0 t/s |
| 42 |
google/gemma-2-9b-it
google
|
252 ms | 95.0 t/s |
| 43 |
google/gemini-2.5-pro
google
|
2535 ms | 92.0 t/s |
| 44 |
mistralai/mistral-small-3.2-24b-instruct-2506
mistralai
|
352 ms | 91.0 t/s |
| 45 |
arcee-ai/trinity-large-thinking
arcee-ai
|
393 ms | 90.0 t/s |
| 46 |
qwen/qwen3.5-9b-20260310
qwen
|
489 ms | 90.0 t/s |
| 47 |
openai/gpt-5.1-20251113
openai
|
980 ms | 87.0 t/s |
| 48 |
sao10k/l3-lunaris-8b
sao10k
|
162 ms | 87.0 t/s |
| 49 |
anthropic/claude-4.5-haiku-20251001
anthropic
|
578 ms | 84.0 t/s |
| 50 |
google/gemini-2.0-flash-001
google
|
414 ms | 84.0 t/s |