Tongyi DeepResearch 30B A3B
Server-rendered model summary page for indexing/share previews. Use the interactive explorer for full filtering and comparison.
Identifiers & provenance
- Primary ID
- alibaba/tongyi-deepresearch-30b-a3b
- OpenRouter ID
- alibaba/tongyi-deepresearch-30b-a3b
- Canonical slug
- alibaba/tongyi-deepresearch-30b-a3b
Source semantics
- Arena rank is a human-preference leaderboard signal, not a universal truth metric.
- OpenRouter usage/popularity reflects adoption/traffic, not benchmark quality.
- Pricing fields may differ by provider and can include extra modes beyond prompt/completion.
Read more on Methodology & data sources.
Description
Tongyi DeepResearch is an agentic large language model developed by Tongyi Lab, with 30 billion total parameters activating only 3 billion per token. It's optimized for long-horizon, deep information-seeking tasks and delivers state-of-the-art performance on benchmarks like Humanity's Last Exam, BrowserComp, BrowserComp-ZH, WebWalkerQA, GAIA, xbench-DeepSearch, and FRAMES. This makes it superior for complex agentic search, reasoning, and multi-step problem-solving compared to prior models. The model includes a fully automated synthetic data pipeline for scalable pre-training, fine-tuning, and reinforcement learning. It uses large-scale continual pre-training on diverse agentic data to boost reasoning and stay fresh. It also features end-to-end on-policy RL with a customized Group Relative Policy Optimization, including token-level gradients and negative sample filtering for stable training. The model supports ReAct for core ability checks and an IterResearch-based 'Heavy' mode for max performance through test-time scaling. It's ideal for advanced research agents, tool use, and heavy inference workflows.
Raw fields snapshot
{
"id": "alibaba/tongyi-deepresearch-30b-a3b",
"name": "Tongyi DeepResearch 30B A3B",
"description": "Tongyi DeepResearch is an agentic large language model developed by Tongyi Lab, with 30 billion total parameters activating only 3 billion per token. It's optimized for long-horizon, deep information-seeking tasks and delivers state-of-the-art performance on benchmarks like Humanity's Last Exam, BrowserComp, BrowserComp-ZH, WebWalkerQA, GAIA, xbench-DeepSearch, and FRAMES. This makes it superior for complex agentic search, reasoning, and multi-step problem-solving compared to prior models.\n\nThe model includes a fully automated synthetic data pipeline for scalable pre-training, fine-tuning, and reinforcement learning. It uses large-scale continual pre-training on diverse agentic data to boost reasoning and stay fresh. It also features end-to-end on-policy RL with a customized Group Relative Policy Optimization, including token-level gradients and negative sample filtering for stable training. The model supports ReAct for core ability checks and an IterResearch-based 'Heavy' mode for max performance through test-time scaling. It's ideal for advanced research agents, tool use, and heavy inference workflows.",
"created": 1758210804,
"canonical_slug": "alibaba/tongyi-deepresearch-30b-a3b",
"hugging_face_id": "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B",
"source_type": "openrouter_only",
"context_length": 131072,
"max_completion_tokens": 131072,
"is_moderated": false,
"architecture": {
"modality": "text->text",
"input_modalities": [
"text"
],
"output_modalities": [
"text"
],
"tokenizer": "Other",
"instruct_type": null
},
"input_modalities": [
"text"
],
"output_modalities": [
"text"
],
"modality": "text->text",
"tokenizer": "Other",
"instruct_type": null,
"supported_parameters": [
"include_reasoning",
"max_tokens",
"reasoning",
"response_format",
"structured_outputs",
"temperature",
"tool_choice",
"tools",
"top_p"
],
"default_parameters": {
"temperature": null,
"top_p": null,
"frequency_penalty": null
},
"per_request_limits": null,
"top_provider": {
"context_length": 131072,
"max_completion_tokens": 131072,
"is_moderated": false
},
"pricing": {
"prompt": "0.00000009",
"completion": "0.00000045",
"input_cache_read": "0.00000009"
},
"PPM": {
"prompt": 0.09,
"completion": 0.45,
"input_cache_read": 0.09
},
"openrouter_raw": {
"id": "alibaba/tongyi-deepresearch-30b-a3b",
"canonical_slug": "alibaba/tongyi-deepresearch-30b-a3b",
"hugging_face_id": "Alibaba-NLP/Tongyi-DeepResearch-30B-A3B",
"name": "Tongyi DeepResearch 30B A3B",
"created": 1758210804,
"description": "Tongyi DeepResearch is an agentic large language model developed by Tongyi Lab, with 30 billion total parameters activating only 3 billion per token. It's optimized for long-horizon, deep information-seeking tasks and delivers state-of-the-art performance on benchmarks like Humanity's Last Exam, BrowserComp, BrowserComp-ZH, WebWalkerQA, GAIA, xbench-DeepSearch, and FRAMES. This makes it superior for complex agentic search, reasoning, and multi-step problem-solving compared to prior models.\n\nThe model includes a fully automated synthetic data pipeline for scalable pre-training, fine-tuning, and reinforcement learning. It uses large-scale continual pre-training on diverse agentic data to boost reasoning and stay fresh. It also features end-to-end on-policy RL with a customized Group Relative Policy Optimization, including token-level gradients and negative sample filtering for stable training. The model supports ReAct for core ability checks and an IterResearch-based 'Heavy' mode for max performance through test-time scaling. It's ideal for advanced research agents, tool use, and heavy inference workflows.",
"context_length": 131072,
"architecture": {
"modality": "text->text",
"input_modalities": [
"text"
],
"output_modalities": [
"text"
],
"tokenizer": "Other",
"instruct_type": null
},
"pricing": {
"prompt": "0.00000009",
"completion": "0.00000045",
"input_cache_read": "0.00000009"
},
"top_provider": {
"context_length": 131072,
"max_completion_tokens": 131072,
"is_moderated": false
},
"per_request_limits": null,
"supported_parameters": [
"include_reasoning",
"max_tokens",
"reasoning",
"response_format",
"structured_outputs",
"temperature",
"tool_choice",
"tools",
"top_p"
],
"default_parameters": {
"temperature": null,
"top_p": null,
"frequency_penalty": null
},
"expiration_date": null
}
}