ByteDance Seed Team

ByteDance Seed Team Introduction

Last updated: June 2026

Organization type: Enterprise AI basic model research team

Company: ByteDance

Main directions: large language model, multi-modal model, video generation, image generation, voice interaction, AI Agent, AI Infra, AI for Science, robots and embodied intelligence

Organization Profile

The ByteDance Seed team is a core AI research team formed by ByteDance in the direction of general artificial intelligence and basic models. It was established in 2023. The team's goal is to explore new methods of general intelligence, continuously improve the upper limit of model intelligence, and apply basic model capabilities to real products and large-scale user scenarios.

As of 2026, Seed is no longer just a single large language model team, but has expanded into a comprehensive AI research organization covering basic models, multi-modal understanding and generation, voice interaction, video generation, 3D generation, AI Agent, AI Infra, AI for Science, robotics and Responsible AI. Its technical achievements have supported multiple products and business scenarios of ByteDance, including applications such as Doubao, Button, and Jimeng.

The positioning of the Seed team can be summarized as follows: taking the basic model as the base, taking multi-modality and Agent capabilities as the core growth direction, taking large-scale product launch as the verification scenario, while maintaining the exploration of long-term AGI paths, model architectures, training systems and world models.

Research direction

The research direction of the Seed team is relatively complete, covering not only model algorithms, but also training systems and productization capabilities.

1. Large language model and Agent

The direction of Seed LLM focuses on the next generation of large language models. Research content includes pre-training, post-training, reasoning capabilities, long context, model memory, interpretability, reasoning acceleration, tool usage and Agent capabilities. In recent years, Seed's general model route has gradually expanded from traditional chat models to Agent models that can perform complex tasks, such as model systems that support search, code execution, GUI operations, multi-round task planning, and long-chain reasoning.

Representative models include Seed1.8, Seed2.0, Seed-OSS-36B, Seed-Thinking-v1.5, etc.

2. Multimodal understanding and world model

Seed's multi-modal direction emphasizes unified understanding and generation capabilities, covering images, videos, audios, texts, GUIs, 3D spaces and virtual/real environment interactions. The goal of this direction is not to simply improve visual question and answer capabilities, but to build a multi-modal basic model that can understand the world, generate content, operate interfaces, and participate in complex tasks.

Representative achievements include Seed1.5-VL, BAGEL, Seed3D 2.0, Depth Anything 3, etc.

3. Image, video and 3D generation

Seed has formed a relatively complete product line in the direction of visual generation, including image generation and editing, video generation, audio and video joint generation and 3D content generation.

Among them, the Seedance series focuses on video and audio and video generation, the Seedream series focuses on image generation, SeedEdit focuses on natural language image editing, and Seed3D focuses on high-quality 3D content generation. These models are geared toward scenarios such as creative production, short videos, advertising, e-commerce, games, 3D assets, and embodied intelligence.

4. Voice and real-time interaction

The Seed Speech direction focuses on speech, audio, music, natural language understanding and multi-modal deep learning. Compared with traditional speech recognition, speech synthesis or half-duplex voice assistants, Seed's new direction emphasizes end-to-end, multi-modal, real-time and natural voice interaction.

Representative models include Seeduplex, Seed LiveInterpret, Seed-Music, etc. Among them, Seeduplex is a large full-duplex voice model for real-time voice dialogue, emphasizing "listening and speaking at the same time", anti-interference, endpoint judgment and a more natural dialogue rhythm.

5. AI infrastructure

The Seed Infrastructures direction is responsible for supporting the underlying system capabilities required for basic model development, including distributed training, reinforcement learning framework, high-performance inference, heterogeneous hardware compilation, training stability, inference scheduling and hardware collaborative optimization.

This direction is of high importance to Seed, because large model competition depends not only on algorithms, but also on whether large-scale training, post-training, and online inference can be completed stably, efficiently, and at low cost.

6. AI for Science

Seed's AI for Science direction focuses on scientific computing paradigm innovation, focusing on basic biological models, protein and molecular structure prediction, quantum chemistry, molecular dynamics, and materials/drug discovery.

Representative achievements include basic models of biomolecules such as Protenix, as well as research work oriented to quantum chemistry and molecular simulations.

7. Robots and Embodied Intelligence

Seed Robotics focuses on general intelligent robots, and its research includes basic robot models, vision-language-action models, robot perception, dexterous operation, reinforcement learning, robot data engines and real environment interaction.

Representative achievements include Seed GR-RL, Seed GR-3, ByteDexter, etc. This direction illustrates that Seed has further expanded basic model capabilities from text and multi-modal content generation to real-world operations and embodied intelligence.

8. Responsible AI

The Seed Responsible AI direction focuses on research on AI security, reliability, sustainability and trustworthy mechanisms. Its research not only includes security assessment, but also includes Agent-oriented long-term memory, task-related memory selection, paper search agents, etc.

Representative achievements include PaSa (Paper Search Agent) and TaskMem, etc.

represents model and project

Seed2.0

Seed2.0 is a new generation of basic model series launched by Seed in 2026, targeting large-scale production deployment and complex real-world tasks. This series includes three general Agent models: Pro, Lite, and Mini, as well as a dedicated Code model.

The key capabilities of Seed2.0 include:

Stronger multi-modal understanding capabilities, covering images, videos, audio and text;
Stronger complex instruction execution, long-chain reasoning and multi-step task processing capabilities;
Optimize agent workflow, supporting planning, execution, reflection and tool orchestration;
Provide different scale models to balance performance, cost, latency and deployment density;
Seed2.0 Lite will further enhance audio input and full-modal understanding capabilities after being upgraded at the end of April 2026.

Seed2.0 marks a further shift in Seed's basic model route from "universal dialogue and multi-modal understanding" to "Agent basic model oriented to real task execution".

Seed1.8

Seed1.8 is a basic model for real-world Agent capabilities, supporting text and image input, and emphasizing multi-round interaction, search, code generation and execution, GUI understanding and multi-step task execution.

Compared with traditional chat models, Seed1.8 focuses more on "how the model completes tasks in the environment." It integrates perception, reasoning and action into the same model system through a unified Agent interaction interface, and is suitable for research tasks, coding tasks, visual interface operations and complex tool invocation scenarios.

Seed-OSS-36B

Seed-OSS-36B is a large language model series released by Seed for the open source community, with a parameter size of 36B. This series is open source using the Apache-2.0 protocol, supports native 512K long context, and emphasizes reasoning, Agent, long text processing and developer-friendly capabilities.

Seed-OSS-36B includes:

Seed-OSS-36B-Base;
Seed-OSS-36B-Base-woSyn;
Seed-OSS-36B-Instruct.

Among them, the Base-woSyn version reduces the interference of synthetic instruction data on pre-training research and is more suitable for researchers to conduct post-training, alignment and basic model research; the Instruct version is oriented to instruction following, reasoning, code and Agent tasks.

Seed-Coder

Seed-Coder is Seed's open source code model series. The model size is 8B and includes base, instruct and reasoning versions. The focus of this project is not only the model itself, but also a set of data screening and code pre-training data construction processes involving large language models.

Seed-Coder’s release notes: In the direction of code intelligence, the Seed team not only focuses on model effects, but also on how to use models to automatically build high-quality training data, thereby reducing the cost of manual data management.

BAGEL

BAGEL is Seed's open source unified multimodal model that supports multimodal understanding and generation. It incorporates multi-source interleaved data such as text, images, videos and web pages into a unified pre-training framework, with capabilities such as image understanding, image generation, image editing, style transfer, future frame prediction, 3D operation and world navigation.

BAGEL is Seed's important exploration in the direction of "unified modeling of understanding and generation", and also reflects Seed's judgment on the long-term path of multi-modal basic models: future models should not just complete recognition, question answering or generation at a single point, but should have the ability to understand, generate and interact at the same time under the same architecture.

Seed1.5-VL

Seed1.5-VL is Seed's basic visual language model, oriented to general multi-modal understanding and reasoning tasks. It performs well on tasks such as visual reasoning, image question answering, graph understanding, visual localization, counting, video understanding and GUI Agent.

The significance of Seed1.5-VL is that it advances the multi-modal model from "looking at pictures to answer questions" to more complex visual reasoning, interface understanding and agent control scenarios.

Seedance 2.0

Seedance 2.0 is Seed's new generation video generation model, which adopts a unified multi-modal audio and video joint generation architecture and supports multiple input forms such as text, images, audio, and video.

Compared with early video generation models, Seedance 2.0 places more emphasis on complex motion, physical consistency, subject consistency, audio-visual synchronization and controllable editing capabilities. It supports multi-modal materials as reference, such as images, video clips, audio and natural language instructions, and is suitable for short video creation, advertising creativity, film and television previews and high-quality content production.

Seedream 5.0 Lite

Seedream 5.0 Lite is Seed's unified multi-modal image generation model, with stronger understanding, reasoning and generation capabilities, and added online search capabilities. It is oriented to professional visual creative scenarios, emphasizing text instruction following, style control, composition ability, detail consistency and hot content visualization.

Seeduplex

Seeduplex is a large full-duplex voice model launched by Seed, aiming for more natural real-time voice interaction. It is different from the traditional half-duplex interaction method of "the user finishes speaking and then the system answers". It can understand the environment, judge whether to interrupt, filter interference and control the reply rhythm while listening to the user's voice.

This direction reflects Seed’s judgment on the next generation of AI interaction: voice assistants should not just be voice chatbots, but should have real-time understanding and interaction capabilities that are closer to real-person conversations.

Seed3D 2.0

Seed3D 2.0 is Seed’s 3D generative model focused on improving geometric accuracy, material quality and downstream usability. It targets scenarios such as high-quality 3D content generation, simulation, industrial manufacturing, game assets, robotics, and embodied intelligence.

Rather than simply generating models that "look like 3D", Seed3D 2.0 places more emphasis on geometry, PBR materials, editability, and simulation usability.

Depth Anything 3

Depth Anything 3 is an open source visual space reconstruction model released by Seed. It uses a more concise single Transformer architecture and unified depth-ray representation to extend monocular depth estimation to spatial reconstruction tasks at any viewing angle. This project reflects Seed’s research accumulation in the direction of visual spatial intelligence and 3D perception.

Long-term research plan: Seed Edge

Seed Edge is a long-term research plan initiated by the Seed team. The goal is to explore new methods of general intelligence and support high-uncertainty, high-risk but potentially breakthrough AI research topics over a longer period.

The plan emphasizes cross-modal and cross-direction cooperation, focusing on the boundaries of reasoning capabilities, perception capabilities, integrated software and hardware model design, next-generation learning paradigms, and new scaling paths. The emergence of Seed Edge shows that Seed not only focuses on short-term model iteration and product implementation, but also attempts to establish a longer-term AGI research mechanism.

Open source ecosystem and developer resources

Seed has currently formed a relatively clear external developer ecosystem, which mainly includes:

*ByteDance Seed official website;

ByteDance-Seed GitHub organization;
ByteDance-Seed Hugging Face Organization;
Open source projects such as Seed-OSS, Seed-Coder, BAGEL, Depth Anything 3, Seed1.5-VL, etc.;
Some model APIs on the Volcano Engine/Ark platform;
Top Seed talent program for researchers and students.

From the perspective of open source strategy, Seed does not make all flagship models completely open source, but adopts a combination strategy of "closed source flagship model + open source research model + open source tool chain/data method". On the one hand, core capabilities such as Seed2.0, Seedance 2.0, and Seedream serve more products and platforms; on the other hand, projects such as Seed-OSS, BAGEL, Seed-Coder, and Depth Anything 3 are used to attract developers, researchers, and open source communities to participate.

Team Characteristics

The characteristics of the Seed team can be summarized as follows:

Complete model pedigree: From LLM, VLM, video, image, voice, 3D to robots, Seed has formed multiple basic model product lines.
Focus on real-life scenario verification: Seed’s model capabilities are not only limited to papers and lists, but are also verified for large-scale applications through products such as Beanbags, Buttons, and Jimeng.
Agent-oriented trend is obvious: From Seed1.8 to Seed2.0, Seed is advancing model capabilities from question answering and generation to search, code execution, GUI operations and complex task execution.
Outstanding multi-modal generation capabilities: Seedance, Seedream, SeedEdit, Seed3D and BAGEL together constitute Seed’s technology matrix in visual generation and unified multi-modal modeling.
** Deep investment in infrastructure **: Seed not only builds models, but also invests in AI Infra such as distributed training, RL systems, inference optimization, and heterogeneous hardware compilation.
Long-term AGI research and short-term product implementation in parallel: Seed Edge represents long-term exploration, and applications such as bean bags represent product implementation. The two together constitute Seed’s development path.

Development Timeline

2023: ByteDance established the Seed team and began to systematically invest in basic model research and development.
January 2025: The Seed Edge long-term research plan is launched, focusing on the long-term exploration of general intelligence.
April 2025: Technical details of Seed-Thinking-v1.5 were announced, demonstrating Seed’s progress in the direction of inference models.
May 2025: Seed-Coder, BAGEL, Seed1.5-VL and other models and projects have been released one after another, and Seed’s code, multi-modal and visual language model capabilities have been demonstrated to the outside world.
August 2025: Seed-OSS-36B is open source, providing a large language model series with 36B and 512K contexts.
November 2025: Depth Anything 3 is released, showcasing Seed’s progress in spatial reconstruction and vision-based models.
December 2025: Seed1.8 released, emphasizing real-world Agent capabilities.
February 2026: Seed2.0 and Seedance 2.0 were released, and Seed’s model route further shifted towards complex task execution and unified audio and video generation.
April 2026: Seeduplex, Seed3D 2.0, Seedream 5.0 Lite and other model updates, Seed continues to expand in the direction of voice, 3D and image generation.
End of April 2026: Seed2.0 Lite is upgraded to a stronger full-modal understanding model, enhancing audio input, Agent, Coding and GUI capabilities.

Summary

The ByteDance Seed team has grown from an early beanbao large model R&D team to a comprehensive AI research organization covering basic models, multi-modal generation, voice interaction, Agent, AI Infra, AI for Science and robotics.

If the main task of Seed in 2023-2024 is to catch up and establish basic model capabilities, then Seed in 2025-2026 has entered a stage of systematic expansion: on the one hand, it serves ByteDance's internal and external application scenarios through models such as Seed2.0, Seedance 2.0, Seedream, and Seeduplex; on the other hand, it serves ByteDance's internal and external application scenarios through Seed-OSS, BAGEL, Seed-Coder, Depth Anything 3 and other projects to participate in the open source ecosystem and research community competition.

Judging from the development trend, Seed's core strategy is shifting from "releasing a single large model" to "building a multi-modal, executable, interactive, and implementable basic model system." This makes it one of the important teams worthy of continued attention in China and even the global AI basic model competition.

Product line release timeline

Published models

Seed 2.0

Seed-1.6

No series

About this organization