This paper introduces Intern-S1, a scientific multimodal foundation model designed to bridge the capability gap between open-source and top-tier proprietary models in complex scientific fields. With a total of 241 billion parameters (28 billion activated), Intern-S1 functions as a “specialized generalist,” utilizing a Mixture-of-Experts (MoE) architecture and extensive training on 5 trillion tokens—half of which are scientific data—to master tasks ranging from general reasoning to professional-level molecular analysis.
Major Discoveries A central innovation presented in the report is the Mixture-of-Rewards (MoR) framework for reinforcement learning. Unlike traditional methods that often struggle to balance diverse training objectives, MoR enables the model to simultaneously optimize for over 1,000 different tasks. By synergizing rewards from verified rules, model-based feedback, and environmental interactions, this approach significantly enhances the model’s scalability and adaptability, allowing it to learn professional scientific skills with far greater data efficiency than previous approaches.
Bridging the Gap with “InternBootCamp” The researchers developed InternBootCamp, a comprehensive post-training environment that facilitates the model’s evolution from a generalist to a scientific expert.
Verifiable Task Scaling: The platform hosts over 1,000 domain-diverse environments, enabling the automated generation and verification of training cases to ensure rigorous learning.
Agent-Driven Curation: By employing agent workflows for data mining, the team successfully increased the purity of scientific data in their pre-training corpus from approximately 2% to over 50%.
Scientific Reasoning: This infrastructure allowed Intern-S1 to achieve state-of-the-art performance in reasoning tasks, proving that open-source models can rival closed-source systems when supported by high-quality, verifiable task environments.
Key Scientific Capabilities The study identifies specific domains where Intern-S1 demonstrates exceptional proficiency, often surpassing leading closed-source models.
Molecular Synthesis: Intern-S1 excels in professional tasks such as planning synthesis routes for complex molecules and predicting chemical reaction outcomes, showcasing deep domain expertise.
Multimodal Mastery: The model integrates a dynamic tokenizer and specialized encoders (e.g., for time-series and non-natural visual data), enabling it to natively understand and process diverse scientific modalities like protein sequences and seismic signals.
Competitive General Reasoning: Despite its intense scientific specialization, the model maintains top-tier performance on general benchmarks (e.g., MMLU-Pro, GPQA), demonstrating that deep domain focus need not compromise general intelligence.
Overall, the Intern-S1 Technical Report outlines a path toward Artificial General Intelligence (AGI) in the scientific domain. By combining massive-scale scientific pre-training with innovative reinforcement learning strategies, Intern-S1 not only democratizes access to high-level scientific reasoning but also establishes a new benchmark for open-source foundation models.


