Papers
arxiv:2603.25040

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Published on Mar 26
Β· Submitted by
taesiri
on Mar 27
#2 Paper of the day
Authors:
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,
,

Abstract

Intern-S1-Pro is a one-trillion-parameter scientific multimodal foundation model that enhances general and scientific capabilities through advanced agent functionalities and specialized task mastery across multiple scientific disciplines.

AI-generated summary

We introduce Intern-S1-Pro, the first one-trillion-parameter scientific multimodal foundation model. Scaling to this unprecedented size, the model delivers a comprehensive enhancement across both general and scientific domains. Beyond stronger reasoning and image-text understanding capabilities, its intelligence is augmented with advanced agent capabilities. Simultaneously, its scientific expertise has been vastly expanded to master over 100 specialized tasks across critical science fields, including chemistry, materials, life sciences, and earth sciences. Achieving this massive scale is made possible by the robust infrastructure support of XTuner and LMDeploy, which facilitates highly efficient Reinforcement Learning (RL) training at the 1-trillion parameter level while ensuring strict precision consistency between training and inference. By seamlessly integrating these advancements, Intern-S1-Pro further fortifies the fusion of general and specialized intelligence, working as a Specializable Generalist, demonstrating its position in the top tier of open-source models for general capabilities, while outperforming proprietary models in the depth of specialized scientific tasks.

Community

Thanks for submitting our paper to the daily paper. Please add 'Intern Large Models' as the organization. Our github is https://github.com/InternLM/Intern-S1, and the model is available at https://huggingface.co/internlm/Intern-S1-Pro.

With pipeline, just specify the task and the model id from the Hub.

from transformers import pipeline
pipe = pipeline("text-generation", model="distilbert/distilgpt2")

If you want more control, you will need to define the tokenizer and model.

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("distilbert/distilgpt2")
model = AutoModelForCausalLM.from_pretrained("distilbert/distilgpt2")

hf papers read 2603.25040
curl -LsSf https://hf.co/cli/install.sh | bash

Sorry for the notation typo in Eq. (2). It is currently written as:
p^iSTE=sg⁑(mipiΟ„)+(piΟ„βˆ’sg⁑(piΟ„)) \hat{p}_i^{\mathrm{STE}}=\operatorname{sg}(m_i p_i^\tau)+(p_i^\tau-\operatorname{sg}(p_i^\tau))
but the intended form is:
p^iSTE=sg⁑(mip~iΟ„)+(piΟ„βˆ’sg⁑(piΟ„)) \hat{p}_i^{\mathrm{STE}}=\operatorname{sg}(m_i\tilde{p}_i^\tau)+(p_i^\tau-\operatorname{sg}(p_i^\tau))

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2603.25040
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 2

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2603.25040 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2603.25040 in a Space README.md to link it from this page.

Collections including this paper 3