First Omni-modal Future Forecasting Benchmark
AI & ML interests
LLM
Recent Activity
View all activity
Papers
HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding
FutureOmni: Evaluating Future Forecasting from Omni-Modal Context for Multimodal LLMs
A unified multimodal large language model for end-to-end speaker-attributed, time-stamped transcription.
An Efficient Training Framework for Diffusion Language Models
True Speech-to-Speech Langugage Model
-
OpenMOSS-Team/Embodied_R1-ScienceWorld
8B β’ Updated β’ 1 -
OpenMOSS-Team/Embodied_Planner-R1-Alfworld
8B β’ Updated β’ 5 -
Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning
Paper β’ 2506.23127 β’ Published β’ 1 -
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning
Paper β’ 2503.10480 β’ Published β’ 55
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
OpenMOSS-Team/SmolLM-135M-MLA-d_kv_8-refactor
Text Generation β’ 0.1B β’ Updated β’ 6 -
OpenMOSS-Team/SmolLM-135M-MLA-d_kv_32-refactor
Text Generation β’ 0.1B β’ Updated β’ 25 -
OpenMOSS-Team/SmolLM-135M-MLA-d_kv_16-refactor
Text Generation β’ 0.1B β’ Updated β’ 9 -
OpenMOSS-Team/SmolLM-360M-MLA-d_kv_8-refactor
Text Generation β’ 0.3B β’ Updated β’ 21
-
OpenMOSS-Team/moss-moon-003-sft-plugin
Text Generation β’ Updated β’ 13 β’ 69 -
OpenMOSS-Team/moss-moon-003-sft
Text Generation β’ Updated β’ 19 β’ 127 -
OpenMOSS-Team/moss-moon-003-base
Text Generation β’ Updated β’ 140 β’ 131 -
OpenMOSS-Team/moss-moon-003-sft-int4
Text Generation β’ Updated β’ 23 β’ 40
Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning
https://github.com/OpenMOSS/FRoM-W1
Proactive Robot Manipulation in Omni-modal Context
Open source weights of Lorsa modules introduced in "Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition".
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs
Paper β’ 2502.14837 β’ Published β’ 3 -
OpenMOSS-Team/Llama-2-7B-MLA-d_kv_16
Text Generation β’ 6B β’ Updated β’ 4 -
OpenMOSS-Team/Llama-2-7B-MLA-d_kv_32
Text Generation β’ 6B β’ Updated β’ 3 -
OpenMOSS-Team/Llama-2-7B-MLA-d_kv_64
Text Generation β’ 7B β’ Updated β’ 1
First Omni-modal Future Forecasting Benchmark
Synthesizing Multimodal Verifiable Game Data to Boost VLMs' General Reasoning
A unified multimodal large language model for end-to-end speaker-attributed, time-stamped transcription.
https://github.com/OpenMOSS/FRoM-W1
An Efficient Training Framework for Diffusion Language Models
Proactive Robot Manipulation in Omni-modal Context
True Speech-to-Speech Langugage Model
-
OpenMOSS-Team/Embodied_R1-ScienceWorld
8B β’ Updated β’ 1 -
OpenMOSS-Team/Embodied_Planner-R1-Alfworld
8B β’ Updated β’ 5 -
Unleashing Embodied Task Planning Ability in LLMs via Reinforcement Learning
Paper β’ 2506.23127 β’ Published β’ 1 -
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning
Paper β’ 2503.10480 β’ Published β’ 55
Open source weights of Lorsa modules introduced in "Towards Understanding the Nature of Attention with Low-Rank Sparse Decomposition".
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
OpenMOSS-Team/SmolLM-135M-MLA-d_kv_8-refactor
Text Generation β’ 0.1B β’ Updated β’ 6 -
OpenMOSS-Team/SmolLM-135M-MLA-d_kv_32-refactor
Text Generation β’ 0.1B β’ Updated β’ 25 -
OpenMOSS-Team/SmolLM-135M-MLA-d_kv_16-refactor
Text Generation β’ 0.1B β’ Updated β’ 9 -
OpenMOSS-Team/SmolLM-360M-MLA-d_kv_8-refactor
Text Generation β’ 0.3B β’ Updated β’ 21
The MHA2MLA model published in the paper "Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-Based LLMs"
-
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs
Paper β’ 2502.14837 β’ Published β’ 3 -
OpenMOSS-Team/Llama-2-7B-MLA-d_kv_16
Text Generation β’ 6B β’ Updated β’ 4 -
OpenMOSS-Team/Llama-2-7B-MLA-d_kv_32
Text Generation β’ 6B β’ Updated β’ 3 -
OpenMOSS-Team/Llama-2-7B-MLA-d_kv_64
Text Generation β’ 7B β’ Updated β’ 1
-
OpenMOSS-Team/moss-moon-003-sft-plugin
Text Generation β’ Updated β’ 13 β’ 69 -
OpenMOSS-Team/moss-moon-003-sft
Text Generation β’ Updated β’ 19 β’ 127 -
OpenMOSS-Team/moss-moon-003-base
Text Generation β’ Updated β’ 140 β’ 131 -
OpenMOSS-Team/moss-moon-003-sft-int4
Text Generation β’ Updated β’ 23 β’ 40