32 69 107

Li Dong

unilm

AI & ML interests

Language Model Pre-Training

Recent Activity

new activity about 16 hours ago

microsoft/VibeVoice-Realtime-0.5B:add gradio app for this model

liked a model 4 days ago

microsoft/VibeVoice-Realtime-0.5B

liked a dataset 17 days ago

ytz20/LMSYS-Chat-GPT-5-Chat-Response

View all activity

Organizations

New activity in microsoft/VibeVoice-Realtime-0.5B about 16 hours ago

add gradio app for this model

👍 3

#4 opened 3 days ago by

akhaliq

commented a paper about 2 months ago

Ming-UniVision: Joint Image Understanding and Generation with a Unified Continuous Tokenizer

Paper • 2510.06590 • Published Oct 8 • 72 •

New activity in microsoft/VibeVoice-1.5B 3 months ago

Add library_name: transformers to metadata

#20 opened 3 months ago by

nielsr

run locally on iPhones

❤️ 1

#18 opened 3 months ago by

Lxdro

Possibly helpful info' for Windows users wanting to run this locally.

❤️ ➕ 2

#10 opened 3 months ago by

ScarabOfficial

colab usage

👀 ❤️ 6

#14 opened 3 months ago by

unilm

commented a paper 3 months ago

VibeVoice Technical Report

Paper • 2508.19205 • Published Aug 26 • 126 •

New activity in microsoft/VibeVoice-1.5B 3 months ago

Local Installation Video and Testing - Step by Step

🔥 8

#4 opened 3 months ago by

fahdmirzac

issue in README

❤️ 1

#6 opened 3 months ago by

hamidtech

commented 2 papers 4 months ago

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7 • 180 •

On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification

Paper • 2508.05629 • Published Aug 7 • 180 •

commented 2 papers 6 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262 •

On-Policy RL with Optimal Reward Baseline

Paper • 2505.23585 • Published May 29 • 14 •

commented a paper 7 months ago

Reward Reasoning Model

Paper • 2505.14674 • Published May 20 • 38 •

commented a paper 12 months ago

Multimodal Latent Language Modeling with Next-Token Diffusion

Paper • 2412.08635 • Published Dec 11, 2024 • 48 •

commented a paper about 1 year ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 179 •

commented 2 papers over 1 year ago

Direct Preference Knowledge Distillation for Large Language Models

Paper • 2406.19774 • Published Jun 28, 2024 • 22 •

You Only Cache Once: Decoder-Decoder Architectures for Language Models

Paper • 2405.05254 • Published May 8, 2024 • 10 •

New activity in microsoft/beit-base-patch16-224 over 2 years ago

Model architecture?

#2 opened over 2 years ago by

raresionut

New activity in microsoft/Multilingual-MiniLM-L12-H384 over 2 years ago

Discrepancy in Parameter Count: A Closer Look at the Model's Size and the Number of Layers

#3 opened over 2 years ago by

Karim-Gamal