10 17 56

Yuda Song

IDKiro

IDKiro

AI & ML interests

Image Generation

Recent Activity

upvoted a paper 4 days ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

liked a model 8 days ago

deepseek-ai/DeepSeek-V4-Pro

upvoted an article 4 months ago

M2.1: Multilingual and Multi-Task Coding with Strong Generalization

View all activity

Organizations

upvoted a paper 4 days ago

Tuna-2: Pixel Embeddings Beat Vision Encoders for Multimodal Understanding and Generation

Paper • 2604.24763 • Published 6 days ago • 66

liked a model 8 days ago

deepseek-ai/DeepSeek-V4-Pro

Text Generation • 862B • Updated 5 days ago • 382k • • 3.39k

upvoted an article 4 months ago

Article

M2.1: Multilingual and Multi-Task Coding with Strong Generalization

Jan 5

•

liked a model 4 months ago

MiniMaxAI/MiniMax-M2.1

Text Generation • 229B • Updated Feb 13 • 29k • • 1.35k

authored a paper 5 months ago

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published Dec 15, 2025 • 106

upvoted a paper 5 months ago

Towards Scalable Pre-training of Visual Tokenizers for Generation

Paper • 2512.13687 • Published Dec 15, 2025 • 106

liked 3 models 5 months ago

upvoted 4 papers 5 months ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published Nov 26, 2025 • 162

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published Dec 2, 2025 • 267

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published Nov 27, 2025 • 245

Back to Basics: Let Denoising Generative Models Denoise

Paper • 2511.13720 • Published Nov 17, 2025 • 70

liked a dataset 6 months ago

facebook/PE-Video

Viewer • Updated Apr 18, 2025 • 118k • 2.31k • 45

liked a model 6 months ago

MiniMaxAI/MiniMax-M2

Text Generation • 229B • Updated Dec 23, 2025 • 85.3k • • 1.49k

liked 2 datasets 6 months ago

ling99/OCRBench_v2

Viewer • Updated Feb 24, 2025 • 10k • 3.05k • 18

nvidia/Llama-Nemotron-VLM-Dataset-v1

Viewer • Updated Oct 22, 2025 • 2.86M • 1.2k • 163

upvoted 2 papers 8 months ago

AToken: A Unified Tokenizer for Vision

Paper • 2509.14476 • Published Sep 17, 2025 • 37

DINOv3

Paper • 2508.10104 • Published Aug 13, 2025 • 306

liked a model 9 months ago

Qwen/Qwen-Image

Text-to-Image • Updated Aug 18, 2025 • 196k • • 2.48k

Yuda Song

AI & ML interests

Recent Activity

Organizations

IDKiro's activity

M2.1: Multilingual and Multi-Task Coding with Strong Generalization