2 42 2

Akshay Nuthanapati

a0308

AI & ML interests

Neural Networks, Large Language Models

Recent Activity

upvoted an article about 1 month ago

Training mRNA Language Models Across 25 Species for $165

upvoted an article 5 months ago

Deriving the PPO Loss from First Principles

new activity 5 months ago

huggingface/InferenceSupport:haykgrigorian/v2mini-eval1

View all activity

Organizations

None yet

upvoted an article about 1 month ago

Article

Training mRNA Language Models Across 25 Species for $165

OpenMed

•

Mar 31

• 27

upvoted an article 5 months ago

Article

Deriving the PPO Loss from First Principles

garg-aayush

•

Dec 25, 2025

• 42

New activity in huggingface/InferenceSupport 5 months ago

haykgrigorian/v2mini-eval1

👍 3

#6764 opened 5 months ago by

a0308

upvoted 4 articles 6 months ago

Article

Exploring Quantization Backends in Diffusers

derekl35, marcsun13, sayakpaul

•

May 21, 2025

• 45

Article

Diffusers welcomes FLUX-2

YiYiXu, dg845, sayakpaul, OzzyGT, dn6, ariG23498, linoyts, multimodalart

•

Nov 25, 2025

• 190

Article

Continuous batching from first principles

ror, ArthurZ, mcpotato

•

Nov 25, 2025

• 378

Article

SmolLM - blazingly fast and remarkably powerful

loubnabnl, anton-l, eliebak

•

Jul 16, 2024

• 455

upvoted 5 articles 7 months ago

Article

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face

dvgodoy

•

Feb 11, 2025

• 123

Article

KV Cache from scratch in nanoVLM

ariG23498, kashif, lusxvr, andito, pcuenq

•

Jun 4, 2025

• 119

Article

Proximal Policy Optimization (PPO)

ThomasSimonini

•

Aug 5, 2022

• 84

Article

KV Caching Explained: Optimizing Transformer Inference Efficiency

not-lain

•

Jan 30, 2025

• 326

Article

Get your VLM running in 3 simple steps on Intel CPUs

ezelanza, helenai, nikita-savelyev-intel, echarlaix, IlyasMoutawwakil

•

Oct 15, 2025

• 22

upvoted a paper 7 months ago

MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21, 2025 • 98

updated a dataset 7 months ago

a0308/boltz-yaml

Updated Oct 20, 2025 • 10

published a dataset 7 months ago

a0308/boltz-yaml

Updated Oct 20, 2025 • 10

upvoted 2 articles 7 months ago

Article

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

ariG23498, merve, pcuenq, reach-vb

•

Mar 12, 2025

• 495

Article

Vision Language Models (Better, faster, stronger)

merve, sergiopaniego, ariG23498, pcuenq, andito

•

May 12, 2025

• 611

upvoted a paper 7 months ago

D-AR: Diffusion via Autoregressive Models

Paper • 2505.23660 • Published May 29, 2025 • 34

upvoted 2 articles 7 months ago

Article

Introducing Würstchen: Fast Diffusion for Image Generation

dome272, babbleberns, kashif, sayakpaul, pcuenq

•

Sep 13, 2023

• 21

Article

How 🤗 Accelerate runs very large models thanks to PyTorch

sgugger

•

Sep 27, 2022

• 18

Akshay Nuthanapati

AI & ML interests

Recent Activity

Organizations

a0308's activity

Training mRNA Language Models Across 25 Species for $165

Deriving the PPO Loss from First Principles

haykgrigorian/v2mini-eval1

Exploring Quantization Backends in Diffusers

Diffusers welcomes FLUX-2

Continuous batching from first principles

SmolLM - blazingly fast and remarkably powerful

Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face

KV Cache from scratch in nanoVLM

Proximal Policy Optimization (PPO)

KV Caching Explained: Optimizing Transformer Inference Efficiency

Get your VLM running in 3 simple steps on Intel CPUs

Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM

Vision Language Models (Better, faster, stronger)

Introducing Würstchen: Fast Diffusion for Image Generation

How 🤗 Accelerate runs very large models thanks to PyTorch