Thomas Liang's picture

Open to Work

Thomas Liang PRO

thliang01

·

thliang01

AI & ML interests

Efficient ML

Recent Activity

upvoted a collection about 16 hours ago

upvoted a collection 1 day ago

liked a Space 1 day ago

HuggingFaceFW/finephrase

View all activity

Organizations

upvoted a collection about 16 hours ago

LLM PlayBooks

All useful playbooks for training LLM • 6 items • Updated about 18 hours ago • 2

upvoted a collection 1 day ago

🤏 Smol-Data

Tried and tested mixes for strong pretraining. Inspired by https://huggingface.co/blog/codelion/optimal-dataset-mixing • 14 items • Updated 8 days ago • 12

liked a Space 1 day ago

The Synthetic Data Playbook: Generating Trillions of the Finest Tokens

Explore synthetic data experiments in a visual bookshelf

liked a model 9 days ago

mistralai/Ministral-3-14B-Instruct-2512

Updated Jan 15 • 198k • 262

liked a model 10 days ago

mistralai/Ministral-3-14B-Base-2512

Updated Jan 15 • 13k • 54

upvoted an article 16 days ago

Article

GGML and llama.cpp join HF to ensure the long-term progress of Local AI

+4

18 days ago

•

479

upvoted 2 papers 28 days ago

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

Paper • 2409.02813 • Published Sep 4, 2024 • 33

MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

Paper • 2404.16006 • Published Apr 24, 2024 • 2

upvoted 2 articles 28 days ago

Article

Vision Language Models Explained

Apr 11, 2024

•

522

Article

SmolVLM - small yet mighty Vision Language Model

+3

Nov 26, 2024

•

416

upvoted a changelog about 1 month ago

Hugging Face Changelog

Find All Your Blog Drafts in One Place

Feb 2

• 43

New activity in twinkle-ai/fineweb-zhtw-filtered about 1 month ago

feat(README): Create Fineweb-zhtw-filtered Banner

#1 opened about 1 month ago by

liked a dataset about 1 month ago

twinkle-ai/fineweb-zhtw-filtered

Updated Feb 1 • 17 • 1

liked a model about 1 month ago

lianghsun/Marble-3B

Text Generation • 3B • Updated Jan 25 • 5

updated a model about 1 month ago

twinkle-ai/gemma-3-4B-T1-it-GGUF

Text Generation • 5B • Updated Jan 30 • 2.29k • 3

liked a Space about 1 month ago

T1 4B Chat

A 4B instruction-tuned Gemma 3 model optimized for zh-tw.

upvoted a changelog about 1 month ago

Hugging Face Changelog

Sort Datasets by Size

Jan 23

• 87

liked 2 datasets about 1 month ago

twinkle-ai/finevoices-zhtw

Updated Jan 25 • 16 • 1

HuggingFaceFW/finepdfs

Viewer • Updated Jan 9 • 476M • 35.4k • 821

upvoted an article about 1 month ago

Article

FineWeb-C: A Community-Driven Dataset for Educational Quality Annotations in 122 Languages

Jul 8, 2025

•

35