Steffen Röcker's picture

Open to Collab

Steffen Röcker PRO

sroecker

·

https://x.com/sroecker

AI & ML interests

Local models

Recent Activity

upvoted a collection 3 days ago

liked a model 6 days ago

poolside/Laguna-XS.2-speculator.dflash

liked a model 7 days ago

mudler/LFM2.5-8B-A1B-APEX-GGUF

View all activity

Organizations

upvoted a collection 3 days ago

🧬 Carbon

Carbon 500M, 3B, 8B genomic models and GGUF variants for llama.cpp • 7 items • Updated 3 days ago • 43

upvoted 2 collections 17 days ago

NuExtract3

12 items • Updated 11 days ago • 15

GSQ

GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling, https://huggingface.co/papers/2604.18556 • 9 items • Updated 11 days ago • 6

upvoted a collection 20 days ago

Fixed Chat Templates for Qwen 3.5 & 3.6

Rewritten Jinja templates fixing 5 bugs in official Qwen 3.5/3.6. Works in LM Studio, llama.cpp, MLX, vLLM. • 1 item • Updated Apr 30 • 4

upvoted a paper 21 days ago

Achieving Gold-Medal-Level Olympiad Reasoning via Simple and Unified Scaling

Paper • 2605.13301 • Published 24 days ago • 159

upvoted a collection 24 days ago

SpecDrift

Models released as a part of Attention-Drift Paper, trained for deployment on production • 2 items • Updated 27 days ago • 2

upvoted a collection 27 days ago

Gemma 4 Assistant GGUF

Gemma 4 MTP assistant drafters as GGUF (F16/Q8_0/Q5_K_M/Q4_K_M/Q4_K_S). Speculative-decoding heads for the atomic-llama-cpp-turboquant fork. • 4 items • Updated 29 days ago • 11

upvoted 2 collections about 1 month ago

8GB VRAM Local LLMs - Practitioner Tested

11 models on RTX 4060 Ti 8GB. speed king: Llama 1B (228 tok/s). quality king: Gemma 4 E4B (6/6). agentic king: gpt-oss-20b (10/10). • 21 items • Updated 15 days ago • 5

Granite 4.1 Language Models

Efficient language models for multilingual generation, coding, RAG, and AI assistant workflows. • 6 items • Updated Apr 29 • 56

upvoted an article about 1 month ago

Article

Granite 4.1 LLMs: How They’re Built

ibm-granite

•

Apr 29

• 77

upvoted 9 collections about 1 month ago

APEX Quants (GGUF)

MoE models quantized with the APEX Quantization technique ( https://github.com/mudler/apex-quant ) • 36 items • Updated 7 days ago • 109

1930 Coder

Fine-tuning the Talkie 13B 1930 model on agentic trajectories • 4 items • Updated about 1 month ago • 4

Qwen3.6

4 items • Updated Apr 22 • 396

Laguna XS.2

Designed for agentic coding and long-horizon work on a local machine. Apache 2.0. • 5 items • Updated 29 days ago • 24

privacy-filter

OpenAI's privacy-filter fine0tuned models • 6 items • Updated about 1 month ago • 10

talkie-13b

talkie-1930-13b is a vintage language model trained on pre-1931 English-language text. See https://github.com/talkie-lm/talkie to run talkie. • 3 items • Updated Apr 21 • 55

DeepSeek V4

8 items • Updated Apr 25 • 8

DeepSeek-V4

4 items • Updated Apr 24 • 672

DR-Venus

5 items • Updated Apr 24 • 17

upvoted a paper about 1 month ago

Pushing the Limits of Large Language Model Quantization via the Linearity Theorem

Paper • 2411.17525 • Published Nov 26, 2024 • 6