River Rider PRO

RiverRider

Space-Bacon

AI & ML interests

Computational semiotics is empirically proven. It takes three to tango 💃🪩🕺

Recent Activity

updated a Space about 16 hours ago

RiverRider/srt-introspect

published a Space about 16 hours ago

RiverRider/srt-introspect

reacted to theirpost with 👀 1 day ago

A single forward pass of the frozen Qwen-2.5-7B model plus a lightweight classifier reaches 0.866 plus or minus 0.011 AUC on the full TruthfulQA-MC2 benchmark. No adapters. No fine-tuning. No extra parameters on the backbone. This is the strongest hidden-state truthfulness detector reported on the benchmark to date. The same latent features that the SRT-NLA-AV-v1 demo reads out as coherent natural-language verbalizations turn out to be rich enough to support production-grade auditing for honesty versus hallucination. The internal semiotic infrastructure we have been exploring in public is already information-dense enough to solve hard downstream problems with almost trivial overhead. You can watch the underlying latent geometry in action right here: https://huggingface.co/spaces/RiverRider/srt-nla-av-v1-demo Full code, artifacts, and reproduction steps are in the repository: https://github.com/space-bacon/SRT Try the Glass Box https://huggingface.co/spaces/RiverRider/srt-nla-demo

View all activity

Organizations

updated a Space about 16 hours ago

SRT introspect

🧭

Adaptive-density reasoning traces over a frozen Qwen-2.5-7B

published a Space about 16 hours ago

SRT introspect

🧭

Adaptive-density reasoning traces over a frozen Qwen-2.5-7B

reacted to their post with 👀 1 day ago

Post

181

A single forward pass of the frozen Qwen-2.5-7B model plus a lightweight classifier reaches 0.866 plus or minus 0.011 AUC on the full TruthfulQA-MC2 benchmark. No adapters. No fine-tuning. No extra parameters on the backbone.

This is the strongest hidden-state truthfulness detector reported on the benchmark to date.

The same latent features that the SRT-NLA-AV-v1 demo reads out as coherent natural-language verbalizations turn out to be rich enough to support production-grade auditing for honesty versus hallucination. The internal semiotic infrastructure we have been exploring in public is already information-dense enough to solve hard downstream problems with almost trivial overhead.

You can watch the underlying latent geometry in action right here:
RiverRider/srt-nla-av-v1-demo

Full code, artifacts, and reproduction steps are in the repository:
https://github.com/space-bacon/SRT

Try the Glass Box
RiverRider/srt-nla-demo

posted an update 5 days ago

Post

181

A single forward pass of the frozen Qwen-2.5-7B model plus a lightweight classifier reaches 0.866 plus or minus 0.011 AUC on the full TruthfulQA-MC2 benchmark. No adapters. No fine-tuning. No extra parameters on the backbone.

This is the strongest hidden-state truthfulness detector reported on the benchmark to date.

The same latent features that the SRT-NLA-AV-v1 demo reads out as coherent natural-language verbalizations turn out to be rich enough to support production-grade auditing for honesty versus hallucination. The internal semiotic infrastructure we have been exploring in public is already information-dense enough to solve hard downstream problems with almost trivial overhead.

You can watch the underlying latent geometry in action right here:
RiverRider/srt-nla-av-v1-demo

Full code, artifacts, and reproduction steps are in the repository:
https://github.com/space-bacon/SRT

Try the Glass Box
RiverRider/srt-nla-demo

updated a Space 6 days ago

MindReader-NLA

🧠

Ask a frozen LM what it is thinking, in plain English.

reacted to their post with 🔥 6 days ago

Post

396

🧠 New Space: MindReader-NLA — ask a frozen LM what it's thinking, in plain English.

A trained Activation Verbalizer (~5–13M params, frozen backbone) over Qwen-2.5-7B, Llama-3.2-3B, and Gemma-2-2B. Three demos in one Space:

Playground — sample K verbalizations of the layer-L hidden state and score how well each reproduces the original activation when fed back through the same frozen model (raw + anisotropy-centred cosine FVE).

Live Thought Trace — stream a verbalization per token as the model writes, side-by-side with the generation.

Steer-by-Editing — edit the verbalized thought, project it back into hidden-state space, and watch the continuation change.

Runs on ZeroGPU. Try it: RiverRider/srt-nla-demo

Paper + code: https://github.com/space-bacon/SRT