Foresight V1 32B - Open-Source Forecasting Model

Lightning Rod Labs | lightningrod.ai

Foresight V1 32B is a forecasting model fine-tuned from Qwen3-32B via outcome-based RL. Despite being 10-100x smaller, it has outperformed frontier models on Brier score, ECE, and profitability.

Our latest model, Foresight V3, can be tested at dashboard.lightningrod.ai.

Lightning Rod Labs takes you from raw data to fine-tuned model. With automated training data generation, fine-tuning, and evaluation, all in one place. No manual labeling required.

3rd Party Benchmarks 🏆

Feb 2026: Foresight V1 32B ranked #1 on Prophet Arena Sports, a benchmark run by SIGMA Lab at UChicago, beating Grok-4, GPT-5.2, Gemini 3 Pro, and Claude Opus 4.5 on live prediction questions.

Jan 2026: Foresight V1 32B is the only non-frontier model in the top 5 on ForecastBench, an independent forecasting benchmark run by the Forecasting Research Institute, where AIs compete on real-world forecasting questions.

Key Results

Evaluated on August 25, 2025 against 251 live Polymarket questions, Foresight-v1 outperformed every frontier model tested on accuracy (Brier Score), calibration (ECE), and profitability.

Further details on our methodology and results are available here.

How It Works

Foresight V1 32B was trained using outcome-based RL. The model was shown only information available at prediction time, forced to commit to a probability, and scored against the realized outcome using the Brier score as the reward signal. Confident wrong predictions were penalized more heavily than uncertain ones, directly incentivizing calibration over overconfidence.

Training data was generated using our Foresight Data platform, which automatically transformed unstructured sources into labeled training datasets — no human annotation required.

The same framework has been applied across domains to create prediction agents and domain expert models, including finance, healthcare, insurance, and sports analytics.

See: LLMs Can Teach Themselves to Better Predict the Future · Outcome-based Reinforcement Learning to Predict the Future · Future-as-Label: Scalable Supervision from Real-World Outcomes

Output Format

Foresight-32B is OpenAI API-compatible. See recommended usage for generating predictions.

About Lighting Rod Labs

Lightning Rod Labs takes you from raw data to fine-tuned model. With automated training data generation, fine-tuning, and evaluation, all in one place. No manual labeling required. Our research is peer-reviewed and published, including in Transactions on Machine Learning Research (TMLR). Our models have been benchmarked live and outperformed the world's best.

A few highlights:

🏆 #1 on ProphetArena Sport, beating GPT-5.2, Gemini 3 Pro, and Grok-4 (Feb 2026)
📊 Top 5 on ForecastBench, outperforming Claude, O3, and Grok-4 (Jan 2026)
🔬 Published in TMLR: 14B model matches o1 accuracy and generates >10% profit in live trading simulations [link]
🏛️ Vetted and awardable for U.S. defense procurement via DARPA ERIS and CDAO Tradewinds marketplaces
📰 Featured in The Atlantic, TIME, and the Forecasting Research Institute

Contact

Interested in generating training data for your own models or building a custom prediction model?

License

apache-2.0

Downloads last month: 85

Safetensors

Model size

33B params

Tensor type

BF16

Model tree for LightningRodLabs/foresight-32B

Base model

Qwen/Qwen3-32B

Finetuned

(505)

this model

Quantizations

2 models

Papers for LightningRodLabs/foresight-32B