Note: You must use the custom python script to run this model properly, you can download it from here by going into the downloads option and scrolling down.

Glint-1

⚠️ IMPORTANT NOTICE

  1. This model is experimental. Glint-1 is a 1M parameter research model designed for architectural experimentation.
  2. Performance characteristics: The model exhibits behavioral patterns comparable to ~2M parameter models despite its compact size.
  3. Not production-ready: This release demonstrates functional capability, not optimal performance.

Overview

Glint-1 is an ultra-compact language model developed by CompactAI following our rebrand initiative. This 1M parameter model demonstrates that efficient architectural design can yield behavioral characteristics typically associated with larger models (~2M parameters).

This release includes both Pretrained Weights (base language modeling) and Instruction-Tuned Weights (fine-tuned for conversational tasks).

Model Specifications

Parameter Value
Architecture Transformer Decoder
Parameters ~1M
Effective Behavior ~2M parameter equivalent
Context Length 2,048 tokens
Vocabulary Standard
Normalization RMSNorm
Activation SwiGLU

Benchmarks

Glint-1 has been evaluated on standard language modeling and reasoning benchmarks:

BLiMP Benchmark

Grammaticality minimal pairs across 67 paradigms. Accuracy measured as % grammatical < ungrammatical perplexity.

BLiMP Benchmark

ARC-Easy Benchmark

Multiple-choice science QA (~2.4K questions) using perplexity-based answer selection.

ARC-Easy Benchmark

WikiText-2 Benchmark

Language modeling perplexity on Wikipedia test split. Lower is better.

WikiText-2 Benchmark

Training Details

Parameter Value
Batch Size 48
Learning Rate 8e-4 (pretrain), 2e-4 (SFT)
Warmup 300 steps
Weight Decay 0.02
Max Grad Norm 1.0

Limitations

  • Repetition: May exhibit repetitive generation patterns
  • Knowledge: Limited world knowledge due to parameter constraints
  • Reliability: Not suitable for production applications or critical tasks
  • Purpose: Intended for research, educational purposes, and architectural benchmarking

Usage

This model is released for research purposes. While functional, users should not expect state-of-the-art performance. The model demonstrates that compact architectures can achieve reasonable behavioral characteristics, making it suitable for:

  • Architectural research
  • Edge deployment experiments
  • Educational purposes
  • Baseline comparisons

Generated by CompactAI for research purposes. Use responsibly.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train CompactAI-O/Glint-1

Collection including CompactAI-O/Glint-1