SAM-RFI: Radio Frequency Interference Detection with SAM2

Automated RFI detection for radio astronomy using Meta's Segment Anything Model 2 (SAM2), fi ne tuned on radio visibility data.

Overview

SAM-RFI adapts SAM2's powerful segmentation capabilities to identify and flag Radio Frequenc y Interference (RFI) in radio astronomy measurement sets. The models are trained on physics- based synthetic RFI data and can detect various interference patterns including narrowband c arriers, broadband interference, and transient events.

Key Features:

🎯 High accuracy: 80-90% IoU on validation data
⚡ Multiple model sizes: Tiny to Large (balance speed vs accuracy)
🔄 Iterative flagging: Progressive deep cleaning with multiple passes
🛠️ Easy integration: Compatible with CASA measurement sets
📦 One-line usage: Auto-download and run with samrfi CLI

Model Sizes

All models are trained with vision encoder + mask decoder fine-tuning on radio astronomy data.

Size	Best Train Loss	Best Val Loss	Learning Rate	Batch Size
tiny	0.0708	0.0724	1e-06	8
small	0.0810	0.0764	1e-06	8
base_plus	0.0708	0.0740	1e-06	8
large	0.0708	0.0770	1e-06	8

Recommended Use Cases

tiny (40M params): Quick testing, low-memory environments (~4GB VRAM)
small (180M params): Balanced performance for general use (~8GB VRAM)
base_plus (330M params): High accuracy for production pipelines (~12GB VRAM)
large (850M params): Best performance, research applications (~16GB VRAM)

Quick Start

Installation

pip install samrfi[gpu]

Single-Pass Prediction

# Use any model size: tiny, small, base_plus, large
samrfi predict \
  --model polarimetic/sam-rfi/large \
  --input observation.ms

Iterative Prediction (Recommended)

For deep cleaning, use 2-3 iterations to progressively find fainter RFI:

samrfi predict \
  --model polarimetic/sam-rfi/large \
  --input observation.ms \
  --iterations 3

Python API

from samrfi.inference import RFIPredictor

# Initialize predictor (auto-downloads model)
predictor = RFIPredictor(
    model_path="polarimetic/sam-rfi/large",
    device="cuda"
)

# Single-pass prediction
flags = predictor.predict_ms("observation.ms")

# Iterative prediction (3 passes)
flags = predictor.predict_iterative("observation.ms", num_iterations=3)

Training Details

Architecture

Base Model: SAM2 (Segment Anything Model 2) from Meta AI
Vision Encoder: Hiera hierarchical transformer (fine-tuned for radio astronomy)
Prompt Encoder: Positional encoding for bounding boxes (frozen)
Mask Decoder: Transformer decoder (fine-tuned for RFI segmentation)

The vision encoder and mask decoder are trained on radio astronomy data, adapting SAM2's vis ual features to recognize RFI patterns in visibility waterfalls.

Training Data

Type: Physics-based synthetic RFI simulations
RFI Patterns: Narrowband carriers, broadband interference, impulsive events, satellite glint
Dynamic Range: 10^6 to 10^7 (matching real observations)
Samples: 4000-10000 training samples per model

Input Preprocessing

Radio visibility data is converted to 3-channel RGB-like features:

Channel 1: Spatial gradient (edge detection for RFI boundaries)
Channel 2: Log amplitude (intensity, range [-3, 4])
Channel 3: Phase information ([-π, π] → [0, 1])

All channels normalized with ImageNet statistics for SAM2 compatibility.

Hardware

Platform: NAIRR Jetstream-2
GPU: NVIDIA H100 (80 GB HBM3)
Framework: PyTorch 2.0+ with HuggingFace Transformers

Performance

Typical validation metrics:

IoU (Intersection over Union): 80-90%
Precision: 85-95%
Recall: 80-90%
F1 Score: 82-92%

Performance varies by RFI type, severity, and model size. Iterative prediction (2-3 passes) improves detection of faint RFI.

Usage Examples

Different Model Sizes

# Fast testing with tiny model
samrfi predict --model polarimetic/sam-rfi/tiny --input obs.ms

# Balanced performance with base_plus
samrfi predict --model polarimetic/sam-rfi/base_plus --input obs.ms --iterations 2

# Best performance with large model
samrfi predict --model polarimetic/sam-rfi/large --input obs.ms --iterations 3

Custom Flagging Strategies

from samrfi.inference import RFIPredictor

# Conservative flagging (fewer false positives)
predictor = RFIPredictor("polarimetic/sam-rfi/large", device="cuda")
flags = predictor.predict_ms("obs.ms")  # Single pass

# Aggressive deep cleaning (more thorough)
flags = predictor.predict_iterative("obs.ms", num_iterations=3)

Integration with CASA

SAM-RFI works directly with CASA measurement sets:

from samrfi.inference import RFIPredictor

# Flag RFI in measurement set
predictor = RFIPredictor("polarimetic/sam-rfi/large", device="cuda")
flags = predictor.predict_ms("observation.ms")

# Flags are automatically written to MS FLAG column
# Continue with CASA calibration pipeline...

Limitations

CASA dependency: Requires CASA tools for measurement set I/O
Preprocessing sensitivity: Best results when preprocessing matches training configurat ion
Data domain: Optimized for VLA-like interferometric data (single-dish and VLBI not ext ensively tested)
Over-flagging: Iterative prediction with >3 passes may flag clean data

Recommendations

Start with 1-2 iterations on new datasets, validate before using 3+
Validate on known clean data to assess over-flagging
Use appropriate model size for your hardware (tiny for testing, large for production)
Monitor statistics (mean, std, MAD) before/after flagging to balance RFI removal vs da ta retention

Citation

A paper describing SAM-RFI is in preparation. If you use these models, please cite:

@software{samrfi2024,
  author = {polarimetic},
  title = {SAM-RFI: Radio Frequency Interference Detection with SAM2},
  year = {2024},
  url = {https://github.com/preshanth/SAM-RFI},
  note = {Models available at https://huggingface.co/polarimetic/sam-rfi}
}

Acknowledgments

Training compute provided by the [National AI Research Resource (NAIRR)](https://nairrpilot. org/) Pilot via Jetstream-2 at Indiana University.

Repository & Documentation

GitHub: https://github.com/preshanth/SAM-RFI
Documentation: See repository README for installation, training, and advanced usage
Issues: Report bugs or request features on GitHub

License

MIT License - See repository for details.

Generated with SAM-RFI • Models: tiny, small, base_plus, large

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for polarimetric/sam-rfi

Base model

facebook/sam2.1-hiera-base-plus

Finetuned

(1)

this model