SAM-RFI: Radio Frequency Interference Detection with SAM2

Automated RFI detection for radio astronomy using Meta's Segment Anything Model 2 (SAM2), fi ne tuned on radio visibility data.

Overview

SAM-RFI adapts SAM2's powerful segmentation capabilities to identify and flag Radio Frequenc y Interference (RFI) in radio astronomy measurement sets. The models are trained on physics- based synthetic RFI data and can detect various interference patterns including narrowband c arriers, broadband interference, and transient events.

Key Features:

  • 🎯 High accuracy: 80-90% IoU on validation data
  • ⚑ Multiple model sizes: Tiny to Large (balance speed vs accuracy)
  • πŸ”„ Iterative flagging: Progressive deep cleaning with multiple passes
  • πŸ› οΈ Easy integration: Compatible with CASA measurement sets
  • πŸ“¦ One-line usage: Auto-download and run with samrfi CLI

Model Sizes

All models are trained with vision encoder + mask decoder fine-tuning on radio astronomy data.

Size Best Train Loss Best Val Loss Learning Rate Batch Size
tiny 0.0708 0.0724 1e-06 8
small 0.0810 0.0764 1e-06 8
base_plus 0.0708 0.0740 1e-06 8
large 0.0708 0.0770 1e-06 8

Recommended Use Cases

  • tiny (40M params): Quick testing, low-memory environments (~4GB VRAM)
  • small (180M params): Balanced performance for general use (~8GB VRAM)
  • base_plus (330M params): High accuracy for production pipelines (~12GB VRAM)
  • large (850M params): Best performance, research applications (~16GB VRAM)

Quick Start

Installation

pip install samrfi[gpu]

Single-Pass Prediction

# Use any model size: tiny, small, base_plus, large
samrfi predict \
  --model polarimetic/sam-rfi/large \
  --input observation.ms

Iterative Prediction (Recommended)

For deep cleaning, use 2-3 iterations to progressively find fainter RFI:

samrfi predict \
  --model polarimetic/sam-rfi/large \
  --input observation.ms \
  --iterations 3

Python API

from samrfi.inference import RFIPredictor

# Initialize predictor (auto-downloads model)
predictor = RFIPredictor(
    model_path="polarimetic/sam-rfi/large",
    device="cuda"
)

# Single-pass prediction
flags = predictor.predict_ms("observation.ms")

# Iterative prediction (3 passes)
flags = predictor.predict_iterative("observation.ms", num_iterations=3)

Training Details

Architecture

  • Base Model: SAM2 (Segment Anything Model 2) from Meta AI
  • Vision Encoder: Hiera hierarchical transformer (fine-tuned for radio astronomy)
  • Prompt Encoder: Positional encoding for bounding boxes (frozen)
  • Mask Decoder: Transformer decoder (fine-tuned for RFI segmentation)

The vision encoder and mask decoder are trained on radio astronomy data, adapting SAM2's vis ual features to recognize RFI patterns in visibility waterfalls.

Training Data

  • Type: Physics-based synthetic RFI simulations
  • RFI Patterns: Narrowband carriers, broadband interference, impulsive events, satellite glint
  • Dynamic Range: 10^6 to 10^7 (matching real observations)
  • Samples: 4000-10000 training samples per model

Input Preprocessing

Radio visibility data is converted to 3-channel RGB-like features:

  1. Channel 1: Spatial gradient (edge detection for RFI boundaries)
  2. Channel 2: Log amplitude (intensity, range [-3, 4])
  3. Channel 3: Phase information ([-Ο€, Ο€] β†’ [0, 1])

All channels normalized with ImageNet statistics for SAM2 compatibility.

Hardware

  • Platform: NAIRR Jetstream-2
  • GPU: NVIDIA H100 (80 GB HBM3)
  • Framework: PyTorch 2.0+ with HuggingFace Transformers

Performance

Typical validation metrics:

  • IoU (Intersection over Union): 80-90%
  • Precision: 85-95%
  • Recall: 80-90%
  • F1 Score: 82-92%

Performance varies by RFI type, severity, and model size. Iterative prediction (2-3 passes) improves detection of faint RFI.

Usage Examples

Different Model Sizes

# Fast testing with tiny model
samrfi predict --model polarimetic/sam-rfi/tiny --input obs.ms

# Balanced performance with base_plus
samrfi predict --model polarimetic/sam-rfi/base_plus --input obs.ms --iterations 2

# Best performance with large model
samrfi predict --model polarimetic/sam-rfi/large --input obs.ms --iterations 3

Custom Flagging Strategies

from samrfi.inference import RFIPredictor

# Conservative flagging (fewer false positives)
predictor = RFIPredictor("polarimetic/sam-rfi/large", device="cuda")
flags = predictor.predict_ms("obs.ms")  # Single pass

# Aggressive deep cleaning (more thorough)
flags = predictor.predict_iterative("obs.ms", num_iterations=3)

Integration with CASA

SAM-RFI works directly with CASA measurement sets:

from samrfi.inference import RFIPredictor

# Flag RFI in measurement set
predictor = RFIPredictor("polarimetic/sam-rfi/large", device="cuda")
flags = predictor.predict_ms("observation.ms")

# Flags are automatically written to MS FLAG column
# Continue with CASA calibration pipeline...

Limitations

  • CASA dependency: Requires CASA tools for measurement set I/O
  • Preprocessing sensitivity: Best results when preprocessing matches training configurat ion
  • Data domain: Optimized for VLA-like interferometric data (single-dish and VLBI not ext ensively tested)
  • Over-flagging: Iterative prediction with >3 passes may flag clean data

Recommendations

  • Start with 1-2 iterations on new datasets, validate before using 3+
  • Validate on known clean data to assess over-flagging
  • Use appropriate model size for your hardware (tiny for testing, large for production)
  • Monitor statistics (mean, std, MAD) before/after flagging to balance RFI removal vs da ta retention

Citation

A paper describing SAM-RFI is in preparation. If you use these models, please cite:

@software{samrfi2024,
  author = {polarimetic},
  title = {SAM-RFI: Radio Frequency Interference Detection with SAM2},
  year = {2024},
  url = {https://github.com/preshanth/SAM-RFI},
  note = {Models available at https://huggingface.co/polarimetic/sam-rfi}
}

Acknowledgments

Training compute provided by the [National AI Research Resource (NAIRR)](https://nairrpilot. org/) Pilot via Jetstream-2 at Indiana University.

Repository & Documentation

  • GitHub: https://github.com/preshanth/SAM-RFI
  • Documentation: See repository README for installation, training, and advanced usage
  • Issues: Report bugs or request features on GitHub

License

MIT License - See repository for details.


Generated with SAM-RFI β€’ Models: tiny, small, base_plus, large

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for polarimetric/sam-rfi

Finetuned
(1)
this model