LuMamba Logo

LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling

Github License Paper

LuMamba (Latent Unified Mamba) is an EEG foundation model built on efficient Mamba state-space learning, capable of handling heterogeneous channel topologies. LuMamba addresses varying channel layouts with LUNA channel unification, projecting a given EEG channel layout to a fixed latent topology, and overcomes the quadratic complexity of transformers with FEMBA's efficient bidirectional Mamba encoder.


πŸ”’ License & Usage Policy (Weights)

Weights license: The released model weights are licensed under Creative Commons Attribution–NoDerivatives 4.0 (CC BY-ND 4.0). This section summarizes the practical implications for users. This is not legal advice; please read the full license text.

βœ… You may

  • Use and redistribute the unmodified LuMamba weights (including in commercial settings) with proper attribution to the LuMamba authors.
  • Fine-tune / adapt the weights for your internal use (research or production) without redistributing the modified weights.
  • Publish your code, configs, logs, and papers describing experiments with LuMamba (please cite the paper).

🚫 You may not

  • Share, host, or redistribute any modified weights (including LoRA/adapter/delta checkpoints or pruned/quantized variants). Any parameter set that encodes an adaptation is considered a derivative and cannot be shared under CC BY-ND 4.0.
  • Imply endorsement by the LuMamba authors for any derivative or evaluation without our written permission.
  • Use the LuMamba name in a way that suggests your modified model is an official LuMamba release.

🀝 How to contribute improvements (PR-gated releases)

We welcome community improvements via a pull-request (PR) workflow. If you believe your improvements should become an official LuMamba release:

  1. Open a PR in the BioFoundation repository describing the change (architecture/head/training recipe, datasets, preprocessing, compute).
  2. Include reproducibility artifacts: configs, seeds, scripts, environment details, training/validation logs, and the evaluation protocol (e.g., TUAB/TUAR/TUSL) with exact splits.
  3. Provide comprehensive results (AUROC/AUPR/BA, FLOPs, memory) vs. the baselines reported in the LuMamba paper.
  4. After maintainer review, approved changes will be retrained/validated and, if accepted, released by the maintainers as a new official LuMamba checkpoint under CC BY-ND 4.0.

Rationale: CC BY-ND protects users from fragmented, lower-quality β€œLuMamba variants,” while still enabling internal fine-tuning and a path for the community to upstream improvements through review.


πŸ”Ž Model Summary

  • Goal: Efficient and topology-agnostic EEG modeling with linear complexity in sequence length.
  • Core idea: Channel-Unification Module uses learned queries (Q) with cross-attention to map any set of channels to a fixed latent space. bidirectional Mamba blocks then operate on that latent sequence.
  • Pre-training data: TUEG, >21,000 hours of raw EEG; downstream subjects removed to avoid leakage.
  • Downstream tasks: TUAB (abnormal), TUAR (artifacts), TUSL (slowing), SEED-V (emotion; unseen 62-ch montage), APAVA (Alzheimer's disease; unseen 16-ch layout, TDBrain (Parkinson's disease; unseen 26-ch layout)

πŸš€ Model Variants

The model currently exists in a Tiny Variant, with the following parameters:

Variant Parameters FEMBA parameters LUNA parameters
LuMamba_tiny 4.1M (num_blocks = 2, exp = 2) (num_queries = 6, embed_dim = 64)

Larger model sizes can be attained by increasing the number of bi-Mamba blocks num_blocks (e.g. 8 bi-Mamba blocks yields 15M parameters).


πŸ“Š Results (Highlights)

  • TUAB (abnormal vs normal): 80.99 % Bal. Acc., 0.883 AUROC, 0.892 AUPR. (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
  • APAVA (Alzheimer's detection): 0.955 AUROC, 0.970 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).
  • TDBrain (Parkinson's detection): 0.961 AUROC, 0.960 AUPR (LuMamba-Tiny, pre-trained with LeJEPA-reconstruction).

Efficiency: Up to 377Γ— fewer FLOPs relative to transformer-based baselines and supporting up to 500x longer EEG windows, thanks to the efficient FEMBA bi-Mamba encoder.


🧠 Intended Use & Limitations

Intended use. Research on EEG representation learning & classification (abnormality, artifacts, slowing, emotion), especially when montages vary or channel counts are high.

Limitations.

  • Not a medical device. Do not use for clinical decisions without proper validation & regulatory clearance.
  • Unseen topologies: Zero-shot transfer to very different/dense layouts (e.g., SEED-V) can underperform SOTA despite positive scaling; consider augmenting pre-training montage diversity and spatial encodings.
  • Distribution shifts: Performance varies across cohorts, devices, and label protocols; validate locally and consider domain adaptation.

πŸ—οΈ Architecture & Training

LUNA Tokenizer & features. EEG is patch-segmented; temporal features via 1D conv w/ GroupNorm+GELU; frequency features (FFT mag/phase β†’ MLP) are added; 3D electrode coordinates encoded via NeRF-style sinusoids β†’ MLP (positional enc).

LUNA Channel-Unification Module. Q learned queries cross-attend to channel-wise patch features to produce a fixed QΓ—E latent per patch; FFN + Transformer layers refine the query tokens. Complexity is O(QΒ·C) (linear in channels).

FEMBA Bi-Mamba Temporal encoder. Mamba blocks process the embeddings in separate forward and backward streams.

Pre-training objectives. Masked-patch reconstruction is used to reconstruct masked tokens. In parallel, the LeJEPA loss encourages an isotropic Gaussian embedding distribution to minimize downstream prediction risk.


πŸ”§ How to Use

LuMamba weights are organized by pre-training configuration:

  • Reconstruction-only β†’ variants pre-trained with masked reconstruction exclusively
  • LeJEPA-reconstruction β†’ variants pre-trained with a balanced mixture of masked reconstruction and LeJEPA losses. Variants exist for two different LeJEPA hyperparameters: 128 and 300 projection slices.
  • LeJEPA-only β†’ variant pre-trained with LeJEPA exclusively.

All variants are pre-trained on TUEG.

LuMamba experiments are categorized by two Hydra configurations, in BioFoundation/config/experiments:

  • LuMamba_finetune.yaml β†’ configuration for fine-tuning experiments.
  • LuMamba_pretrain.yaml β†’ configuration for pre-training experiments.

πŸ”§ Fine-tuning β€” General Checklist

  1. Install & read data prep: clone the BioFoundation repo, set up the environment as described there, then open make_datasets/README.md for dataset-specific notes (naming, expected folder layout, and common pitfalls).
  2. Point to weights: set pretrained_safetensors_path: /path/to/LuMamba_*.safetensors in the experiment YAML.
  3. Preprocess data: acquire fine-tuning dataset and follow preprocessing protocol (see guide in /make_datasets/README.md) to generate train/test/val.h5 files.
  4. Update data module of LuMamba_finetune.yaml config:
    • TUH datasets (TUAB/TUSL/TUAR) β†’ change _target_ in /data_module: to datasets.tuh_dataset.TUH_Dataset.
    • Other β†’ change /data_module:_target_ to corresponding dataset.py file in BioFoundation/datasets (e.g., for TDBrain dataset use _target_:datasets.tdbrain_dataset.TDBrain_Dataset)
    • HDF5 file location β†’ change /data_module:hdf5_file for train, test, and val with the path to the corresponding HDF5 data split file.
  5. Task settings:
    • Task type: override with /task:finetune_task_LUNA for classification and /task:finetune_regression_task_LuMamba for regression tasks
    • Classification type: set classification_type (bc, mcc) and model.num_classes to match your downstream task. In a regression scenario,mcc is used and model.num_classes describes the number of features in the output.
    • Classifier choice: set /model:classifier_option (mamba for FEMBA classifier, linear for single-layer linear classifier,null for default LUNA classifier)
    • Configuration file includes further #CHANGEME tags and instructions for a working example.
  6. Env vars: export DATA_PATH (dataset root) and CHECKPOINT_DIR (artifacts).
  7. Trainer/optimizer: adjust gpus/devices, batch_size, max_epochs, LR/scheduler if needed.
  8. I/O: set io.base_output_path and confirm io.checkpoint_dirpath exists.

To launch fine-tuning (Hydra):

python -u run_train.py +experiment=LuMamba_finetune

βš–οΈ Responsible AI, Risks & Biases

  • Clinical safety: research-only; human oversight required.
  • Bias & drift: montage/device/population differences can induce shifts; validate and monitor.
  • Artifacts & rare events: robustness varies; use QC and task-appropriate preprocessing.

πŸ”— Sources


πŸ“œ Citation

If you use LuMamba, please cite:

@misc{broustail2026lumambalatentunifiedmamba,
      title={LuMamba: Latent Unified Mamba for Electrode Topology-Invariant and Efficient EEG Modeling}, 
      author={DanaΓ© Broustail and Anna Tegon and Thorir Mar Ingolfsson and Yawei Li and Luca Benini},
      year={2026},
      eprint={2603.19100},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2603.19100}, 
}

πŸ› οΈ Maintenance & Contact

  • Issues & support: please open a GitHub issue in the BioFoundation repository.

πŸ—’οΈ Changelog

  • v1.0: Initial release of LuMamba model card with task-specific checkpoints and instructions.
Downloads last month
12
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Paper for PulpBio/LuMamba