bruAristimunha commited on
Commit
a22ba3d
·
verified ·
1 Parent(s): 933c888

Add architecture-only model card

Browse files
Files changed (1) hide show
  1. README.md +238 -0
README.md ADDED
@@ -0,0 +1,238 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: bsd-3-clause
3
+ library_name: braindecode
4
+ pipeline_tag: feature-extraction
5
+ tags:
6
+ - eeg
7
+ - biosignal
8
+ - pytorch
9
+ - neuroscience
10
+ - braindecode
11
+ - foundation-model
12
+ - transformer
13
+ ---
14
+
15
+ # CodeBrain
16
+
17
+ CodeBrain: Scalable Code EEG Pre-Training for Unified Downstream BCI Tasks.
18
+
19
+ > **Architecture-only repository.** This repo documents the
20
+ > `braindecode.models.CodeBrain` class. **No pretrained weights are
21
+ > distributed here** — instantiate the model and train it on your own
22
+ > data, or fine-tune from a published foundation-model checkpoint
23
+ > separately.
24
+
25
+ ## Quick start
26
+
27
+ ```bash
28
+ pip install braindecode
29
+ ```
30
+
31
+ ```python
32
+ from braindecode.models import CodeBrain
33
+
34
+ model = CodeBrain(
35
+ n_chans=22,
36
+ sfreq=200,
37
+ input_window_seconds=4.0,
38
+ n_outputs=2,
39
+ )
40
+ ```
41
+
42
+ The signal-shape arguments above are example defaults — adjust them
43
+ to match your recording.
44
+
45
+ ## Documentation
46
+
47
+ - Full API reference (parameters, references, architecture figure):
48
+ <https://braindecode.org/stable/generated/braindecode.models.CodeBrain.html>
49
+ - Interactive browser with live instantiation:
50
+ <https://huggingface.co/spaces/braindecode/model-explorer>
51
+ - Source on GitHub: <https://github.com/braindecode/braindecode/blob/master/braindecode/models/codebrain.py#L21>
52
+
53
+ ## Architecture description
54
+
55
+ The block below is the rendered class docstring (parameters,
56
+ references, architecture figure where available).
57
+
58
+ <div class='bd-doc'><main>
59
+ <p>CodeBrain: Scalable Code EEG Pre-Training for Unified Downstream BCI Tasks.</p>
60
+ <span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#d9534f;color:white;font-size:11px;font-weight:600;margin-right:4px;">Foundation Model</span><span style="display:inline-block;padding:2px 8px;border-radius:4px;background:#56B4E9;color:white;font-size:11px;font-weight:600;margin-right:4px;">Attention/Transformer</span>
61
+
62
+
63
+
64
+ .. figure:: https://raw.githubusercontent.com/jingyingma01/CodeBrain/refs/heads/main/assets/intro.png
65
+ :align: center
66
+ :alt: CodeBrain pre-training overview
67
+ :width: 1000px
68
+
69
+ CodeBrain is a foundation model for EEG that pre-trains on large unlabelled
70
+ corpora using a two-stage vector-quantised masking strategy, then fine-tunes
71
+ on downstream BCI tasks. It segments EEG signals into fixed-size patches,
72
+ embeds them with convolutional and spectral projections, and processes them
73
+ through stacked residual blocks that combine a multi-scale convolutional
74
+ structured state-space model (``_GConv``) with sliding-window self-attention.
75
+
76
+ .. rubric:: Stage 2: EEGSSM Backbone (this implementation)
77
+
78
+ This class implements Stage 2 of CodeBrain — the EEGSSM backbone described
79
+ in Section 3.3 of [codebrain]_. Following :class:`Labram`, CodeBrain
80
+ discretises EEG patches into codebook tokens via VQ-VAE (Stage 1, not
81
+ implemented here), then trains the backbone to predict masked token indices
82
+ via cross-entropy. CodeBrain extends this with a *dual* tokenizer that
83
+ decouples temporal and frequency representations, as stated in the paper:
84
+ *"the TFDual-Tokenizer, which decouples heterogeneous temporal and frequency
85
+ EEG signals into discrete tokens to enhance discriminative power."*
86
+
87
+ .. rubric:: Macro Components
88
+
89
+ - **PatchEmbedding**: Splits ``(batch, n_chans, n_times)`` into
90
+ ``(batch, n_chans, seq_len, patch_size)`` patches, projects each patch
91
+ with a 2-D convolutional stack, adds FFT-based spectral embeddings, and
92
+ applies depth-wise convolutional positional encoding.
93
+ - **Residual blocks** (``ResidualGroup``): Each block applies RMSNorm,
94
+ a ``_GConv`` SSM layer, and sliding-window multi-head attention, with
95
+ gated activation and separate residual/skip paths.
96
+ - **Classification head** (``final_layer``): Flattens the output and maps
97
+ to ``n_outputs`` classes.
98
+
99
+ .. important::
100
+ **Pre-trained Weights Available**
101
+
102
+ This model has pre-trained weights available on the Hugging Face Hub.
103
+ You can load them using:
104
+
105
+ .. code:: python
106
+ from braindecode.models import CodeBrain
107
+
108
+ # Load pre-trained model from Hugging Face Hub
109
+ model = CodeBrain.from_pretrained("braindecode/codebrain-pretrained")
110
+
111
+ To push your own trained model to the Hub:
112
+
113
+ .. code:: python
114
+ model.push_to_hub("my-username/my-codebrain")
115
+
116
+ Parameters
117
+ ----------
118
+ patch_size : int, default=200
119
+ Number of time samples per patch. Input length is trimmed to the
120
+ nearest multiple of ``patch_size``.
121
+ res_channels : int, default=200
122
+ Width of the residual stream inside each ``ResidualBlock``.
123
+ skip_channels : int, default=200
124
+ Width of the skip-connection stream aggregated across blocks.
125
+ out_channels : int, default=200
126
+ Output channels of ``final_conv`` before the classification head.
127
+ num_res_layers : int, default=8
128
+ Number of stacked ``ResidualBlock`` modules.
129
+ drop_prob : float, default=0.1
130
+ Dropout rate used inside the ``_GConv`` SSM and attention layers.
131
+ s4_bidirectional : bool, default=True
132
+ Whether the ``_GConv`` SSM processes the sequence bidirectionally.
133
+ s4_layernorm : bool, default=False
134
+ Whether to apply layer normalisation inside the ``_GConv`` SSM.
135
+ Set to ``False`` to match the released pretrained checkpoint.
136
+ s4_lmax : int, default=570
137
+ Maximum sequence length for the ``_GConv`` SSM kernel. Also determines
138
+ the patch embedding dimension as ``s4_lmax // n_chans``.
139
+ s4_d_state : int, default=64
140
+ State dimension of the ``_GConv`` SSM.
141
+ conv_out_chans : int, default=25
142
+ Number of output channels in the patch projection convolutions.
143
+ conv_groups : int, default=5
144
+ Number of groups for ``GroupNorm`` in the patch projection.
145
+ activation : type[nn.Module], default=nn.ReLU
146
+ Non-linear activation class used in ``init_conv`` and ``final_conv``.
147
+
148
+ References
149
+ ----------
150
+ .. [codebrain] Yi Ding, Xuyang Chen, Yong Li, Rui Yan, Tao Wang, Le Wu (2025).
151
+ CodeBrain: Scalable Code EEG Pre-Training for Unified Downstream BCI Tasks.
152
+ https://arxiv.org/abs/2506.09110
153
+
154
+ .. rubric:: Hugging Face Hub integration
155
+
156
+ When the optional ``huggingface_hub`` package is installed, all models
157
+ automatically gain the ability to be pushed to and loaded from the
158
+ Hugging Face Hub. Install with::
159
+
160
+ pip install braindecode[hub]
161
+
162
+ **Pushing a model to the Hub:**
163
+
164
+ .. code::
165
+ from braindecode.models import CodeBrain
166
+
167
+ # Train your model
168
+ model = CodeBrain(n_chans=22, n_outputs=4, n_times=1000)
169
+ # ... training code ...
170
+
171
+ # Push to the Hub
172
+ model.push_to_hub(
173
+ repo_id="username/my-codebrain-model",
174
+ commit_message="Initial model upload",
175
+ )
176
+
177
+ **Loading a model from the Hub:**
178
+
179
+ .. code::
180
+ from braindecode.models import CodeBrain
181
+
182
+ # Load pretrained model
183
+ model = CodeBrain.from_pretrained("username/my-codebrain-model")
184
+
185
+ # Load with a different number of outputs (head is rebuilt automatically)
186
+ model = CodeBrain.from_pretrained("username/my-codebrain-model", n_outputs=4)
187
+
188
+ **Extracting features and replacing the head:**
189
+
190
+ .. code::
191
+ import torch
192
+
193
+ x = torch.randn(1, model.n_chans, model.n_times)
194
+ # Extract encoder features (consistent dict across all models)
195
+ out = model(x, return_features=True)
196
+ features = out["features"]
197
+
198
+ # Replace the classification head
199
+ model.reset_head(n_outputs=10)
200
+
201
+ **Saving and restoring full configuration:**
202
+
203
+ .. code::
204
+ import json
205
+
206
+ config = model.get_config() # all __init__ params
207
+ with open("config.json", "w") as f:
208
+ json.dump(config, f)
209
+
210
+ model2 = CodeBrain.from_config(config) # reconstruct (no weights)
211
+
212
+ All model parameters (both EEG-specific and model-specific such as
213
+ dropout rates, activation functions, number of filters) are automatically
214
+ saved to the Hub and restored when loading.
215
+
216
+ See :ref:`load-pretrained-models` for a complete tutorial.</main>
217
+ </div>
218
+
219
+ ## Citation
220
+
221
+ Please cite both the original paper for this architecture (see the
222
+ *References* section above) and braindecode:
223
+
224
+ ```bibtex
225
+ @article{aristimunha2025braindecode,
226
+ title = {Braindecode: a deep learning library for raw electrophysiological data},
227
+ author = {Aristimunha, Bruno and others},
228
+ journal = {Zenodo},
229
+ year = {2025},
230
+ doi = {10.5281/zenodo.17699192},
231
+ }
232
+ ```
233
+
234
+ ## License
235
+
236
+ BSD-3-Clause for the model code (matching braindecode).
237
+ Pretraining-derived weights, if you fine-tune from a checkpoint,
238
+ inherit the licence of that checkpoint and its training corpus.