NeuroRVQ Logo

NeuroRVQ: Multi-Scale Biosignal Tokenization for Generative Foundation Models

1Imperial College London  2Cogitat
3National and Kapodistrian University of Athens  4Archimedes Research Unit
5Aristotle University of Thessaloniki  6Northeastern University London
NeuroRVQ Banner

Abstract

Biosignals such as electroencephalography (EEG), electrocardiography (ECG), and electromyography (EMG) encode physiological activity across multiple temporal and spectral scales, yielding representations that are rich but challenging for machine learning. Foundation models trained to predict masked signal tokens have shown promise in learning generalizable biosignal representations, yet their performance depends on the tokenizer's ability to preserve high-frequency dynamics and reconstruct signals with high fidelity. We introduce NeuroRVQ, a modality-adaptive biosignal tokenizer family designed for high-fidelity signal reconstruction. To capture the full frequency spectrum, NeuroRVQ decomposes biosignals into frequency-specific representations via multi-scale temporal convolutions, each encoded into hierarchical RVQ codebooks to preserve high-frequency detail, combined with a novel phase-aware training loss that respects the circular topology of Fourier phase. By tuning the temporal resolution, number and size of temporal kernels and RVQ depth, this design adapts to the spectro-temporal characteristics of each biosignal modality. To validate that tokenizer quality drives downstream performance, we train a simple masked-token foundation model for each modality (NeuroRVQ-FM) using the corresponding NeuroRVQ tokenizer. The NeuroRVQ-FM family achieves competitive or superior downstream performance compared to existing modality-specific foundation models, demonstrating that high-fidelity tokenization is a critical factor for effective biosignal modeling.

Key Contributions

Modality-Adaptive Multi-Scale Tokenizer

A modality-adaptive multiscale tokenizer architecture that decomposes biosignals into frequency-specific representations via temporal convolutions with varying kernel sizes and encodes each frequency scale into hierarchical Residual Vector Quantization (RVQ) codebooks,enabling high-fidelity signal reconstruction across EEG, ECG and EMG. By tuning the temporal resolution, number and size of temporal kernels, and RVQ depth, this architecture adapts to the spectro-temporal characteristics of each biosignal modality.

Phase-Aware Training Loss

A phase-aware training loss that reconstructs the Fourier spectrum through three complementary components: a log-amplitude loss for Fourier amplitude to emphasize high-frequency content, a temporal-domain regularization loss and a novel phase loss that respects the circular topology of phase angles by leveraging cosine similarity for directional alignment

Simple masked-token foundation model (NeuroRVQ-FM)

A simple masked-token foundation model (NeuroRVQ-FM) for each modality that leverages the corresponding NeuroRVQ tokenizer during pre-training and achieves competitive or superior downstream performance compared to existing modality-specific foundation models, demonstrating that high-fidelity tokenization (rather than model scale or architectural complexity) is a critical ingredient for effective biosignal modeling.

Method

The NeuroRVQ Tokenizer converts raw biosignals into compact and informative neural tokens. The input multi-variate time series is segmented into patches, encoded by the multi-scale temporal encoder at multiple resolutions, combined via a transformer encoder, then discretized into neural tokens through per-scale RVQ codebooks. Tokens are decoded to reconstruct the input patches using the Fourier spectrum.

NeuroRVQ Tokenizer Architecture
Figure 1: The NeuroRVQ tokenizer architecture with multi-scale temporal encoder and hierarchical RVQ codebooks.
1

Multi-Scale Temporal Encoding

Temporal convolutions with varying kernel sizes capture features across multiple frequency resolutions.

2

Transformer-Based Context Modeling

Transformer layers model long-range spatio-temporal dependencies across channels and patches, producing rich contextualized embeddings.

3

Hierarchical RVQ Quantization

Per-scale Residual Vector Quantization codebooks discretize the multi-scale embeddings into sequences of neural tokens optimized for reconstruction fidelity.

4

Phase-Aware Reconstruction

A decoder reconstructs input patches using the Fourier spectrum, supervised by a phase-aware training loss that jointly captures amplitude and phase information.

The NeuroRVQ Foundation Model operates on the tokenized representation, using masked-token prediction with symmetric masking. By working at the token level, it captures long-range dependencies, learns abstract neural dynamics, and enables efficient pre-training across diverse biosignal datasets. The learned codebooks serve as prediction targets during pre-training, and the resulting representations transfer effectively to a range of downstream BCI tasks.

NeuroRVQ Foundation Model Architecture
Figure 2: The NeuroRVQ foundation model with masked-token prediction and symmetric masking strategy.

Modality-Specific Configuration & Scaling Analysis

To determine the optimal tokenizer configuration for each modality, we conduct a systematic scaling analysis over the number of temporal branches and RVQ codebooks per branch. Two consistent patterns emerge across all modalities: deeper residual quantization yields large reductions in reconstruction error, and adding temporal branches provides complementary gains by decomposing the signal into frequency-specific representations that are each easier to quantize. The optimal balance between these two mechanisms is modality-dependent.

Scaling Analysis across EEG, ECG and EMG
Figure 3: Scaling analysis — mean squared error (MSE) reconstruction performance and number of trainable parameters for NeuroRVQ with various numbers of temporal branches and RVQ codebooks per branch across EEG, ECG and EMG.
Validation MSE during NeuroRVQ tokenizer training
Figure 4: Validation mean squared error (MSE) during NeuroRVQ tokenizer training across all three modalities. ECG exhibits the fastest convergence consistent with its stereotyped morphology, EEG converges to a stable plateau reflecting its richer spectral structure, and EMG converges more gradually consistent with its broadband nature.

Phase-Aware Training Loss

Architecture alone does not account for NeuroRVQ's reconstruction gains — the training loss is equally critical. The tokenizer is trained end-to-end with a composite objective operating in the Fourier domain that combines three complementary components: a log-amplitude loss that compresses dynamic range and emphasizes high-frequency content, a novel phase loss based on cosine similarity that respects the circular topology of Fourier phase angles, and a temporal-domain regularization term.

NeuroRVQ Phase-Aware Training Loss

Component Ablation Study

Starting from a simple baseline (single branch, single codebook, naive MSE on phase and amplitude), each NeuroRVQ component is added incrementally. The phase-aware loss delivers the largest single improvement, and the full combination achieves up to a 43× reduction in reconstruction error.

ConfigurationEEGECGEMG
1-Branch, 1CB, MSE Phase + MSE A1.5091.9561.996
4-Branch, 8CB, MSE Phase + MSE A0.9461.4171.651
4-Branch, 8CB, MSE Phase + log(A) + Temporal0.6030.8431.000
4-Branch, 8CB, Phase Loss + MSE A0.1670.2870.903
4-Branch, 8CB, Phase Loss + log(A) + Temporal 0.035 0.115 0.447

Validation MSE averaged over the last 10 training epochs. The full NeuroRVQ configuration (row 5) achieves a 43× reduction for EEG, 17× for ECG and 4.5× for EMG relative to the baseline.

Reconstruction Quality

Per-band analysis of reconstructed EEG signals. NeuroRVQ faithfully preserves waveform morphology across all frequency bands, while existing tokenizers lose high-frequency detail.

Per-Band EEG Reconstruction Comparison
Per-band reconstruction of EEG signal from LaBraM and NeuroRVQ codebook-based tokenizers. Green lines denote the input EEG signal, orange the reconstructed EEG signal for LaBraM and red the reconstructed EEG signal for NeuroRVQ.
BrainOmni vs NeuroRVQ Reconstruction Comparison
Reconstruction comparison between RVQ-codebook-based tokenizers, BrainOmni and NeuroRVQ.
EEG Reconstruction
EEG
EMG Reconstruction
EMG
ECG Reconstruction
ECG

Reconstruction Examples

Sample reconstructed signals from the validation set using the NeuroRVQ tokenizer across all three modalities. Blue lines denote the input signal and orange the reconstructed signal.

ECG Reconstruction Examples
Reconstructed ECG signals from the validation set.
EEG Reconstruction Examples
Reconstructed EEG signals from the validation set.
EMG Reconstruction Examples
Reconstructed EMG signals from the validation set.

Supported Modalities & Models

Model VersionBackboneModalityStatus
NeuroRVQ-EEG-tokenizer76MEEG✅ Released
NeuroRVQ-EEG-FM5.9MEEG✅ Released
NeuroRVQ-ECG-tokenizer76MECG✅ Released
NeuroRVQ-ECG-FM264KECG✅ Released
NeuroRVQ-EMG-tokenizer144MEMG✅ Released
NeuroRVQ-EMG-FM5.9MEMG✅ Released

Downstream Performance

EEG Downstream Performance

NeuroRVQ achieves state-of-the-art performance on five BCI downstream tasks, outperforming the next-best model by over 4 percentage points in mean balanced accuracy. Results are based on a rigorous subject-independent cross-validation benchmark.

ModelMotorERPMemorySleepEyesMeanSize
NeuroGPT0.6820.7570.5970.6740.8270.70779.5M
CBraMod0.6140.7770.5740.6350.8390.6884.9M
BIOT0.4430.5000.5100.7633.2M
MIRepNet0.689
BrainOmni0.5850.7230.5180.852
LaBraM0.6300.8220.5260.6520.7990.6865.8M
EEGPT0.3130.6680.5200.6340.7970.58725.7M
NeuroRVQ-EEG 0.700 0.876 0.574 0.728 0.869 0.749 5.9M

Benchmark from Assessing the Capabilities of Large Brainwave Foundation Models (IEEE MLSP 2025).

ECG Downstream Performance

NeuroRVQ-ECG achieves high balanced accuracy on both PTB-XL settings, doubling the balanced accuracy of HuBERT-ECG on the fine-grained 43-class task and substantially outperforming all baselines where class imbalance is most severe.

Model5-class PTB-XL43-class PTB-XL
AccuracyBAccAccuracyBAcc
HuBERT-ECG72.6060.2362.4920.71
ECGFounder76.5565.3965.5128.96
NeuroRVQ-ECG 70.19 64.50 79.17 58.33

EMG Downstream Performance

NeuroRVQ-EMG outperforms PhysioWave and TinyMyo across all four classification tasks by a wide margin, demonstrating that high-fidelity tokenization transfers effectively to muscular activity decoding.

ModelDiscrete GesturesEPN-612NinaPro DB5UCI-EMG
BAcc ↑CLER ↓AccF1AccF1AccF1
PhysioWave54.7064.2090.3090.3524.9122.9556.5255.76
TinyMyo39.7064.2084.6884.6825.2623.2985.9985.66
NeuroRVQ-EMG 70.80 27.60 94.65 94.66 41.36 38.76 89.43 89.28

Research Timeline

The journey of publications that shaped the development of NeuroRVQ.

Journal of Neural Engineering 2024

A Causal Perspective on Brainwave Modeling for Brain-Computer Interfaces

Konstantinos Barmpas, Yannis Panagakis, Georgios Zoumpourlis, Dimitrios A. Adamos, Nikolaos Laskaris, Stefanos Zafeiriou

Introduced a causal reasoning framework for BCI paradigms, providing step-by-step guidelines for designing robust brainwave decoders that generalize beyond controlled laboratory settings.

J. Neural Eng. →
NeurIPS 2024 Workshop

A Causal Perspective in Brainwave Foundation Models

Konstantinos Barmpas, Yannis Panagakis, Dimitrios A. Adamos, Nikolaos Laskaris, Stefanos Zafeiriou

Examined LBM training through causal reasoning, identifying key challenges impacting performance and generalization in BCI applications.

CaLM Workshop →
NeurIPS 2024 Workshop

Position: Addressing Ethical Challenges and Safety Risks in GenAI-Powered BCIs

Konstantinos Barmpas*, Georgios Zoumpourlis*, Yannis Panagakis, Dimitrios A. Adamos, Nikolaos Laskaris, Stefanos Zafeiriou

Identified safety and ethical concerns in GenAI for BCIs, including synthetic neural activity, behaviour profiling, and privacy risks, along with mitigation strategies.

GenAI for Health Workshop →
IEEE MLSP 2025 · ICLR 2025 Workshop

Assessing the Capabilities of Large Brainwave Foundation Models

Na Lee*, Stylianos Bakas*, Konstantinos Barmpas, Yannis Panagakis, Dimitrios A. Adamos, Nikolaos Laskaris, Stefanos Zafeiriou

Proposed a rigorous benchmarking protocol using causal reasoning and subject-independent cross-validation to properly evaluate LBMs across diverse BCI paradigms.

IEEE Xplore →
ICML 2025

Are Large Brainwave Foundation Models Capable Yet? Insights from Fine-tuning

Na Lee*, Konstantinos Barmpas*, Yannis Panagakis, Dimitrios Adamos, Nikolaos Laskaris, Stefanos Zafeiriou

Comprehensively evaluated LBMs through fine-tuning experiments, revealing marginal gains over traditional architectures and pioneering LoRA adaptation for brainwave models.

ICML Proceedings →
Survey 2025

A Comprehensive Review of Biosignal Foundation Models

Na Lee, Konstantinos Barmpas, Alexandros Koliousis, Yannis Panagakis, Dimitrios Adamos, Nikolaos Laskaris, Stefanos Zafeiriou

A structured survey covering EEG, ECG, EMG, EOG, and PPG foundation models — reviewing data processing, architectures, pre-training paradigms, and open challenges.

TechRxiv →
NeurIPS 2025 Workshop

Subject-Aware Contrastive Learning for EEG Foundation Models

Antonis Karantonis*, Konstantinos Barmpas*, Dimitrios A. Adamos, Nikolaos Laskaris, Stefanos Zafeiriou, Yannis Panagakis

Introduced the first subject-aware contrastive EEG foundation model, leveraging intra-subject variability across sessions as a natural supervisory signal.

Learning from Time Series for Health Workshop →
NeurIPS 2025 Workshop

Advancing Brainwave Modeling with a Codebook-Based Foundation Model

Konstantinos Barmpas*, Na Lee*, Yannis Panagakis, Dimitrios A. Adamos, Nikolaos Laskaris, Stefanos Zafeiriou

Introduced LaBraM++, an enhanced LBM with principled signal processing improvements to the tokenizer, achieving 6% improvement over the original architecture.

Foundation Models for the Brain and Body Workshop →
arXiv 2025

NeuroRVQ: Multi-Scale EEG Tokenization for Generative Large Brainwave Models

Konstantinos Barmpas, Na Lee, Alexandros Koliousis, Yannis Panagakis, Dimitrios A. Adamos, Nikolaos Laskaris, Stefanos Zafeiriou

Introduced a state-of-the-art codebook-based EEG tokenizer with multi-scale RVQ and phase-aware training, powering a new generation of Large Brainwave Models.

arXiv →
arXiv 2026

Beyond Accuracy: Robustness, Interpretability and Expressiveness of EEG Foundation Models

Urban Širca, Maryam Alimardani, Stefanos Zafeiriou, Konstantinos Barmpas

First systematic assessment of EEG foundation model robustness under noise and channel dropout, interpretability via AttnLRP attribution maps, and expressiveness through block-wise probing.

arXiv →
arXiv 2026

NeuroRVQ: Multi-Scale Biosignal Tokenization for Generative Foundation Models

Konstantinos Barmpas, Na Lee, Dimitrios Chalatsis, William Raftery, Yannis Panagakis, Dimitrios A. Adamos, Nikolaos Laskaris, Alexandros Koliousis, Dario Farina, Stefanos Zafeiriou

Culmination of all prior insights — a state-of-the-art modality-adaptive biosignal tokenizer with multi-scale RVQ and phase-aware training across EEG, ECG and EMG, powering a new generation of biosignal foundation models.

arXiv →

BibTeX

@misc{neurorvq,
      title={NeuroRVQ: Multi-Scale Biosignal Tokenization for Generative Foundation Models},
      author={Konstantinos Barmpas and Na Lee and Dimitrios Chalatsis and William Raftery and Yannis Panagakis and Dimitrios A. Adamos and Nikolaos Laskaris and Alexandros Koliousis and Dario Farina and Stefanos Zafeiriou},
      year={2026},
      eprint={2510.13068},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2510.13068},
}