VQ-RL Codebook Pretrained Model (PPG + IMU)
🚧 This repository is currently under preparation. Full model weights and code will be released upon publication.
Overview
This repository will provide a pretrained Vector Quantized Representation Learning (VQ-RL) model for multimodal physiological signals, including photoplethysmography (PPG) and inertial measurement unit (IMU) data.
The model learns discrete token representations using a shared codebook trained on large-scale public datasets. This model serves as a pretrained backbone for downstream QoL-related representation analysis.
Pretraining Data
The model is pretrained using large-scale publicly available datasets:
- NHANES (IMU signals)
- VitalDB (PPG signals)
- MIMIC-III waveform dataset (PPG signals)
Model Description
The VQ-RL model consists of:
- Encoder: 1D convolution-based temporal feature extractor
- Vector Quantization: shared codebook for tokenization
- Decoder: signal reconstruction module
The model discretizes continuous physiological signals into sequences of codebook indices.
Intended Use
This model is designed for:
- Tokenization of physiological time-series data
- Representation learning
- Downstream analysis such as QoL-related modeling
⚠️ This model is NOT intended for direct QoL prediction.
Current Status
⏳ Model weights: Not yet released
⏳ Codebook representations: Not yet released
⏳ Inference code: Not yet released
Release Plan
All resources will be released upon publication of the associated manuscript, including:
- Pretrained model weights
- Learned codebook
- Tokenization pipeline
- Reproducibility scripts
Notes
- Clinical datasets cannot be publicly released due to IRB restrictions
- Example data and reproducibility pipeline will be provided