zongzhex commited on
Commit
b0c3041
·
verified ·
1 Parent(s): 355ef9f

Add README.md

Browse files
Files changed (1) hide show
  1. README.md +178 -3
README.md CHANGED
@@ -1,3 +1,178 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - biosignals
7
+ - sleep
8
+ - multimodal
9
+ - contrastive-learning
10
+ - eeg
11
+ - ecg
12
+ - polysomnography
13
+ library_name: open_clip
14
+ ---
15
+
16
+ # SleepLM: Natural-Language Intelligence for Human Sleep
17
+ [![Paper](https://img.shields.io/badge/paper-arXiv-red)](#citation)
18
+ [![Webpage](https://img.shields.io/badge/website-demo-blue)](https://yang-ai-lab.github.io/SleepLM/)
19
+ [![License](https://img.shields.io/badge/license-MIT-green)](LICENSE)
20
+ [![Python](https://img.shields.io/badge/python-3.10%2B-brightgreen)](#installation)
21
+
22
+ SleepLM is, to our knowledge, the first sleep-language foundation model family that enables targeted natural language generation from multimodal polysomnography (PSG) while also learning a shared signal–text embedding space for retrieval and open vocabulary sleep understanding. It is trained on the largest paired sleep–text corpus to date, built from five NSRR cohorts totaling 100K+ hours of PSG from 10,000+ individuals.
23
+
24
+ SleepLM supports controllable, domain-specific generation (brain, cardiac, respiration, somatic) as well as holistic summaries, moving beyond fixed label spaces like sleep stages and events. The model combines contrastive alignment, captioning, and signal reconstruction to preserve physiological fidelity while learning strong cross-modal semantics. Across a broad benchmark, SleepLM enables sleep-text retrieval, zero-shot and few-shot generalization, and robust transfer to unseen concepts.
25
+
26
+ ---
27
+
28
+ ## 📰 News
29
+ - **[2026-02-23]** Code released on GitHub!
30
+ - **[2026-02-23]** Project website is live!
31
+
32
+ ---
33
+
34
+ ## ✨ What you can do with this repo
35
+
36
+ - **Targeted caption generation** for 30-second sleep epochs using modality tokens (brain / cardiac / respiration / somatic).
37
+ - **Cross modal retrieval** by encoding signals and text into a shared embedding space and computing cosine similarity.
38
+ - Run an interactive demo in **`demo.ipynb`**.
39
+
40
+ ---
41
+
42
+ ## 🚀 Quickstart
43
+
44
+ ### 1) Install
45
+
46
+ ```bash
47
+ git clone https://github.com/yang-ai-lab/SleepLM
48
+ cd SleepLM
49
+ pip install -r requirements.txt
50
+ ```
51
+
52
+ ### 2) Download checkpoint
53
+
54
+ The model checkpoint is hosted on Hugging Face Hub:
55
+
56
+ ```python
57
+ from huggingface_hub import hf_hub_download
58
+ checkpoint_path = hf_hub_download(repo_id="yang-ai-lab/SleepLM-Base", filename="model_checkpoint.pt")
59
+ ```
60
+
61
+ Or via the CLI:
62
+ ```bash
63
+ huggingface-cli download yang-ai-lab/SleepLM-Base model_checkpoint.pt
64
+ ```
65
+
66
+ The checkpoint will be cached locally by `huggingface_hub` and the returned path can be passed directly to `load_checkpoint()` in `demo.ipynb`.
67
+
68
+ ### 3) Prepare your data
69
+
70
+ Preprocess your PSG recordings into a float32 PyTorch tensor of shape `[N, 10, 1920]`
71
+ (N epochs × 10 channels × 1920 samples) following the channel order and signal requirements
72
+ in [Using your own signals](#-using-your-own-signals) below.
73
+ Save it as a `.pt` file and update the path in `demo.ipynb`.
74
+
75
+ ### 4) Run the demo
76
+
77
+ Open and run:
78
+
79
+ - `demo.ipynb`
80
+
81
+ The notebook includes:
82
+ - similarity calculation between signal and text embeddings
83
+ - targeted caption generation with per-modality conditioning
84
+
85
+ ---
86
+
87
+ ## 📦 Repository contents
88
+
89
+ - `demo.ipynb` — interactive inference + visualization
90
+ - `requirements.txt` — dependencies
91
+
92
+ ---
93
+
94
+ ## 🧾 Input format
95
+
96
+ SleepLM expects a **30-second epoch**, sampled at **64 Hz** → **1920 samples/channel**, with **10 channels** in the order below.
97
+
98
+ ### Channel order
99
+
100
+ | Index | Channel | Description |
101
+ |------:|---------|-------------|
102
+ | 0 | ECG | Electrocardiogram |
103
+ | 1 | ABD | Abdominal respiratory effort |
104
+ | 2 | THX | Thoracic respiratory effort |
105
+ | 3 | AF | Airflow |
106
+ | 4 | EOG_Left | Left eye movement |
107
+ | 5 | EOG_Right | Right eye movement |
108
+ | 6 | EEG_C3_A2 | Left central EEG |
109
+ | 7 | EEG_C4_A1 | Right central EEG |
110
+ | 8 | EMG_Chin | Chin muscle tone |
111
+ | 9 | POS | Body position |
112
+
113
+ ### Body position encoding (POS channel)
114
+
115
+ ```python
116
+ POSITION_ENCODING = {
117
+ 0: "Right",
118
+ 1: "Left",
119
+ 2: "Supine",
120
+ 3: "Prone",
121
+ 4: "Upright",
122
+ -1: "Other/Unknown", # Use for missing data
123
+ }
124
+ ```
125
+
126
+ ---
127
+
128
+ ## 🧪 Using your own signals
129
+
130
+ You can generate captions for your own sleep recordings by loading **preprocessed** epochs directly in `demo.ipynb`.
131
+
132
+ **Signal requirements**
133
+ - Resample to **64 Hz**
134
+ - Normalize each channel (**z-score**)
135
+ - If a channel is missing, **zero-pad** it
136
+ - POS must follow the integer encoding above
137
+ - Each epoch must be exactly **30 seconds** (**1920 samples @ 64 Hz**)
138
+ - Pack epochs into a float32 PyTorch tensor of shape `[N, 10, 1920]`
139
+
140
+ ---
141
+
142
+ ## 🔁 Reproducibility notes
143
+
144
+ This repo is intentionally lightweight and focuses on **inference**. If you plan to:
145
+ - reproduce paper benchmarks,
146
+ - train on NSRR cohorts,
147
+ - or evaluate cross-cohort generalization,
148
+
149
+ We are planning to opensource our training pipeline upon the acceptance of the paper.
150
+ Note that the training data will not be opensourced due credential issue. If you wish to use the same NSRR datasets, [please apply here](https://sleepdata.org/).
151
+
152
+ ---
153
+
154
+ ## 📝 Citation
155
+
156
+ If you use SleepLM in your research, please cite the paper:
157
+
158
+ ```bibtex
159
+ @article{xu2026sleeplm,
160
+ title={SleepLM: Natural-Language Intelligence for Human Sleep},
161
+ author={Xu, Zongzhe and Shuai, Zitao and Mozaffari, Eideen and Aysola, Ravi Shankar and Kumar, Rajesh and Yang, Yuzhe},
162
+ journal={arXiv preprint},
163
+ year={2026}
164
+ }
165
+ ```
166
+
167
+ ---
168
+
169
+ ## 📄 License
170
+
171
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
172
+
173
+ ---
174
+
175
+ ## 🙏 Acknowledgments
176
+
177
+ - Data sources and cohort infrastructure: **NSRR** (if applicable to your paper/training pipeline)
178
+ - Model architecture inspiration: OpenCLIP (https://github.com/mlfoundations/open_clip)