Update README.md
Browse files
README.md
CHANGED
|
@@ -7,12 +7,13 @@ base_model:
|
|
| 7 |
## haoranli-ml/Llama-3-8B-HardClip-64k-Base
|
| 8 |
|
| 9 |
[](https://arxiv.org/abs/2602.05258)
|
|
|
|
| 10 |
|
| 11 |
|
| 12 |
### ✨ Overview
|
| 13 |
-
**CoPE** is a plug-and-play
|
| 14 |
|
| 15 |
-
With a simple yet effective soft clipping strategy, CoPE
|
| 16 |
|
| 17 |
1️⃣ **Eliminates severe OOD outliers**, whose periods exceed the pre-training context window and are the primary cause of OOD extrapolation.
|
| 18 |
|
|
@@ -20,7 +21,7 @@ With a simple yet effective soft clipping strategy, CoPE
|
|
| 20 |
|
| 21 |
3️⃣ **Prevents Spectral Leakage** induced by hard frequency truncation, which otherwise leads to long-range oscillatory ringing in the attention scores across relative token distances and introduces spurious correlations.
|
| 22 |
|
| 23 |
-
|
| 24 |
|
| 25 |
### 📖 Citation
|
| 26 |
```
|
|
|
|
| 7 |
## haoranli-ml/Llama-3-8B-HardClip-64k-Base
|
| 8 |
|
| 9 |
[](https://arxiv.org/abs/2602.05258)
|
| 10 |
+
[](https://github.com/hrlics/CoPE)
|
| 11 |
|
| 12 |
|
| 13 |
### ✨ Overview
|
| 14 |
+
**CoPE** is a plug-and-play enhancement of RoPE that *softly* clips the unstable low-frequency components, delivering consistent gains both **within the training context** and during **long-context extrapoaltion**.
|
| 15 |
|
| 16 |
+
With a simple yet effective soft clipping strategy, CoPE:
|
| 17 |
|
| 18 |
1️⃣ **Eliminates severe OOD outliers**, whose periods exceed the pre-training context window and are the primary cause of OOD extrapolation.
|
| 19 |
|
|
|
|
| 21 |
|
| 22 |
3️⃣ **Prevents Spectral Leakage** induced by hard frequency truncation, which otherwise leads to long-range oscillatory ringing in the attention scores across relative token distances and introduces spurious correlations.
|
| 23 |
|
| 24 |
+
For more details on training and evaluation, please refer to the [official GitHub repository](https://github.com/hrlics/CoPE).
|
| 25 |
|
| 26 |
### 📖 Citation
|
| 27 |
```
|