haoranli-ml commited on
Commit
b12349d
·
verified ·
1 Parent(s): 2a7a649

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md ADDED
@@ -0,0 +1,33 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ base_model:
5
+ - meta-llama/Meta-Llama-3-8B
6
+ ---
7
+ ## haoranli-ml/Llama-3-8B-HardClip-64k-Base
8
+
9
+ [![Paper](https://img.shields.io/badge/CoPE_paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2602.05258)
10
+
11
+
12
+ ### ✨ Overview
13
+ **CoPE** is a plug-and-play enchancement of RoPE that *softly* clips the unstable low-frequency components, delivering consistent gains both **within the training context** and during **long-context extrapoaltion**.
14
+
15
+ With a simple yet effective soft clipping strategy, CoPE
16
+
17
+ 1️⃣ **Eliminates severe OOD outliers**, whose periods exceed the pre-training context window and are the primary cause of OOD extrapolation.
18
+
19
+ 2️⃣ **Refines Long-range Semantic Signals** by alleviating the secret *long-term decay of semantic attention* introduced by RoPE.
20
+
21
+ 3️⃣ **Prevents Spectral Leakage** induced by hard frequency truncation, which otherwise leads to long-range oscillatory ringing in the attention scores across relative token distances and introduces spurious correlations.
22
+
23
+
24
+
25
+ ### 📖 Citation
26
+ ```
27
+ @article{li2026cope,
28
+ title={CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs},
29
+ author={Li, Haoran and Ren, Sucheng and Yuille, Alan and Wang, Feng},
30
+ journal={arXiv preprint arXiv:2602.05258},
31
+ year={2026}
32
+ }
33
+ ```