haoranli-ml
/

Llama-3-8B-HardClip-64k-Base

Model card Files Files and versions

haoranli-ml commited on Feb 6

Commit

b12349d

·

verified ·

1 Parent(s): 2a7a649

Create README.md

Files changed (1) hide show

README.md +33 -0

README.md ADDED Viewed

	@@ -0,0 +1,33 @@

+---
+language:
+- en
+base_model:
+- meta-llama/Meta-Llama-3-8B
+---
+## haoranli-ml/Llama-3-8B-HardClip-64k-Base
+[![Paper](https://img.shields.io/badge/CoPE_paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2602.05258)
+### ✨  Overview
+**CoPE** is a plug-and-play enchancement of RoPE that *softly* clips the unstable low-frequency components, delivering consistent gains both **within the training context** and during **long-context extrapoaltion**.
+With a simple yet effective soft clipping strategy, CoPE
+1️⃣ **Eliminates severe OOD outliers**, whose periods exceed the pre-training context window and are the primary cause of OOD extrapolation.
+2️⃣ **Refines Long-range Semantic Signals** by alleviating the secret *long-term decay of semantic attention* introduced by RoPE.
+3️⃣ **Prevents Spectral Leakage** induced by hard frequency truncation, which otherwise leads to long-range oscillatory ringing in the attention scores across relative token distances and introduces spurious correlations.
+### 📖 Citation
+```
+@article{li2026cope,
+  title={CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs},
+  author={Li, Haoran and Ren, Sucheng and Yuille, Alan and Wang, Feng},
+  journal={arXiv preprint arXiv:2602.05258},
+  year={2026}
+}
+```