--- language: - en base_model: - meta-llama/Meta-Llama-3-8B --- ## haoranli-ml/Llama-3-8B-HardClip-64k-Base [![Paper](https://img.shields.io/badge/CoPE_paper-A42C25?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/2602.05258) [![GitHub](https://img.shields.io/badge/GitHub-181717?style=for-the-badge&logo=github&logoColor=white)](https://github.com/hrlics/CoPE) ### ✨ Overview **CoPE** is a plug-and-play enhancement of RoPE that *softly* clips the unstable low-frequency components, delivering consistent gains both **within the training context** and during **long-context extrapoaltion**. With a simple yet effective soft clipping strategy, CoPE: 1️⃣ **Eliminates severe OOD outliers**, whose periods exceed the pre-training context window and are the primary cause of OOD extrapolation. 2️⃣ **Refines Long-range Semantic Signals** by alleviating the secret *long-term decay of semantic attention* introduced by RoPE. 3️⃣ **Prevents Spectral Leakage** induced by hard frequency truncation, which otherwise leads to long-range oscillatory ringing in the attention scores across relative token distances and introduces spurious correlations. For more details on training and evaluation, please refer to the [official GitHub repository](https://github.com/hrlics/CoPE). ### 📖 Citation ``` @article{li2026cope, title={CoPE: Clipped RoPE as A Scalable Free Lunch for Long Context LLMs}, author={Li, Haoran and Ren, Sucheng and Yuille, Alan and Wang, Feng}, journal={arXiv preprint arXiv:2602.05258}, year={2026} } ```