--- base_model: bert-base-uncased datasets: - AAU-NLP/HiFi-KPI language: - en library_name: transformers model_name: Cal-BERT-SL1000 pipeline_tag: token-classification tags: - financial NLP - named entity recognition - sequence labeling - structured extraction - hierarchical taxonomy - XBRL - iXBRL - SEC filings - financial-information-extraction task_categories: - token-classification task_ids: - named-entity-recognition - financial-information-extraction pretty_name: 'Cal-BERT-SL1000: Sequence Labeling for Calculation Taxonomy KPI Extraction' --- ## **Cal-BERT-SL1000** ### **Model Description** Cal-BERT-SL1000 is a **BERT-based sequence labeling model** fine-tuned on the **[HiFi-KPI dataset](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI)** for extracting **financial key performance indicators (KPIs)** from **SEC earnings filings (10-K & 10-Q)**. It specializes in identifying entities that are one level up the calculation taxonomy ($n=1$), such as `revenueAbstract`, `earnings`, and `financial ratios`, using **token classification**. This model was introduced in the paper [HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings](https://huggingface.co/papers/2502.15411) by Rasmus Aavang, Giovanni Rizzi, Rasmus Bøggild, Alexandre Iolov, Mike Zhang (@jjzha), and Johannes Bjerva. ### **Use Cases** - Extracting **financial KPIs** using **iXBRL calculation taxonomy** - **Financial document parsing** with entity recognition ### **Performance** - Trained on **1,000 most frequent labels** from the **[HiFi-KPI dataset](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI)** with $n=1$ in the calculation taxonomy. ### **Dataset & Code** - **Dataset**: [HiFi-KPI on Hugging Face](https://huggingface.co/datasets/AAU-NLP/HiFi-KPI) - **Official Code**: [HiFi-KPI GitHub Repository](https://github.com/aaunlp/HiFi-KPI) ### **Citation** ```bibtex @article{aavang2025hifikpi, title={HiFi-KPI: A Dataset for Hierarchical KPI Extraction from Earnings Filings}, author={Aavang, Rasmus and Rizzi, Giovanni and B{\o}ggild, Rasmus and Iolov, Alexandre and Zhang, Mike and Bjerva, Johannes}, journal={arXiv preprint arXiv:2502.15411}, year={2025} } ```