Text Ranking
sentence-transformers
Safetensors
qwen3
sentence-similarity
cross-encoder
reranker
feature-extraction
telepix
Instructions to use telepix/PIXIE-Spell-Reranker-Preview-0.6B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use telepix/PIXIE-Spell-Reranker-Preview-0.6B with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("telepix/PIXIE-Spell-Reranker-Preview-0.6B") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -71,7 +71,7 @@ Descriptions of the benchmark datasets used for evaluation are as follows:
|
|
| 71 |
> **Note:**
|
| 72 |
> While many benchmark datasets are available for evaluation, in this project we chose to use only those that contain clean positive documents for each query. Keep in mind that a benchmark dataset is just that a benchmark. For real-world applications, it is best to construct an evaluation dataset tailored to your specific domain and evaluate embedding models, such as PIXIE, in that environment to determine the most suitable one.
|
| 73 |
|
| 74 |
-
####
|
| 75 |
Our model, **telepix/PIXIE-Spell-Reranker-Preview-0.6B**, achieves strong performance on a wide range of tasks, including fact verification, multi-hop question answering, financial QA, and scientific document retrieval, demonstrating competitive generalization across diverse domains.
|
| 76 |
|
| 77 |
| Model Name | # params | Avg. NDCG | NDCG@1 | NDCG@3 | NDCG@5 | NDCG@10 |
|
|
@@ -87,16 +87,10 @@ Our model, **telepix/PIXIE-Spell-Reranker-Preview-0.6B**, achieves strong perfor
|
|
| 87 |
Descriptions of the benchmark datasets used for evaluation are as follows:
|
| 88 |
- **ArguAna**
|
| 89 |
A dataset for argument retrieval based on claim-counterclaim pairs from online debate forums.
|
| 90 |
-
- **FEVER**
|
| 91 |
-
A fact verification dataset using Wikipedia for evidence-based claim validation.
|
| 92 |
- **FiQA-2018**
|
| 93 |
A retrieval benchmark tailored to the finance domain with real-world questions and answers.
|
| 94 |
-
- **HotpotQA**
|
| 95 |
-
A multi-hop open-domain QA dataset requiring reasoning across multiple documents.
|
| 96 |
- **MSMARCO**
|
| 97 |
A large-scale benchmark using real Bing search queries and corresponding web documents.
|
| 98 |
-
- **NQ**
|
| 99 |
-
A Google QA dataset where user questions are answered using Wikipedia articles.
|
| 100 |
- **SCIDOCS**
|
| 101 |
A citation-based document retrieval dataset focused on scientific papers.
|
| 102 |
|
|
|
|
| 71 |
> **Note:**
|
| 72 |
> While many benchmark datasets are available for evaluation, in this project we chose to use only those that contain clean positive documents for each query. Keep in mind that a benchmark dataset is just that a benchmark. For real-world applications, it is best to construct an evaluation dataset tailored to your specific domain and evaluate embedding models, such as PIXIE, in that environment to determine the most suitable one.
|
| 73 |
|
| 74 |
+
#### 4 Datasets of BEIR (English)
|
| 75 |
Our model, **telepix/PIXIE-Spell-Reranker-Preview-0.6B**, achieves strong performance on a wide range of tasks, including fact verification, multi-hop question answering, financial QA, and scientific document retrieval, demonstrating competitive generalization across diverse domains.
|
| 76 |
|
| 77 |
| Model Name | # params | Avg. NDCG | NDCG@1 | NDCG@3 | NDCG@5 | NDCG@10 |
|
|
|
|
| 87 |
Descriptions of the benchmark datasets used for evaluation are as follows:
|
| 88 |
- **ArguAna**
|
| 89 |
A dataset for argument retrieval based on claim-counterclaim pairs from online debate forums.
|
|
|
|
|
|
|
| 90 |
- **FiQA-2018**
|
| 91 |
A retrieval benchmark tailored to the finance domain with real-world questions and answers.
|
|
|
|
|
|
|
| 92 |
- **MSMARCO**
|
| 93 |
A large-scale benchmark using real Bing search queries and corresponding web documents.
|
|
|
|
|
|
|
| 94 |
- **SCIDOCS**
|
| 95 |
A citation-based document retrieval dataset focused on scientific papers.
|
| 96 |
|