Token Classification
Transformers
French
German
ocr_qa_assessment
ocr
bloomfilter
unigram
impresso
quality-assessment
v1.0.6
custom_code
Instructions to use impresso-project/ocr-quality-assessor-unigram-light with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use impresso-project/ocr-quality-assessor-unigram-light with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="impresso-project/ocr-quality-assessor-unigram-light", trust_remote_code=True)# Load model directly from transformers import AutoModelForTokenClassification model = AutoModelForTokenClassification.from_pretrained("impresso-project/ocr-quality-assessor-unigram-light", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
| from transformers import Pipeline | |
| class QAAssessmentPipeline(Pipeline): | |
| def _sanitize_parameters(self, **kwargs): | |
| preprocess_kwargs = {} | |
| if "text" in kwargs: | |
| preprocess_kwargs["text"] = kwargs["text"] | |
| return preprocess_kwargs, {}, {} | |
| def preprocess(self, text, **kwargs): | |
| # Nothing to preprocess | |
| return text | |
| def _forward(self, text, **kwargs): | |
| predictions, probabilities = self.model(text) | |
| return predictions, probabilities | |
| def postprocess(self, outputs, **kwargs): | |
| predictions, probabilities = outputs | |
| label = predictions[0][0].replace("__label__", "") # Remove __label__ prefix | |
| confidence = float( | |
| probabilities[0][0] | |
| ) # Convert to float for JSON serialization | |
| # Format as JSON-compatible dictionary | |
| model_output = {"label": label, "confidence": round(confidence * 100, 2)} | |
| return model_output | |