metythorn
/

ocr-stn-cnn-transformer-base

@@ -1,42 +1,27 @@
 ---
 language:
-  - km
 license: apache-2.0
 tags:
-  - ocr
-  - transformer
-  - vision
 pipeline_tag: image-to-text
 ---
-# Khmer OCR CNN + Transformer (ONNX)
-This repository contains a ResNet + Transformer decoder checkpoint for Khmer OCR,
-the exported ONNX graph, the serialized `config.json` (vocab + hyperparameters),
-and the standalone `inference_onnx.py` helper.
-## Files
-- `khmer_ocr.onnx` – ONNX model
-- `config.json` – hyperparameters plus serialized vocabulary
-- `inference_onnx.py` and `model.py` – inference helper and architecture
 ## Usage
 ```python
-from huggingface_hub import hf_hub_download
-import importlib.util
-repo_id = "metythorn/ocr-stn-cnn-transformer-base"
-onnx_path = hf_hub_download(repo_id=repo_id, filename="khmer_ocr.onnx")
-config_path = hf_hub_download(repo_id=repo_id, filename="config.json")
-inference_path = hf_hub_download(repo_id=repo_id, filename="onnx_inference.py")
-spec = importlib.util.spec_from_file_location("khmer_ocr_infer", inference_path)
-module = importlib.util.module_from_spec(spec)
-spec.loader.exec_module(module)
-ONNXPredictor = module.ONNXPredictor
-predictor = ONNXPredictor(model_path=onnx_path, config_path=config_path)
-print(predictor.predict("path/to/image.jpg"))
 ```

 ---
 language:
+- km
 license: apache-2.0
 tags:
+- ocr
+- transformer
+- vision
 pipeline_tag: image-to-text
 ---
+# Khmer OCR CNN + Transformer
+This repository contains a ResNet + Transformer decoder checkpoint for Khmer OCR, I don’t have a public paper for this model — everything comes from thousands of experiments across different model architectures and datasets.
+## Installation
+```python
+pip install mer
+```
 ## Usage
 ```python
+from mer import Mer
+model = Mer(markdown=True, device='cuda')
+result = model.predict("sample_image.png")
+print("Predicted text:", result)
 ```