ONNX Model

Converted from: granite-embedding-reranker-english-r2

Files

  • model.onnx - FP32 version
  • model_quantized.onnx - INT8 quantized version
  • *.json - tokenizer and config files

Usage

from transformers import AutoTokenizer
import onnxruntime as ort

tokenizer = AutoTokenizer.from_pretrained("granite-onnx")
session = ort.InferenceSession("granite-onnx/model_quantized.onnx")
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Zenabius/granite-embedding-reranker-english-r2