--- license: apache-2.0 language: multilingual library_name: transformers.js pipeline_tag: text-classification base_model: huawei-noah/TinyBERT_General_4L_312D tags: - autofill - field-classification - bert - tinybert - onnx - transformers.js - browser --- # TinyBERT Address Autofill A compact field-type classifier for HTML form autofill developed by the Credentials Management Team on Firefox. Given a string describing a single form field's attributes, it predicts one of 66 autofill field types (`given-name`, `family-name`, `email`, `postal-code`, `address-line1`, `cc-number`, etc.) or `other` when the field should not be filled. The model is fine-tuned from `huawei-noah/TinyBERT_General_4L_312D` on a corpus of manually annotated shopping and address forms collected by Mozilla, and is intended to run client-side inside Firefox (or any Transformers.js host) as a replacement or augmentation for the existing regex-based heuristic field detector. ## ONNX variants All variants live under `onnx/` and are loadable through Transformers.js by passing the corresponding `dtype` argument. | File | Precision | Size | Transformers.js `dtype` | | --- | --- | ---: | --- | | `onnx/model.onnx` | fp32 | 57.6 MB | `fp32` | | `onnx/model_fp16.onnx` | fp16 | 28.9 MB | `fp16` | | `onnx/model_quantized.onnx` | int8 dynamic (default) | 14.6 MB | `q8` | | `onnx/model_int8.onnx` | int8 dynamic | 14.6 MB | `int8` | | `onnx/model_uint8.onnx` | uint8 dynamic | 14.6 MB | `uint8` | | `onnx/model_q4.onnx` | 4-bit weight-only on MatMul | 42.3 MB | `q4` | | `onnx/model_q4f16.onnx` | 4-bit on top of fp16 | 22.4 MB | `q4f16` | | `onnx/model_bnb4.onnx` | bitsandbytes NF4 | 41.9 MB | `bnb4` | ## How to use ### Transformers.js (browser) ```js import { pipeline } from "@huggingface/transformers"; const classifier = await pipeline( "text-classification", "vazish/tinybert-address-autofill", { dtype: "q8" } // try "fp16" for highest fidelity, "q4f16" for smallest ); const out = await classifier( "a-c-postal-code billing zip code dwfrm billing address fields postal code" ); // → [{ label: "postal-code", score: 0.99 }] ``` ### Python (Optimum + ONNX Runtime) ```python from optimum.onnxruntime import ORTModelForSequenceClassification from transformers import AutoTokenizer, pipeline model = ORTModelForSequenceClassification.from_pretrained( "vazish/tinybert-address-autofill", file_name="onnx/model.onnx", # or onnx/model_quantized.onnx, etc. ) tokenizer = AutoTokenizer.from_pretrained("vazish/tinybert-address-autofill") clf = pipeline("text-classification", model=model, tokenizer=tokenizer) clf("email email mail **email") # → [{"label": "email", "score": 0.99}] ``` ## Input format The model expects a single string per field, built by concatenating that field's HTML attributes after light normalisation: 1. Concatenate (in order): `type` + `autocomplete` + `id` + `name` + `placeholder` + the field's computed `