How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("feature-extraction", model="naver-clova-ocr/bros-base-uncased")
# Load model directly
from transformers import AutoTokenizer, AutoModel

tokenizer = AutoTokenizer.from_pretrained("naver-clova-ocr/bros-base-uncased")
model = AutoModel.from_pretrained("naver-clova-ocr/bros-base-uncased")
Quick Links

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

BROS

GitHub: https://github.com/clovaai/bros

Introduction

BROS (BERT Relying On Spatiality) is a pre-trained language model focusing on text and layout for better key information extraction from documents.
Given the OCR results of the document image, which are text and bounding box pairs, it can perform various key information extraction tasks, such as extracting an ordered item list from receipts.
For more details, please refer to our paper:

BROS: A Pre-trained Language Model Focusing on Text and Layout for Better Key Information Extraction from Documents
Teakgyu Hong, Donghyun Kim, Mingi Ji, Wonseok Hwang, Daehyun Nam, Sungrae Park
AAAI 2022 - Main Technical Track

[arXiv]

Pre-trained models

name # params Hugging Face - Models
bros-base-uncased (this) < 110M naver-clova-ocr/bros-base-uncased
bros-large-uncased < 340M naver-clova-ocr/bros-large-uncased
Downloads last month
43,483
Inference Providers NEW

Model tree for naver-clova-ocr/bros-base-uncased

Finetunes
1 model

Spaces using naver-clova-ocr/bros-base-uncased 2

Paper for naver-clova-ocr/bros-base-uncased