Image-Text-to-Text
PaddleOCR
Safetensors
English
Chinese
multilingual
paddleocr_vl
ERNIE4.5
PaddlePaddle
image-to-text
ocr
document-parse
layout
table
formula
chart
conversational
custom_code
Eval Results
Instructions to use PaddlePaddle/PaddleOCR-VL with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PaddleOCR
How to use PaddlePaddle/PaddleOCR-VL with PaddleOCR:
# See https://www.paddleocr.ai/latest/version3.x/pipeline_usage/PaddleOCR-VL.html to installation from paddleocr import PaddleOCRVL pipeline = PaddleOCRVL(pipeline_version="v1") output = pipeline.predict("path/to/document_image.png") for res in output: res.print() res.save_to_json(save_path="output") res.save_to_markdown(save_path="output") - Notebooks
- Google Colab
- Kaggle
fix: allow pass image kwargs to image processor
#90
by bigmoyan - opened
processing_paddleocr_vl.py
CHANGED
|
@@ -158,7 +158,7 @@ class PaddleOCRVLProcessor(ProcessorMixin):
|
|
| 158 |
)
|
| 159 |
|
| 160 |
if images is not None:
|
| 161 |
-
image_inputs = self.image_processor(images=images,
|
| 162 |
image_inputs["pixel_values"] = image_inputs["pixel_values"]
|
| 163 |
image_grid_thw = image_inputs["image_grid_thw"]
|
| 164 |
|
|
|
|
| 158 |
)
|
| 159 |
|
| 160 |
if images is not None:
|
| 161 |
+
image_inputs = self.image_processor(images=images, **output_kwargs["images_kwargs"])
|
| 162 |
image_inputs["pixel_values"] = image_inputs["pixel_values"]
|
| 163 |
image_grid_thw = image_inputs["image_grid_thw"]
|
| 164 |
|