---
license: apache-2.0
library_name: transformers
tags:
  - multimodal
  - vision-language
  - code-generation
  - tikz
  - geometric-reasoning
  - computer-vision
  - cvpr2026
  - internvl
  - internlm2
  - instruction-tuning
datasets:
  - SJY-1995/GeoTikz-Base
  - SJY-1995/GeoTikz-Instruct
model-index:
  - name: GeoTikzBridge-Instruct-8B
    results:
      - task:
          type: image-to-text
          name: Instruction-Guided Geometric Code Generation
        dataset:
          name: GeoTikz-Instruct
          type: SJY-1995/GeoTikz-Instruct
        metrics:
          - type: CLIP-S
            value: 99.2
            name: CLIP-S
---

# GeoTikzBridge-Instruct-8B

## Model Overview
GeoTikzBridge-Instruct-8B is the instruction-tuned variant of the GeoTikzBridge series, proposed in the **CVPR 2026** accepted paper *GeoTikzBridge: Advancing Multimodal Code Generation for Geometric Perception and Reasoning*. Built on GeoTikzBridge-Base-8B, this model is further fine-tuned on the 419k-scale GeoTikz-Instruct dataset, enabling strong instruction-following capabilities for geometric tasks. Beyond basic image-to-TikZ conversion, it supports instruction-guided auxiliary line generation, interactive geometric modification, and step-by-step geometric reasoning, making it a powerful tool for educational and research scenarios requiring interactive geometric manipulation.

## Model Details
### Core Architecture
- Backbone Foundation: Initialized from GeoTikzBridge-Base-8B (InternVL2 8B with InternLM2 7B backbone), inheriting its strong geometric perception and code generation capabilities.
- Parameter Scale: 8 billion parameters, maintaining efficient inference while supporting complex instruction understanding.
- Instruction Tuning: Fine-tuned on a diverse set of geometric instruction-response pairs, enabling the model to understand and execute natural language instructions for geometric figure manipulation.

### Core Capabilities
1.  **Instruction-Guided TikZ Generation**: Generates TikZ code based on natural language instructions (e.g., "Draw a right triangle with a height of 5cm and label the right angle").
2.  **Auxiliary Line Generation**: Adds auxiliary lines (e.g., perpendicular bisectors, angle bisectors, medians) to existing geometric figures as instructed, supporting geometric problem-solving.
3.  **Interactive Geometric Modification**: Modifies existing geometric figures (e.g., resizing, rotating, adding/removing elements) according to user instructions.
4.  **Basic Geometric Reasoning**: Provides step-by-step geometric reasoning processes (in text form) alongside TikZ code generation for simple geometric problems.

## Intended Use & Limitations
### Intended Use Cases
- Core Scenarios: Interactive geometric teaching aids, step-by-step geometry problem-solving assistance, dynamic geometric illustration generation for educational materials, and research on geometric reasoning with multimodal models.
- Research Purposes: Serves as a baseline for instruction-tuned multimodal code generation and geometric reasoning research.
- Downstream Expansion: Can be integrated into educational platforms or geometric drawing tools to provide intelligent, interactive support.

### Out-of-Scope Use Cases
- Non-geometric image manipulation or code generation.
- High-precision engineering drawing generation requiring professional CAD software.
- Solving advanced mathematical proofs or complex geometric problems beyond the scope of plane geometry.

### Model Limitations
- The model primarily understands and executes instructions in English; instructions in other languages may lead to suboptimal results.
- While it can generate reasoning text for simple problems, it is not a substitute for professional mathematical proof systems.
- Complex or ambiguous instructions may require multiple rounds of clarification to achieve the desired result.

## Quick Start
### Environment Setup
Install basic dependencies:
```bash
pip install transformers torch pillow accelerate
```
For full training/inference dependencies, please refer to the official project repository: [GeoTikzBridge GitHub](sslocal://flow/file_open?url=https%3A%2F%2Fgithub.com%2Fsjy-1995%2FGeoTikzBridge%2F&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=)

### Inference Example
Quickly load the model for instruction-guided geometric code generation:
```python
from transformers import AutoProcessor, AutoModelForCausalLM
import torch
from PIL import Image

# Load model and processor
model_name = "SJY-1995/GeoTikzBridge-Instruct-8B"
processor = AutoProcessor.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
    device_map="auto"
)

# Load input geometric image (optional, depending on the instruction)
# If the instruction requires modifying an existing figure, load the image here
# image = Image.open("existing_geometric_figure.png").convert("RGB")
# If generating a new figure from scratch, you can use a placeholder or omit the image (depending on model input requirements)

# Build an instruction-guided prompt
prompt = ""
# inputs = processor(text=prompt, images=image, return_tensors="pt").to(model.device)

# Generate TikZ code
with torch.no_grad():
    output = model.generate(
        **inputs,
        max_new_tokens=4096,
        temperature=0.2,
        top_p=0.95,
        do_sample=False
    )

# Decode and output the generated result
tikz_code = processor.decode(output[0], skip_special_tokens=True)
print("Generated TikZ Code:\n", tikz_code)
```

## Training Details
### Training Dataset
The model is initialized from GeoTikzBridge-Base-8B and further fine-tuned on the **GeoTikz-Instruct** dataset, which contains approximately 419k high-quality instruction-geometric response pairs. The dataset covers diverse instruction types, including figure generation, auxiliary line addition, figure modification, and basic reasoning.

Dataset Links:
- GeoTikz-Base: [SJY-1995/GeoTikz-Base](sslocal://flow/file_open?url=https%3A%2F%2Fhuggingface.co%2Fdatasets%2FSJY-1995%2FGeoTikz-Base&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=)
- GeoTikz-Instruct: [SJY-1995/GeoTikz-Instruct](sslocal://flow/file_open?url=https%3A%2F%2Fhuggingface.co%2Fdatasets%2FSJY-1995%2FGeoTikz-Instruct&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=)

### Key Training Hyperparameters
(Refer to the paper or official repository for detailed Instruct-version hyperparameters; the following are illustrative.)
| Hyperparameter | Configuration |
|----------------|---------------|
| Global Batch Size | 64 |
| Peak Learning Rate | 2e-7 |
| Training Epochs | 2 |
| Max Sequence Length | 12800 |
| Training Precision | BF16 |

### Training Framework & Scripts
Refer to the official project repository for training details: [GeoTikzBridge GitHub](sslocal://flow/file_open?url=https%3A%2F%2Fgithub.com%2Fsjy-1995%2FGeoTikzBridge%2F&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=)

## Model Family
| Model Name | Parameter Size | Core Capability | Model Link |
|------------|----------------|-----------------|------------|
| GeoTikzBridge-Base-8B | 8B | Basic geometric image-to-TikZ code generation | [🤗 Hugging Face](sslocal://flow/file_open?url=https%3A%2F%2Fhuggingface.co%2FSJY-1995%2FGeoTikzBridge-Base-8B&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=) |
| GeoTikzBridge-Base-38B | 38B | High-precision complex geometric figure TikZ code generation | [🤗 Hugging Face](sslocal://flow/file_open?url=https%3A%2F%2Fhuggingface.co%2FSJY-1995%2FGeoTikzBridge-Base-38B&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=) |
| GeoTikzBridge-Instruct-8B | 8B | Instruction following, auxiliary line generation, interactive geometric reasoning | [🤗 Hugging Face](sslocal://flow/file_open?url=https%3A%2F%2Fhuggingface.co%2FSJY-1995%2FGeoTikzBridge-Instruct-8B&flow_extra=eyJsaW5rX3R5cGUiOiJjb2RlX2ludGVycHJldGVyIn0=) |

## Citation
If you use this model, related datasets or code in your research or projects, please cite the following paper:
```bibtex
@inproceedings{
  geotikzbridge,
  title={GeoTikzBridge: Advancing Multimodal Code Generation for Geometric Perception and Reasoning},
  author={Jiayin Sun and Caixia Sun and Boyu Yang and Hailin Li and Xiao Chen and Yi Zhang and Errui Ding and Liang Li and Chao Deng and Junlan Feng},
  booktitle={2026 IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2026}
}
```