Image-Text-to-Text
Transformers
Safetensors
qwen2_vl
conversational
text-generation-inference
limuyu011's picture
Update README.md
076ba41 verified
|
Raw
History Blame Contribute Delete
2.06 kB
metadata
library_name: transformers
license: mit
datasets:
  - CraftJarvis/minecraft-text-action-dataset
metrics:
  - accuracy
base_model:
  - Qwen/Qwen2-VL-7B-Instruct
pipeline_tag: image-text-to-text
arxiv: 2509.13347

Minecraft-Textvla-Qwen2vl-7b-2509

💻 Usage

You can download and use this model with:

python examples/rollout_openha.py \
    --output_mode text_action  \
    --vlm_client_mode hf \
    --system_message_tag text_action \
    --model_ips localhost --model_ports 11000 \
    --model_path CraftJarvis/minecraft-textvla-qwen2vl-7b-2509 \
    --model_id minecraft-textvla-qwen2vl-7b-2509 \
    --record_path "~/evaluate" \
    --max_steps_num 200 \
    --num_rollouts 8