---
base_model: google/medgemma-1.5-4b-it
library_name: peft
pipeline_tag: text-generation
tags:
- base_model:adapter:google/medgemma-1.5-4b-it
- lora
- sft
- transformers
- trl
---

# Model Card for Model ID

<!-- Provide a quick summary of what the model is/does. -->
A fine-tuned version of MedGemma-1.5 on surgical tools of cholecystectomy surgical dataset


## Model Details

### Model Description

This is a fine-tuned Medgemma-1.5-4B-it model on image-text pairs of surgical images from the colecystectomy surgical procedure. The model can be prompted with an image with the text prompt "explain in detail the surgical scene inlcuding the position of the surgical tools". A short description of the scene is obtained as an output. These text prompts were used to further fine-tune image-text diffusion models for surgical context grounded image generation.
<!-- Provide a longer summary of what this model is. -->

## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

The model can be prompted with a surgical image and a corresponding text is obtained which describes the surgical tools present in the scene. Based on the set temperature, the information about anatomies can also be obtained. However, be cautious that certain hallucinations about the scene can be wrong


## Training Details

For details on training this model, please refer to this [repo](https://github.com/danushkv/surgfast/tree/main)