# Model Recommendations for Medical Text Summarization

## Executive Summary

**Recommended Model**: `microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf`

This is the **PRIMARY** model configured in `models_config.json` with `"is_active": true`.

---

## ⚠️ Models NOT Recommended for Medical Text

### 1. patrickvonplaten/longformer2roberta-cnn_dailymail-fp16

**Status**: ❌ **DEPRECATED - DO NOT USE**

**Problem**: This model produces **irrelevant summaries** for medical text because:

1. **Training Mismatch**: Trained on news articles (CNN/DailyMail dataset), NOT medical text
2. **Domain Gap**: Cannot understand:
   - Clinical terminology and medical abbreviations
   - Structured visit data and medical codes
   - ICD codes, medications, dosages
   - Clinical narrative style
3. **Not Instruction-Tuned**: Cannot follow medical summarization instructions properly

**What Happens**: The model tries to summarize medical data as if it were a news article, resulting in nonsensical output that misses critical clinical information.

**Solution**: Use Phi-3-mini-4k-instruct-q4.gguf instead.

---

### 2. facebook/bart-large-cnn

**Status**: ⚠️ **NOT RECOMMENDED FOR MEDICAL TEXT**

**Problem**: Similar to Longformer:
- Trained on news articles (CNN/DailyMail)
- Limited medical domain knowledge
- May produce suboptimal results for clinical text

**Better Alternative**: Use Phi-3-mini-4k-instruct-q4.gguf

---

## ✅ Recommended Models

### 1. microsoft/Phi-3-mini-4k-instruct-q4.gguf (PRIMARY - ACTIVE)

**Why This Model?**

✅ **Instruction-tuned**: Understands and follows complex medical summarization prompts
✅ **General domain knowledge**: Trained on diverse data including medical/technical content
✅ **Efficient**: GGUF quantization (Q4) provides excellent performance with lower resource usage
✅ **Reliable**: Produces coherent, relevant medical summaries
✅ **Fast**: CPU-optimized, works well in production

**Configuration**:
```json
{
  "name": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
  "type": "gguf",
  "is_active": true,
  "cached": true,
  "description": "Phi-3 Mini GGUF Q4 quantized - PRIMARY MODEL",
  "use_case": "Fast patient summary generation with CPU/GPU"
}
```

---

### 2. google/flan-t5-large (ALTERNATIVE)

**Status**: ✅ **Good Alternative**

**Advantages**:
- Instruction-tuned (FLAN methodology)
- Can follow summarization instructions
- Smaller than Phi-3, faster inference
- Better than BART/Longformer for structured text

**Use When**:
- Need faster inference than Phi-3
- Memory constraints
- Simple summarization tasks

---

## Technical Background: Why News Models Fail on Medical Text

### Training Data Mismatch

**News Articles (CNN/DailyMail)**:
```
Title: New Study Shows Coffee Benefits
Body: A recent study published in the Journal of Medicine found that...
Summary: Research indicates coffee may have health benefits including...
```

**Medical Records**:
```
Visit 2024-01-15:
Chief Complaint: SOB, DOE
HPI: 65F w/ PMH of HTN, DM2, presents with 3d progressive DOE...
PE: RRR, no m/r/g. Lungs CTAB. +1 bilateral LE edema...
A/P: 1. CHF exacerbation - start Lasix 40mg PO daily...
```

### What News Models Do Wrong

1. **Terminology**: Can't understand medical abbreviations (SOB, DOE, HTN, DM2, CTAB, etc.)
2. **Structure**: Expect narrative news format, not clinical structured data
3. **Priority**: News models prioritize "interesting" content; medical needs prioritize clinical significance
4. **Context**: Medical context requires understanding relationships between symptoms, diagnoses, medications
5. **Instructions**: Cannot follow complex instructions like "generate a comprehensive clinical summary focusing on changes over time"

---

## Migration Guide

### If You're Currently Using Longformer or BART:

**Step 1**: Update your API request to use the recommended model:

```json
{
  "patient_summarizer_model_name": "microsoft/Phi-3-mini-4k-instruct-gguf",
  "patient_summarizer_model_type": "gguf",
  "generation_mode": "gguf"
}
```

**Step 2**: Remove any model-name specification to use the default (Phi-3):

```json
{
  // Just omit model specification - defaults to Phi-3
  "patientid": "12345",
  "token": "your-token",
  "key": "your-key"
}
```

**Step 3**: Test the output quality and adjust parameters if needed:

```json
{
  "max_new_tokens": 2048,  // Adjust output length
  "temperature": 0.1,      // Lower = more focused, Higher = more creative
  "top_p": 0.5            // Lower = more deterministic
}
```

---

## Configuration Reference

### Current Active Configuration (models_config.json)

```json
{
  "patient_summary_models": [
    {
      "name": "microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf",
      "type": "gguf",
      "is_active": true,  // ← PRIMARY MODEL
      "cached": true,
      "description": "Phi-3 Mini GGUF Q4 quantized - PRIMARY MODEL",
      "use_case": "Fast patient summary generation with CPU/GPU",
      "repo_id": "microsoft/Phi-3-mini-4k-instruct-gguf",
      "filename": "Phi-3-mini-4k-instruct-q4.gguf"
    }
  ]
}
```

---

## Performance Comparison

| Model | Medical Text Quality | Speed | Memory | Instruction Following |
|-------|---------------------|-------|--------|----------------------|
| **Phi-3 GGUF Q4** | ⭐⭐⭐⭐⭐ Excellent | Fast | Low | ✅ Yes |
| FLAN-T5 Large | ⭐⭐⭐⭐ Good | Very Fast | Low | ✅ Yes |
| Longformer | ⭐ Poor (Irrelevant) | Slow | High | ❌ No |
| BART-CNN | ⭐⭐ Poor | Medium | Medium | ❌ No |

---

## FAQs

**Q: Can I still use Longformer/BART?**
A: Technically yes (they're still cached), but **strongly not recommended**. They will produce irrelevant summaries.

**Q: Why are these models still in the config?**
A: For backward compatibility and documentation. They're marked as `deprecated` and `is_active: false`.

**Q: What if Phi-3 is too slow?**
A: Try `google/flan-t5-large` as an alternative. Still instruction-tuned but smaller/faster.

**Q: Can you fix Longformer to work with medical text?**
A: No. The model's training is fundamentally incompatible. Would require retraining on medical data.

---

## Summary

✅ **DO USE**: Phi-3-mini-4k-instruct-q4.gguf (default/recommended)
✅ **ALTERNATIVE**: google/flan-t5-large  
⚠️ **AVOID**: facebook/bart-large-cnn
❌ **DO NOT USE**: patrickvonplaten/longformer2roberta-cnn_dailymail-fp16

The Longformer model's irrelevant summaries are due to fundamental training mismatch with medical domain, not a bug that can be fixed.