---
title: Spam Classifier — Liquid AI
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: "5.23.0"
python_version: "3.11"
app_file: app.py
pinned: false
license: mit
tags:
  - spam-detection
  - liquid-ai
  - lora
  - peft
  - gradio
  - nlp
  - text-classification
models:
  - LiquidAI/LFM2.5-1.2B-Instruct
datasets:
  - VoltageVagabond/spam-email-dataset
---

## Senior Project Notice

This repository was created for a senior project in ENGT 375 Applied Machine Learning at Old Dominion University. It is provided for educational and research demonstration purposes only. It is not intended for production use, security filtering, or making real-world spam/phishing decisions. Always use established security tools for operational email protection.

---
library_name: transformers
tags:
  - spam-detection
  - liquid-ai
  - lora
  - peft
  - apple-silicon
  - nlp
  - text-classification
license: mit
base_model: LiquidAI/LFM2.5-1.2B-Instruct
datasets:
  - VoltageVagabond/spam-email-dataset
pipeline_tag: text-generation
---

# Spam Classifier — Liquid AI LFM2.5-1.2B LoRA Fine-Tune

**ENGT 375 — Applied Machine Learning | Spring 2026 | ODU**
Liquid AI's LFM2.5-1.2B-Instruct model fine-tuned with LoRA adapters using HuggingFace Transformers + PEFT for spam email classification.

## Model Details

- **Base model:** LiquidAI/LFM2.5-1.2B-Instruct
- **Fine-tuning:** LoRA (rank 8, alpha 16, dropout 0.1)
- **Framework:** HuggingFace Transformers + PEFT + TRL
- **Hardware:** Apple Silicon (M-series)
- **Task:** Classify emails as spam or ham

## LoRA Target Modules

`w1`, `w2`, `in_proj`, `out_proj`, `v_proj`, `k_proj`, `q_proj`, `w3`

## Training Details

| Hyperparameter | Value |
|----------------|-------|
| Training examples | ~8,000 (fast) / ~16,000 (full) — 3-class Spam/Ham/Phishing |
| Test examples | ~20% holdout from the retrain split |
| Epochs | 3 |
| Batch size | 1 (effective 4 with gradient accumulation steps = 4) |
| Learning rate | 2e-4 |
| Max sequence length | 256 |
| Optimizer | adamw_torch (bitsandbytes 8-bit not supported on MPS) |
| Weight dtype | bfloat16 |
| Device | MPS (Apple Silicon) |
| Gradient checkpointing | Enabled (use_reentrant=False) |
| Max gradient norm | 0.3 |
| LoRA rank | 8 |
| LoRA alpha | 16 |
| LoRA dropout | 0.1 |
| Target modules | 8 (q_proj, k_proj, v_proj, out_proj, w1, w2, w3, in_proj) |
| Training time | ~1–1.5 hours (per fine_tune.py; earlier docs listed ~2–2.5 hours before the v0.4.3 memory optimization) |

### Hardware

- **Device:** Apple Silicon (M-series)
- **Backend:** PyTorch MPS (Metal Performance Shaders)

## Dataset

- [VoltageVagabond/spam-email-dataset](https://huggingface.co/datasets/VoltageVagabond/spam-email-dataset)

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("LiquidAI/LFM2.5-1.2B-Instruct")
model = PeftModel.from_pretrained(base_model, "adapters")
tokenizer = AutoTokenizer.from_pretrained("LiquidAI/LFM2.5-1.2B-Instruct")
```

## Gradio Interface

```bash
pip install -r requirements.txt
python app.py
```

## Files

- `adapters/` — LoRA adapter weights + config
- `fine_tune.py` — Training script
- `app.py` — Gradio web interface
- `training_data/` — Training dataset

## Intended Use

This model is an **educational demonstration** of LLM fine-tuning with HuggingFace PEFT, created as part of a university course project. It is suitable for:

- Learning how LoRA fine-tuning works with the HuggingFace ecosystem (Transformers + PEFT + TRL)
- Exploring Liquid AI's novel architecture for text classification
- Comparing different LLM fine-tuning frameworks (MLX vs. HuggingFace)

It is **not** intended for production spam filtering.

## Limitations

- May misclassify legitimate marketing emails as spam
- Trained on **English emails only** — not suitable for other languages
- Training set (~8K fast / ~16K full) is modest compared to production spam filters — generalization may be limited

**Note:** Three-class classification (SPAM / HAM / PHISHING) is supported as of v0.4.0 — earlier versions were binary. The model is deployed as a HuggingFace Space (see Space header above).

## Related Models

| Model | Description | Link |
|-------|-------------|------|
| spam-classifier-mlx | Qwen 3.5 0.8B MLX LoRA fine-tune | [VoltageVagabond/spam-classifier-mlx](https://huggingface.co/VoltageVagabond/spam-classifier-mlx) |
| spam-xai-model | sklearn voting ensemble (RF + LR + SVM) with LIME/SHAP/ELI5 explainability | [VoltageVagabond/spam-xai-model](https://huggingface.co/VoltageVagabond/spam-xai-model) |
| spam-xai-classifier (Space) | Live Gradio web app for the sklearn classifier | [VoltageVagabond/spam-xai-classifier](https://huggingface.co/spaces/VoltageVagabond/spam-xai-classifier) |

## Citation

```bibtex
@misc{voltagevagabond2026spamliquid,
  title={Spam Classifier — Liquid AI LFM2.5-1.2B LoRA Fine-Tune},
  author={VoltageVagabond},
  year={2026},
  howpublished={\url{https://huggingface.co/VoltageVagabond/spam-classifier-liquid}},
  note={ENGT 375 — Applied Machine Learning, Old Dominion University, Spring 2026}
}
```