---
title: Spam Classifier — Liquid AI
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: "5.23.0"
python_version: "3.11"
app_file: app.py
pinned: false
license: mit
tags:
  - spam-detection
  - liquid-ai
  - lora
  - peft
  - gradio
  - nlp
  - text-classification
models:
  - LiquidAI/LFM2.5-1.2B-Instruct
datasets:
  - VoltageVagabond/spam-email-dataset
---
---
library_name: transformers
tags:
  - spam-detection
  - liquid-ai
  - lora
  - peft
  - apple-silicon
  - nlp
  - text-classification
license: mit
base_model: LiquidAI/LFM2.5-1.2B-Instruct
datasets:
  - VoltageVagabond/spam-email-dataset
pipeline_tag: text-generation
---

# Spam Classifier — Liquid AI LFM2.5-1.2B LoRA Fine-Tune

**ENGT 375 — Applied Machine Learning | Spring 2026 | ODU**

> **Disclaimer:** This model was created as a student project for ENGT 375 (Applied Machine Learning) at Old Dominion University, Spring 2026. It is intended for **educational and research purposes only** and should not be used as a sole spam/phishing filter in production. Classification accuracy may vary, and the model may produce incorrect or misleading results. Always use established email security tools for real-world spam filtering.

Liquid AI's LFM2.5-1.2B-Instruct model fine-tuned with LoRA adapters using HuggingFace Transformers + PEFT for spam email classification.

## Model Details

- **Base model:** LiquidAI/LFM2.5-1.2B-Instruct
- **Fine-tuning:** LoRA (rank 8, alpha 16, dropout 0.1)
- **Framework:** HuggingFace Transformers + PEFT + TRL
- **Hardware:** Apple Silicon (M-series)
- **Task:** Classify emails as spam or ham

## LoRA Target Modules

`w1`, `w2`, `in_proj`, `out_proj`, `v_proj`, `k_proj`, `q_proj`, `w3`

## Training Details

| Hyperparameter | Value |
|----------------|-------|
| Training examples | ~8,000 (fast) / ~16,000 (full) — 3-class Spam/Ham/Phishing |
| Test examples | ~20% holdout from the retrain split |
| Epochs | 3 |
| Batch size | 1 (effective 4 with gradient accumulation steps = 4) |
| Learning rate | 2e-4 |
| Max sequence length | 256 |
| Optimizer | adamw_torch (bitsandbytes 8-bit not supported on MPS) |
| Weight dtype | bfloat16 |
| Device | MPS (Apple Silicon) |
| Gradient checkpointing | Enabled (use_reentrant=False) |
| Max gradient norm | 0.3 |
| LoRA rank | 8 |
| LoRA alpha | 16 |
| LoRA dropout | 0.1 |
| Target modules | 8 (q_proj, k_proj, v_proj, out_proj, w1, w2, w3, in_proj) |
| Training time | ~1–1.5 hours (per fine_tune.py; earlier docs listed ~2–2.5 hours before the v0.4.3 memory optimization) |

### Hardware

- **Device:** Apple Silicon (M-series)
- **Backend:** PyTorch MPS (Metal Performance Shaders)

## Dataset

- [VoltageVagabond/spam-email-dataset](https://huggingface.co/datasets/VoltageVagabond/spam-email-dataset)

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = AutoModelForCausalLM.from_pretrained("LiquidAI/LFM2.5-1.2B-Instruct")
model = PeftModel.from_pretrained(base_model, "adapters")
tokenizer = AutoTokenizer.from_pretrained("LiquidAI/LFM2.5-1.2B-Instruct")
```

## Gradio Interface

```bash
pip install -r requirements.txt
python app.py
```

## Files

- `adapters/` — LoRA adapter weights + config
- `fine_tune.py` — Training script
- `app.py` — Gradio web interface
- `training_data/` — Training dataset

## Intended Use

This model is an **educational demonstration** of LLM fine-tuning with HuggingFace PEFT, created as part of a university course project. It is suitable for:

- Learning how LoRA fine-tuning works with the HuggingFace ecosystem (Transformers + PEFT + TRL)
- Exploring Liquid AI's novel architecture for text classification
- Comparing different LLM fine-tuning frameworks (MLX vs. HuggingFace)

It is **not** intended for production spam filtering.

## Limitations

- May misclassify legitimate marketing emails as spam
- Trained on **English emails only** — not suitable for other languages
- Training set (~8K fast / ~16K full) is modest compared to production spam filters — generalization may be limited

**Note:** Three-class classification (SPAM / HAM / PHISHING) is supported as of v0.4.0 — earlier versions were binary. The model is deployed as a HuggingFace Space (see Space header above).

## Related Models

| Model | Description | Link |
|-------|-------------|------|
| spam-classifier-mlx | Qwen 3.5 0.8B MLX LoRA fine-tune | [VoltageVagabond/spam-classifier-mlx](https://huggingface.co/VoltageVagabond/spam-classifier-mlx) |
| spam-xai-model | sklearn voting ensemble (RF + LR + SVM) with LIME/SHAP/ELI5 explainability | [VoltageVagabond/spam-xai-model](https://huggingface.co/VoltageVagabond/spam-xai-model) |
| spam-xai-classifier (Space) | Live Gradio web app for the sklearn classifier | [VoltageVagabond/spam-xai-classifier](https://huggingface.co/spaces/VoltageVagabond/spam-xai-classifier) |

## Citation

```bibtex
@misc{voltagevagabond2026spamliquid,
  title={Spam Classifier — Liquid AI LFM2.5-1.2B LoRA Fine-Tune},
  author={VoltageVagabond},
  year={2026},
  howpublished={\url{https://huggingface.co/VoltageVagabond/spam-classifier-liquid}},
  note={ENGT 375 — Applied Machine Learning, Old Dominion University, Spring 2026}
}
```