--- language: - en license: mit tags: - text-generation - fine-tuning - lora - mlx - plain-english - literary - translation - smollm2 base_model: HuggingFaceTB/SmolLM2-1.7B-Instruct --- # PlainSpeak — Dense-to-Plain-English Translator **I taught a tiny AI to speak human.** Give it Shakespeare. Give it a legal contract. Give it anything written to impress instead of communicate. It gives you back what it actually means. ## Try it **[→ Live demo on Hugging Face Spaces](https://huggingface.co/spaces/Brandi-Kinard/plainspeak)** ```bash pip install mlx-lm mlx_lm.generate \ --model Brandi-Kinard/plainspeak-smollm2-1.7b \ --prompt "### Original: YOUR TEXT HERE ### Plain English:" \ --max-tokens 200 ``` ## Examples **Shakespeare → Plain English** > **Original:** *"Wherefore art thou Romeo? Deny thy father and refuse thy name."* > > **PlainSpeak:** "Why are you Romeo? Don't deny your father and refuse your name." **Adam Smith → Plain English** > **Original:** *"The invisible hand of the market, whereby individuals pursuing their own self-interest are led, as if by an invisible hand, to promote ends which were no part of their original intention."* > > **PlainSpeak:** "When people try to make money for themselves, they often end up helping society without meaning to." **KJV Bible → Plain English** > **Original:** *"The LORD is my shepherd; I shall not want."* > > **PlainSpeak:** "The LORD leads me. I don't need anything else." ## Model Details | Property | Value | |---|---| | Base model | SmolLM2-1.7B-Instruct | | Fine-tuning method | LoRA (8 layers) | | Training iterations | 500 | | Training examples | 1,200 | | Validation examples | 150 | | Data source | Project Gutenberg + AI-generated synthetic pairs | | Hardware | Apple M1, 16GB unified memory | | Peak training memory | 10.09 GB | | Final val loss | 1.771 | | Inference memory | ~3.6 GB | | Build time | 1 evening | ## What it's good at - 19th century prose (Dickens, James, Eliot, Hardy) - Shakespeare and Elizabethan English - King James Bible passages - Economic and political theory (Smith, Burke, Locke) - Academic abstracts - Legal boilerplate ## Known limitations - Trained on 200-word chunks — short fragments may produce inconsistent results - Occasional errors on numerical content (dates, quantities) - Not optimized for highly technical scientific notation - May struggle with extremely abstract or experimental writing (e.g. stream-of-consciousness) ## How it was built ``` 1. Stream 1,500 prose passages from Project Gutenberg 2. Generate plain English versions using a frontier model as teacher 3. Format as (original → plain) training pairs 4. Fine-tune SmolLM2-1.7B with LoRA on Apple MLX 5. Fuse adapter into final weights ``` The key insight: a small model trained on 1,500 high-quality examples outperforms a large model trained on millions of noisy ones — at this specific task. ## Use in Python ```python from mlx_lm import load, generate model, tokenizer = load("Brandi-Kinard/plainspeak-smollm2-1.7b") prompt = """### Original: It is a truth universally acknowledged, that a single man in possession of a good fortune, must be in want of a wife. ### Plain English:""" response = generate(model, tokenizer, prompt=prompt, max_tokens=200) print(response) ``` ## Links - [GitHub repo](https://github.com/Brandi-Kinard/plainspeak) - [Live demo](https://huggingface.co/spaces/Brandi-Kinard/plainspeak) - [Built by Brandi Kinard](https://www.linkedin.com/in/brandi-kinard/)