Instructions to use vitorallo/securereview-7b-mlx-4bit with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use vitorallo/securereview-7b-mlx-4bit with MLX:

# Make sure mlx-lm is installed
# pip install --upgrade mlx-lm

# Generate text with mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("vitorallo/securereview-7b-mlx-4bit")

prompt = "Write a story about Einstein"
messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True
)

text = generate(model, tokenizer, prompt=prompt, verbose=True)

Notebooks
Google Colab
Kaggle
Local Apps Settings
LM Studio

How to use vitorallo/securereview-7b-mlx-4bit with Pi:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "vitorallo/securereview-7b-mlx-4bit"

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "mlx-lm": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "vitorallo/securereview-7b-mlx-4bit"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use vitorallo/securereview-7b-mlx-4bit with Hermes Agent:

Start the MLX server

# Install MLX LM:
uv tool install mlx-lm
# Start a local OpenAI-compatible server:
mlx_lm.server --model "vitorallo/securereview-7b-mlx-4bit"

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default vitorallo/securereview-7b-mlx-4bit

Run Hermes

hermes

MLX LM

How to use vitorallo/securereview-7b-mlx-4bit with MLX LM:

Generate or start a chat session

# Install MLX LM
uv tool install mlx-lm
# Interactive chat REPL
mlx_lm.chat --model "vitorallo/securereview-7b-mlx-4bit"

Run an OpenAI-compatible server

# Install MLX LM
uv tool install mlx-lm
# Start the server
mlx_lm.server --model "vitorallo/securereview-7b-mlx-4bit"
# Calling the OpenAI-compatible server with curl
curl -X POST "http://localhost:8000/v1/chat/completions" \
   -H "Content-Type: application/json" \
   --data '{
     "model": "vitorallo/securereview-7b-mlx-4bit",
     "messages": [
       {"role": "user", "content": "Hello"}
     ]
   }'

vitorallo commited on Apr 19

Commit

4927f6f

verified ·

1 Parent(s): d281097

Upload folder using huggingface_hub

Browse files

Files changed (2) hide show

README.md +33 -122
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -22,154 +22,65 @@ datasets:
 # securereview-7b-mlx-4bit
-A 4-bit quantised, MLX-native fine-tune of Qwen2.5-Coder-7B-Instruct for
-function-level security code review. Feed it a function, get back JSON
-findings with severity, category, CWE, line number, and a description.
-Runs on Apple Silicon with ~7 GB of memory.
-Trained on 13,484 examples across 9 languages (Python, JavaScript,
-TypeScript, Go, Java, Ruby, Rust, C++, C#) from CVEFixes, synthetic
 generation, real vulnerable applications, and community rule sets.
 All training data is permissively licensed.
 ## Benchmarks
-### Out-of-distribution (real vulnerable apps)
-Tested against 33 unique vulnerable functions from 8 deliberately
-vulnerable applications (DVNA, NodeGoat, pygoat, crAPI, DSVW, WebGoat,
-RailsGoat, Juice Shop):
-**Recall: 33/33 (100%)**
 | Category | Recall |
 |----------|--------|
-| SQL Injection | 6/6 (100%) |
-| Command Injection | 3/3 (100%) |
-| SSRF | 2/2 (100%) |
-| Path Traversal | 2/2 (100%) |
-| Broken Access Control | 5/5 (100%) |
-| IDOR | 7/7 (100%) |
-| Code Injection | 1/1 (100%) |
-| Open Redirect | 1/1 (100%) |
-| Insecure Deserialization | 1/1 (100%) |
-| Broken Authentication | 1/1 (100%) |
-### In-distribution (200-example test split)
-| Metric | Base Qwen | securereview-7b (M6) |
-|--------|----------:|---------------------:|
-| F1 | 14.3% | **44%+** |
-| FPR | 70.3% | **<3%** |
-| Logic bug recall | 29.2% | **37%+** |
-| Pattern bug recall | 28.9% | **59%+** |
-### Auth-context awareness
-M6 understands the `Auth:` context line added by scanners that perform
-auth-coverage analysis. When the prompt includes
-`Auth: NONE -- no auth decorator or middleware protects this endpoint`,
-the model correctly flags IDOR and broken access control on small route
-handlers that previous versions missed.
-## How to use
 ```python
 from mlx_lm import load, generate
 model, tok = load("vitorallo/securereview-7b-mlx-4bit")
-# Belt-and-braces stop-token patch for older mlx-lm versions
 if hasattr(tok, "eos_token_ids") and 151645 not in tok.eos_token_ids:
     tok.eos_token_ids.add(151645)
-SYSTEM = (
-    "You are a JSON API that performs security code review. "
-    "You only output valid JSON. Never output markdown, explanations, "
-    "or text outside JSON."
-)
-USER = """Project: Express.js, 12 route handlers, 0 auth functions, 4 data sinks.
-Auth coverage: 0 of 12 route handlers protected. 12 unprotected.
-Analyze this function for security vulnerabilities.
-Function: get_user
-File: app/api/users.py:1-3
-Role: ROUTE_HANDLER
-Auth: NONE -- no auth decorator or middleware protects this endpoint
-Calls: db.execute
-Code:
-```
-def get_user(user_id):
-    return db.execute(f"SELECT * FROM users WHERE id='{user_id}'")
-```
-Respond with ONLY a valid JSON object:
-{"findings": [{"severity": "HIGH|MEDIUM|LOW", "category": "...", "line": integer, "code": "...", "description": "...", "recommendation": "...", "confidence": 0.0-1.0, "cwe_id": "CWE-xxx"}]}
-If no vulnerabilities: {"findings": []}"""
-prompt = tok.apply_chat_template(
-    [{"role": "system", "content": SYSTEM}, {"role": "user", "content": USER}],
-    add_generation_prompt=True, tokenize=False,
-)
-print(generate(model, tok, prompt=prompt, max_tokens=512))
 ```
-The prompt format matters. The model was trained with `mask_prompt: true`
-on a specific prompt structure. Key fields: `Function`, `File`, `Role`,
-`Auth`, `Called by`, `Calls`, and a triple-backticked `Code` block.
-Deviating from this structure recovers base-model behaviour.
-Full prompt specification: [docs/m3_inference_contract.md](https://github.com/vitorallo/securereview-7b/blob/main/docs/m3_inference_contract.md)
 ## Training
-- **Base**: mlx-community/Qwen2.5-Coder-7B-Instruct-4bit
-- **Method**: QLoRA via mlx-lm lora
-- **LoRA**: rank 8, scale 1.0, 8 transformer layers
-- **Optimizer**: AdamW, lr 1e-4
-- **Batch**: 4 effective (1 x 4 grad accumulation), max_seq_length 4096
-- **Epochs**: 1 (~2,654 iterations on 10,616 train records)
-- **Val loss**: 1.530 -> 0.203
-- **Hardware**: Apple Silicon, ~1 hour, 7.4 GB peak memory
-The adapter is deliberately light (rank 8, 1 epoch) to preserve the base
-model's security reasoning while teaching the output format and
-auth-context awareness.
-## Training data
-13,484 records from 11 sources:
-| Source | Records | Description |
-|--------|--------:|-------------|
-| CVEFixes | 7,846 | Real vulnerable functions from CVE patch commits |
-| Synthetic | 2,728 | LLM-generated pairs from 162 security rules |
-| Tool-use | 2,700 | Multi-turn tool-calling examples |
-| Investigation | 500 | Verdict examples (confirmed/dismissed/uncertain) |
-| Vulnapp | 119 | Real functions from 8 vulnerable applications |
-| IDOR/auth | 55 | Auth-context-aware IDOR and clean route handler examples |
-All records include `Auth:` context and `Project:` preamble lines
-matching the M11 scanner prompt format.
-All permissively licensed. No CC-BY-NC or share-alike data.
-## Iteration history
-| Version | Test F1 | FPR | Real-code recall | Change |
-|---------|--------:|----:|----------------:|--------|
-| base | 14% | 70% | -- | No fine-tuning |
-| M3 | 61% | 0% | 26% | Anti-memorisation, description templates |
-| M4 | 66% | 0% | 0% (prod) | Tool-use format; over-suppressed |
-| M5 | 44% | 2.6% | 88% | Lighter adapter, real vulnapp data |
-| **M6** | **44%+** | **<3%** | **100%** | Auth-context injection, IDOR training examples |
 ## Links
-- Code + pipeline: [github.com/vitorallo/securereview-7b](https://github.com/vitorallo/securereview-7b)
 - License: Apache-2.0
 ## Citation

 # securereview-7b-mlx-4bit
+A 4-bit MLX fine-tune of Qwen2.5-Coder-7B-Instruct for function-level
+security code review. Input: a code function. Output: structured JSON
+findings with severity, category, CWE, line number, and description.
+Runs on Apple Silicon, ~8 GB memory.
+Trained on 13,484 examples across 9 languages from CVEFixes, synthetic
 generation, real vulnerable applications, and community rule sets.
 All training data is permissively licensed.
 ## Benchmarks
+Tested against 33 vulnerable functions from 8 deliberately vulnerable
+applications (DVNA, NodeGoat, pygoat, crAPI, DSVW, WebGoat, RailsGoat,
+Juice Shop):
+| Metric | Base Qwen | securereview-7b |
+|--------|----------:|----------------:|
+| Vulnapp recall | -- | **94% (31/33)** |
+| FPR (clean code) | 70% | **<3%** |
+| F1 (test split) | 14% | **44%** |
+Detection by category:
 | Category | Recall |
 |----------|--------|
+| SQL Injection | 100% |
+| Command Injection | 100% |
+| SSRF | 100% |
+| Path Traversal | 100% |
+| Broken Access Control | 100% |
+| IDOR | 86% |
+| Insecure Deserialization | 100% |
+| Broken Authentication | 100% |
+## Quick start
 ```python
 from mlx_lm import load, generate
 model, tok = load("vitorallo/securereview-7b-mlx-4bit")
 if hasattr(tok, "eos_token_ids") and 151645 not in tok.eos_token_ids:
     tok.eos_token_ids.add(151645)
 ```
+The model expects a structured prompt with `Function`, `File`, `Role`,
+`Auth`, `Code` fields and a JSON format reminder. See
+[docs/m3_inference_contract.md](https://github.com/vitorallo/securereview-7b/blob/main/docs/m3_inference_contract.md)
+for the full prompt specification.
 ## Training
+- **Base**: Qwen2.5-Coder-7B-Instruct-4bit
+- **Method**: QLoRA, rank 8, 8 layers, 1 epoch, lr 1e-4
+- **Data**: 13,484 records, 9 languages, multi-rule prompts (2-8 rules per function)
+- **Hardware**: Apple Silicon, ~1 hour
 ## Links
+- [Code + pipeline](https://github.com/vitorallo/securereview-7b)
 - License: Apache-2.0
 ## Citation

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2b166500f9fb7c022aaf180d0db66c4f69189cd51d15c55ae4a428d533f04911
 size 4284346187

 version https://git-lfs.github.com/spec/v1
+oid sha256:0f66cdb775a7537c7837f99701d35ef0d7df7a3aba23cdb64c0306a68b0bc03f
 size 4284346187