sweelol
/

finetuned-pruned-gemma3-270m-dolly

Text Generation

Model card Files Files and versions

sweelol commited on Aug 26, 2025

Commit

a2d4bbe

·

verified ·

1 Parent(s): 0daf645

Update README.md

Files changed (1) hide show

README.md +19 -0

README.md CHANGED Viewed

@@ -27,6 +27,25 @@ This model is part of the **Sweelol AI Hub** collection, resulting from experime
 This is a placeholder README. A detailed model card with full results and usage instructions will be added shortly.
 ## Evaluation

 This is a placeholder README. A detailed model card with full results and usage instructions will be added shortly.
+## Evaluation Results
+This table compares the performance of this **Finetuned-Pruned** model against the original, un-tuned `google/gemma-3-270m` base model.
+| Benchmark Task | Sweelol Finetuned-Pruned | Baseline (Gemma-3-270m) | Change |
+| :--- | :--- | :--- | :--- |
+| **Average MMLU (5 tasks)** | 25.18% | 24.88% | **+0.30%** |
+| HellaSwag (Common Sense) | 29.50% | 43.50% | -14.00% |
+| ---------------------------------- | ---------- | ---------- | -------- |
+| *MMLU Sub-task Breakdown:* | | | |
+| MMLU - Formal Logic | **28.57%** | 25.40% | **+3.17%** |
+| MMLU - High School Computer Science | **25.00%** | 24.00% | **+1.00%** |
+| MMLU - Professional Law | 25.00% | 27.00% | -2.00% |
+| MMLU - Abstract Algebra | 22.00% | 22.00% | 0.00% |
+| MMLU - High School Mathematics | 21.00% | 26.00% | -5.00% |
+#### Summary of Findings
+Fine-tuning the pruned model resulted in a solid overall improvement on MMLU, particularly in formal logic. However, like the pruned-only baseline, it suffered a significant drop in common-sense reasoning (HellaSwag).
 ## Evaluation