nanochat English vs Greek medium experiment

This repo stores the 353M nanochat comparison runs. It now includes the original English baseline, the original Greek run, and a refreshed cleaned Greek run produced with the same nanochat recipe and builder path as the original Greek comparison.

Three-way Results

Run	Final validation BPB	Best validation BPB	Train BPB	Test BPB	Best step	Train time
English baseline	`0.890841`	`0.890841`	`0.852060`	`0.891791`	`2708`	`2.49 h`
Original Greek	`0.300441`	`0.300441`	`0.307879`	`0.331570`	`2708`	`0.87 h`
Refreshed cleaned Greek	`0.424307`	`0.424307`	`0.401086`	`0.434514`	`2708`	`0.93 h`

Delta for refreshed cleaned Greek versus original Greek:

Final validation BPB: +0.123866
Best validation BPB: +0.123866
Train BPB: +0.093207
Test BPB: +0.102944

Three-way Plots

Refreshed Greek Data

Dataset repo: fffoivos/glossapi-greek-nanochat-pretraining-dataset
Dataset revision used for this run: e3b8d19d40665551d49af369efb082e82b6a815e
Builder script: /home/ubuntu/experiments/nanochat/2026-05-07_refreshed-dataset-comparable-353m/runs/greek-refreshed-dataset-comparable-d13/build/scripts/prepare_glossapi_greek_experiment_data_v2.py
Train/validation/test char targets: 5658225610 / 251485449 / 251485449
Split namespace and seed: nanochat-en-vs-el-medium, seed 20260322
Quality filters: greek_badness_score < 10.0 and mojibake_badness_score <= 0.1
Dedup replay: drop_intra_and_inter, exact stage strict_and_relaxed, near threshold 0.85, policy share_aware
Markdown chunking and shuffled chunks are enabled, matching the original Greek recipe.

Artifacts

models/english/final/
models/greek_threshold_050/final/
models/greek_refreshed_dataset_comparable/final/
tokenizers/english/
tokenizers/greek_threshold_050/
tokenizers/greek_refreshed_dataset_comparable/
analysis/loss_comparison_refreshed_dataset_three_way.json
analysis/final_manifest_refreshed_dataset_comparable.json
docs/RUN_REPORT.md
docs/RUN_REPORT_REFRESHED_DATASET_COMPARABLE.md

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support