cy0307 commited on
Commit
03b872c
·
verified ·
1 Parent(s): 0ba324f

Publish Ropedia Xperience-10M task baseline cards

Browse files
ARTIFACT_GUIDE.md CHANGED
@@ -24,7 +24,7 @@ The project intentionally separates four layers:
24
  | [`EVIDENCE_CONTRACT.md`](EVIDENCE_CONTRACT.md) | Defines which claims are verified and which are explicitly not claimed. |
25
  | [`REPRODUCIBILITY.md`](REPRODUCIBILITY.md) | Defines public reproduction commands, expected outputs, and unreproducible boundaries. |
26
  | [`metrics/artifact_index.json`](metrics/artifact_index.json) | Lists reviewer-critical files with existence, size, and stable hashes. |
27
- | [`metrics/mirror_parity.json`](metrics/mirror_parity.json) | Confirms prepared HF Space, artifact, and model mirrors match the repo for critical files. |
28
  | [`metrics/publication_audit.json`](metrics/publication_audit.json) | Confirms public bundles exclude raw data, Python caches, heavy archives, and token strings. |
29
  | [`metrics/scope_claims_audit.json`](metrics/scope_claims_audit.json) | Confirms historical `32ep` smoke-run identifiers are not presented as real 32-episode results. |
30
  | [`metrics/website_integrity.json`](metrics/website_integrity.json) | Confirms local site links, anchors, JSON bundles, and referenced images resolve. |
 
24
  | [`EVIDENCE_CONTRACT.md`](EVIDENCE_CONTRACT.md) | Defines which claims are verified and which are explicitly not claimed. |
25
  | [`REPRODUCIBILITY.md`](REPRODUCIBILITY.md) | Defines public reproduction commands, expected outputs, and unreproducible boundaries. |
26
  | [`metrics/artifact_index.json`](metrics/artifact_index.json) | Lists reviewer-critical files with existence, size, and stable hashes. |
27
+ | [`metrics/mirror_parity.json`](metrics/mirror_parity.json) | Confirms prepared HF Space, artifact, and model mirrors match the repo for critical data, figures, website HTML, and validator scripts. |
28
  | [`metrics/publication_audit.json`](metrics/publication_audit.json) | Confirms public bundles exclude raw data, Python caches, heavy archives, and token strings. |
29
  | [`metrics/scope_claims_audit.json`](metrics/scope_claims_audit.json) | Confirms historical `32ep` smoke-run identifiers are not presented as real 32-episode results. |
30
  | [`metrics/website_integrity.json`](metrics/website_integrity.json) | Confirms local site links, anchors, JSON bundles, and referenced images resolve. |
EVIDENCE_CONTRACT.md CHANGED
@@ -15,7 +15,7 @@ local artifact that a reader can inspect before trusting the dashboard.
15
  | Qwen3-Omni infrastructure has passed technical smoke checks. | Companion GitHub repo: `results/omni_finetune/RUN_REPORT.md`, `results/omni_finetune/dataset_manifest.json`, `results/omni_finetune/metrics_eval.json` | Smoke-only evidence | One episode, 128 train windows; not a 32-episode pilot |
16
  | The real 32-episode LoRA pilot is blocked on gated data access, not on repo presentation. | Companion GitHub repo: `results/omni_finetune/DATA_BLOCKER_REPORT.md`, `results/omni_finetune/A100_HF_RELAY_STATUS.md`, `results/omni_finetune/source_discovery.json` | Blocker documented | No 32-episode metric should be claimed until the gate passes |
17
  | Historical `32ep` path strings are not treated as 32-episode results. | `scripts/validate_scope_claims.py`, `metrics/scope_claims_audit.json` | Verified pass | Classifies old run/path identifiers and fails if public presentation claims real 32-episode metrics |
18
- | Prepared GitHub/Hugging Face mirrors carry matching critical files. | `scripts/validate_mirror_parity.py`, `metrics/mirror_parity.json` | Verified pass | Compares prepared Space, artifact dataset, and model bundles before upload; live URLs are checked after publishing |
19
  | The public GitHub and Hugging Face bundles are publication-clean. | `scripts/validate_publication_package.py`, `metrics/publication_audit.json` | Verified pass | Checks public files and HF bundles, not arbitrary ignored local scratch outputs |
20
  | The public website has checked local references. | `scripts/validate_website_integrity.py`, `metrics/website_integrity.json` | Verified pass | Checks local links, anchors, JSON data, and referenced images; external URLs are not fetched |
21
  | The core proof artifacts are indexed and grouped for fast review. | `ARTIFACT_GUIDE.md`, `scripts/build_artifact_index.py`, `metrics/artifact_index.json` | Verified guide and index | Selective source-of-truth catalog, not a complete inventory of every output file |
@@ -43,7 +43,8 @@ local artifact that a reader can inspect before trusting the dashboard.
43
  8. Inspect `metrics/scope_claims_audit.json` before interpreting historical
44
  `32ep` strings in Qwen3-Omni smoke artifacts.
45
  9. Inspect `metrics/mirror_parity.json` before assuming the GitHub and
46
- Hugging Face mirrors contain the same critical files.
 
47
  10. Inspect the companion GitHub repo's
48
  `results/omni_finetune/DATA_BLOCKER_REPORT.md` before interpreting any
49
  Qwen3-Omni artifact.
 
15
  | Qwen3-Omni infrastructure has passed technical smoke checks. | Companion GitHub repo: `results/omni_finetune/RUN_REPORT.md`, `results/omni_finetune/dataset_manifest.json`, `results/omni_finetune/metrics_eval.json` | Smoke-only evidence | One episode, 128 train windows; not a 32-episode pilot |
16
  | The real 32-episode LoRA pilot is blocked on gated data access, not on repo presentation. | Companion GitHub repo: `results/omni_finetune/DATA_BLOCKER_REPORT.md`, `results/omni_finetune/A100_HF_RELAY_STATUS.md`, `results/omni_finetune/source_discovery.json` | Blocker documented | No 32-episode metric should be claimed until the gate passes |
17
  | Historical `32ep` path strings are not treated as 32-episode results. | `scripts/validate_scope_claims.py`, `metrics/scope_claims_audit.json` | Verified pass | Classifies old run/path identifiers and fails if public presentation claims real 32-episode metrics |
18
+ | Prepared GitHub/Hugging Face mirrors carry matching critical files. | `scripts/validate_mirror_parity.py`, `metrics/mirror_parity.json` | Verified pass | Compares prepared data files, visual assets, website HTML, and validator scripts before upload; live URLs are checked after publishing |
19
  | The public GitHub and Hugging Face bundles are publication-clean. | `scripts/validate_publication_package.py`, `metrics/publication_audit.json` | Verified pass | Checks public files and HF bundles, not arbitrary ignored local scratch outputs |
20
  | The public website has checked local references. | `scripts/validate_website_integrity.py`, `metrics/website_integrity.json` | Verified pass | Checks local links, anchors, JSON data, and referenced images; external URLs are not fetched |
21
  | The core proof artifacts are indexed and grouped for fast review. | `ARTIFACT_GUIDE.md`, `scripts/build_artifact_index.py`, `metrics/artifact_index.json` | Verified guide and index | Selective source-of-truth catalog, not a complete inventory of every output file |
 
43
  8. Inspect `metrics/scope_claims_audit.json` before interpreting historical
44
  `32ep` strings in Qwen3-Omni smoke artifacts.
45
  9. Inspect `metrics/mirror_parity.json` before assuming the GitHub and
46
+ Hugging Face mirrors contain the same critical data, visual, HTML, and
47
+ validator files.
48
  10. Inspect the companion GitHub repo's
49
  `results/omni_finetune/DATA_BLOCKER_REPORT.md` before interpreting any
50
  Qwen3-Omni artifact.
README.md CHANGED
@@ -62,14 +62,15 @@ and metrics for the 12-task Xperience-10M episode suite, plus four lightweight
62
  direction-extension probes. It is meant to be read like a model audit, not
63
  advertised as a robot foundation model.
64
 
65
- ![12-task suite with sample modalities](assets/task_suite_infographic.png?v=xperience10m-modalities-v9-large-atlas)
66
 
67
  The source Xperience-10M sample spans video, audio, depth, pose, motion
68
  capture, inertial sensing, and language annotation. The committed minimal and
69
  neural task heads use the current 8,378-d feature manifest; audio is documented
70
  in the figures but is not yet extracted into a model input feature block.
71
- The companion dashboard and this model card mirror the responsive modality atlas
72
- metadata in `metrics/modality_atlas.json`, with standalone derived thumbnails in
 
73
  `assets/modalities/`.
74
 
75
  The committed heads are intentionally small:
@@ -110,7 +111,7 @@ Source-of-truth artifact index mirror: `metrics/artifact_index.json`.
110
  | Feature contract | `artifacts/**/feature_manifest.json` | audio documented but not featurized |
111
  | Qwen3-Omni | companion blocker and relay reports | smoke-only until 32 valid episodes are available |
112
  | Scope claims guard | `metrics/scope_claims_audit.json` and `scripts/validate_scope_claims.py` | historical `32ep` path strings are provenance, not 32-episode results |
113
- | Mirror parity | `metrics/mirror_parity.json` and `scripts/validate_mirror_parity.py` | prepared repo/HF mirrors carry matching critical files |
114
  | Publication hygiene | `metrics/publication_audit.json` and validator script mirror | public bundles contain no raw data, generated caches, heavy archives, or token strings |
115
  | Website integrity | `metrics/website_integrity.json` and validator script mirror | local links, anchors, JSON bundles, and referenced images only |
116
  | Artifact index | `metrics/artifact_index.json` and `scripts/build_artifact_index.py` | compact catalog of the reviewer-critical proof artifacts |
@@ -142,10 +143,10 @@ transfers them to H20 for manifest building, training, and evaluation.
142
  | `artifacts/episode_task_suite/research_direction_extensions/` | adds one coded extension probe per research direction |
143
  | `artifacts/episode_task_suite/task_walkthroughs/` | explains every task with case study, input, process modules, output, and limitation |
144
  | `assets/task_architectures.png` | shows the shared pipeline and all 12 heads |
145
- | `assets/task_suite_infographic.png` | presents the 12 heads with public-sample modality thumbnails and verified metrics |
146
  | `assets/modalities/`, `metrics/modality_atlas.json` | responsive modality-card thumbnails and metadata for sample inspection |
147
  | `metrics/artifact_index.json` | indexes proof artifacts with existence, size, and stable-file hashes |
148
- | `metrics/mirror_parity.json` | verifies prepared repo/HF mirrors have matching critical files before upload |
149
  | `metrics/scope_claims_audit.json` | verifies historical `32ep` smoke-run identifiers are not presented as real 32-episode results |
150
  | `metrics/publication_audit.json` | records the latest public-bundle hygiene check |
151
  | `metrics/website_integrity.json` | records the latest local website link, anchor, JSON, and image integrity check |
 
62
  direction-extension probes. It is meant to be read like a model audit, not
63
  advertised as a robot foundation model.
64
 
65
+ ![12-task suite with sample modalities](assets/task_suite_infographic.png?v=xperience10m-taskfirst-v10)
66
 
67
  The source Xperience-10M sample spans video, audio, depth, pose, motion
68
  capture, inertial sensing, and language annotation. The committed minimal and
69
  neural task heads use the current 8,378-d feature manifest; audio is documented
70
  in the figures but is not yet extracted into a model input feature block.
71
+ The companion dashboard and this model card start with the task-first 12-head
72
+ map, then mirror the responsive modality atlas metadata in
73
+ `metrics/modality_atlas.json`, with standalone derived thumbnails in
74
  `assets/modalities/`.
75
 
76
  The committed heads are intentionally small:
 
111
  | Feature contract | `artifacts/**/feature_manifest.json` | audio documented but not featurized |
112
  | Qwen3-Omni | companion blocker and relay reports | smoke-only until 32 valid episodes are available |
113
  | Scope claims guard | `metrics/scope_claims_audit.json` and `scripts/validate_scope_claims.py` | historical `32ep` path strings are provenance, not 32-episode results |
114
+ | Mirror parity | `metrics/mirror_parity.json` and `scripts/validate_mirror_parity.py` | prepared repo/HF mirrors carry matching critical data, figures, website HTML, and validator files |
115
  | Publication hygiene | `metrics/publication_audit.json` and validator script mirror | public bundles contain no raw data, generated caches, heavy archives, or token strings |
116
  | Website integrity | `metrics/website_integrity.json` and validator script mirror | local links, anchors, JSON bundles, and referenced images only |
117
  | Artifact index | `metrics/artifact_index.json` and `scripts/build_artifact_index.py` | compact catalog of the reviewer-critical proof artifacts |
 
143
  | `artifacts/episode_task_suite/research_direction_extensions/` | adds one coded extension probe per research direction |
144
  | `artifacts/episode_task_suite/task_walkthroughs/` | explains every task with case study, input, process modules, output, and limitation |
145
  | `assets/task_architectures.png` | shows the shared pipeline and all 12 heads |
146
+ | `assets/task_suite_infographic.png` | presents the shared processing contract, 12 heads, verified metrics, and public-sample modality thumbnails |
147
  | `assets/modalities/`, `metrics/modality_atlas.json` | responsive modality-card thumbnails and metadata for sample inspection |
148
  | `metrics/artifact_index.json` | indexes proof artifacts with existence, size, and stable-file hashes |
149
+ | `metrics/mirror_parity.json` | verifies prepared repo/HF mirrors have matching critical data, figures, website HTML, and validator files before upload |
150
  | `metrics/scope_claims_audit.json` | verifies historical `32ep` smoke-run identifiers are not presented as real 32-episode results |
151
  | `metrics/publication_audit.json` | records the latest public-bundle hygiene check |
152
  | `metrics/website_integrity.json` | records the latest local website link, anchor, JSON, and image integrity check |
metrics/artifact_index.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "title": "Ropedia Xperience-10M Task Suite Artifact Index",
3
- "generated_at_utc": "2026-06-01T04:49:01+00:00",
4
  "status": "pass",
5
  "artifact_count": 29,
6
  "missing": [],
@@ -35,8 +35,8 @@
35
  "surface": "repo",
36
  "proves": "Defines what is verified, what is smoke-only, and what must not be inferred.",
37
  "exists": true,
38
- "bytes": 6440,
39
- "sha256": "a89e2316e19ebacbb1150879c070279f8f6f659030a945fc398eb08280c60cc0"
40
  },
41
  {
42
  "id": "reviewer_packet",
@@ -57,8 +57,8 @@
57
  "surface": "repo_hf",
58
  "proves": "Gives the human-readable map from proof boundary to data, tasks, platform mirrors, and scale-up status.",
59
  "exists": true,
60
- "bytes": 6438,
61
- "sha256": "01d2e37bf25a5884e116ba7de80cc460d69523c563e540a650646a58e365713f"
62
  },
63
  {
64
  "id": "reproducibility_contract",
@@ -90,8 +90,8 @@
90
  "surface": "repo_hf",
91
  "proves": "Generates the selective proof-artifact catalog from local files.",
92
  "exists": true,
93
- "bytes": 11565,
94
- "sha256": "d57875b1e42a58c02aa2f7da481f7b2190b82414113827883dd8b332c33552f3"
95
  },
96
  {
97
  "id": "publication_audit",
@@ -124,7 +124,7 @@
124
  "kind": "mirror_parity",
125
  "surface": "website_hf",
126
  "volatile": true,
127
- "proves": "Confirms prepared GitHub/HF Space/artifact/model mirrors share the same critical data, figure, and validator files.",
128
  "exists": true,
129
  "bytes": 41465,
130
  "hash_policy": "existence_and_size_only"
 
1
  {
2
  "title": "Ropedia Xperience-10M Task Suite Artifact Index",
3
+ "generated_at_utc": "2026-06-01T05:07:04+00:00",
4
  "status": "pass",
5
  "artifact_count": 29,
6
  "missing": [],
 
35
  "surface": "repo",
36
  "proves": "Defines what is verified, what is smoke-only, and what must not be inferred.",
37
  "exists": true,
38
+ "bytes": 6497,
39
+ "sha256": "417835c2f838f1d4c4bca9f07c708ce04611e7212017e58421956818a4ca4b45"
40
  },
41
  {
42
  "id": "reviewer_packet",
 
57
  "surface": "repo_hf",
58
  "proves": "Gives the human-readable map from proof boundary to data, tasks, platform mirrors, and scale-up status.",
59
  "exists": true,
60
+ "bytes": 6483,
61
+ "sha256": "cc211ec1175eed43bc5d8c9ce1ec412982524af1998316b392e77e5d7ddc99ee"
62
  },
63
  {
64
  "id": "reproducibility_contract",
 
90
  "surface": "repo_hf",
91
  "proves": "Generates the selective proof-artifact catalog from local files.",
92
  "exists": true,
93
+ "bytes": 11579,
94
+ "sha256": "874a3813fb3a19d79be9ea4c0177f5922adf9e667760f927dd49163784eb6b48"
95
  },
96
  {
97
  "id": "publication_audit",
 
124
  "kind": "mirror_parity",
125
  "surface": "website_hf",
126
  "volatile": true,
127
+ "proves": "Confirms prepared GitHub/HF Space/artifact/model mirrors share the same critical data, figure, website HTML, and validator files.",
128
  "exists": true,
129
  "bytes": 41465,
130
  "hash_policy": "existence_and_size_only"
metrics/evidence_contract.json CHANGED
@@ -110,7 +110,7 @@
110
  },
111
  {
112
  "id": "mirror_parity",
113
- "claim": "Prepared GitHub and Hugging Face mirrors carry matching critical files.",
114
  "status": "verified",
115
  "evidence": [
116
  "scripts/validate_mirror_parity.py",
 
110
  },
111
  {
112
  "id": "mirror_parity",
113
+ "claim": "Prepared GitHub and Hugging Face mirrors carry matching critical data, visual, HTML, and validator files.",
114
  "status": "verified",
115
  "evidence": [
116
  "scripts/validate_mirror_parity.py",
metrics/mirror_parity.json CHANGED
@@ -1,9 +1,9 @@
1
  {
2
  "status": "pass",
3
- "generated_at_utc": "2026-06-01T04:49:44+00:00",
4
  "hf_root": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish",
5
  "summary": {
6
- "group_count": 28,
7
  "failure_count": 0,
8
  "failures_by_surface": {}
9
  },
@@ -19,6 +19,10 @@
19
  {
20
  "name": "repo_hf_validator_script_parity",
21
  "status": "pass"
 
 
 
 
22
  }
23
  ],
24
  "groups": [
@@ -28,27 +32,27 @@
28
  "local": {
29
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs/data/artifact_index.json",
30
  "exists": true,
31
- "bytes": 12902,
32
- "sha256": "0a6fb26c150942a0807fc38a092bc85f8dd63cc96943d6c2fb8a1df2d727b7ed"
33
  },
34
  "mirrors": {
35
  "hf_space": {
36
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/space/data/artifact_index.json",
37
  "exists": true,
38
- "bytes": 12902,
39
- "sha256": "0a6fb26c150942a0807fc38a092bc85f8dd63cc96943d6c2fb8a1df2d727b7ed"
40
  },
41
  "hf_artifacts": {
42
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/docs/data/artifact_index.json",
43
  "exists": true,
44
- "bytes": 12902,
45
- "sha256": "0a6fb26c150942a0807fc38a092bc85f8dd63cc96943d6c2fb8a1df2d727b7ed"
46
  },
47
  "hf_model": {
48
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/metrics/artifact_index.json",
49
  "exists": true,
50
- "bytes": 12902,
51
- "sha256": "0a6fb26c150942a0807fc38a092bc85f8dd63cc96943d6c2fb8a1df2d727b7ed"
52
  }
53
  },
54
  "failures": []
@@ -59,27 +63,27 @@
59
  "local": {
60
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs/data/evidence_contract.json",
61
  "exists": true,
62
- "bytes": 7148,
63
- "sha256": "7be0e996d5acec81b26eba19919ff92f951241c22189086c484b055c7f988bed"
64
  },
65
  "mirrors": {
66
  "hf_space": {
67
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/space/data/evidence_contract.json",
68
  "exists": true,
69
- "bytes": 7148,
70
- "sha256": "7be0e996d5acec81b26eba19919ff92f951241c22189086c484b055c7f988bed"
71
  },
72
  "hf_artifacts": {
73
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/docs/data/evidence_contract.json",
74
  "exists": true,
75
- "bytes": 7148,
76
- "sha256": "7be0e996d5acec81b26eba19919ff92f951241c22189086c484b055c7f988bed"
77
  },
78
  "hf_model": {
79
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/metrics/evidence_contract.json",
80
  "exists": true,
81
- "bytes": 7148,
82
- "sha256": "7be0e996d5acec81b26eba19919ff92f951241c22189086c484b055c7f988bed"
83
  }
84
  },
85
  "failures": []
@@ -152,27 +156,27 @@
152
  "local": {
153
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs/data/publication_audit.json",
154
  "exists": true,
155
- "bytes": 4105,
156
- "sha256": "ce4addc653c34287da1f529f526362fb791ad2a07d0e6610f617c4c8e1cf9597"
157
  },
158
  "mirrors": {
159
  "hf_space": {
160
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/space/data/publication_audit.json",
161
  "exists": true,
162
- "bytes": 4105,
163
- "sha256": "ce4addc653c34287da1f529f526362fb791ad2a07d0e6610f617c4c8e1cf9597"
164
  },
165
  "hf_artifacts": {
166
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/docs/data/publication_audit.json",
167
  "exists": true,
168
- "bytes": 4105,
169
- "sha256": "ce4addc653c34287da1f529f526362fb791ad2a07d0e6610f617c4c8e1cf9597"
170
  },
171
  "hf_model": {
172
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/metrics/publication_audit.json",
173
  "exists": true,
174
- "bytes": 4105,
175
- "sha256": "ce4addc653c34287da1f529f526362fb791ad2a07d0e6610f617c4c8e1cf9597"
176
  }
177
  },
178
  "failures": []
@@ -308,26 +312,26 @@
308
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs/data/scope_claims_audit.json",
309
  "exists": true,
310
  "bytes": 19964,
311
- "sha256": "a1a90b04b8bd11e751a34a9ca27676dbe543b6a3bf2454807abf861a91ce33b4"
312
  },
313
  "mirrors": {
314
  "hf_space": {
315
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/space/data/scope_claims_audit.json",
316
  "exists": true,
317
  "bytes": 19964,
318
- "sha256": "a1a90b04b8bd11e751a34a9ca27676dbe543b6a3bf2454807abf861a91ce33b4"
319
  },
320
  "hf_artifacts": {
321
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/docs/data/scope_claims_audit.json",
322
  "exists": true,
323
  "bytes": 19964,
324
- "sha256": "a1a90b04b8bd11e751a34a9ca27676dbe543b6a3bf2454807abf861a91ce33b4"
325
  },
326
  "hf_model": {
327
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/metrics/scope_claims_audit.json",
328
  "exists": true,
329
  "bytes": 19964,
330
- "sha256": "a1a90b04b8bd11e751a34a9ca27676dbe543b6a3bf2454807abf861a91ce33b4"
331
  }
332
  },
333
  "failures": []
@@ -401,26 +405,26 @@
401
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs/data/website_integrity.json",
402
  "exists": true,
403
  "bytes": 5936,
404
- "sha256": "0ba08b7d03c5513520d2900d57cd383f24e228a5d9d55b6a89e8d3419594c55f"
405
  },
406
  "mirrors": {
407
  "hf_space": {
408
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/space/data/website_integrity.json",
409
  "exists": true,
410
  "bytes": 5936,
411
- "sha256": "0ba08b7d03c5513520d2900d57cd383f24e228a5d9d55b6a89e8d3419594c55f"
412
  },
413
  "hf_artifacts": {
414
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/docs/data/website_integrity.json",
415
  "exists": true,
416
  "bytes": 5936,
417
- "sha256": "0ba08b7d03c5513520d2900d57cd383f24e228a5d9d55b6a89e8d3419594c55f"
418
  },
419
  "hf_model": {
420
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/metrics/website_integrity.json",
421
  "exists": true,
422
  "bytes": 5936,
423
- "sha256": "0ba08b7d03c5513520d2900d57cd383f24e228a5d9d55b6a89e8d3419594c55f"
424
  }
425
  },
426
  "failures": []
@@ -801,21 +805,21 @@
801
  "local": {
802
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/scripts/build_artifact_index.py",
803
  "exists": true,
804
- "bytes": 11565,
805
- "sha256": "d57875b1e42a58c02aa2f7da481f7b2190b82414113827883dd8b332c33552f3"
806
  },
807
  "mirrors": {
808
  "hf_artifacts": {
809
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/scripts/build_artifact_index.py",
810
  "exists": true,
811
- "bytes": 11565,
812
- "sha256": "d57875b1e42a58c02aa2f7da481f7b2190b82414113827883dd8b332c33552f3"
813
  },
814
  "hf_model": {
815
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/scripts/build_artifact_index.py",
816
  "exists": true,
817
- "bytes": 11565,
818
- "sha256": "d57875b1e42a58c02aa2f7da481f7b2190b82414113827883dd8b332c33552f3"
819
  }
820
  },
821
  "failures": []
@@ -826,21 +830,21 @@
826
  "local": {
827
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/scripts/validate_mirror_parity.py",
828
  "exists": true,
829
- "bytes": 6971,
830
- "sha256": "d0e0a1514a6c8548120f8bcb68827a648252a198197acab9186b72725fe9d39b"
831
  },
832
  "mirrors": {
833
  "hf_artifacts": {
834
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/scripts/validate_mirror_parity.py",
835
  "exists": true,
836
- "bytes": 6971,
837
- "sha256": "d0e0a1514a6c8548120f8bcb68827a648252a198197acab9186b72725fe9d39b"
838
  },
839
  "hf_model": {
840
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/scripts/validate_mirror_parity.py",
841
  "exists": true,
842
- "bytes": 6971,
843
- "sha256": "d0e0a1514a6c8548120f8bcb68827a648252a198197acab9186b72725fe9d39b"
844
  }
845
  },
846
  "failures": []
@@ -851,21 +855,21 @@
851
  "local": {
852
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/scripts/validate_publication_package.py",
853
  "exists": true,
854
- "bytes": 8960,
855
- "sha256": "129e1276a60abe1330de5190622097a0e19198d133d434425317123f0a390c82"
856
  },
857
  "mirrors": {
858
  "hf_artifacts": {
859
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/scripts/validate_publication_package.py",
860
  "exists": true,
861
- "bytes": 8960,
862
- "sha256": "129e1276a60abe1330de5190622097a0e19198d133d434425317123f0a390c82"
863
  },
864
  "hf_model": {
865
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/scripts/validate_publication_package.py",
866
  "exists": true,
867
- "bytes": 8960,
868
- "sha256": "129e1276a60abe1330de5190622097a0e19198d133d434425317123f0a390c82"
869
  }
870
  },
871
  "failures": []
@@ -919,6 +923,31 @@
919
  }
920
  },
921
  "failures": []
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
922
  }
923
  ],
924
  "failures": []
 
1
  {
2
  "status": "pass",
3
+ "generated_at_utc": "2026-06-01T05:08:43+00:00",
4
  "hf_root": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish",
5
  "summary": {
6
+ "group_count": 29,
7
  "failure_count": 0,
8
  "failures_by_surface": {}
9
  },
 
19
  {
20
  "name": "repo_hf_validator_script_parity",
21
  "status": "pass"
22
+ },
23
+ {
24
+ "name": "repo_hf_website_html_parity",
25
+ "status": "pass"
26
  }
27
  ],
28
  "groups": [
 
32
  "local": {
33
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs/data/artifact_index.json",
34
  "exists": true,
35
+ "bytes": 12916,
36
+ "sha256": "977e2d8d0ec9e42bee1fb7b43b9460b42a6d8a6d6e9a452389901b8d56d69372"
37
  },
38
  "mirrors": {
39
  "hf_space": {
40
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/space/data/artifact_index.json",
41
  "exists": true,
42
+ "bytes": 12916,
43
+ "sha256": "977e2d8d0ec9e42bee1fb7b43b9460b42a6d8a6d6e9a452389901b8d56d69372"
44
  },
45
  "hf_artifacts": {
46
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/docs/data/artifact_index.json",
47
  "exists": true,
48
+ "bytes": 12916,
49
+ "sha256": "977e2d8d0ec9e42bee1fb7b43b9460b42a6d8a6d6e9a452389901b8d56d69372"
50
  },
51
  "hf_model": {
52
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/metrics/artifact_index.json",
53
  "exists": true,
54
+ "bytes": 12916,
55
+ "sha256": "977e2d8d0ec9e42bee1fb7b43b9460b42a6d8a6d6e9a452389901b8d56d69372"
56
  }
57
  },
58
  "failures": []
 
63
  "local": {
64
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs/data/evidence_contract.json",
65
  "exists": true,
66
+ "bytes": 7182,
67
+ "sha256": "42a75b0f87eec02dd5b5fedffe6eb3d0cdc8d9f12156887680686f1900ac2bfa"
68
  },
69
  "mirrors": {
70
  "hf_space": {
71
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/space/data/evidence_contract.json",
72
  "exists": true,
73
+ "bytes": 7182,
74
+ "sha256": "42a75b0f87eec02dd5b5fedffe6eb3d0cdc8d9f12156887680686f1900ac2bfa"
75
  },
76
  "hf_artifacts": {
77
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/docs/data/evidence_contract.json",
78
  "exists": true,
79
+ "bytes": 7182,
80
+ "sha256": "42a75b0f87eec02dd5b5fedffe6eb3d0cdc8d9f12156887680686f1900ac2bfa"
81
  },
82
  "hf_model": {
83
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/metrics/evidence_contract.json",
84
  "exists": true,
85
+ "bytes": 7182,
86
+ "sha256": "42a75b0f87eec02dd5b5fedffe6eb3d0cdc8d9f12156887680686f1900ac2bfa"
87
  }
88
  },
89
  "failures": []
 
156
  "local": {
157
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs/data/publication_audit.json",
158
  "exists": true,
159
+ "bytes": 4214,
160
+ "sha256": "3d1a2d861c96d445541519494abfcfca1da13cb593094a8c660ad40a036ab218"
161
  },
162
  "mirrors": {
163
  "hf_space": {
164
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/space/data/publication_audit.json",
165
  "exists": true,
166
+ "bytes": 4214,
167
+ "sha256": "3d1a2d861c96d445541519494abfcfca1da13cb593094a8c660ad40a036ab218"
168
  },
169
  "hf_artifacts": {
170
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/docs/data/publication_audit.json",
171
  "exists": true,
172
+ "bytes": 4214,
173
+ "sha256": "3d1a2d861c96d445541519494abfcfca1da13cb593094a8c660ad40a036ab218"
174
  },
175
  "hf_model": {
176
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/metrics/publication_audit.json",
177
  "exists": true,
178
+ "bytes": 4214,
179
+ "sha256": "3d1a2d861c96d445541519494abfcfca1da13cb593094a8c660ad40a036ab218"
180
  }
181
  },
182
  "failures": []
 
312
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs/data/scope_claims_audit.json",
313
  "exists": true,
314
  "bytes": 19964,
315
+ "sha256": "5520aa2b2c41ed9394283e8bf08be0ec1926b2851a952ba8a8a56a1f85a058eb"
316
  },
317
  "mirrors": {
318
  "hf_space": {
319
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/space/data/scope_claims_audit.json",
320
  "exists": true,
321
  "bytes": 19964,
322
+ "sha256": "5520aa2b2c41ed9394283e8bf08be0ec1926b2851a952ba8a8a56a1f85a058eb"
323
  },
324
  "hf_artifacts": {
325
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/docs/data/scope_claims_audit.json",
326
  "exists": true,
327
  "bytes": 19964,
328
+ "sha256": "5520aa2b2c41ed9394283e8bf08be0ec1926b2851a952ba8a8a56a1f85a058eb"
329
  },
330
  "hf_model": {
331
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/metrics/scope_claims_audit.json",
332
  "exists": true,
333
  "bytes": 19964,
334
+ "sha256": "5520aa2b2c41ed9394283e8bf08be0ec1926b2851a952ba8a8a56a1f85a058eb"
335
  }
336
  },
337
  "failures": []
 
405
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs/data/website_integrity.json",
406
  "exists": true,
407
  "bytes": 5936,
408
+ "sha256": "b9c324a59e447a11bc6aeb5130736788981b9a1b529a80c988378e3f05f924b1"
409
  },
410
  "mirrors": {
411
  "hf_space": {
412
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/space/data/website_integrity.json",
413
  "exists": true,
414
  "bytes": 5936,
415
+ "sha256": "b9c324a59e447a11bc6aeb5130736788981b9a1b529a80c988378e3f05f924b1"
416
  },
417
  "hf_artifacts": {
418
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/docs/data/website_integrity.json",
419
  "exists": true,
420
  "bytes": 5936,
421
+ "sha256": "b9c324a59e447a11bc6aeb5130736788981b9a1b529a80c988378e3f05f924b1"
422
  },
423
  "hf_model": {
424
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/metrics/website_integrity.json",
425
  "exists": true,
426
  "bytes": 5936,
427
+ "sha256": "b9c324a59e447a11bc6aeb5130736788981b9a1b529a80c988378e3f05f924b1"
428
  }
429
  },
430
  "failures": []
 
805
  "local": {
806
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/scripts/build_artifact_index.py",
807
  "exists": true,
808
+ "bytes": 11579,
809
+ "sha256": "874a3813fb3a19d79be9ea4c0177f5922adf9e667760f927dd49163784eb6b48"
810
  },
811
  "mirrors": {
812
  "hf_artifacts": {
813
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/scripts/build_artifact_index.py",
814
  "exists": true,
815
+ "bytes": 11579,
816
+ "sha256": "874a3813fb3a19d79be9ea4c0177f5922adf9e667760f927dd49163784eb6b48"
817
  },
818
  "hf_model": {
819
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/scripts/build_artifact_index.py",
820
  "exists": true,
821
+ "bytes": 11579,
822
+ "sha256": "874a3813fb3a19d79be9ea4c0177f5922adf9e667760f927dd49163784eb6b48"
823
  }
824
  },
825
  "failures": []
 
830
  "local": {
831
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/scripts/validate_mirror_parity.py",
832
  "exists": true,
833
+ "bytes": 7617,
834
+ "sha256": "0a74954e50fbf7bff661c9499244fc9be704764b701431fc2035ab4cc29d43d0"
835
  },
836
  "mirrors": {
837
  "hf_artifacts": {
838
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/scripts/validate_mirror_parity.py",
839
  "exists": true,
840
+ "bytes": 7617,
841
+ "sha256": "0a74954e50fbf7bff661c9499244fc9be704764b701431fc2035ab4cc29d43d0"
842
  },
843
  "hf_model": {
844
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/scripts/validate_mirror_parity.py",
845
  "exists": true,
846
+ "bytes": 7617,
847
+ "sha256": "0a74954e50fbf7bff661c9499244fc9be704764b701431fc2035ab4cc29d43d0"
848
  }
849
  },
850
  "failures": []
 
855
  "local": {
856
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/scripts/validate_publication_package.py",
857
  "exists": true,
858
+ "bytes": 9772,
859
+ "sha256": "1a915bdd68a6c63941339282a8f747e4cafa08c24e5cdb3dbe105bf6ac3ea144"
860
  },
861
  "mirrors": {
862
  "hf_artifacts": {
863
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/scripts/validate_publication_package.py",
864
  "exists": true,
865
+ "bytes": 9772,
866
+ "sha256": "1a915bdd68a6c63941339282a8f747e4cafa08c24e5cdb3dbe105bf6ac3ea144"
867
  },
868
  "hf_model": {
869
  "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/model/scripts/validate_publication_package.py",
870
  "exists": true,
871
+ "bytes": 9772,
872
+ "sha256": "1a915bdd68a6c63941339282a8f747e4cafa08c24e5cdb3dbe105bf6ac3ea144"
873
  }
874
  },
875
  "failures": []
 
923
  }
924
  },
925
  "failures": []
926
+ },
927
+ {
928
+ "name": "website/index.html",
929
+ "status": "pass",
930
+ "local": {
931
+ "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs/index.html",
932
+ "exists": true,
933
+ "bytes": 89653,
934
+ "sha256": "f4d2b412d24bb29e977e8b82bb531fdb352cc7a1b81a2141ac63a0328bab654b"
935
+ },
936
+ "mirrors": {
937
+ "hf_space": {
938
+ "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/space/index.html",
939
+ "exists": true,
940
+ "bytes": 89653,
941
+ "sha256": "f4d2b412d24bb29e977e8b82bb531fdb352cc7a1b81a2141ac63a0328bab654b"
942
+ },
943
+ "hf_artifacts_docs": {
944
+ "path": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/hf_publish/artifacts/docs/index.html",
945
+ "exists": true,
946
+ "bytes": 89653,
947
+ "sha256": "f4d2b412d24bb29e977e8b82bb531fdb352cc7a1b81a2141ac63a0328bab654b"
948
+ }
949
+ },
950
+ "failures": []
951
  }
952
  ],
953
  "failures": []
metrics/publication_audit.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "status": "pass",
3
- "generated_at_utc": "2026-06-01T04:49:16+00:00",
4
  "checks": [
5
  {
6
  "name": "required_publication_assets_present",
@@ -26,6 +26,11 @@
26
  "name": "no_hf_tokens_in_public_text",
27
  "status": "pass",
28
  "count": 0
 
 
 
 
 
29
  }
30
  ],
31
  "required_assets": {
 
1
  {
2
  "status": "pass",
3
+ "generated_at_utc": "2026-06-01T05:07:53+00:00",
4
  "checks": [
5
  {
6
  "name": "required_publication_assets_present",
 
26
  "name": "no_hf_tokens_in_public_text",
27
  "status": "pass",
28
  "count": 0
29
+ },
30
+ {
31
+ "name": "no_stale_task_suite_presentation_copy",
32
+ "status": "pass",
33
+ "count": 0
34
  }
35
  ],
36
  "required_assets": {
metrics/scope_claims_audit.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "status": "pass",
3
- "generated_at_utc": "2026-06-01T04:43:33+00:00",
4
  "summary": {
5
  "qwen3_omni_32_episode_claim": false,
6
  "dataset_manifest_num_episodes": 1,
 
1
  {
2
  "status": "pass",
3
+ "generated_at_utc": "2026-06-01T05:06:13+00:00",
4
  "summary": {
5
  "qwen3_omni_32_episode_claim": false,
6
  "dataset_manifest_num_episodes": 1,
metrics/website_integrity.json CHANGED
@@ -1,6 +1,6 @@
1
  {
2
  "status": "pass",
3
- "generated_at_utc": "2026-06-01T04:48:47+00:00",
4
  "docs_root": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs",
5
  "site_base": "/ropedia-xperience-10m-task-suite/",
6
  "summary": {
@@ -56,7 +56,7 @@
56
  },
57
  {
58
  "path": "data/evidence_contract.json",
59
- "bytes": 7148,
60
  "top_level_type": "dict"
61
  },
62
  {
 
1
  {
2
  "status": "pass",
3
+ "generated_at_utc": "2026-06-01T05:06:42+00:00",
4
  "docs_root": "/Users/chaoyue/Documents/Codex/2026-05-29/i-am-learning-this-dataset-https/working_repo_copy/docs",
5
  "site_base": "/ropedia-xperience-10m-task-suite/",
6
  "summary": {
 
56
  },
57
  {
58
  "path": "data/evidence_contract.json",
59
+ "bytes": 7182,
60
  "top_level_type": "dict"
61
  },
62
  {
scripts/build_artifact_index.py CHANGED
@@ -90,7 +90,7 @@ ARTIFACTS = [
90
  "kind": "mirror_parity",
91
  "surface": "website_hf",
92
  "volatile": True,
93
- "proves": "Confirms prepared GitHub/HF Space/artifact/model mirrors share the same critical data, figure, and validator files.",
94
  },
95
  {
96
  "id": "website_integrity",
 
90
  "kind": "mirror_parity",
91
  "surface": "website_hf",
92
  "volatile": True,
93
+ "proves": "Confirms prepared GitHub/HF Space/artifact/model mirrors share the same critical data, figure, website HTML, and validator files.",
94
  },
95
  {
96
  "id": "website_integrity",
scripts/validate_mirror_parity.py CHANGED
@@ -56,6 +56,10 @@ SCRIPT_FILES = [
56
  "validate_website_integrity.py",
57
  ]
58
 
 
 
 
 
59
 
60
  def sha256(path: Path) -> str:
61
  digest = hashlib.sha256()
@@ -150,6 +154,18 @@ def build_report(hf_root: Path) -> dict:
150
  )
151
  )
152
 
 
 
 
 
 
 
 
 
 
 
 
 
153
  failures = [
154
  {"group": group["name"], **failure}
155
  for group in groups
@@ -187,6 +203,12 @@ def build_report(hf_root: Path) -> dict:
187
  if not any(failure["group"].startswith("scripts/") for failure in failures)
188
  else "fail",
189
  },
 
 
 
 
 
 
190
  ],
191
  "groups": groups,
192
  "failures": failures,
 
56
  "validate_website_integrity.py",
57
  ]
58
 
59
+ WEBSITE_FILES = [
60
+ "index.html",
61
+ ]
62
+
63
 
64
  def sha256(path: Path) -> str:
65
  digest = hashlib.sha256()
 
154
  )
155
  )
156
 
157
+ for filename in WEBSITE_FILES:
158
+ groups.append(
159
+ parity_group(
160
+ f"website/{filename}",
161
+ ROOT / "docs" / filename,
162
+ {
163
+ "hf_space": hf_root / "space" / filename,
164
+ "hf_artifacts_docs": hf_root / "artifacts/docs" / filename,
165
+ },
166
+ )
167
+ )
168
+
169
  failures = [
170
  {"group": group["name"], **failure}
171
  for group in groups
 
203
  if not any(failure["group"].startswith("scripts/") for failure in failures)
204
  else "fail",
205
  },
206
+ {
207
+ "name": "repo_hf_website_html_parity",
208
+ "status": "pass"
209
+ if not any(failure["group"].startswith("website/") for failure in failures)
210
+ else "fail",
211
+ },
212
  ],
213
  "groups": groups,
214
  "failures": failures,
scripts/validate_publication_package.py CHANGED
@@ -42,6 +42,10 @@ TEXT_SUFFIXES = {
42
  ".yml",
43
  }
44
  TOKEN_PATTERN = re.compile(r"hf_[A-Za-z0-9]{20,}")
 
 
 
 
45
 
46
 
47
  def rel(path: Path, base: Path) -> str:
@@ -114,6 +118,13 @@ def scan(root: Path, *, paths: list[Path] | None = None) -> dict:
114
  continue
115
  if TOKEN_PATTERN.search(text):
116
  violations.append({"kind": "possible_hf_token", "path": path_rel})
 
 
 
 
 
 
 
117
 
118
  return {
119
  "root": str(root),
@@ -222,6 +233,11 @@ def build_report(hf_root: Path) -> dict:
222
  "status": "pass" if not any(v["kind"] == "possible_hf_token" for v in violations) else "fail",
223
  "count": sum(1 for v in violations if v["kind"] == "possible_hf_token"),
224
  },
 
 
 
 
 
225
  ]
226
  status = "pass" if all(check["status"] == "pass" for check in checks) else "fail"
227
  return {
 
42
  ".yml",
43
  }
44
  TOKEN_PATTERN = re.compile(r"hf_[A-Za-z0-9]{20,}")
45
+ STALE_PRESENTATION_STRINGS = {
46
+ "xperience10m-" + "modalities-v9-large-atlas": "old task-suite infographic cache key",
47
+ "Start with the large native " + "modality atlas": "old suite-section hierarchy copy",
48
+ }
49
 
50
 
51
  def rel(path: Path, base: Path) -> str:
 
118
  continue
119
  if TOKEN_PATTERN.search(text):
120
  violations.append({"kind": "possible_hf_token", "path": path_rel})
121
+ for needle, reason in STALE_PRESENTATION_STRINGS.items():
122
+ if needle in text:
123
+ violations.append({
124
+ "kind": "stale_presentation_copy",
125
+ "path": path_rel,
126
+ "detail": reason,
127
+ })
128
 
129
  return {
130
  "root": str(root),
 
233
  "status": "pass" if not any(v["kind"] == "possible_hf_token" for v in violations) else "fail",
234
  "count": sum(1 for v in violations if v["kind"] == "possible_hf_token"),
235
  },
236
+ {
237
+ "name": "no_stale_task_suite_presentation_copy",
238
+ "status": "pass" if not any(v["kind"] == "stale_presentation_copy" for v in violations) else "fail",
239
+ "count": sum(1 for v in violations if v["kind"] == "stale_presentation_copy"),
240
+ },
241
  ]
242
  status = "pass" if all(check["status"] == "pass" for check in checks) else "fail"
243
  return {