Task Method 20-Result Matrix
Every method has one record for each of the 20 unified task contracts. Numeric scores appear only where a committed runner or verified package produced that task target.
Legend: score = direct numeric task score and proxy = documented compact substitute target. The current public matrix is complete at 180/180 scored records; unsupported/not-evaluated labels are retained only for future regression audits.
| Method | Records | Scored | Proxy scored | Scoreless | Status counts |
|---|---|---|---|---|---|
| Minimal | 20 | 20 | 0 | 0 | scored 20 |
| Neural MLP | 20 | 20 | 0 | 0 | scored 20 |
| 128ep Aligned Simple | 20 | 20 | 1 | 0 | proxy scored 1, scored 19 |
| 128ep Aligned NN | 20 | 20 | 1 | 0 | proxy scored 1, scored 19 |
| 128ep Raw Simple | 20 | 20 | 2 | 0 | proxy scored 2, scored 18 |
| 128ep Raw NN | 20 | 20 | 2 | 0 | proxy scored 2, scored 18 |
| Qwen3-Omni v6 LoRA | 20 | 20 | 0 | 0 | scored 20 |
| Cosmos3-Super Reasoner | 20 | 20 | 0 | 0 | scored 20 |
| Cosmos3-Nano Future Window | 20 | 20 | 0 | 0 | scored 20 |
Compact Score Matrix
Cells show raw metric value, then direct/proxy; normalized radar value; metric key. The raw metric is the value to cite; the normalized value is the exact linear 0-1 score retained in JSON. The SVG radar uses sqrt(normalized score) only for visual radius, so low but real differences remain visible without changing the table values.
| # | Task | Min | NN | 128-S | 128-NN | 128-RS | 128-RN | Qwen3 | C3-S | C3-N |
|---|---|---|---|---|---|---|---|---|---|---|
| 01 | Action Recognition | 0.0500 direct; norm 0.050; macro_f1 |
0.0148 direct; norm 0.015; macro_f1 |
0.0083 direct; norm 0.008; macro_f1 |
0.0042 direct; norm 0.004; macro_f1 |
0.0029 direct; norm 0.003; macro_f1 |
0.0015 direct; norm 0.001; macro_f1 |
0.0029 direct; norm 0.003; action_macro_f1 |
0.0008 direct; norm 0.001; action_macro_f1 |
0.0079 direct; norm 0.008; action_accuracy_from_retrieved_future |
| 02 | Procedure Step Recognition | 0.0506 direct; norm 0.051; macro_f1 |
0.0281 direct; norm 0.028; macro_f1 |
0.0002 direct; norm 0.000; macro_f1 |
0.0001 direct; norm 0.000; macro_f1 |
0.0000 direct; norm 0.000; macro_f1 |
0.0001 direct; norm 0.000; macro_f1 |
0.0037 direct; norm 0.004; subtask_accuracy |
0.0000 direct; norm 0.000; subtask_accuracy |
0.0000 direct; norm 0.000; timeline_subtask_macro_f1 |
| 03 | Action Boundary Detection | 0.6118 direct; norm 0.612; macro_f1 |
0.5862 direct; norm 0.586; macro_f1 |
0.2965 direct; norm 0.297; macro_f1 |
0.4842 direct; norm 0.484; macro_f1 |
0.4204 direct; norm 0.420; macro_f1 |
0.4902 direct; norm 0.490; macro_f1 |
0.9898 direct; norm 0.990; transition_accuracy |
0.3683 direct; norm 0.368; transition_accuracy |
0.9683 direct; norm 0.968; transition_accuracy |
| 04 | Next-Action Prediction | 0.0593 direct; norm 0.059; macro_f1 |
0.0419 direct; norm 0.042; macro_f1 |
0.0065 direct; norm 0.007; macro_f1 |
0.0049 direct; norm 0.005; macro_f1 |
0.0033 direct; norm 0.003; macro_f1 |
0.0018 direct; norm 0.002; macro_f1 |
0.0431 direct; norm 0.043; next_action_accuracy |
0.0134 direct; norm 0.013; next_action_accuracy |
0.0079 direct; norm 0.008; action_accuracy_from_retrieved_future |
| 05 | Hand Trajectory Forecasting | 0.8647 direct; norm 0.125; mpjpe |
0.1079 direct; norm 1.000; mpjpe |
8.817 direct; norm 0.012; mpjpe |
0.4294 direct; norm 0.251; mpjpe |
0.2729 direct; norm 0.395; mae |
0.1848 direct; norm 0.584; mae |
0.7216 direct; norm 0.149; hand_trajectory_forecast_mrr |
0.8915 direct; norm 0.121; hand_trajectory_forecast_mrr |
0.6913 direct; norm 0.156; hand_trajectory_forecast_mrr |
| 06 | Contact State Prediction | 1.000 direct; norm 1.000; macro_f1 |
1.000 direct; norm 1.000; macro_f1 |
0.4381 direct; norm 0.438; macro_f1 |
0.5683 direct; norm 0.568; macro_f1 |
0.8870 direct; norm 0.887; macro_f1 |
1.000 direct; norm 1.000; macro_f1 |
0.8177 direct; norm 0.818; contact_accuracy |
0.3214 direct; norm 0.321; contact_accuracy |
0.7434 direct; norm 0.743; contact_accuracy |
| 07 | Object Relevance Prediction | 0.1803 direct; norm 0.180; micro_f1 |
0.1679 direct; norm 0.168; micro_f1 |
0.1776 direct; norm 0.178; micro_f1 |
0.1866 direct; norm 0.187; micro_f1 |
0.0655 direct; norm 0.066; micro_f1 |
0.1766 direct; norm 0.177; micro_f1 |
0.3065 direct; norm 0.306; object_micro_f1 |
0.1370 direct; norm 0.137; object_micro_f1 |
0.0005 direct; norm 0.000; object_relevance_micro_f1 |
| 08 | Language Grounding | 0.0160 direct; norm 0.016; mrr |
0.0168 direct; norm 0.017; mrr |
0.0023 direct; norm 0.002; mrr |
0.0082 direct; norm 0.008; mrr |
0.0111 direct; norm 0.011; mrr |
0.0063 direct; norm 0.006; mrr |
0.8764 direct; norm 0.876; caption_grounding_mrr |
0.3064 direct; norm 0.306; caption_grounding_iou |
0.5221 direct; norm 0.522; caption_grounding_mrr |
| 09 | Cross-Modal Retrieval | 0.2693 direct; norm 0.269; mrr |
0.1300 direct; norm 0.130; mrr |
0.0026 direct; norm 0.003; mrr |
0.0026 direct; norm 0.003; mrr |
0.0035 direct; norm 0.003; mrr |
0.0025 direct; norm 0.003; mrr |
0.5080 direct; norm 0.508; cross_modal_retrieval_mrr |
0.6628 direct; norm 0.663; cross_modal_retrieval_mrr |
0.0221 direct; norm 0.022; future_retrieval_mrr |
| 10 | Cross-Modal Reconstruction | -0.0153 direct; norm 0.000; r2 |
-0.0102 direct; norm 0.000; r2 |
-190.66 direct; norm 0.000; r2 |
-0.4348 direct; norm 0.000; r2 |
-1.345 direct; norm 0.000; r2 |
-1.397 direct; norm 0.000; r2 |
0.9671 direct; norm 0.967; modality_reconstruction_mrr |
0.9939 direct; norm 0.994; modality_reconstruction_mrr |
0.0003 direct; norm 0.000; feature_reconstruction_quality |
| 11 | Temporal Order Verification | 0.5400 direct; norm 0.540; f1 |
0.8520 direct; norm 0.852; f1 |
0.4199 direct; norm 0.420; f1 |
0.8252 direct; norm 0.825; f1 |
0.4982 direct; norm 0.498; macro_f1 |
0.8030 direct; norm 0.803; macro_f1 |
0.4098 direct; norm 0.410; temporal_order_f1 |
0.6286 direct; norm 0.629; temporal_order_f1 |
0.5954 direct; norm 0.595; temporal_order_f1 |
| 12 | Multimodal Synchronization Detection | 0.5052 direct; norm 0.505; f1 |
0.7153 direct; norm 0.715; f1 |
0.4998 direct; norm 0.500; f1 |
0.7774 direct; norm 0.777; f1 |
0.4959 direct; norm 0.496; macro_f1 |
0.8273 direct; norm 0.827; macro_f1 |
0.3345 direct; norm 0.334; misalignment_detection_f1 |
0.3727 direct; norm 0.373; misalignment_detection_f1 |
0.4772 direct; norm 0.477; misalignment_detection_f1 |
| 13 | Long-Horizon Next-Action Forecasting | 0.0750 direct; norm 0.075; macro_f1 |
0.0655 direct; norm 0.065; macro_f1 |
0.0046 direct; norm 0.005; macro_f1 |
0.0030 direct; norm 0.003; macro_f1 |
0.0024 direct; norm 0.002; macro_f1 |
0.0011 direct; norm 0.001; macro_f1 |
0.0023 direct; norm 0.002; long_horizon_next_action_macro_f1 |
0.0088 direct; norm 0.009; long_horizon_next_action_macro_f1 |
0.0025 direct; norm 0.002; long_horizon_next_action_macro_f1 |
| 14 | Long-Horizon Next-Subtask Forecasting | 0.0455 direct; norm 0.045; macro_f1 |
0.0507 direct; norm 0.051; macro_f1 |
0.0001 direct; norm 0.000; macro_f1 |
0.0000 direct; norm 0.000; macro_f1 |
0.0000 direct; norm 0.000; macro_f1 |
0.0000 direct; norm 0.000; macro_f1 |
0.0042 direct; norm 0.004; next_subtask_forecast_macro_f1 |
0.0000 direct; norm 0.000; next_subtask_forecast_macro_f1 |
0.0066 direct; norm 0.007; next_subtask_forecast_macro_f1 |
| 15 | Interaction Text Prediction | 0.0444 direct; norm 0.044; macro_f1 |
0.0381 direct; norm 0.038; macro_f1 |
0.0000 direct; norm 0.000; macro_f1 |
0.0000 direct; norm 0.000; macro_f1 |
0.0126 proxy; norm 0.013; macro_f1 |
0.0098 proxy; norm 0.010; macro_f1 |
0.4319 direct; norm 0.432; macro_f1 |
0.1795 direct; norm 0.179; macro_f1 |
0.1788 direct; norm 0.179; macro_f1 |
| 16 | Action-Object Relation Prediction | 0.0000 direct; norm 0.000; macro_f1 |
0.0000 direct; norm 0.000; macro_f1 |
0.0000 direct; norm 0.000; macro_f1 |
0.0000 direct; norm 0.000; macro_f1 |
0.0000 direct; norm 0.000; macro_f1 |
0.0000 direct; norm 0.000; macro_f1 |
0.0002 direct; norm 0.000; action_object_relation_macro_f1 |
0.0000 direct; norm 0.000; action_object_relation_macro_f1 |
0.0028 direct; norm 0.003; action_object_relation_macro_f1 |
| 17 | Future Object-Set Forecasting | 0.1694 direct; norm 0.169; micro_f1 |
0.1972 direct; norm 0.197; micro_f1 |
0.1766 direct; norm 0.177; micro_f1 |
0.1742 direct; norm 0.174; micro_f1 |
0.0647 direct; norm 0.065; micro_f1 |
0.1752 direct; norm 0.175; micro_f1 |
0.1659 direct; norm 0.166; object_set_forecast_micro_f1 |
0.0009 direct; norm 0.001; object_set_forecast_micro_f1 |
0.0178 direct; norm 0.018; object_set_forecast_micro_f1 |
| 18 | IMU-to-Hand Pose Reconstruction | 0.0420 direct; norm 1.000; mae |
0.0426 direct; norm 0.988; mae |
0.2295 direct; norm 0.183; mae |
0.2556 direct; norm 0.165; mae |
0.2294 direct; norm 0.183; mae |
0.2530 direct; norm 0.166; mae |
0.9642 direct; norm 0.044; imu_to_hand_pose_mrr |
0.9897 direct; norm 0.042; imu_to_hand_pose_mrr |
0.9920 direct; norm 0.042; imu_to_hand_pose_mrr |
| 19 | Camera-View Synchronization Retrieval | 0.4943 direct; norm 0.494; mrr |
0.2409 direct; norm 0.241; mrr |
0.0021 proxy; norm 0.002; mrr |
0.0027 proxy; norm 0.003; mrr |
0.0027 proxy; norm 0.003; mrr |
0.0025 proxy; norm 0.003; mrr |
0.6588 direct; norm 0.659; camera_view_sync_retrieval_mrr |
0.9980 direct; norm 0.998; camera_view_sync_retrieval_mrr |
0.9990 direct; norm 0.999; camera_view_sync_retrieval_mrr |
| 20 | Time-to-Next-Transition Regression | 10.54 direct; norm 1.000; mae |
10.55 direct; norm 0.998; mae |
624.81 direct; norm 0.017; mae |
41.47 direct; norm 0.254; mae |
52.33 direct; norm 0.201; mae |
42.37 direct; norm 0.249; mae |
134.07 direct; norm 0.079; time_to_transition_mae |
52.95 direct; norm 0.199; time_to_transition_mae |
33.81 direct; norm 0.312; time_to_transition_mae |
Status Matrix
| # | Task | Min | NN | 128-S | 128-NN | 128-RS | 128-RN | Qwen3 | C3-S | C3-N |
|---|---|---|---|---|---|---|---|---|---|---|
| 01 | Action Recognition | score | score | score | score | score | score | score | score | score |
| 02 | Procedure Step Recognition | score | score | score | score | score | score | score | score | score |
| 03 | Action Boundary Detection | score | score | score | score | score | score | score | score | score |
| 04 | Next-Action Prediction | score | score | score | score | score | score | score | score | score |
| 05 | Hand Trajectory Forecasting | score | score | score | score | score | score | score | score | score |
| 06 | Contact State Prediction | score | score | score | score | score | score | score | score | score |
| 07 | Object Relevance Prediction | score | score | score | score | score | score | score | score | score |
| 08 | Language Grounding | score | score | score | score | score | score | score | score | score |
| 09 | Cross-Modal Retrieval | score | score | score | score | score | score | score | score | score |
| 10 | Cross-Modal Reconstruction | score | score | score | score | score | score | score | score | score |
| 11 | Temporal Order Verification | score | score | score | score | score | score | score | score | score |
| 12 | Multimodal Synchronization Detection | score | score | score | score | score | score | score | score | score |
| 13 | Long-Horizon Next-Action Forecasting | score | score | score | score | score | score | score | score | score |
| 14 | Long-Horizon Next-Subtask Forecasting | score | score | score | score | score | score | score | score | score |
| 15 | Interaction Text Prediction | score | score | score | score | proxy | proxy | score | score | score |
| 16 | Action-Object Relation Prediction | score | score | score | score | score | score | score | score | score |
| 17 | Future Object-Set Forecasting | score | score | score | score | score | score | score | score | score |
| 18 | IMU-to-Hand Pose Reconstruction | score | score | score | score | score | score | score | score | score |
| 19 | Camera-View Synchronization Retrieval | score | score | proxy | proxy | proxy | proxy | score | score | score |
| 20 | Time-to-Next-Transition Regression | score | score | score | score | score | score | score | score | score |
Sources and raw values are in docs/data/task_method_20_result_matrix.json and docs/data/unified_task_model_radar.json.