ljchang commited on
Commit
db19582
·
verified ·
1 Parent(s): ea34f7e

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,7 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ au_compare_overlay.png filter=lfs diff=lfs merge=lfs -text
37
+ au_effect_maps.png filter=lfs diff=lfs merge=lfs -text
38
+ au_solid_neutral_vs_activated.png filter=lfs diff=lfs merge=lfs -text
39
+ au_solid_panels.png filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,149 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ library_name: py-feat
4
+ tags:
5
+ - face-action-units
6
+ - facs
7
+ - facial-landmarks
8
+ - regression
9
+ - pls
10
+ - mediapipe
11
+ - py-feat
12
+ ---
13
+
14
+ # AU → MP-mesh PLS (visualization)
15
+
16
+ > Predicts the 478-vertex MediaPipe FaceMesh deformation from 20 AU intensities. Trained with pose covariates and pose×AU interactions for cleaner AU coefficients; deploy as 23-d input (AU + pose), pose=0 at inference for the standard canonical-frontal viz.
17
+
18
+ # AU+pose -> MP mesh PLS (v2)
19
+
20
+ Linear PLS regression mapping 20 FACS AU intensities + 3 pose covariates
21
+ (Pitch, Yaw, Roll) to 478×3 = 1434 MediaPipe FaceMesh vertex coordinates in a
22
+ pose-canonical frame. Used to visualize how the MP face mesh deforms as AU
23
+ sliders move.
24
+
25
+ ## Training data
26
+ - 344,418 frames from 9,978 CelebV-HQ celebrity videos
27
+ - Mesh source: MPDetector mp_facemesh_v2 (478-vertex MediaPipe topology)
28
+ - AU source: Detector with img2pose face + xgb AU on the same frames
29
+ - Pose source: MPDetector facial_transformation_matrices (Pitch, Yaw, Roll, radians)
30
+ - Pose-filtered to |yaw| <= 40°, |pitch| <= 30°
31
+ - Per-frame Umeyama similarity Procrustes alignment to a reference template
32
+ using 12 stable upper-face anchors (forehead 10/9/8/151, nose bridge 6/168/197/195,
33
+ outer canthi 33/263, inner canthi 133/362). Removes (R, s, t).
34
+ - Top 1% of frames by max anchor residual dropped (alignment outliers).
35
+ - IMPORTANT: NO per-subject neutral subtraction — predicting absolute aligned
36
+ coords directly is the Cheong / Py-Feat-tutorial-06 recipe. Per-subject
37
+ neutral subtraction was tested and capped R² at 0.083; switching to absolute
38
+ coords raised R² to 0.143 (no-pose) and 0.244 (with pose).
39
+
40
+ ## Method
41
+ - PLSRegression(n_components=83, scale=True), Cheong style
42
+ - Inputs: [20 AU | 3 pose (Pitch, Yaw, Roll)] = 23 features
43
+ - Outputs: 1434-d absolute pose-canonical mesh coords (image-space pixels)
44
+
45
+ ## Out-of-sample performance (3-fold GroupKFold by video_id)
46
+ - **Variance-weighted R² = 0.2443 ± 0.0034** across 1434 dims
47
+ - Per-fold R² = [0.2395, 0.2472, 0.2463]
48
+ - MAE = 0.223 px (canonical-frame image space)
49
+
50
+ R² is modest by absolute standards because AU intensities can't fully describe
51
+ 1434-d mesh deformation (continuous AU intensities from xgb are noisier than
52
+ human FACS coding; many micro-expressions aren't captured by 20 AUs). For
53
+ visualization, qualitative AU-direction correctness matters more than R².
54
+
55
+ ## Inference
56
+
57
+ ```python
58
+ import numpy as np
59
+ m = np.load("au_to_mesh_pls_v2.npz")
60
+ au = np.zeros(20); au[m["au_columns"].tolist().index("AU12")] = 1.0 # smile
61
+ pose = np.zeros(3) # [pitch, yaw, roll]
62
+ x = np.concatenate([au, pose]) # (23,)
63
+ flat = x @ m["coef"] + m["intercept"] # (1434,)
64
+ # IMPORTANT: layout is axis-major [all x | all y | all z], NOT interleaved
65
+ mesh = np.stack([flat[:478], flat[478:956], flat[956:]], axis=1) # (478, 3)
66
+ # Render with mediapipe.solutions.face_mesh.FACEMESH_TESSELATION
67
+ # Optionally apply user-chosen rigid (R, s, t) post-hoc.
68
+ ```
69
+
70
+ The NPZ also includes:
71
+ - `mean_aligned_mesh` (478, 3) — population mean canonical mesh ("AU=0" face)
72
+ - `mean_low_au_mesh` (478, 3) — mean of low-AU-sum frames (cleaner neutral)
73
+
74
+ ## Citation context
75
+ Adapts Cheong et al. 2023 / Py-Feat tutorial 06 (E. Jolly): affine-aligned 68
76
+ dlib landmarks + pose -> 20 AUs via PLS. Direction inverted (AU+pose -> mesh)
77
+ and scaled to MP's 478-vertex mesh on 10x larger wild-celebrity data.
78
+
79
+ ## File format
80
+ NPZ with:
81
+ - coef (23, 1434) float32 — linear weights, rows match input_columns
82
+ - intercept (1434,) float32 — bias
83
+ - input_columns (23,) str — input feature order: AU01..AU43, Pitch, Yaw, Roll
84
+ - au_columns (20,) str — convenience: just the AU subset
85
+ - pose_columns (3,) str — convenience: just the pose subset
86
+ - mean_aligned_mesh (478, 3) float32 — population mean (viz default)
87
+ - mean_low_au_mesh (478, 3) float32 — low-AU-frame mean (cleaner neutral)
88
+ - reference_anchors (12, 3) float32 — Procrustes reference
89
+ - anchor_indices (12,) int32 — MP indices used as anchors
90
+ - n_components () int32
91
+ - model_card () str
92
+ - training_metadata () str (JSON)
93
+
94
+ Loader: np.load("au_to_mesh_pls_v2.npz") — no extra deps needed.
95
+
96
+
97
+ ## Applying pose post-hoc (optional)
98
+
99
+ The model is trained with pose covariates as input but at inference users
100
+ typically pass `pose=0` to get the canonical (frontal) deformation. To render
101
+ the result at any chosen head pose, apply a rigid transform AFTER prediction:
102
+
103
+ ```python
104
+ import numpy as np
105
+ from scipy.spatial.transform import Rotation
106
+ m = np.load("au_to_mesh_pls_v2.npz")
107
+
108
+ # 1) Predict canonical mesh from AU vector at pose=0:
109
+ au = np.zeros(20); au[m["au_columns"].tolist().index("AU12")] = 1.0
110
+ x = np.concatenate([au, np.zeros(3)]) # (23,)
111
+ flat = x @ m["coef"] + m["intercept"] # (1434,)
112
+ # IMPORTANT: layout is axis-major [all x | all y | all z], NOT interleaved
113
+ canonical_mesh = np.stack([flat[:478], flat[478:956], flat[956:]], axis=1) # (478, 3)
114
+
115
+ # 2) Apply user-chosen rigid pose:
116
+ R = Rotation.from_euler("xyz", [pitch, yaw, roll]).as_matrix() # (3, 3)
117
+ posed_mesh = canonical_mesh @ R.T # (478, 3)
118
+ # Or for re-projection onto image: s * (canonical @ R.T) + t
119
+ ```
120
+
121
+ **Two ways to control pose at inference:**
122
+ - **Render-time rotation** (recommended for clean separation): set `pose=0` in
123
+ the input vector, then rotate the predicted mesh post-hoc as above.
124
+ - **Pose-conditioned prediction** (if you want the model's residual pose-correlated
125
+ bias): pass non-zero pose values directly into the input vector. The model was
126
+ trained with pose+pose×AU interactions, so non-zero pose changes the prediction
127
+ in addition to the rigid rotation needed at render time.
128
+
129
+ For most use cases, render-time rotation is cleaner — the model's AU-driven
130
+ deformation stays canonical and pose is purely a viewing transform.
131
+
132
+ **Convention notes**:
133
+ - MP canonical: y-axis UP, x-axis to subject's left, z-axis out of face.
134
+ Standard right-handed.
135
+ - `Rotation.from_euler("xyz", ...)` is intrinsic xyz (R = Rx·Ry·Rz) — matches
136
+ img2pose Pitch/Yaw/Roll output.
137
+ - If you have MP's full 4x4 `facial_transformation_matrix` (M), apply via
138
+ `posed = (np.concatenate([canonical, np.ones((478, 1))], axis=1) @ M.T)[:, :3]`.
139
+
140
+ ## Figures
141
+
142
+ ![au_solid_panels.png](./au_solid_panels.png)
143
+
144
+ ![au_solid_neutral_vs_activated.png](./au_solid_neutral_vs_activated.png)
145
+
146
+ ![au_compare_overlay.png](./au_compare_overlay.png)
147
+
148
+ ![au_effect_maps.png](./au_effect_maps.png)
149
+
au_compare_overlay.png ADDED

Git LFS Details

  • SHA256: 4c5b336ca1a3e28f3c472ff36c5b47a8598883cef176d86607f7a6344bf1ae6b
  • Pointer size: 131 Bytes
  • Size of remote file: 784 kB
au_effect_maps.png ADDED

Git LFS Details

  • SHA256: 9f950aacdf2365e3d0cc12161804e7d2d43d10a75520f072cc6e5a7faedba8e9
  • Pointer size: 131 Bytes
  • Size of remote file: 903 kB
au_solid_neutral_vs_activated.png ADDED

Git LFS Details

  • SHA256: 10fd2b2d29044f92ea4fa1dd59ad7560ff18356bfa29de5aa9d7349951efaa4c
  • Pointer size: 132 Bytes
  • Size of remote file: 1.22 MB
au_solid_panels.png ADDED

Git LFS Details

  • SHA256: 3500ad55ade43b3027c7b6567c4ad464ba213ebec04a23c6f6065c0ecd21a681
  • Pointer size: 132 Bytes
  • Size of remote file: 1.45 MB
au_to_mesh_pls_v2.npz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1fe7d16576333ded008c18d3c7efc97c23bbdefa3b6ee5e13c64f4b1abe1fe1
3
+ size 148084