Hate-speech-CNERG/hatexplain
Updated • 2.95k • 23
CREDENCE checkpoint for hatexplain (toxicity) with backbone roberta-base (15 concept heads).
Paper: https://huggingface.co/papers/2604.24170 · arXiv: 2604.24170
Training run folder: head_ablation_roberta_20251224_105619
credence_checkpoint.pt — PyTorch checkpoint (model_state_dict, config, metadata, optimizer state)from huggingface_hub import hf_hub_download
import torch
path = hf_hub_download(repo_id="tankiit/credence-ablation-roberta-n-heads-15-hatexplain", filename="credence_checkpoint.pt")
ckpt = torch.load(path, map_location="cpu", weights_only=False)
state = ckpt["model_state_dict"]
config = ckpt["config"]
Load into your CREDENCE model implementation (see project credence.py).
@article{mukherjee2026credence,
title={Credal Concept Bottleneck Models for Epistemic--Aleatoric Uncertainty Decomposition},
author={Mukherjee, et al.},
journal={arXiv preprint arXiv:2604.24170},
year={2026}
}
| Metric | Value |
|---|---|
| Accuracy | 0.5811 |
| ρ(epistemic, error) | 0.1387 |
| ρ(aleatoric, unknown) | 0.1847 |
Base model
FacebookAI/roberta-base