Isabelbinu's picture
Upload 6 files
1d2dd73 verified
|
raw
history blame
4.75 kB
---
language: en
license: mit
tags:
- text-classification
- argumentation
- fallacy-detection
- argument-scheme
- roberta
- pytorch
datasets:
- EthiX
- Macagno
metrics:
- f1
- accuracy
model-index:
- name: ArgueBot Unified Argument & Fallacy Classifier
results:
- task:
type: text-classification
metrics:
- type: f1
value: 1.8465
name: Macro F1
---
# ArgueBot — Unified Argument Scheme & Fallacy Classifier
A fine-tuned **RoBERTa-large** model that classifies text into one of **24 categories**:
- **11 argument scheme types** (valid argumentative patterns from Walton's taxonomy)
- **13 logical fallacy types** (common informal fallacies)
The model determines both *whether* an argument is valid or fallacious *and* which specific type it is — in a single inference pass.
---
## Model Details
| Property | Value |
|---|---|
| Base model | `roberta-large` |
| Task | 24-class text classification |
| Scheme classes | 11 |
| Fallacy classes | 13 |
| Datasets | EthiX + Macagno (argument schemes), Fallacy dataset (13 types) |
| Epochs trained | 5 (early stopping) |
| Best val metric | 1.8465 (val_loss) |
---
## Labels
### ✅ Argument Schemes (valid arguments)
- `argument from alternatives`
- `argument from analogy`
- `argument from cause to effect`
- `argument from commitment`
- `argument from example`
- `argument from expert opinion`
- `argument from negative consequences`
- `argument from positive consequences`
- `argument from practical reasoning`
- `argument from sign`
- `argument from values`
### ⚡ Fallacy Types
- `ad hominem`
- `ad populum`
- `appeal to emotion`
- `circular reasoning`
- `equivocation`
- `fallacy of credibility`
- `fallacy of extension`
- `fallacy of logic`
- `fallacy of relevance`
- `false causality`
- `false dilemma`
- `faulty generalization`
- `intentional`
---
## How to Use
```python
from transformers import RobertaTokenizer, RobertaForSequenceClassification
import torch, json
model_id = "your-username/arguebot-argument-fallacy-classifier"
tokenizer = RobertaTokenizer.from_pretrained(model_id)
model = RobertaForSequenceClassification.from_pretrained(model_id)
model.eval()
# Load label metadata
import requests
meta = requests.get(
f"https://huggingface.co/{model_id}/resolve/main/metadata.json"
).json()
label_map = {int(k): v for k, v in meta["label_map"].items()}
scheme_ids = set(meta["scheme_ids"])
fallacy_ids = set(meta["fallacy_ids"])
def predict(text):
enc = tokenizer(text, return_tensors="pt",
truncation=True, max_length=128)
with torch.no_grad():
logits = model(**enc).logits
probs = torch.softmax(logits, dim=1).squeeze()
pred_id = int(probs.argmax())
label = label_map[pred_id]
verdict = "Valid Argument" if pred_id in scheme_ids else "Fallacy"
return {
"verdict": verdict,
"label": label,
"confidence": float(round(probs[pred_id].item(), 4)),
}
print(predict("According to NASA, global temperatures will rise 2°C by 2050."))
# {'verdict': 'Valid Argument', 'label': 'argument from expert opinion', 'confidence': 0.94}
print(predict("Don't trust him — he was caught lying before."))
# {'verdict': 'Fallacy', 'label': 'ad hominem', 'confidence': 0.88}
```
---
## Training Details
- **Deduplication**: exact + near-duplicate removal, min 5 words per sample
- **Class balancing**: `sklearn compute_class_weight("balanced")` + weighted cross-entropy loss
- **Batch strategy**: custom `InterleavedSampler` alternates scheme/fallacy samples per batch
- **Early stopping**: patience=3, min_delta=0.001, monitor=`val_loss`
- **Optimiser**: AdamW, lr=3e-05, weight_decay=0.01
- **Scheduler**: linear warmup (15% of steps)
---
## Intended Uses
- Debate analysis and argumentation quality assessment
- Educational tools for teaching critical thinking and informal logic
- AI-assisted fact-checking and media literacy tools
- Research in computational argumentation
## Limitations
- Trained on English text only
- Short texts (< 5 words) may produce unreliable predictions
- Some fallacy types (e.g. `intentional`, `equivocation`) are harder to distinguish without broader context
- Not suitable for legal or medical decision-making
---
## Citation
If you use this model in your research, please cite:
```
@misc{arguebot2025,
title = {ArgueBot: Unified Argument Scheme and Fallacy Classification},
author = {Isabel},
year = {2025},
url = {https://huggingface.co/your-username/arguebot-argument-fallacy-classifier}
}
```
---
*Built with RoBERTa-large · EthiX + Macagno datasets · Walton's Argumentation Schemes*