Upload 6 files

1d2dd73 verified 16 days ago

4.75 kB

	---
	language: en
	license: mit
	tags:
	- text-classification
	- argumentation
	- fallacy-detection
	- argument-scheme
	- roberta
	- pytorch
	datasets:
	- EthiX
	- Macagno
	metrics:
	- f1
	- accuracy
	model-index:
	- name: ArgueBot Unified Argument & Fallacy Classifier
	results:
	- task:
	type: text-classification
	metrics:
	- type: f1
	value: 1.8465
	name: Macro F1
	---

	# ArgueBot — Unified Argument Scheme & Fallacy Classifier

	A fine-tuned RoBERTa-large model that classifies text into one of 24 categories:
	- 11 argument scheme types (valid argumentative patterns from Walton's taxonomy)
	- 13 logical fallacy types (common informal fallacies)

	The model determines both whether an argument is valid or fallacious and which specific type it is — in a single inference pass.

	---

	## Model Details

	\| Property \| Value \|
	\|---\|---\|
	\| Base model \| `roberta-large` \|
	\| Task \| 24-class text classification \|
	\| Scheme classes \| 11 \|
	\| Fallacy classes \| 13 \|
	\| Datasets \| EthiX + Macagno (argument schemes), Fallacy dataset (13 types) \|
	\| Epochs trained \| 5 (early stopping) \|
	\| Best val metric \| 1.8465 (val_loss) \|

	---

	## Labels

	### ✅ Argument Schemes (valid arguments)
	- `argument from alternatives`
	- `argument from analogy`
	- `argument from cause to effect`
	- `argument from commitment`
	- `argument from example`
	- `argument from expert opinion`
	- `argument from negative consequences`
	- `argument from positive consequences`
	- `argument from practical reasoning`
	- `argument from sign`
	- `argument from values`

	### ⚡ Fallacy Types
	- `ad hominem`
	- `ad populum`
	- `appeal to emotion`
	- `circular reasoning`
	- `equivocation`
	- `fallacy of credibility`
	- `fallacy of extension`
	- `fallacy of logic`
	- `fallacy of relevance`
	- `false causality`
	- `false dilemma`
	- `faulty generalization`
	- `intentional`

	---

	## How to Use

	```python
	from transformers import RobertaTokenizer, RobertaForSequenceClassification
	import torch, json

	model_id = "your-username/arguebot-argument-fallacy-classifier"
	tokenizer = RobertaTokenizer.from_pretrained(model_id)
	model = RobertaForSequenceClassification.from_pretrained(model_id)
	model.eval()

	# Load label metadata
	import requests
	meta = requests.get(
	f"https://huggingface.co/{model_id}/resolve/main/metadata.json"
	).json()
	label_map = {int(k): v for k, v in meta["label_map"].items()}
	scheme_ids = set(meta["scheme_ids"])
	fallacy_ids = set(meta["fallacy_ids"])

	def predict(text):
	enc = tokenizer(text, return_tensors="pt",
	truncation=True, max_length=128)
	with torch.no_grad():
	logits = model(**enc).logits
	probs = torch.softmax(logits, dim=1).squeeze()
	pred_id = int(probs.argmax())
	label = label_map[pred_id]
	verdict = "Valid Argument" if pred_id in scheme_ids else "Fallacy"
	return {
	"verdict": verdict,
	"label": label,
	"confidence": float(round(probs[pred_id].item(), 4)),
	}

	print(predict("According to NASA, global temperatures will rise 2°C by 2050."))
	# {'verdict': 'Valid Argument', 'label': 'argument from expert opinion', 'confidence': 0.94}

	print(predict("Don't trust him — he was caught lying before."))
	# {'verdict': 'Fallacy', 'label': 'ad hominem', 'confidence': 0.88}
	```

	---

	## Training Details

	- Deduplication: exact + near-duplicate removal, min 5 words per sample
	- Class balancing: `sklearn compute_class_weight("balanced")` + weighted cross-entropy loss
	- Batch strategy: custom `InterleavedSampler` alternates scheme/fallacy samples per batch
	- Early stopping: patience=3, min_delta=0.001, monitor=`val_loss`
	- Optimiser: AdamW, lr=3e-05, weight_decay=0.01
	- Scheduler: linear warmup (15% of steps)

	---

	## Intended Uses

	- Debate analysis and argumentation quality assessment
	- Educational tools for teaching critical thinking and informal logic
	- AI-assisted fact-checking and media literacy tools
	- Research in computational argumentation

	## Limitations

	- Trained on English text only
	- Short texts (< 5 words) may produce unreliable predictions
	- Some fallacy types (e.g. `intentional`, `equivocation`) are harder to distinguish without broader context
	- Not suitable for legal or medical decision-making

	---

	## Citation

	If you use this model in your research, please cite:

	```
	@misc{arguebot2025,
	title = {ArgueBot: Unified Argument Scheme and Fallacy Classification},
	author = {Isabel},
	year = {2025},
	url = {https://huggingface.co/your-username/arguebot-argument-fallacy-classifier}
	}
	```

	---

	Built with RoBERTa-large · EthiX + Macagno datasets · Walton's Argumentation Schemes