Upload 6 files

Browse files

Files changed (6) hide show

README.md +167 -0
config.json +80 -0
metadata.json +90 -0
model.safetensors +3 -0
tokenizer.json +0 -0
tokenizer_config.json +16 -0

README.md CHANGED Viewed

@@ -1,3 +1,170 @@
 ---
 license: mit
 ---

 ---
+language: en
 license: mit
+tags:
+  - text-classification
+  - argumentation
+  - fallacy-detection
+  - argument-scheme
+  - roberta
+  - pytorch
+datasets:
+  - EthiX
+  - Macagno
+metrics:
+  - f1
+  - accuracy
+model-index:
+  - name: ArgueBot Unified Argument & Fallacy Classifier
+    results:
+      - task:
+          type: text-classification
+        metrics:
+          - type: f1
+            value: 1.8465
+            name: Macro F1
 ---
+# ArgueBot — Unified Argument Scheme & Fallacy Classifier
+A fine-tuned **RoBERTa-large** model that classifies text into one of **24 categories**:
+- **11 argument scheme types** (valid argumentative patterns from Walton's taxonomy)
+- **13 logical fallacy types** (common informal fallacies)
+The model determines both *whether* an argument is valid or fallacious *and* which specific type it is — in a single inference pass.
+---
+## Model Details
+| Property | Value |
+|---|---|
+| Base model | `roberta-large` |
+| Task | 24-class text classification |
+| Scheme classes | 11 |
+| Fallacy classes | 13 |
+| Datasets | EthiX + Macagno (argument schemes), Fallacy dataset (13 types) |
+| Epochs trained | 5 (early stopping) |
+| Best val metric | 1.8465 (val_loss) |
+---
+## Labels
+### ✅ Argument Schemes (valid arguments)
+- `argument from alternatives`
+- `argument from analogy`
+- `argument from cause to effect`
+- `argument from commitment`
+- `argument from example`
+- `argument from expert opinion`
+- `argument from negative consequences`
+- `argument from positive consequences`
+- `argument from practical reasoning`
+- `argument from sign`
+- `argument from values`
+### ⚡ Fallacy Types
+- `ad hominem`
+- `ad populum`
+- `appeal to emotion`
+- `circular reasoning`
+- `equivocation`
+- `fallacy of credibility`
+- `fallacy of extension`
+- `fallacy of logic`
+- `fallacy of relevance`
+- `false causality`
+- `false dilemma`
+- `faulty generalization`
+- `intentional`
+---
+## How to Use
+```python
+from transformers import RobertaTokenizer, RobertaForSequenceClassification
+import torch, json
+model_id = "your-username/arguebot-argument-fallacy-classifier"
+tokenizer = RobertaTokenizer.from_pretrained(model_id)
+model     = RobertaForSequenceClassification.from_pretrained(model_id)
+model.eval()
+# Load label metadata
+import requests
+meta      = requests.get(
+    f"https://huggingface.co/{model_id}/resolve/main/metadata.json"
+).json()
+label_map    = {int(k): v for k, v in meta["label_map"].items()}
+scheme_ids   = set(meta["scheme_ids"])
+fallacy_ids  = set(meta["fallacy_ids"])
+def predict(text):
+    enc = tokenizer(text, return_tensors="pt",
+                    truncation=True, max_length=128)
+    with torch.no_grad():
+        logits = model(**enc).logits
+    probs   = torch.softmax(logits, dim=1).squeeze()
+    pred_id = int(probs.argmax())
+    label   = label_map[pred_id]
+    verdict = "Valid Argument" if pred_id in scheme_ids else "Fallacy"
+    return {
+        "verdict":    verdict,
+        "label":      label,
+        "confidence": float(round(probs[pred_id].item(), 4)),
+    }
+print(predict("According to NASA, global temperatures will rise 2°C by 2050."))
+# {'verdict': 'Valid Argument', 'label': 'argument from expert opinion', 'confidence': 0.94}
+print(predict("Don't trust him — he was caught lying before."))
+# {'verdict': 'Fallacy', 'label': 'ad hominem', 'confidence': 0.88}
+```
+---
+## Training Details
+- **Deduplication**: exact + near-duplicate removal, min 5 words per sample
+- **Class balancing**: `sklearn compute_class_weight("balanced")` + weighted cross-entropy loss
+- **Batch strategy**: custom `InterleavedSampler` alternates scheme/fallacy samples per batch
+- **Early stopping**: patience=3, min_delta=0.001, monitor=`val_loss`
+- **Optimiser**: AdamW, lr=3e-05, weight_decay=0.01
+- **Scheduler**: linear warmup (15% of steps)
+---
+## Intended Uses
+- Debate analysis and argumentation quality assessment
+- Educational tools for teaching critical thinking and informal logic
+- AI-assisted fact-checking and media literacy tools
+- Research in computational argumentation
+## Limitations
+- Trained on English text only
+- Short texts (< 5 words) may produce unreliable predictions
+- Some fallacy types (e.g. `intentional`, `equivocation`) are harder to distinguish without broader context
+- Not suitable for legal or medical decision-making
+---
+## Citation
+If you use this model in your research, please cite:
+```
+@misc{arguebot2025,
+  title   = {ArgueBot: Unified Argument Scheme and Fallacy Classification},
+  author  = {Isabel},
+  year    = {2025},
+  url     = {https://huggingface.co/your-username/arguebot-argument-fallacy-classifier}
+}
+```
+---
+*Built with RoBERTa-large · EthiX + Macagno datasets · Walton's Argumentation Schemes*

config.json ADDED Viewed

	@@ -0,0 +1,80 @@

+{
+  "add_cross_attention": false,
+  "architectures": [
+    "RobertaForSequenceClassification"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "bos_token_id": 0,
+  "classifier_dropout": null,
+  "dtype": "float32",
+  "eos_token_id": 2,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 1024,
+  "id2label": {
+    "0": "LABEL_0",
+    "1": "LABEL_1",
+    "2": "LABEL_2",
+    "3": "LABEL_3",
+    "4": "LABEL_4",
+    "5": "LABEL_5",
+    "6": "LABEL_6",
+    "7": "LABEL_7",
+    "8": "LABEL_8",
+    "9": "LABEL_9",
+    "10": "LABEL_10",
+    "11": "LABEL_11",
+    "12": "LABEL_12",
+    "13": "LABEL_13",
+    "14": "LABEL_14",
+    "15": "LABEL_15",
+    "16": "LABEL_16",
+    "17": "LABEL_17",
+    "18": "LABEL_18",
+    "19": "LABEL_19",
+    "20": "LABEL_20",
+    "21": "LABEL_21",
+    "22": "LABEL_22",
+    "23": "LABEL_23"
+  },
+  "initializer_range": 0.02,
+  "intermediate_size": 4096,
+  "is_decoder": false,
+  "label2id": {
+    "LABEL_0": 0,
+    "LABEL_1": 1,
+    "LABEL_10": 10,
+    "LABEL_11": 11,
+    "LABEL_12": 12,
+    "LABEL_13": 13,
+    "LABEL_14": 14,
+    "LABEL_15": 15,
+    "LABEL_16": 16,
+    "LABEL_17": 17,
+    "LABEL_18": 18,
+    "LABEL_19": 19,
+    "LABEL_2": 2,
+    "LABEL_20": 20,
+    "LABEL_21": 21,
+    "LABEL_22": 22,
+    "LABEL_23": 23,
+    "LABEL_3": 3,
+    "LABEL_4": 4,
+    "LABEL_5": 5,
+    "LABEL_6": 6,
+    "LABEL_7": 7,
+    "LABEL_8": 8,
+    "LABEL_9": 9
+  },
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 514,
+  "model_type": "roberta",
+  "num_attention_heads": 16,
+  "num_hidden_layers": 24,
+  "pad_token_id": 1,
+  "tie_word_embeddings": true,
+  "transformers_version": "5.0.0",
+  "type_vocab_size": 1,
+  "use_cache": true,
+  "vocab_size": 50265
+}

metadata.json ADDED Viewed

	@@ -0,0 +1,90 @@

+{
+  "label_map": {
+    "0": "ad hominem",
+    "1": "ad populum",
+    "2": "appeal to emotion",
+    "3": "argument from alternatives",
+    "4": "argument from analogy",
+    "5": "argument from cause to effect",
+    "6": "argument from commitment",
+    "7": "argument from example",
+    "8": "argument from expert opinion",
+    "9": "argument from negative consequences",
+    "10": "argument from positive consequences",
+    "11": "argument from practical reasoning",
+    "12": "argument from sign",
+    "13": "argument from values",
+    "14": "circular reasoning",
+    "15": "equivocation",
+    "16": "fallacy of credibility",
+    "17": "fallacy of extension",
+    "18": "fallacy of logic",
+    "19": "fallacy of relevance",
+    "20": "false causality",
+    "21": "false dilemma",
+    "22": "faulty generalization",
+    "23": "intentional"
+  },
+  "scheme_ids": [
+    3,
+    4,
+    5,
+    6,
+    7,
+    8,
+    9,
+    10,
+    11,
+    12,
+    13
+  ],
+  "fallacy_ids": [
+    0,
+    1,
+    2,
+    14,
+    15,
+    16,
+    17,
+    18,
+    19,
+    20,
+    21,
+    22,
+    23
+  ],
+  "scheme_labels": [
+    "argument from cause to effect",
+    "argument from negative consequences",
+    "argument from analogy",
+    "argument from commitment",
+    "argument from values",
+    "argument from example",
+    "argument from alternatives",
+    "argument from sign",
+    "argument from practical reasoning",
+    "argument from expert opinion",
+    "argument from positive consequences"
+  ],
+  "fallacy_labels": [
+    "intentional",
+    "ad hominem",
+    "fallacy of relevance",
+    "false causality",
+    "ad populum",
+    "fallacy of credibility",
+    "faulty generalization",
+    "false dilemma",
+    "fallacy of logic",
+    "appeal to emotion",
+    "circular reasoning",
+    "equivocation",
+    "fallacy of extension"
+  ],
+  "num_classes": 24,
+  "model_name": "roberta-large",
+  "max_len": 128,
+  "best_val_f1": null,
+  "best_val_loss": 1.8465,
+  "epochs_trained": 5
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:f045fdf0d80eeb8c7465d78bc6c8004844fdb4041382050761fd79a4f3917758
+size 1421585568

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,16 @@

+{
+  "add_prefix_space": false,
+  "backend": "tokenizers",
+  "bos_token": "<s>",
+  "cls_token": "<s>",
+  "eos_token": "</s>",
+  "errors": "replace",
+  "is_local": false,
+  "mask_token": "<mask>",
+  "model_max_length": 512,
+  "pad_token": "<pad>",
+  "sep_token": "</s>",
+  "tokenizer_class": "RobertaTokenizer",
+  "trim_offsets": true,
+  "unk_token": "<unk>"
+}