Isabelbinu commited on
Commit
1d2dd73
·
verified ·
1 Parent(s): fc09bec

Upload 6 files

Browse files
Files changed (6) hide show
  1. README.md +167 -0
  2. config.json +80 -0
  3. metadata.json +90 -0
  4. model.safetensors +3 -0
  5. tokenizer.json +0 -0
  6. tokenizer_config.json +16 -0
README.md CHANGED
@@ -1,3 +1,170 @@
1
  ---
 
2
  license: mit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: en
3
  license: mit
4
+ tags:
5
+ - text-classification
6
+ - argumentation
7
+ - fallacy-detection
8
+ - argument-scheme
9
+ - roberta
10
+ - pytorch
11
+ datasets:
12
+ - EthiX
13
+ - Macagno
14
+ metrics:
15
+ - f1
16
+ - accuracy
17
+ model-index:
18
+ - name: ArgueBot Unified Argument & Fallacy Classifier
19
+ results:
20
+ - task:
21
+ type: text-classification
22
+ metrics:
23
+ - type: f1
24
+ value: 1.8465
25
+ name: Macro F1
26
  ---
27
+
28
+ # ArgueBot — Unified Argument Scheme & Fallacy Classifier
29
+
30
+ A fine-tuned **RoBERTa-large** model that classifies text into one of **24 categories**:
31
+ - **11 argument scheme types** (valid argumentative patterns from Walton's taxonomy)
32
+ - **13 logical fallacy types** (common informal fallacies)
33
+
34
+ The model determines both *whether* an argument is valid or fallacious *and* which specific type it is — in a single inference pass.
35
+
36
+ ---
37
+
38
+ ## Model Details
39
+
40
+ | Property | Value |
41
+ |---|---|
42
+ | Base model | `roberta-large` |
43
+ | Task | 24-class text classification |
44
+ | Scheme classes | 11 |
45
+ | Fallacy classes | 13 |
46
+ | Datasets | EthiX + Macagno (argument schemes), Fallacy dataset (13 types) |
47
+ | Epochs trained | 5 (early stopping) |
48
+ | Best val metric | 1.8465 (val_loss) |
49
+
50
+ ---
51
+
52
+ ## Labels
53
+
54
+ ### ✅ Argument Schemes (valid arguments)
55
+ - `argument from alternatives`
56
+ - `argument from analogy`
57
+ - `argument from cause to effect`
58
+ - `argument from commitment`
59
+ - `argument from example`
60
+ - `argument from expert opinion`
61
+ - `argument from negative consequences`
62
+ - `argument from positive consequences`
63
+ - `argument from practical reasoning`
64
+ - `argument from sign`
65
+ - `argument from values`
66
+
67
+ ### ⚡ Fallacy Types
68
+ - `ad hominem`
69
+ - `ad populum`
70
+ - `appeal to emotion`
71
+ - `circular reasoning`
72
+ - `equivocation`
73
+ - `fallacy of credibility`
74
+ - `fallacy of extension`
75
+ - `fallacy of logic`
76
+ - `fallacy of relevance`
77
+ - `false causality`
78
+ - `false dilemma`
79
+ - `faulty generalization`
80
+ - `intentional`
81
+
82
+ ---
83
+
84
+ ## How to Use
85
+
86
+ ```python
87
+ from transformers import RobertaTokenizer, RobertaForSequenceClassification
88
+ import torch, json
89
+
90
+ model_id = "your-username/arguebot-argument-fallacy-classifier"
91
+ tokenizer = RobertaTokenizer.from_pretrained(model_id)
92
+ model = RobertaForSequenceClassification.from_pretrained(model_id)
93
+ model.eval()
94
+
95
+ # Load label metadata
96
+ import requests
97
+ meta = requests.get(
98
+ f"https://huggingface.co/{model_id}/resolve/main/metadata.json"
99
+ ).json()
100
+ label_map = {int(k): v for k, v in meta["label_map"].items()}
101
+ scheme_ids = set(meta["scheme_ids"])
102
+ fallacy_ids = set(meta["fallacy_ids"])
103
+
104
+ def predict(text):
105
+ enc = tokenizer(text, return_tensors="pt",
106
+ truncation=True, max_length=128)
107
+ with torch.no_grad():
108
+ logits = model(**enc).logits
109
+ probs = torch.softmax(logits, dim=1).squeeze()
110
+ pred_id = int(probs.argmax())
111
+ label = label_map[pred_id]
112
+ verdict = "Valid Argument" if pred_id in scheme_ids else "Fallacy"
113
+ return {
114
+ "verdict": verdict,
115
+ "label": label,
116
+ "confidence": float(round(probs[pred_id].item(), 4)),
117
+ }
118
+
119
+ print(predict("According to NASA, global temperatures will rise 2°C by 2050."))
120
+ # {'verdict': 'Valid Argument', 'label': 'argument from expert opinion', 'confidence': 0.94}
121
+
122
+ print(predict("Don't trust him — he was caught lying before."))
123
+ # {'verdict': 'Fallacy', 'label': 'ad hominem', 'confidence': 0.88}
124
+ ```
125
+
126
+ ---
127
+
128
+ ## Training Details
129
+
130
+ - **Deduplication**: exact + near-duplicate removal, min 5 words per sample
131
+ - **Class balancing**: `sklearn compute_class_weight("balanced")` + weighted cross-entropy loss
132
+ - **Batch strategy**: custom `InterleavedSampler` alternates scheme/fallacy samples per batch
133
+ - **Early stopping**: patience=3, min_delta=0.001, monitor=`val_loss`
134
+ - **Optimiser**: AdamW, lr=3e-05, weight_decay=0.01
135
+ - **Scheduler**: linear warmup (15% of steps)
136
+
137
+ ---
138
+
139
+ ## Intended Uses
140
+
141
+ - Debate analysis and argumentation quality assessment
142
+ - Educational tools for teaching critical thinking and informal logic
143
+ - AI-assisted fact-checking and media literacy tools
144
+ - Research in computational argumentation
145
+
146
+ ## Limitations
147
+
148
+ - Trained on English text only
149
+ - Short texts (< 5 words) may produce unreliable predictions
150
+ - Some fallacy types (e.g. `intentional`, `equivocation`) are harder to distinguish without broader context
151
+ - Not suitable for legal or medical decision-making
152
+
153
+ ---
154
+
155
+ ## Citation
156
+
157
+ If you use this model in your research, please cite:
158
+
159
+ ```
160
+ @misc{arguebot2025,
161
+ title = {ArgueBot: Unified Argument Scheme and Fallacy Classification},
162
+ author = {Isabel},
163
+ year = {2025},
164
+ url = {https://huggingface.co/your-username/arguebot-argument-fallacy-classifier}
165
+ }
166
+ ```
167
+
168
+ ---
169
+
170
+ *Built with RoBERTa-large · EthiX + Macagno datasets · Walton's Argumentation Schemes*
config.json ADDED
@@ -0,0 +1,80 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_cross_attention": false,
3
+ "architectures": [
4
+ "RobertaForSequenceClassification"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "dtype": "float32",
10
+ "eos_token_id": 2,
11
+ "hidden_act": "gelu",
12
+ "hidden_dropout_prob": 0.1,
13
+ "hidden_size": 1024,
14
+ "id2label": {
15
+ "0": "LABEL_0",
16
+ "1": "LABEL_1",
17
+ "2": "LABEL_2",
18
+ "3": "LABEL_3",
19
+ "4": "LABEL_4",
20
+ "5": "LABEL_5",
21
+ "6": "LABEL_6",
22
+ "7": "LABEL_7",
23
+ "8": "LABEL_8",
24
+ "9": "LABEL_9",
25
+ "10": "LABEL_10",
26
+ "11": "LABEL_11",
27
+ "12": "LABEL_12",
28
+ "13": "LABEL_13",
29
+ "14": "LABEL_14",
30
+ "15": "LABEL_15",
31
+ "16": "LABEL_16",
32
+ "17": "LABEL_17",
33
+ "18": "LABEL_18",
34
+ "19": "LABEL_19",
35
+ "20": "LABEL_20",
36
+ "21": "LABEL_21",
37
+ "22": "LABEL_22",
38
+ "23": "LABEL_23"
39
+ },
40
+ "initializer_range": 0.02,
41
+ "intermediate_size": 4096,
42
+ "is_decoder": false,
43
+ "label2id": {
44
+ "LABEL_0": 0,
45
+ "LABEL_1": 1,
46
+ "LABEL_10": 10,
47
+ "LABEL_11": 11,
48
+ "LABEL_12": 12,
49
+ "LABEL_13": 13,
50
+ "LABEL_14": 14,
51
+ "LABEL_15": 15,
52
+ "LABEL_16": 16,
53
+ "LABEL_17": 17,
54
+ "LABEL_18": 18,
55
+ "LABEL_19": 19,
56
+ "LABEL_2": 2,
57
+ "LABEL_20": 20,
58
+ "LABEL_21": 21,
59
+ "LABEL_22": 22,
60
+ "LABEL_23": 23,
61
+ "LABEL_3": 3,
62
+ "LABEL_4": 4,
63
+ "LABEL_5": 5,
64
+ "LABEL_6": 6,
65
+ "LABEL_7": 7,
66
+ "LABEL_8": 8,
67
+ "LABEL_9": 9
68
+ },
69
+ "layer_norm_eps": 1e-05,
70
+ "max_position_embeddings": 514,
71
+ "model_type": "roberta",
72
+ "num_attention_heads": 16,
73
+ "num_hidden_layers": 24,
74
+ "pad_token_id": 1,
75
+ "tie_word_embeddings": true,
76
+ "transformers_version": "5.0.0",
77
+ "type_vocab_size": 1,
78
+ "use_cache": true,
79
+ "vocab_size": 50265
80
+ }
metadata.json ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "label_map": {
3
+ "0": "ad hominem",
4
+ "1": "ad populum",
5
+ "2": "appeal to emotion",
6
+ "3": "argument from alternatives",
7
+ "4": "argument from analogy",
8
+ "5": "argument from cause to effect",
9
+ "6": "argument from commitment",
10
+ "7": "argument from example",
11
+ "8": "argument from expert opinion",
12
+ "9": "argument from negative consequences",
13
+ "10": "argument from positive consequences",
14
+ "11": "argument from practical reasoning",
15
+ "12": "argument from sign",
16
+ "13": "argument from values",
17
+ "14": "circular reasoning",
18
+ "15": "equivocation",
19
+ "16": "fallacy of credibility",
20
+ "17": "fallacy of extension",
21
+ "18": "fallacy of logic",
22
+ "19": "fallacy of relevance",
23
+ "20": "false causality",
24
+ "21": "false dilemma",
25
+ "22": "faulty generalization",
26
+ "23": "intentional"
27
+ },
28
+ "scheme_ids": [
29
+ 3,
30
+ 4,
31
+ 5,
32
+ 6,
33
+ 7,
34
+ 8,
35
+ 9,
36
+ 10,
37
+ 11,
38
+ 12,
39
+ 13
40
+ ],
41
+ "fallacy_ids": [
42
+ 0,
43
+ 1,
44
+ 2,
45
+ 14,
46
+ 15,
47
+ 16,
48
+ 17,
49
+ 18,
50
+ 19,
51
+ 20,
52
+ 21,
53
+ 22,
54
+ 23
55
+ ],
56
+ "scheme_labels": [
57
+ "argument from cause to effect",
58
+ "argument from negative consequences",
59
+ "argument from analogy",
60
+ "argument from commitment",
61
+ "argument from values",
62
+ "argument from example",
63
+ "argument from alternatives",
64
+ "argument from sign",
65
+ "argument from practical reasoning",
66
+ "argument from expert opinion",
67
+ "argument from positive consequences"
68
+ ],
69
+ "fallacy_labels": [
70
+ "intentional",
71
+ "ad hominem",
72
+ "fallacy of relevance",
73
+ "false causality",
74
+ "ad populum",
75
+ "fallacy of credibility",
76
+ "faulty generalization",
77
+ "false dilemma",
78
+ "fallacy of logic",
79
+ "appeal to emotion",
80
+ "circular reasoning",
81
+ "equivocation",
82
+ "fallacy of extension"
83
+ ],
84
+ "num_classes": 24,
85
+ "model_name": "roberta-large",
86
+ "max_len": 128,
87
+ "best_val_f1": null,
88
+ "best_val_loss": 1.8465,
89
+ "epochs_trained": 5
90
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f045fdf0d80eeb8c7465d78bc6c8004844fdb4041382050761fd79a4f3917758
3
+ size 1421585568
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "backend": "tokenizers",
4
+ "bos_token": "<s>",
5
+ "cls_token": "<s>",
6
+ "eos_token": "</s>",
7
+ "errors": "replace",
8
+ "is_local": false,
9
+ "mask_token": "<mask>",
10
+ "model_max_length": 512,
11
+ "pad_token": "<pad>",
12
+ "sep_token": "</s>",
13
+ "tokenizer_class": "RobertaTokenizer",
14
+ "trim_offsets": true,
15
+ "unk_token": "<unk>"
16
+ }