Arseniy-Sandalov
/

GeorgianBert-Sent

Text Classification

Model card Files Files and versions

Arseniy-Sandalov commited on Jan 20, 2025

Commit

f197e0a

·

verified ·

1 Parent(s): 52a17c1

Update README.md

Files changed (1) hide show

README.md +61 -1

README.md CHANGED Viewed

@@ -13,4 +13,64 @@ base_model:
 pipeline_tag: text-classification
 tags:
 - Sentiment
----

 pipeline_tag: text-classification
 tags:
 - Sentiment
+---
+# Sentiment Analysis with Fine-tuned Multilingual BERT for Georgian 🇬🇪
+## 📄 Model Overview
+This is a **fine-tuned BERT model** for **Georgian sentiment analysis**, based on **`bert-base-multilingual-cased`**. The model was trained using the **Georgian Sentiment Analysis dataset**.
+- **Base Model:** `bert-base-multilingual-cased`
+- **Fine-tuned on:** `Arseniy-Sandalov/Georgian-Sentiment-Analysis`
+- **Task:** Sentiment classification (positive, negative, neutral)
+- **Tokenizer:** BERT multilingual cased tokenizer
+- **License:** [Check dataset source](http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf)
+## 👉 Usage Example
+You can load and use this model with Hugging Face Transformers:
+```python
+from transformers import AutoModelForSequenceClassification, AutoTokenizer
+import torch
+model_name = "your_huggingface_model_name"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+def predict_sentiment(text):
+    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)
+    with torch.no_grad():
+        outputs = model(**inputs)
+    prediction = torch.argmax(outputs.logits, dim=1).item()
+    return ["negative", "neutral", "positive"][prediction]
+text = "ახალი მეარი კარგია ერთილა"
+print(predict_sentiment(text))
+```
+## 📊 Training Details
+**Dataset Preprocessing:**
+- Removed irrelevant columns (e.g., perturbation)
+- Stratified split: 80% train, 10% validation, 10% test
+**Evaluation Metric:**
+- ROC AUC Score (computed on validation & test sets)
+## 📖 Citation
+If you use this model, please cite the original dataset:
+```
+@misc {Stefanovitch2023Sentiment,
+  author = {Stefanovitch, Nicolas and Piskorski, Jakub and Kharazi, Sopho},
+  title = {Sentiment analysis for Georgian},
+  year = {2023},
+  publisher = {European Commission, Joint Research Centre (JRC)},
+  howpublished = {\url{http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf}},
+  url = {http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf},
+  type = {dataset},
+  note = {PID: http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf}
+}
+```