Arseniy-Sandalov/Georgian-Sentiment-Analysis
Viewer โข Updated โข 4.22k โข 9
This is a fine-tuned BERT model for Georgian sentiment analysis, based on bert-base-multilingual-cased. The model was trained using the Georgian Sentiment Analysis dataset.
bert-base-multilingual-casedArseniy-Sandalov/Georgian-Sentiment-AnalysisYou can load and use this model with Hugging Face Transformers:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
model_name = "Arseniy-Sandalov/GeorgianBert-Sent"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
def predict_sentiment(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=1).item()
return ["negative", "neutral", "positive"][prediction]
text = "แแฎแแแ แแแแ แ แแแ แแแ แแ แแแแ"
print(predict_sentiment(text))
Dataset Preprocessing:
Removed irrelevant columns (e.g., perturbation)
Stratified split: 80% train, 10% validation, 10% test
Evaluation Metric:
If you use this model, please cite the original dataset:
@misc {Stefanovitch2023Sentiment,
author = {Stefanovitch, Nicolas and Piskorski, Jakub and Kharazi, Sopho},
title = {Sentiment analysis for Georgian},
year = {2023},
publisher = {European Commission, Joint Research Centre (JRC)},
howpublished = {\url{http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf}},
url = {http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf},
type = {dataset},
note = {PID: http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf}
}
Base model
google-bert/bert-base-multilingual-cased