--- license: mit datasets: - Arseniy-Sandalov/Georgian-Sentiment-Analysis language: - ka metrics: - f1 - roc_auc - accuracy base_model: - google-bert/bert-base-multilingual-cased pipeline_tag: text-classification tags: - Sentiment --- # Sentiment Analysis with Fine-tuned Multilingual BERT for Georgian 🇬🇪 ## 📄 Model Overview This is a **fine-tuned BERT model** for **Georgian sentiment analysis**, based on **`bert-base-multilingual-cased`**. The model was trained using the **Georgian Sentiment Analysis dataset**. - **Base Model:** `bert-base-multilingual-cased` - **Fine-tuned on:** `Arseniy-Sandalov/Georgian-Sentiment-Analysis` - **Task:** Sentiment classification (positive, negative, neutral) - **Tokenizer:** BERT multilingual cased tokenizer - **License:** [Check dataset source](http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf) ## 👉 Usage Example You can load and use this model with Hugging Face Transformers: ```python from transformers import AutoModelForSequenceClassification, AutoTokenizer import torch model_name = "Arseniy-Sandalov/GeorgianBert-Sent" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForSequenceClassification.from_pretrained(model_name) def predict_sentiment(text): inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512) with torch.no_grad(): outputs = model(**inputs) prediction = torch.argmax(outputs.logits, dim=1).item() return ["negative", "neutral", "positive"][prediction] text = "ახალი მეარი კარგია ერთილა" print(predict_sentiment(text)) ``` ## 📊 Training Details **Dataset Preprocessing:** - Removed irrelevant columns (e.g., perturbation) - Stratified split: 80% train, 10% validation, 10% test **Evaluation Metric:** - ROC AUC Score (computed on validation & test sets) ## 📖 Citation If you use this model, please cite the original dataset: ``` @misc {Stefanovitch2023Sentiment, author = {Stefanovitch, Nicolas and Piskorski, Jakub and Kharazi, Sopho}, title = {Sentiment analysis for Georgian}, year = {2023}, publisher = {European Commission, Joint Research Centre (JRC)}, howpublished = {\url{http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf}}, url = {http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf}, type = {dataset}, note = {PID: http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf} } ```