Text Classification
Transformers
Safetensors
English
deberta-v2
gender
gender-prediction
deberta
Eval Results (legacy)
Instructions to use ariyul/gender_prediction_model_from_text with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ariyul/gender_prediction_model_from_text with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="ariyul/gender_prediction_model_from_text")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("ariyul/gender_prediction_model_from_text") model = AutoModelForSequenceClassification.from_pretrained("ariyul/gender_prediction_model_from_text") - Notebooks
- Google Colab
- Kaggle
| language: en | |
| tags: | |
| - text-classification | |
| - gender | |
| - gender-prediction | |
| - transformers | |
| - deberta | |
| license: mit | |
| datasets: | |
| - samzirbo/europarl.en-es.gendered | |
| - czyzi0/luna-speech-dataset | |
| - czyzi0/pwr-azon-speech-dataset | |
| - sagteam/author_profiling | |
| - kaushalgawri/nptel-en-tags-and-gender-v0 | |
| metrics: | |
| - accuracy | |
| - f1 | |
| - precision | |
| - recall | |
| base_model: microsoft/deberta-v3-large | |
| pipeline_tag: text-classification | |
| model-index: | |
| - name: gender_prediction_model_from_text | |
| results: | |
| - task: | |
| type: text-classification | |
| name: Text Classification | |
| metrics: | |
| - type: f1 | |
| value: 0.69 | |
| - type: accuracy | |
| value: 0.69 | |
| citations: | |
| - "@misc{fc63_gender1_2025,\n title = {Gender Prediction from Text},\n author = {Çoban, Furkan},\n year = {2025},\n howpublished = {\\url{https://doi.org/10.5281/zenodo.15619489}},\n note = {DeBERTa-v3-large model fine-tuned on multi-domain gender-labeled texts}\n}" | |
| # Gender Prediction from Text ✍️ → 👩🦰👨 | |
| This model **predicts** the likely **gender** of an anonymous speaker or writer based solely on the content of an English text. It is built upon [DeBERTa-v3-large](https://huggingface.co/microsoft/deberta-v3-large) and fine-tuned on a diverse, multilingual, and multi-domain dataset with both formal and informal texts. | |
| 📍 **Space link**: [🔗 Try it out on Hugging Face Spaces](https://huggingface.co/spaces/fc63/Gender_Prediction) | |
| 📁 **Model repo**: [🔗 View on Hugging Face Hub](https://huggingface.co/fc63/gender_prediction_model_from_text) | |
| 🧠 **Source code**: [GitHub](https://github.com/fc63/gender-classification) | |
| --- | |
| ## 📊 Model Summary | |
| - **Base model**: `microsoft/deberta-v3-large` | |
| - **Fine-tuned on**: binary gender classification task (`female` vs `male`) | |
| - **Best F1 Score**: `0.69` on a balanced multi-domain test set | |
| - **Max token length**: 128 | |
| - **Evaluation Metrics**: | |
| - F1: 0.69 | |
| - Accuracy: 0.69 | |
| - Precision: 0.69 | |
| - Recall: 0.69 | |
| 📂 **Evaluation**: [View on Notebook](https://github.com/fc63/gender-classification/blob/main/Evaluate/modelv3.ipynb) | |
| --- | |
| ## 🧾 Datasets Used | |
| | Dataset | Domain | Type | | |
| |--------|--------|------| | |
| | [samzirbo/europarl.en-es.gendered](https://huggingface.co/datasets/samzirbo/europarl.en-es.gendered) | Formal speech (Parliament) | English | | |
| | [czyzi0/luna-speech-dataset](https://huggingface.co/datasets/czyzi0/luna-speech-dataset) | Phone conversations | Polish → Translated | | |
| | [czyzi0/pwr-azon-speech-dataset](https://huggingface.co/datasets/czyzi0/pwr-azon-speech-dataset) | Phone conversations | Polish → Translated | | |
| | [sagteam/author_profiling](https://huggingface.co/datasets/sagteam/author_profiling) | Social posts | Russian → Translated | | |
| | [kaushalgawri/nptel-en-tags-and-gender-v0](https://huggingface.co/datasets/kaushalgawri/nptel-en-tags-and-gender-v0) | Spoken transcripts | English | | |
| | [Blog Authorship Corpus](https://u.cs.biu.ac.il/~koppel/BlogCorpus.htm) | Blog posts | English | | |
| All datasets were normalized, translated if necessary, deduplicated, and **balanced via random undersampling** to ensure equal representation of both genders. | |
| --- | |
| ## 🛠️ Preprocessing & Training | |
| - **Normalization**: Cleaned quotes, dashes, placeholders, noise, and HTML/code from all datasets. | |
| - **Translation**: Used `Helsinki-NLP/opus-mt-*` models for Polish and Russian data. | |
| - **Undersampling**: Random undersampling to balance male and female samples. | |
| - **Training Strategy**: | |
| - LR Finder used to optimize learning rate (`2.66e-6`) | |
| - Fine-tuned using early stopping on both F1 and loss | |
| - Step-based evaluation every 250 steps | |
| - Best checkpoint at step 24,750 saved and evaluated | |
| - **Second Phase Fine-tuning**: | |
| - Performed on full merged dataset for 2 epochs | |
| - Used cosine learning rate scheduler and warm-up steps | |
| --- | |
| ## 📈 Performance (on full merged test set) | |
| | Class | Precision | Recall | F1-Score | Accuracy | Support | | |
| |-----|-----|--------|----------|---------|---------| | |
| | Female | 0.70 | 0.65 | 0.68 | | 591,027 | | |
| | Male | 0.68 | 0.72 | 0.70 | | 591,027 | | |
| | **Macro Avg** | 0.69 | 0.69 | **0.69** | | 1,182,054 | | |
| | **Accuracy** | | | | **0.69** | 1,182,054 | | |
| --- | |
| ## 📦 Usage Example | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForSequenceClassification | |
| import torch | |
| import torch.nn.functional as F | |
| device = torch.device("cuda" if torch.cuda.is_available() else "cpu") | |
| model_name = "fc63/gender_prediction_model_from_text" | |
| tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=False) | |
| model = AutoModelForSequenceClassification.from_pretrained(model_name).eval().to(device) | |
| def predict(text): | |
| inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=128).to(device) | |
| with torch.no_grad(): | |
| outputs = model(**inputs) | |
| probs = F.softmax(outputs.logits, dim=1) | |
| pred = torch.argmax(probs, dim=1).item() | |
| confidence = round(probs[0][pred].item() * 100, 1) | |
| gender = "Female" if pred == 0 else "Male" | |
| return f"{gender} (Confidence: {confidence}%)" | |
| ``` | |
| ``` | |
| sample_text = "I love writing in my journal every night. It helps me reflect on the day and plan for tomorrow." | |
| print(predict(sample_text)) | |
| ``` | |
| The Output Of This Sample: | |
| ``` | |
| Female (Confidence: 84.1%) | |
| ``` | |
| --- | |
| ## 📌 Future Work & Limitations | |
| I do not want to leave this model at the level of 0.69 accuracy and F1 score. | |
| As far as I can detect at this point, there is a bias towards predicting emotional, psychological, and introspective texts as female. Similarly, more direct and result-oriented writings are also often predicted as male. Therefore, a large, carefully labeled dataset that reflects the opposite of this pattern is needed. | |
| The datasets used to train this model had to be obtained from open-source platforms, which limited the range of accessible data. | |
| To make further progress, I need to create and label a larger dataset myself — which requires a significant amount of time, effort, and cost. | |
| Before moving to dataset creation, I plan to try a few more approaches using the current dataset. So far, alternative techniques have not helped improve the scores without causing overfitting. After testing a few more methods, if none work, the only step left will be building a new dataset — and that will likely be the point where I stop development, as it will be both labor-intensive and costly for me. | |
| --- | |
| ## 👨🔬 Author & License | |
| **Author**: Furkan Çoban | |
| **Project**: CENG-481 Gender Prediction Model | |
| **License**: MIT | |