--- license: mit base_model: gpt2-large library_name: transformers pipeline_tag: text-generation tags: - text-generation - gpt2 - dialogue - emotion - causal-lm language: - en datasets: - DailyDialog metrics: - perplexity --- # Emotional GPT-2 Large `emotional-gpt2-large` is a [GPT-2 Large](https://huggingface.co/openai-community/gpt2-large) causal language model fine-tuned for emotion-conditioned dialogue generation with DailyDialog-derived data. GitHub repository: [`Mario-RC/emotional-gpt`](https://github.com/Mario-RC/emotional-gpt) ## Model Details - Base model: [`gpt2-large`](https://huggingface.co/openai-community/gpt2-large) - Architecture: `GPT2LMHeadModel` - Task: text generation - Context length: 1024 tokens - Parameters: 774.0M - Evaluation perplexity: 7.4115 ## Model Comparison | Model | Base model | Parameters | Evaluation perplexity | | :--- | :---: | :---: | :---: | | [Emotional DistilGPT2](https://huggingface.co/mario-rc/emotional-distilgpt2) | [`distilgpt2`](https://huggingface.co/distilgpt2) | 81.9M | 15.3322 | | [Emotional GPT-2](https://huggingface.co/mario-rc/emotional-gpt2) | [`gpt2`](https://huggingface.co/openai-community/gpt2) | 124.4M | 12.9404 | | [Emotional GPT-2 Medium](https://huggingface.co/mario-rc/emotional-gpt2-medium) | [`gpt2-medium`](https://huggingface.co/openai-community/gpt2-medium) | 354.8M | 10.0080 | | [Emotional GPT-2 Large](https://huggingface.co/mario-rc/emotional-gpt2-large) | [`gpt2-large`](https://huggingface.co/openai-community/gpt2-large) | 774.0M | 7.4115 | | [Emotional DialoGPT Small](https://huggingface.co/mario-rc/emotional-dialogpt-small) | [`microsoft/DialoGPT-small`](https://huggingface.co/microsoft/DialoGPT-small) | 124.4M | 13.0488 | | [Emotional DialoGPT Medium](https://huggingface.co/mario-rc/emotional-dialogpt-medium) | [`microsoft/DialoGPT-medium`](https://huggingface.co/microsoft/DialoGPT-medium) | 354.8M | 10.5130 | | [Emotional DialoGPT Large](https://huggingface.co/mario-rc/emotional-dialogpt-large) | [`microsoft/DialoGPT-large`](https://huggingface.co/microsoft/DialoGPT-large) | 774.0M | 8.6719 | ## Training The fine-tuning run used the following setup: - Framework: Hugging Face Transformers - Training data: `data/gpt-dialogues/train.txt`; evaluation data: `data/gpt-dialogues/dev.txt`, built from DailyDialog CSV resources - Epochs: 4 - Train/eval batch size per GPU: 1 / 1 - Gradient accumulation steps: 6 - Effective training batch size: 6 - Learning rate: `1e-5` - Max gradient norm: `1.0` - Objective: line-by-line causal language modeling - Seed: `42` - Checkpointing/logging: every 5000 optimizer steps; last checkpoint kept - Memory optimization: gradient checkpointing enabled ## Training Format Training examples use adjacent DailyDialog utterance pairs with explicit source and target emotion labels: ```text source utterancetarget utterance<|endoftext|> ``` ## Prompt Format At generation time, the prompt should include the source utterance and the desired target emotion: ```text source utterance ``` Prompt and training tags: - `` marks the beginning of one formatted dialogue example. - `` is a placeholder for one emotion label describing the input/source utterance, for example ``. - `source utterance` is the user/input text. - `` separates the source side from the response side. - `` is a placeholder for the emotion you want the generated response to follow, for example ``. - `target utterance` is the response text generated by the model. - `<|endoftext|>` marks the end of one example. GPT-2 uses this as its native end-of-text/eos token, and generation can stop when this token is produced. Emotion conditioning: replace `` and `` in the template with one of the model's literal emotion tokens in each position. Supported emotion labels: - `` - `` - `` - `` - `` - `` - `` For example: ```text I just started a new job and I am a bit nervous. ``` This means: the source utterance expresses `fear`, and the requested response should be conditioned toward `happiness`. ## How to Use ```python from transformers import AutoModelForCausalLM, AutoTokenizer repo_id = "mario-rc/emotional-gpt2-large" tokenizer = AutoTokenizer.from_pretrained(repo_id) model = AutoModelForCausalLM.from_pretrained(repo_id) model.config.pad_token_id = tokenizer.pad_token_id prompt = "I just started a new job and I am a bit nervous." inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate( **inputs, do_sample=True, max_new_tokens=80, temperature=0.8, top_p=0.95, pad_token_id=tokenizer.pad_token_id, eos_token_id=tokenizer.eos_token_id, ) generated = outputs[0][inputs["input_ids"].shape[-1]:] response = tokenizer.decode(generated, skip_special_tokens=False) response = response.split(tokenizer.eos_token, 1)[0].strip() emotion_labels = [ "", "", "", "", "", "", "", ] for label in emotion_labels: if response.startswith(label): response = response[len(label):].strip() break print(response) ``` ## Limitations The model is intended for experimental dialogue/text generation. Generated text may be inaccurate, biased, repetitive, or emotionally inappropriate, and should be reviewed before user-facing use.