---
license: mit
base_model: gpt2-large
library_name: transformers
pipeline_tag: text-generation
tags:
- text-generation
- gpt2
- dialogue
- emotion
- causal-lm
language:
- en
datasets:
- DailyDialog
metrics:
- perplexity
---

# Emotional GPT-2 Large

`emotional-gpt2-large` is a [GPT-2 Large](https://huggingface.co/openai-community/gpt2-large)
causal language model fine-tuned for emotion-conditioned dialogue generation
with DailyDialog-derived data.

GitHub repository: [`Mario-RC/emotional-gpt`](https://github.com/Mario-RC/emotional-gpt)

## Model Details

- Base model: [`gpt2-large`](https://huggingface.co/openai-community/gpt2-large)
- Architecture: `GPT2LMHeadModel`
- Task: text generation
- Context length: 1024 tokens
- Parameters: 774.0M
- Evaluation perplexity: 7.4115

## Model Comparison

| Model | Base model | Parameters | Evaluation perplexity |
| :--- | :---: | :---: | :---: |
| [Emotional DistilGPT2](https://huggingface.co/mario-rc/emotional-distilgpt2) | [`distilgpt2`](https://huggingface.co/distilgpt2) | 81.9M | 15.3322 |
| [Emotional GPT-2](https://huggingface.co/mario-rc/emotional-gpt2) | [`gpt2`](https://huggingface.co/openai-community/gpt2) | 124.4M | 12.9404 |
| [Emotional GPT-2 Medium](https://huggingface.co/mario-rc/emotional-gpt2-medium) | [`gpt2-medium`](https://huggingface.co/openai-community/gpt2-medium) | 354.8M | 10.0080 |
| [Emotional GPT-2 Large](https://huggingface.co/mario-rc/emotional-gpt2-large) | [`gpt2-large`](https://huggingface.co/openai-community/gpt2-large) | 774.0M | 7.4115 |
| [Emotional DialoGPT Small](https://huggingface.co/mario-rc/emotional-dialogpt-small) | [`microsoft/DialoGPT-small`](https://huggingface.co/microsoft/DialoGPT-small) | 124.4M | 13.0488 |
| [Emotional DialoGPT Medium](https://huggingface.co/mario-rc/emotional-dialogpt-medium) | [`microsoft/DialoGPT-medium`](https://huggingface.co/microsoft/DialoGPT-medium) | 354.8M | 10.5130 |
| [Emotional DialoGPT Large](https://huggingface.co/mario-rc/emotional-dialogpt-large) | [`microsoft/DialoGPT-large`](https://huggingface.co/microsoft/DialoGPT-large) | 774.0M | 8.6719 |

## Training

The fine-tuning run used the following setup:

- Framework: Hugging Face Transformers
- Training data: `data/gpt-dialogues/train.txt`; evaluation data: `data/gpt-dialogues/dev.txt`, built from DailyDialog CSV resources
- Epochs: 4
- Train/eval batch size per GPU: 1 / 1
- Gradient accumulation steps: 6
- Effective training batch size: 6
- Learning rate: `1e-5`
- Max gradient norm: `1.0`
- Objective: line-by-line causal language modeling
- Seed: `42`
- Checkpointing/logging: every 5000 optimizer steps; last checkpoint kept
- Memory optimization: gradient checkpointing enabled

## Training Format

Training examples use adjacent DailyDialog utterance pairs with explicit source
and target emotion labels:

```text
<bos><source_emotion>source utterance<sep><target_emotion>target utterance<|endoftext|>
```

## Prompt Format

At generation time, the prompt should include the source utterance and the
desired target emotion:

```text
<bos><source_emotion>source utterance<sep><target_emotion>
```

Prompt and training tags:

- `<bos>` marks the beginning of one formatted dialogue example.
- `<source_emotion>` is a placeholder for one emotion label describing the input/source utterance, for example `<fear>`.
- `source utterance` is the user/input text.
- `<sep>` separates the source side from the response side.
- `<target_emotion>` is a placeholder for the emotion you want the generated response to follow, for example `<happiness>`.
- `target utterance` is the response text generated by the model.
- `<|endoftext|>` marks the end of one example. GPT-2 uses this as its native end-of-text/eos token, and generation can stop when this token is produced.

Emotion conditioning: replace `<source_emotion>` and `<target_emotion>` in the
template with one of the model's literal emotion tokens in each position.

Supported emotion labels:

- `<no emotion>`
- `<anger>`
- `<disgust>`
- `<fear>`
- `<happiness>`
- `<sadness>`
- `<surprise>`

For example:

```text
<bos><fear>I just started a new job and I am a bit nervous.<sep><happiness>
```

This means: the source utterance expresses `fear`, and the requested response
should be conditioned toward `happiness`.

## How to Use

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

repo_id = "mario-rc/emotional-gpt2-large"

tokenizer = AutoTokenizer.from_pretrained(repo_id)
model = AutoModelForCausalLM.from_pretrained(repo_id)
model.config.pad_token_id = tokenizer.pad_token_id

prompt = "<bos><fear>I just started a new job and I am a bit nervous.<sep><happiness>"
inputs = tokenizer(prompt, return_tensors="pt")

outputs = model.generate(
    **inputs,
    do_sample=True,
    max_new_tokens=80,
    temperature=0.8,
    top_p=0.95,
    pad_token_id=tokenizer.pad_token_id,
    eos_token_id=tokenizer.eos_token_id,
)

generated = outputs[0][inputs["input_ids"].shape[-1]:]
response = tokenizer.decode(generated, skip_special_tokens=False)
response = response.split(tokenizer.eos_token, 1)[0].strip()

emotion_labels = [
    "<no emotion>",
    "<anger>",
    "<disgust>",
    "<fear>",
    "<happiness>",
    "<sadness>",
    "<surprise>",
]

for label in emotion_labels:
    if response.startswith(label):
        response = response[len(label):].strip()
        break

print(response)
```

## Limitations

The model is intended for experimental dialogue/text generation. Generated text
may be inaccurate, biased, repetitive, or emotionally inappropriate, and should
be reviewed before user-facing use.