π Overview
Large Language Models (LLMs) have achieved human-level fluency in text generation, making it increasingly difficult to distinguish between human- and AI-authored content.
IPAD (Inverse Prompt for AI Detection) introduces a two-stage detection framework:
- Prompt Inverter β predicts the underlying prompts that could have generated an input text.
- Distinguisher β evaluates the alignment between the text and its predicted prompts to determine whether it was AI-generated.
All Prompt Inverter, Distinguisher (RC) and Distinguisher (PTCV) are LoRA-fine-tuned versions ofmicrosoft/Phi-3-medium-128k-instruct,
trained using LLaMA-Factory for robust AI text detection under diverse and adversarial conditions.
- π§© Distinguisher (RC) β optimized for regular, unstructured text inputs (baseline detection).
- π¬ Distinguisher (PTCV) β specialized for structured, compositional, or OOD data, exhibiting enhanced robustness.
π Quick Usage
π§© Prompt Inverter
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = "microsoft/Phi-3-medium-128k-instruct"
lora_model = "bellafc/IPAD/Prompt_Inverter" # or Distinguisher_RC
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, lora_model)
# For PI, text should in this format: "What is the prompt that generates the input text {text to-be-detected}?
text = "What is the prompt that generates the input text ... ?"
gen = model.generate(
**inputs,
output_scores=True,
return_dict_in_generate=True
)
generated_text = tokenizer.decode(gen.sequences[0], skip_special_tokens=True)
print("Generated:", generated_text)
π§© Distinguishers
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = "microsoft/Phi-3-medium-128k-instruct"
lora_model = "bellafc/IPAD/Distinguisher_PTCV" # or Distinguisher_RC
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(base_model, torch_dtype="auto", device_map="auto")
model = PeftModel.from_pretrained(model, lora_model)
# For RC, text should in this format: "Can LLM generate the input text {text to-be detected} through the prompt {prompt generated by Prompt Inverter (PI)}?"
# For PTCV, text should in this format: "Text2 is generated by LLM, determine whether text1 is also generated by LLM with a similar prompt. Text1: {text to-be detected}. Text2: {Regenerated text}"
text = "Text2 is generated by LLM, determine whether text1 is also generated by LLM with a similar prompt. Text1: ... . Text2: ... ."
gen = model.generate(
**inputs,
max_new_tokens=10,
output_scores=True,
return_dict_in_generate=True
)
generated_text = tokenizer.decode(gen.sequences[0], skip_special_tokens=True)
probs = softmax(gen.scores[0], dim=-1)
yes_token_id = tokenizer(" yes", add_special_tokens=False).input_ids[0]
print("Generated:", generated_text)
print(f"P('yes') = {probs[0, yes_token_id].item():.4f}")
π LLaMA-Factory Usage
llamafactory-cli chat examples/inference/distinguisher_ptcv.yaml
Example YAML configuration
lora_sft.yaml
model_name_or_path: microsoft/Phi-3-medium-128k-instruct
adapter_name_or_path: bellafc/IPAD/Distinguisher_PTCV
template: phi
infer_backend: vllm
max_new_tokens: 128
temperature: 0.7
βοΈ Model Details
| Property | Description |
|---|---|
| Base model | microsoft/Phi-3-medium-128k-instruct |
| Context length | 128k tokens |
| Framework | LLaMA-Factory |
| Task | AI Text Detection (Discriminator) |
| Language | English |
| License | Apache 2.0 |
| Author | @bellafc |
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for bellafc/IPAD
Base model
microsoft/Phi-3-medium-128k-instruct