Instructions to use RichardLu/Llama3_AE_res with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use RichardLu/Llama3_AE_res with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="RichardLu/Llama3_AE_res")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoModel
model = AutoModel.from_pretrained("RichardLu/Llama3_AE_res", dtype="auto")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use RichardLu/Llama3_AE_res with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "RichardLu/Llama3_AE_res"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RichardLu/Llama3_AE_res",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/RichardLu/Llama3_AE_res

SGLang

How to use RichardLu/Llama3_AE_res with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "RichardLu/Llama3_AE_res" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RichardLu/Llama3_AE_res",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "RichardLu/Llama3_AE_res" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "RichardLu/Llama3_AE_res",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Unsloth Studio new

How to use RichardLu/Llama3_AE_res with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for RichardLu/Llama3_AE_res to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for RichardLu/Llama3_AE_res to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for RichardLu/Llama3_AE_res to start chatting

Load model with FastModel

pip install unsloth
from unsloth import FastModel
model, tokenizer = FastModel.from_pretrained(
    model_name="RichardLu/Llama3_AE_res",
    max_seq_length=2048,
)

Docker Model Runner
How to use RichardLu/Llama3_AE_res with Docker Model Runner:
```
docker model run hf.co/RichardLu/Llama3_AE_res
```

Aspect Extraction Model for Restaurant Reviews using Llama 3.1 8b

This repository contains a fine-tuned version of unsloth/meta-llama-3.1-8b-instruct-bnb-4bit, trained specifically for Aspect Extraction tasks using the SemEval 2014 Restaurant Dataset. The model employs the InstructABSA instruction prompt format combined with the Alpaca prompting structure, optimizing its performance on real-world restaurant review analysis.

Model Overview

Base Model: unsloth/meta-llama-3.1-8b-instruct-bnb-4bit
Fine-tuning Dataset: SemEval 2014 Restaurant Dataset
Task: Aspect Extraction
Prompt Format: InstructABSA within Alpaca prompt format

Performance Metrics

Dataset	F1 Score
Train	93.76%
Test	94.03%

Use Cases

This model is well-suited for:

Research purposes: Explore novel methodologies or validate existing theories in ABSA.
Real-world applications: Deriving actionable insights from restaurant reviews for businesses, marketers, and product developers.

Inference Speed

Approximate inference time: ~1 second per review (tested on NVIDIA GPUs with 4-bit quantization).

Installation

Install the required dependencies using pip:

import os
if "COLAB_" not in "".join(os.environ.keys()):
    !pip install unsloth
else:
    # Do this only in Colab notebooks! Otherwise, use pip install unsloth
    !pip install --no-deps bitsandbytes accelerate xformers==0.0.29 peft trl triton
    !pip install --no-deps cut_cross_entropy unsloth_zoo
    !pip install sentencepiece protobuf datasets huggingface_hub hf_transfer
    !pip install --no-deps unsloth

!pip uninstall unsloth -y && pip install --upgrade --no-cache-dir --no-deps git+https://github.com/unslothai/unsloth.git

Example Usage

from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    "RichardLu/Llama3_AE_res",
    load_in_4bit=True,
    max_seq_length=2048,
)

FastLanguageModel.for_inference(model)

# Define the instruction for aspect extraction
instructabsa_instruction = """Definition: The output will be the aspects (both implicit and explicit) which have an associated opinion that are extracted from the input text. In cases where there are no aspects the output should be noaspectterm.
Positive example 1-
input: With the great variety on the menu, I eat here often and never get bored.
output: menu
Positive example 2-
input: Great food, good size menu, great service and an unpretensious setting.
output: food, menu, service, setting
Negative example 1-
input: They did not have mayonnaise, forgot our toast, left out ingredients (ie cheese in an omelet), below hot temperatures and the bacon was so over cooked it crumbled on the plate when you touched it.
output: toast, mayonnaise, bacon, ingredients, plate
Negative example 2-
input: The seats are uncomfortable if you are sitting against the wall on wooden benches.
output: seats
Neutral example 1-
input: I asked for seltzer with lime, no ice.
output: seltzer with lime
Neutral example 2-
input: They wouldnt even let me finish my glass of wine before offering another.
output: glass of wine
Now complete the following example:"""
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{}
### Input:
{}
### Response:
{}"""

prompt = alpaca_prompt.format(instructabsa_instruction, "Great food, good size menu, great service and an unpretensious setting.", "")

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
output_ids = model.generate(**inputs, max_new_tokens=128)
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)

print(output_text.split("### Response:")[-1].strip())

License

This model is intended for research and educational purposes. Please ensure proper citation if utilized in academic or industry research.

Citation

If you utilize this model in your research, please cite it appropriately and reference this repository.

@misc{yourcitation2024,
  author = {Lu Phone Maw},
  title = {Aspect Extraction Model for Restaurant Reviews using Llama 3.1 8b},
  year = {2025},
  publisher = {Lu Phone Maw},
  journal = {Hugging Face repository},
  howpublished = {\url{https://huggingface.co/RichardLu/Llama3_AE_res}}
}

For any questions or feedback, please contact the repository maintainer.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for RichardLu/Llama3_AE_res

Base model

meta-llama/Llama-3.1-8B

Finetuned

meta-llama/Llama-3.1-8B-Instruct

Finetuned

(2771)

this model