Not-For-All-Audiences

Instructions to use Hastagaras/Halu-8B-Llama3-Blackroot with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Hastagaras/Halu-8B-Llama3-Blackroot with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Hastagaras/Halu-8B-Llama3-Blackroot")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Hastagaras/Halu-8B-Llama3-Blackroot")
model = AutoModelForCausalLM.from_pretrained("Hastagaras/Halu-8B-Llama3-Blackroot")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Local Apps Settings

vLLM

How to use Hastagaras/Halu-8B-Llama3-Blackroot with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Hastagaras/Halu-8B-Llama3-Blackroot"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Hastagaras/Halu-8B-Llama3-Blackroot",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Hastagaras/Halu-8B-Llama3-Blackroot

SGLang

How to use Hastagaras/Halu-8B-Llama3-Blackroot with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "Hastagaras/Halu-8B-Llama3-Blackroot" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Hastagaras/Halu-8B-Llama3-Blackroot",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "Hastagaras/Halu-8B-Llama3-Blackroot" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Hastagaras/Halu-8B-Llama3-Blackroot",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use Hastagaras/Halu-8B-Llama3-Blackroot with Docker Model Runner:
```
docker model run hf.co/Hastagaras/Halu-8B-Llama3-Blackroot
```

Halu-8B-Llama3-Blackroot / README.md

Hastagaras

Update README.md

371a8d1 verified about 2 years ago

preview code

Raw

History Blame Contribute Delete

5.94 kB

	---
	license: llama3
	library_name: transformers
	tags:
	- mergekit
	- merge
	- not-for-all-audiences
	base_model:
	- Hastagaras/Halu-8B-Llama3-v0.3
	- Blackroot/Llama-3-LongStory-LORA
	- Hastagaras/Halu-8B-Llama3-v0.3
	- Blackroot/Llama-3-8B-Abomination-LORA
	- Hastagaras/Halu-8B-Llama3-v0.3
	model-index:
	- name: Halu-8B-Llama3-Blackroot
	results:
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: AI2 Reasoning Challenge (25-Shot)
	type: ai2_arc
	config: ARC-Challenge
	split: test
	args:
	num_few_shot: 25
	metrics:
	- type: acc_norm
	value: 63.82
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Hastagaras/Halu-8B-Llama3-Blackroot
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: HellaSwag (10-Shot)
	type: hellaswag
	split: validation
	args:
	num_few_shot: 10
	metrics:
	- type: acc_norm
	value: 84.55
	name: normalized accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Hastagaras/Halu-8B-Llama3-Blackroot
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: MMLU (5-Shot)
	type: cais/mmlu
	config: all
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 67.04
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Hastagaras/Halu-8B-Llama3-Blackroot
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: TruthfulQA (0-shot)
	type: truthful_qa
	config: multiple_choice
	split: validation
	args:
	num_few_shot: 0
	metrics:
	- type: mc2
	value: 53.28
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Hastagaras/Halu-8B-Llama3-Blackroot
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: Winogrande (5-shot)
	type: winogrande
	config: winogrande_xl
	split: validation
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 79.48
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Hastagaras/Halu-8B-Llama3-Blackroot
	name: Open LLM Leaderboard
	- task:
	type: text-generation
	name: Text Generation
	dataset:
	name: GSM8k (5-shot)
	type: gsm8k
	config: main
	split: test
	args:
	num_few_shot: 5
	metrics:
	- type: acc
	value: 70.51
	name: accuracy
	source:
	url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=Hastagaras/Halu-8B-Llama3-Blackroot
	name: Open LLM Leaderboard
	---
	## EXPERIMENTAL MODEL

	VERY IMPORTANT: This model has not been extensively tested or evaluated, and its performance characteristics are currently unknown. It may generate harmful, biased, or inappropriate content. Please exercise caution and use it at your own risk and discretion.

	I just tried [saishf's](https://huggingface.co/saishf) merged model, and it's great. So I decided to try a similar merge method with [Blackroot's](https://huggingface.co/Blackroot) LoRA that I had found earlier.

	I don't know what to say about this model... this model is very strange...Maybe because Blackroot's amazing Loras used human data and not synthetic data, hence the model turned out to be very human-like...even the actions or narrations.

	WARNING: This model is very unsafe in certain parts...especially in RP.

	[IMATRIX GGUF IS HERE](https://huggingface.co/Lewdiculous/Halu-8B-Llama3-Blackroot-GGUF-IQ-Imatrix) made available by [Lewdiculous](https://huggingface.co/Lewdiculous)

	[STATIC GGUF IS HERE](https://huggingface.co/mradermacher/Halu-8B-Llama3-Blackroot-GGUF/tree/main) made avaible by [mradermacher](https://huggingface.co/mradermacher)

	<div align="left">
	<img src="https://huggingface.co/Hastagaras/Halu-8B-Llama3-Blackroot/resolve/main/Halu (1).png" width="500"/>
	</div>

	### Merge Method

	This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [Hastagaras/Halu-8B-Llama3-v0.3](https://huggingface.co/Hastagaras/Halu-8B-Llama3-v0.3) as a base.

	### Models Merged

	The following models were included in the merge:
	* [Hastagaras/Halu-8B-Llama3-v0.3](https://huggingface.co/Hastagaras/Halu-8B-Llama3-v0.3) + [Blackroot/Llama-3-LongStory-LORA](https://huggingface.co/Blackroot/Llama-3-LongStory-LORA)
	* [Hastagaras/Halu-8B-Llama3-v0.3](https://huggingface.co/Hastagaras/Halu-8B-Llama3-v0.3) + [Blackroot/Llama-3-8B-Abomination-LORA](https://huggingface.co/Blackroot/Llama-3-8B-Abomination-LORA)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	models:
	- model: Hastagaras/Halu-8B-Llama3-v0.3+Blackroot/Llama-3-LongStory-LORA
	- model: Hastagaras/Halu-8B-Llama3-v0.3+Blackroot/Llama-3-8B-Abomination-LORA
	merge_method: model_stock
	base_model: Hastagaras/Halu-8B-Llama3-v0.3
	dtype: bfloat16

	```
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Hastagaras__Halu-8B-Llama3-Blackroot)

	\| Metric \|Value\|
	\|---------------------------------\|----:\|
	\|Avg. \|69.78\|
	\|AI2 Reasoning Challenge (25-Shot)\|63.82\|
	\|HellaSwag (10-Shot) \|84.55\|
	\|MMLU (5-Shot) \|67.04\|
	\|TruthfulQA (0-shot) \|53.28\|
	\|Winogrande (5-shot) \|79.48\|
	\|GSM8k (5-shot) \|70.51\|