Instructions to use 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory", dtype="auto") - llama-cpp-python
How to use 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory", filename="unsloth.F16.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16 # Run inference directly in the terminal: llama-cli -hf 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16 # Run inference directly in the terminal: llama-cli -hf 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16 # Run inference directly in the terminal: ./llama-cli -hf 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16 # Run inference directly in the terminal: ./build/bin/llama-cli -hf 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16
Use Docker
docker model run hf.co/2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16
- LM Studio
- Jan
- vLLM
How to use 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16
- SGLang
How to use 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Ollama
How to use 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory with Ollama:
ollama run hf.co/2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16
- Unsloth Studio new
How to use 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory to start chatting
- Docker Model Runner
How to use 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory with Docker Model Runner:
docker model run hf.co/2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16
- Lemonade
How to use 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory:F16
Run and chat with the model
lemonade run user.llama-3-8b-Instruct-bnb-4bit-personal-shortstory-F16
List all available models
lemonade list
About Model
Developed by: 2173ars
License: apache-2.0
Finetuned from model : unsloth/llama-3-8b-Instruct-bnb-4bit
The future of the entertainment industry is personalized content creation, using genAI we can create custom content for the user, like in this prototype model, it is finetuned such that it creates high-quality custom stories for the user. user can tell what storyline he wants, what genre, world setting, how the characters should be, how should the ending be, endless possibilities and get a story tailor-made for them.
in recent testing outperformed llama 3 70B parameters model in zero-shot story generation (judged by other AI's like gpt 4o, gemini pro)
System Prompt : You are a skilled and creative story writer, tasked with crafting personalized short stories for users. Your stories should meet the following criteria:1. Detailed: Incorporate rich descriptions, complex characters, and immersive settings to create a vivid and engaging narrative.2. Mature and Thoughtful Plotlines: Develop well-thought-out plots that are sophisticated and meaningful, avoiding childish themes and simplistic resolutions. 3. Emotionally Impactful: Explore a range of emotions and outcomes, including sad or bittersweet endings. Aim to evoke strong emotions and leave a lasting impression on the reader.4. Risk-taking: Embrace creative risks in your storytelling. Introduce unexpected twists, unconventional themes, and bold narrative choices. 5. Consistent and Coherent: Ensure the story is logically consistent and coherent from beginning to end, with all plot points and character actions making sense within the narrative.6. High Quality: Prioritize quality in every aspect of the story, from grammar and syntax to the originality and depth of the plot. 7. Personalized: Consider the user's preferences and interests to make the narrative as personalized and engaging as possible.Aim to captivate the reader with every story you write, delivering narratives that are immersive, meaningful, and emotionally resonant.
Sample Input : write a sifi story where there is war between mars and earth, sad ending
Output : The year was 2157, and the once peaceful solar system had been ravaged by war for decades. Mars, fueled by its desire for independence and resources, had finally declared its separation from Earth. The United Earth Government (UEG) responded with military force, seeking to maintain its control over the Martian colonies.
Captain Jaxon Thompson, a seasoned pilot from the UEG's elite squadron, was tasked with leading a crucial mission to capture Mars' strategic Jupiter-bound space station, Aurora-4. His team of skilled pilots and engineers set course for the Red Planet, braving treacherous asteroid fields and hostile alien encounters along the way.
As they approached Mars, Jaxon's ship was ambushed by a fleet of Martian fighters. In the ensuing dogfight, his wingman, Lieutenant Rachel Lee, was struck by a stray laser blast and critically wounded. With her life slipping away, Jaxon made the impossible decision to leave the battle and rush Rachel back to their flagship for medical attention.
The UEG's heavy cruiser, Celestial Dragon, was battered but still operational as Jaxon rushed Rachel to the ship's medbay. However, their triumph was short-lived, as Mars' advanced artillery, known as the "Eye of Mars," targeted the flagship. A direct hit crippled the ship, leaving it vulnerable to Martian boarding parties.
As Jaxon and his team fought to repel the invaders, they were met with a surprise - a group of Martian soldiers, led by a charismatic young officer named Arin Vex, who had defected from the Martian army due to its brutal tactics. The rebels shared intelligence on the Eye's location, revealing it was powered by an ancient alien technology that could grant unimaginable power.
With this new information, Jaxon and his team devised a daring plan: infiltrate the Eye, disable it, and use its energy to destroy Mars' fleet. However, as they entered the facility, they were met with a devastating surprise - Arin Vex was, in fact, the true leader of the Martian rebellion, driven by a desire for revenge against humanity.
The two sides clashed in a final, brutal battle. Jaxon's team managed to disable the Eye, but not before it unleashed its full power, obliterating the Celestial Dragon and leaving only a handful of survivors. As the UEG's last hopes faded, Mars' forces captured Arin, who revealed his true intentions: to harness the alien technology for his own people, enslaving humanity in the process.
The war raged on, with both sides suffering heavy losses. The once hopeful mission had become a catastrophe, leaving Jaxon and Rachel as the only remaining survivors of their squadron. As they drifted through space, they were met with a haunting thought: that the price of independence was not worth the cost in human lives.
In this desolate, war-torn future, the two former enemies gazed up at the stars, mourning the loss of their friends and the dreams of a better tomorrow. The once vibrant Earth-Mars relations had been reduced to ashes, leaving only sorrow and regret in its wake. The last remnants of humanity's hope faded into the darkness, as the galaxy whispered a haunting epitaph: "War is the ultimate cost."
- Downloads last month
- 124
4-bit
16-bit
Model tree for 2173ars/llama-3-8b-Instruct-bnb-4bit-personal-shortstory
Base model
unsloth/llama-3-8b-Instruct-bnb-4bit