Instructions to use FlagAlpha/Llama2-Chinese-13b-Chat with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FlagAlpha/Llama2-Chinese-13b-Chat with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("question-answering", model="FlagAlpha/Llama2-Chinese-13b-Chat")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("FlagAlpha/Llama2-Chinese-13b-Chat") model = AutoModelForMultimodalLM.from_pretrained("FlagAlpha/Llama2-Chinese-13b-Chat") - Notebooks
- Google Colab
- Kaggle
fine-tuning and model capability
I tried the online chat llama.family it's generating english sentences as a reply even after modifying the system prompt. Is this because the model is not really aligned for QA task? because of the architecture ?
I am asking this because I fine-tuned on QA task but with very lengthy contexts, but looks like the model is not learning much, it doesn't generate anything when asked something in lengthy after fine-tuning. It generates responses to small length questions. I'm confused if whether I did something wrong while fine-tuning or it's just the model architecture. Also I would like to know more about your approach of training it for QA task for translation. Would appreciate any suggestions.