Instructions to use malaysia-ai/xlnet-large-bahasa-cased with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use malaysia-ai/xlnet-large-bahasa-cased with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("feature-extraction", model="malaysia-ai/xlnet-large-bahasa-cased")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("malaysia-ai/xlnet-large-bahasa-cased") model = AutoModelForMultimodalLM.from_pretrained("malaysia-ai/xlnet-large-bahasa-cased") - Notebooks
- Google Colab
- Kaggle
xlnet-large-bahasa-cased
Pretrained XLNET large language model for Malay.
Pretraining Corpus
xlnet-large-bahasa-cased model was pretrained on ~1.4 Billion words. Below is list of data we trained on,
Pretraining details
- All steps can reproduce from here, Malaya/pretrained-model/xlnet.
Load Pretrained Model
You can use this model by installing torch or tensorflow and Huggingface library transformers. And you can use it directly by initializing it like this:
from transformers import XLNetModel, XLNetTokenizer
model = XLNetModel.from_pretrained('malay-huggingface/xlnet-large-bahasa-cased')
tokenizer = XLNetTokenizer.from_pretrained(
'malay-huggingface/xlnet-large-bahasa-cased',
do_lower_case = False,
)
- Downloads last month
- 3