--- language: - lt base_model: - VSSA-SDSA/LT_AI_DLKVM pipeline_tag: text-generation library_name: transformers tags: - summary - lithuanian - llama3 --- # DLKVM-Summary-Llama3-1B LoRA adapterio kortelė (LT) / LoRA Adapter Card for DLKVM-Summary-Llama3-1B (EN) ## Turinys / Table of contents - [Adapterio informacija](#adapterio-informacija) (LT) / [Adapter Information](#adapter-information) (EN) - [Kaip pradėti naudoti adapterį](#kaip-pradėti-naudoti-adapterį) (LT) / [How to Get Started with the Adapter](#how-to-get-started-with-the-adapter) (EN) - [Naudojimo sritis](#naudojimo-sritis) (LT) / [Uses](#uses) (EN) - [Mokymo detalės](#mokymo-detalės) (LT) / [Training Details](#training-details) (EN) - [Įvertinimas](#įvertinimas) (LT) / [Evaluation](#evaluation) (EN) - [Citavimas](#citavimas) (LT) / [Citation](#citation) (EN) - [Licencija](#licencija) (LT) / [License](#license) (EN) ## Adapterio informacija **Adapterio pavadinimas:** BLKT-Summary-Llama3-LoRA-Adapter **Bazinis modelis:** [VSSA-SDSA/LT_AI_DLKVM](https://huggingface.co/VSSA-SDSA/LT_AI_DLKVM) **Architektūra:** Llama3 CausalLM **Užduotis:** Abstrakčiųjų santraukų generavimas ## Kaip pradėti naudoti adapterį Šį adapterį galime naudoti lietuviškų abstrakčiųjų santraukų generavime (angl. inference) su Hugging Face `transformers` ir `peft` bibliotekomis. ### Aplinkos pasiruošimas Įsidiegiame papildomas bibliotekas iš bibliotekų reikalavimo failo. Naudota: ***Python 3.12.12*** ``` pip install -r requirements.txt ``` ### Kodo pavyzdys ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel MODEL_ID = "VSSA-SDSA/LT_AI_DLKVM" LORA_ADAPTER = "VSSA-SDSA/LT_AI_DLKVM_demo" MAX_NEW_TOKENS = 200 tekstas = "Jūsų tekstas santraukos generavimui" tokenizer = AutoTokenizer.from_pretrained(MODEL_ID,use_fast=True) if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token tokenizer.padding_side = "left" base_model = AutoModelForCausalLM.from_pretrained( MODEL_ID, torch_dtype=torch.bfloat16, device_map={"":0}, attn_implementation="sdpa" ) model = PeftModel.from_pretrained( base_model, LORA_ADAPTER, is_trainable=False ) model.eval() prompt = ( f"<|im_start|>Teksto pradžia:\n{tekstas}<|im_end|>\n" f"<|im_start|>Santraukos pradžia:\n" ) inputs = tokenizer(prompt, return_tensors="pt").to(model.device) inputs.pop("token_type_ids", None) end_tokens = ["<|im_end|>"] eos_ids = tokenizer(end_tokens, add_special_tokens=False).input_ids eos_ids = [ids[0] for ids in eos_ids if len(ids) == 1] with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=MAX_NEW_TOKENS, do_sample=False, repetition_penalty=2.5, eos_token_id = eos_ids, pad_token_id = tokenizer.pad_token_id, num_beams = 2, early_stopping=True ) generated = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=True).strip() print(generated) ``` ### `Flash-Attention` palaikymas Kurtas modelis palaiko `flash_attention_2`, tačiau, siekiant jį naudoti reikalinga įsidiegti papildomas bibliotekas. `Python 3.12` ``` pip install flash-attn==2.7.4.post1 --no-build-isolation ``` `Python 3.13` ``` pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp313-cp313-linux_x86_64.whl ``` Susidiegus biblioteką reikia atnaujinti bazinio modelio užkrovimo skriptą. ```python base_model = AutoModelForCausalLM.from_pretrained( MODEL_ID, torch_dtype=torch.bfloat16, device_map={"":0}, attn_implementation="flash_attention_2" ) ``` ## Naudojimo sritis - Abstrakčiųjų santraukų generavimas lietuviškiems tekstams - Taikymai: teisės, medicinos, žiniasklaidos ir informacinių technologijų temoms ## Mokymo detalės Modelio validavimui buvo naudojamas projekte „Santraukų tekstynai dirbtiniam intelektui“, Nr.02-101-K-0001 kuriamas santraukų tekstynas. ### Mokymo konfigūracija ```yml lora_settings: r: 64 lora_alpha: 128 target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"] lora_dropout: 0.05 task_type: "CAUSAL_LM" use_rslora: True training: per_device_train_batch_size: 4 gradient_accumulation_steps: 16 bf16: True learning_rate: 6e-5 warmup_ratio: 0.063 weight_decay: 0.053 num_train_epochs: 4 lr_scheduler_type: "cosine" optim: "adafactor" adam_epsilon: 1e-6 max_grad_norm: 1.0 ``` **Aplinka:** Hugging Face Transformers (v4.54.1) **Aparatinė įranga:** 1× NVIDIA RTX A6000 ADA ## Įvertinimas | Rouge-1 | Rouge-2 | Rouge-L | BertScore Preciziškumas | BertScore iškvietimas | BertScore F1 | BLEU | | :------------- | :------------- | :------------- | :---------- | :---- | :---- | :---- | | 0.3230 | 0.1377 | 0.2135 | 0.8786 | 0.8683 | 0.8732 | 10.2290 | ## Citavimas Jei naudojate LT_AI_DLKVM_demo ar bet kurią šios saugyklos dalį tyrimuose ar versle, cituokite taip (BibTeX): ```bibtex @misc{SDSA_LT-AI-DLKVM-demo_2026, title= {{LT-AI-DLKVM-demo}: Lithuanian Llama 3 model for abstracts generation}, author = {{State Digital Solutions Agency (SDSA)}}, year = {2026}, howpublished = {\url{https://huggingface.co/VSSA-SDSA/LT_AI_DLKVM_demo}}, note = {Developed by Vytautas Magnus University (VMU), UAB Neurotechnology, UAB Tilde informacinės technologijos, MB Krilas} } ``` ## Licencija Autorių teisės (c) 2026 Valstybės skaitmeninių sprendimų agentūra (VSSA) Sukurta Vytauto Didžiojo universiteto (VDU), UAB „Neurotechnology“, UAB „Tilde informacinės technologijos“, MB „Krilas“ Licencijuota pagal NewGenLTU openRAIL-M Pastaba: Finansuojama iš Ekonomikos gaivinimo ir atsparumo didinimo priemonės „Naujos kartos Lietuva“ plano ## Adapter Information **Adapter Name:** DLKVM-Summary-Llama3-LoRA-Adapter **Base Model:** [VSSA-SDSA/LT_AI_DLKVM](https://huggingface.co/VSSA-SDSA/LT_AI_DLKVM) **Architecture:** Llama3 CausalLM **Task:** Abstractive summaries generation ## How to Get Started with the Adapter This adapter must be used for lithuanian abstractive sumamries generation using Hugging Face `transformers` and `peft` libraries. ### Environment Setup Installing required Python libraries. Used: ***Python 3.12.12*** ``` pip install -r requirements.txt ``` ### Code Snippet ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM from peft import PeftModel MODEL_ID = "VSSA-SDSA/LT_AI_DLKVM" LORA_ADAPTER = "VSSA-SDSA/LT_AI_DLKVM_demo" MAX_NEW_TOKENS = 200 tekstas = "Jūsų tekstas santraukos generavimui" tokenizer = AutoTokenizer.from_pretrained(MODEL_ID,use_fast=True) if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token tokenizer.padding_side = "left" base_model = AutoModelForCausalLM.from_pretrained( MODEL_ID, torch_dtype=torch.bfloat16, device_map={"":0}, attn_implementation="sdpa" ) model = PeftModel.from_pretrained( base_model, LORA_ADAPTER, is_trainable=False ) model.eval() prompt = ( f"<|im_start|>Teksto pradžia:\n{tekstas}<|im_end|>\n" f"<|im_start|>Santraukos pradžia:\n" ) inputs = tokenizer(prompt, return_tensors="pt").to(model.device) inputs.pop("token_type_ids", None) end_tokens = ["<|im_end|>"] eos_ids = tokenizer(end_tokens, add_special_tokens=False).input_ids eos_ids = [ids[0] for ids in eos_ids if len(ids) == 1] with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=MAX_NEW_TOKENS, do_sample=False, repetition_penalty=2.5, eos_token_id = eos_ids, pad_token_id = tokenizer.pad_token_id, num_beams = 2, early_stopping=True ) generated = tokenizer.decode(outputs[0][len(inputs["input_ids"][0]):], skip_special_tokens=True).strip() print(generated) ``` ### Support of `Flash-Attention` Model supports `flash_attention_2`, in order to use it, you need to install additional dependancies. `Python 3.12` ``` pip install flash-attn==2.7.4.post1 --no-build-isolation ``` `Python 3.13` ``` pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.7.4.post1/flash_attn-2.7.4.post1+cu12torch2.6cxx11abiFALSE-cp313-cp313-linux_x86_64.whl ``` After installing dependancies update the base model loading script ```python base_model = AutoModelForCausalLM.from_pretrained( MODEL_ID, torch_dtype=torch.bfloat16, device_map={"":0}, attn_implementation="flash_attention_2" ) ``` ## Uses - Abstract summary generation from Lithuanian texts - Applications: Law, Healthcare, Information Technolagy, and News topics ## Training Details For model validation, the summary corpus being developed in the project “Summary Corpora for Artificial Intelligence,” No. 02-101-K-0001, was used. ### Training Configuration ```yml lora_settings: r: 64 lora_alpha: 128 target_modules: ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"] lora_dropout: 0.05 task_type: "CAUSAL_LM" use_rslora: True training: per_device_train_batch_size: 4 gradient_accumulation_steps: 16 bf16: True learning_rate: 6e-5 warmup_ratio: 0.063 weight_decay: 0.053 num_train_epochs: 4 lr_scheduler_type: "cosine" optim: "adafactor" adam_epsilon: 1e-6 max_grad_norm: 1.0 ``` **Environment:** Hugging Face Transformers (v4.54.1) **Hardware:** 1× NVIDIA RTX A6000 ADA ## Evaluation | Rouge-1 | Rouge-2 | Rouge-L | BertScore Precision | BertScore Recall | BertScore F1 | BLEU | | :------------- | :------------- | :------------- | :---------- | :---- | :---- | :---- | | 0.3230 | 0.1377 | 0.2135 | 0.8786 | 0.8683 | 0.8732 | 10.2290 | ## Citation If you use LT-AI-DLKVM-demo or any part of this repository in your research or deployment, please cite as follows (BibTeX): ```bibtex @misc{SDSA_LT-AI-DLKVM-demo_2026, title= {{LT-AI-DLKVM-demo}: Lithuanian Llama 3 model for abstracts generation}, author = {{State Digital Solutions Agency (SDSA)}}, year = {2026}, howpublished = {\url{https://huggingface.co/VSSA-SDSA/LT_AI_DLKVM_demo}}, note = {Developed by Vytautas Magnus University (VMU), UAB Neurotechnology, UAB Tilde informacinės technologijos, MB Krilas} } ``` ## License Copyright (c) 2026 State Digital Solutions Agency (SDSA) Developed by Vytautas Magnus University (VMU), UAB Neurotechnology, UAB Tilde informacinės technologijos, MB Krilas Licensed under NewGenLTU openRAIL-M Notice: Funded by Economic Recovery and Resilience Facility "New Generation Lithuania" Plan