--- base_model: answerdotai/ModernBERT-base library_name: peft tags: - base_model:adapter:answerdotai/ModernBERT-base - lora - transformers - CyberSecurity - PEFT license: apache-2.0 datasets: - AINovice2005/cicflow-ids-multiclass language: - en pipeline_tag: fill-mask --- Open AINovice2005/ModernBERT-base-lora-cicflow-1m-r4 in hfviewer This model fine‑tunes ModernBERT‑base using LoRA (Low‑Rank Adaptation) for efficient parameter‑tuning. It is designed for binary classification tasks where high recall and controlled false positive rates are important. ## Training Configuration - Seed: 42 (ensures reproducibility) - Batch sizes: Train = 128, Eval = 256 - Max sequence length: 256 - Epochs: 1 (baseline run) - Learning rate: 3e‑4 - Weight decay: 0.01 - Warmup ratio: 0.05 - Gradient clipping: 1.0 - Early stopping patience: 3 - - Steps: 5,241 ## LoRA Setup - Enabled: Yes - Rank (r): 4 - Alpha: 8 - Dropout: 0.05 - Target modules: Attention (Wqkv, Wo) and MLP (Wi, Wo) layers - Max drift ratio: 0.1 LoRA adapters allow efficient fine‑tuning by updating only small low‑rank matrices, reducing memory and compute requirements. ## Loss Function Training uses Asymmetric Focal Loss, which emphasizes hard negatives while keeping positive weighting mild. This helps balance recall and false positive rate. - Gamma_pos: 0.0 (minimal emphasis on positives) - Gamma_neg: 4.0 (stronger emphasis on negatives) - Clip: 0.05 (stability for probabilities) Validation is performed every 5000 steps, with early stopping to prevent overfitting. ## Usage: ```python import torch from transformers import AutoTokenizer, AutoModelForMaskedLM, pipeline from peft import PeftModel # Base ModernBERT model base_model_name = "answerdotai/ModernBERT-base" # LoRA adapter checkpoint adapter_model_name = "AINovice2005/ModernBERT-base-lora-cicflow-1m-r4" # Load tokenizer tokenizer = AutoTokenizer.from_pretrained(base_model_name) # Load base masked language model base_model = AutoModelForMaskedLM.from_pretrained(base_model_name) # Attach LoRA adapter model = PeftModel.from_pretrained(base_model, adapter_model_name) # Move to device device = "cuda" if torch.cuda.is_available() else "cpu" model = model.to(device) # Build fill-mask pipeline fill_mask = pipeline( "fill-mask", model=model, tokenizer=tokenizer, device=0 if device == "cuda" else -1 ) # Example usage text = "The network traffic shows a [MASK] pattern." outputs = fill_mask(text) for o in outputs: print(f"Token: {o['token_str']}, Score: {o['score']:.4f}") ``` ## Intended Use - Binary classification tasks where recall is critical. - Efficient fine‑tuning scenarios with limited compute resources. - Research and experimentation with parameter‑efficient methods. ## Artifacts: - LoRA adapter - Training configuration and evaluation logs - PEFT 0.18.1