--- license: apache-2.0 base_model: HuggingFaceTB/SmolLM-360M tags: - text-generation - entity-extraction - calendar-events - smollm - fine-tuned - nlp language: - en metrics: - f1 - accuracy library_name: transformers pipeline_tag: text-generation --- # SmolLM-360M Fine-tuned for Calendar Event Entity Extraction SmolLM-360M fine-tuned for calendar event entity extraction. Extracts structured information from natural language event descriptions. ## Model Details - **Base Model:** HuggingFaceTB/SmolLM-360M - **Model Size:** 360M parameters - **Task:** Calendar event entity extraction - **Language:** English - **License:** Apache 2.0 ## Supported Entity Fields This model extracts the following entities from calendar event descriptions: - **action**: The main activity or event type - **date**: Event date - **time**: Event time - **attendees**: List of people attending - **location**: Event location - **duration**: Event duration - **recurrence**: Recurrence pattern (if any) - **notes**: Additional notes ## Performance Metrics ### Fine-tuned Model Performance - **Exact Match Accuracy:** 0.6625 - **Macro F1 Score:** 0.7466 - **Macro Precision:** 0.7476 - **Macro Recall:** 0.7457 ### Improvement over Baseline - **F1 Score Improvement:** +0.7488 - **Accuracy Improvement:** +0.6750 ## Usage ### Quick Start ```python from transformers import AutoTokenizer, AutoModelForCausalLM import json # Load model and tokenizer model_name = "smollm-360m-event-extraction" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name) # Example event text event_text = "Team meeting tomorrow at 2pm with John and Sarah for 1 hour" # Create prompt prompt = f'''Extract the following entities from the calendar event description: Event: {event_text} Please provide the extracted information in this exact JSON format: {{ "action": "extracted action or null", "date": "extracted date or null", "time": "extracted time or null", "attendees": ["list of attendees"] or null, "location": "extracted location or null", "duration": "extracted duration or null", "recurrence": "extracted recurrence or null", "notes": "extracted notes or null" }} Extracted entities:''' # Tokenize and generate inputs = tokenizer(prompt, return_tensors="pt") outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.1) response = tokenizer.decode(outputs[0], skip_special_tokens=True) # Extract and parse JSON generated_json = response[len(prompt):].strip() entities = json.loads(generated_json) print(entities) ``` ### Batch Processing ```python def extract_entities_batch(event_texts, model, tokenizer): results = [] for event_text in event_texts: # Use the same prompt format as above # ... (generation code) results.append(entities) return results ``` ## Training Details ### Training Data - **Dataset Size:** 793 examples (based on provided JSONL file) - **Data Split:** 80.0% train, 10.0% validation, 10.0% test - **Data Format:** Instruction-following format with JSON output ### Training Configuration - **Learning Rate:** 2e-05 - **Training Epochs:** 3 - **Batch Size:** 2 - **Max Sequence Length:** 256 - **Optimizer:** AdamW with weight decay - **Learning Rate Schedule:** Linear warmup ### Training Infrastructure - **Framework:** Hugging Face Transformers + Accelerate - **Hardware:** GPU-optimized (also supports CPU) - **Mixed Precision:** FP16 training enabled ## Example Outputs ### Input ``` "Team meeting tomorrow at 2pm with John and Sarah for 1 hour" ``` ### Output ```json { "action": "Team meeting", "date": "tomorrow", "time": "2pm", "attendees": ["John", "Sarah"], "location": null, "duration": "1 hour", "recurrence": null, "notes": null } ``` ## Limitations and Biases - The model is trained on English calendar events only - Performance may vary on events with complex or unusual phrasing - Date/time formats should follow common conventions for best results - The model may struggle with implicit information not stated in the text ## Ethical Considerations - This model is designed for calendar event processing and should not be used for sensitive personal data without proper privacy safeguards - Users should validate outputs, especially for critical scheduling applications - The model may reflect biases present in the training data