Spaces:
Paused
Paused
| # GGUF Timeout Fix - Progress Update | |
| ## β Completed Steps: | |
| 1. **Increased GGUF timeout**: Changed from 120s to 300s for Hugging Face Spaces | |
| 2. **Configurable timeout**: Added GGUF_GENERATION_TIMEOUT environment variable support | |
| 3. **Better error handling**: Enhanced timeout and fallback mechanisms in routes.py | |
| 4. **Fallback pipeline**: Added robust fallback when GGUF model fails to load or times out | |
| ## π§ Changes Made: | |
| ### model_loader_gguf.py: | |
| - Updated `_generate_with_timeout()` to use 300s default for Spaces, 120s for local | |
| - Made timeout configurable via environment variable | |
| - Updated `generate()` to use configurable timeout | |
| ### routes.py: | |
| - Added fallback pipeline usage when GGUF times out | |
| - Added better logging for timeout errors | |
| - Added fallback for GGUF model loading failures | |
| - Improved error messages and response handling | |
| ## π Next Steps: | |
| - Test the changes with the GGUF model | |
| - Verify timeout is sufficient for Phi-3 model | |
| - Test fallback mechanisms | |
| - Add progress logging for generation | |
| ## βοΈ Configuration: | |
| - Default timeout: 300s (Spaces) / 120s (local) | |
| - Environment variable: `GGUF_GENERATION_TIMEOUT` | |
| - Fallback: Template-based summary when GGUF fails | |