===== Application Startup at 2025-10-09 06:25:23 =====

[ENTRYPOINT] Clearing Hugging Face / Torch / tmp cache...
chmod: changing permissions of '/tmp': Operation not permitted
2025-10-09 06:27:43,466 - INFO - Added src_dir to Python path: /app/services/ai-service/src
2025-10-09 06:27:43,466 - INFO - src_dir exists: True
2025-10-09 06:27:43,466 - INFO - Contents of src_dir: ['__main__.py', 'agents', 'ai_med_extract', 'app.py', 'config_settings.py', 'gradio_app.py', 'routes', 'wsgi.py']
2025-10-09 06:27:43,466 - INFO - ai_med_extract_path exists: True
2025-10-09 06:27:43,466 - INFO - Contents of ai_med_extract: ['__init__.py', 'agents', 'api', 'api_endpoints.py', 'api_middleware.py', 'app.py', 'core_exceptions.py', 'core_logger.py', 'core_security.py', 'database_audit.py', 'gradio_app.py', 'health_endpoints.py', 'inference_service.py', 'main.py', 'metrics_adapter.py', 'monitoring_observability.py', 'phi_scrubber_service.py', 'scalable_service_mesh.py', 'utils']
2025-10-09 06:27:43,466 - INFO - Detected Hugging Face Spaces environment
2025-10-09 06:27:43,466 - INFO - Attempting to import from ai_med_extract package...
2025-10-09 06:27:43,466 - INFO - Python path: ['/app/services/ai-service/src', '', '/opt/venv/bin']
2025-10-09 06:27:43,466 - INFO - Current working directory: /app
2025-10-09 06:27:43,467 - INFO - Files in current directory: ['README.md', '__init__.py', 'app.py', 'requirements.txt', 'services']

libgomp: Invalid value for environment variable OMP_NUM_THREADS

libgomp: Invalid value for environment variable OMP_NUM_THREADS
2025-10-09 06:27:48,840 - INFO - Model manager imported successfully
2025-10-09 06:27:49,115 - INFO - Model manager imported successfully in initialize_agents
2025-10-09 06:27:49,139 - INFO - PatientSummarizerAgent created for Falconsai/medical_summarization (summarization) on cuda (loader deferred)
2025-10-09 06:27:49,239 - WARNING - ONNX Runtime not available: /opt/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_pybind11_state.cpython-310-x86_64-linux-gnu.so: cannot enable executable stack as shared object requires: Invalid argument
[GGUF] Preloading GGUF models as requested by settings...
[GGUF] Preload failed (non-fatal): Model path does not exist: microsoft/Phi-3-mini-4k-instruct-gguf
2025-10-09 06:27:49,285 - INFO - Main router registered with app
2025-10-09 06:27:49,290 - INFO - Redis URL not configured, PHI scrubbing will use fallback mode
2025-10-09 06:27:49,298 - INFO - API router registered with app
2025-10-09 06:27:49,305 - INFO - Model management router registered with app
2025-10-09 06:27:49,305 - INFO - All routes registered successfully
2025-10-09 06:27:49,314 - INFO - Agents initialized and routes registered
2025-10-09 06:27:49,314 - INFO - ============================================================
2025-10-09 06:27:49,314 - INFO - REGISTERED ROUTES:
2025-10-09 06:27:49,314 - INFO -   ['HEAD', 'GET'] /openapi.json
2025-10-09 06:27:49,314 - INFO -   ['HEAD', 'GET'] /docs
2025-10-09 06:27:49,314 - INFO -   ['HEAD', 'GET'] /docs/oauth2-redirect
2025-10-09 06:27:49,314 - INFO -   ['HEAD', 'GET'] /redoc
2025-10-09 06:27:49,314 - INFO -   ['GET'] /live
2025-10-09 06:27:49,314 - INFO -   ['GET'] /ready
2025-10-09 06:27:49,314 - INFO -   ['GET'] /metrics
2025-10-09 06:27:49,314 - INFO -   ['GET'] /
2025-10-09 06:27:49,314 - INFO -   ['GET'] /api/info
2025-10-09 06:27:49,314 - INFO -   ['GET'] /api/performance_metrics
2025-10-09 06:27:49,314 - INFO -   ['POST'] /generate_patient_summary
2025-10-09 06:27:49,315 - INFO -   ['POST'] /summarize
2025-10-09 06:27:49,315 - INFO -   ['POST'] /upload
2025-10-09 06:27:49,315 - INFO -   ['POST'] /phi/scrub
2025-10-09 06:27:49,315 - INFO -   ['POST'] /api/models/load
2025-10-09 06:27:49,315 - INFO -   ['POST'] /api/models/generate
2025-10-09 06:27:49,315 - INFO -   ['GET'] /api/models/info
2025-10-09 06:27:49,315 - INFO -   ['GET'] /api/models/defaults
2025-10-09 06:27:49,315 - INFO -   ['POST'] /api/models/clear_cache
2025-10-09 06:27:49,315 - INFO -   ['POST'] /api/models/switch
2025-10-09 06:27:49,315 - INFO -   ['GET'] /api/models/health
2025-10-09 06:27:49,315 - INFO -   ['GET'] /debug/routes
2025-10-09 06:27:49,315 - INFO -   ['GET'] /health/live
2025-10-09 06:27:49,315 - INFO -   ['GET'] /health/ready
2025-10-09 06:27:49,315 - INFO - ============================================================
2025-10-09 06:27:49,315 - INFO - Agents initialized successfully
2025-10-09 06:27:49,315 - INFO - Model manager imported successfully in initialize_agents
2025-10-09 06:27:49,315 - INFO - PatientSummarizerAgent created for Falconsai/medical_summarization (summarization) on cuda (loader deferred)
[GGUF] Preloading GGUF models as requested by settings...
[GGUF] Preload failed (non-fatal): Model path does not exist: microsoft/Phi-3-mini-4k-instruct-gguf
2025-10-09 06:27:49,324 - INFO - Main router registered with app
2025-10-09 06:27:49,327 - INFO - API router registered with app
2025-10-09 06:27:49,329 - INFO - Model management router registered with app
2025-10-09 06:27:49,329 - INFO - All routes registered successfully
2025-10-09 06:27:49,338 - INFO - Agents initialized and routes registered
2025-10-09 06:27:49,338 - INFO - ============================================================
2025-10-09 06:27:49,338 - INFO - REGISTERED ROUTES:
2025-10-09 06:27:49,338 - INFO -   ['HEAD', 'GET'] /openapi.json
2025-10-09 06:27:49,338 - INFO -   ['HEAD', 'GET'] /docs
2025-10-09 06:27:49,338 - INFO -   ['HEAD', 'GET'] /docs/oauth2-redirect
2025-10-09 06:27:49,338 - INFO -   ['HEAD', 'GET'] /redoc
2025-10-09 06:27:49,338 - INFO -   ['GET'] /live
2025-10-09 06:27:49,338 - INFO -   ['GET'] /ready
2025-10-09 06:27:49,338 - INFO -   ['GET'] /metrics
[GGUF] Preloading GGUF models as requested by settings...
[GGUF] Preload failed (non-fatal): Model path does not exist: microsoft/Phi-3-mini-4k-instruct-gguf
2025-10-09 06:27:49,338 - INFO -   ['GET'] /
2025-10-09 06:27:49,338 - INFO -   ['GET'] /api/info
2025-10-09 06:27:49,338 - INFO -   ['GET'] /api/performance_metrics
2025-10-09 06:27:49,338 - INFO -   ['POST'] /generate_patient_summary
2025-10-09 06:27:49,338 - INFO -   ['POST'] /upload
2025-10-09 06:27:49,338 - INFO -   ['GET'] /get_updated_medical_data
2025-10-09 06:27:49,338 - INFO -   ['PUT'] /update_medical_data
2025-10-09 06:27:49,338 - INFO -   ['POST'] /transcribe
2025-10-09 06:27:49,338 - INFO -   ['POST'] /extract_medical_data
2025-10-09 06:27:49,338 - INFO -   ['POST'] /api/generate_summary
2025-10-09 06:27:49,338 - INFO -   ['POST'] /api/extract_medical_data_from_audio
2025-10-09 06:27:49,338 - INFO -   ['POST'] /api/patient_summary_openvino
2025-10-09 06:27:49,338 - INFO -   ['POST'] /summarize
2025-10-09 06:27:49,338 - INFO -   ['POST'] /upload
2025-10-09 06:27:49,338 - INFO -   ['POST'] /phi/scrub
2025-10-09 06:27:49,338 - INFO -   ['POST'] /api/models/load
2025-10-09 06:27:49,338 - INFO -   ['POST'] /api/models/generate
2025-10-09 06:27:49,338 - INFO -   ['GET'] /api/models/info
2025-10-09 06:27:49,338 - INFO -   ['GET'] /api/models/defaults
2025-10-09 06:27:49,339 - INFO -   ['POST'] /api/models/clear_cache
2025-10-09 06:27:49,339 - INFO -   ['POST'] /api/models/switch
2025-10-09 06:27:49,339 - INFO -   ['GET'] /api/models/health
2025-10-09 06:27:49,339 - INFO -   ['GET'] /debug/routes
2025-10-09 06:27:49,339 - INFO -   ['GET'] /health/live
2025-10-09 06:27:49,339 - INFO -   ['GET'] /health/ready
2025-10-09 06:27:49,339 - INFO - ============================================================
2025-10-09 06:27:49,339 - INFO - Agents initialized successfully
2025-10-09 06:27:49,339 - INFO - Successfully imported create_app and initialize_agents
2025-10-09 06:27:49,339 - INFO - App instance created successfully (without agents)
2025-10-09 06:27:49,339 - INFO - App title: Medical AI Service
2025-10-09 06:27:49,339 - INFO - App version: 1.0.0
2025-10-09 06:27:49,339 - INFO - Initializing agents with preload_small_models=False...
2025-10-09 06:27:49,339 - INFO - Model manager imported successfully in initialize_agents
2025-10-09 06:27:49,339 - INFO - PatientSummarizerAgent created for Falconsai/medical_summarization (summarization) on cuda (loader deferred)
2025-10-09 06:27:49,354 - INFO - Main router registered with app
2025-10-09 06:27:49,358 - INFO - API router registered with app
2025-10-09 06:27:49,360 - INFO - Model management router registered with app
2025-10-09 06:27:49,360 - INFO - All routes registered successfully
2025-10-09 06:27:49,369 - INFO - Agents initialized and routes registered
2025-10-09 06:27:49,369 - INFO - ============================================================
2025-10-09 06:27:49,369 - INFO - REGISTERED ROUTES:
2025-10-09 06:27:49,369 - INFO -   ['HEAD', 'GET'] /openapi.json
2025-10-09 06:27:49,369 - INFO -   ['HEAD', 'GET'] /docs
2025-10-09 06:27:49,369 - INFO -   ['HEAD', 'GET'] /docs/oauth2-redirect
2025-10-09 06:27:49,369 - INFO -   ['HEAD', 'GET'] /redoc
2025-10-09 06:27:49,369 - INFO -   ['GET'] /live
2025-10-09 06:27:49,369 - INFO -   ['GET'] /ready
2025-10-09 06:27:49,369 - INFO -   ['GET'] /metrics
2025-10-09 06:27:49,369 - INFO -   ['GET'] /
2025-10-09 06:27:49,369 - INFO -   ['GET'] /api/info
2025-10-09 06:27:49,369 - INFO -   ['GET'] /api/performance_metrics
2025-10-09 06:27:49,369 - INFO -   ['POST'] /generate_patient_summary
2025-10-09 06:27:49,369 - INFO -   ['POST'] /upload
2025-10-09 06:27:49,369 - INFO -   ['GET'] /get_updated_medical_data
2025-10-09 06:27:49,369 - INFO -   ['PUT'] /update_medical_data
2025-10-09 06:27:49,369 - INFO -   ['POST'] /transcribe
2025-10-09 06:27:49,369 - INFO -   ['POST'] /extract_medical_data
2025-10-09 06:27:49,369 - INFO -   ['POST'] /api/generate_summary
2025-10-09 06:27:49,369 - INFO -   ['POST'] /api/extract_medical_data_from_audio
2025-10-09 06:27:49,370 - INFO -   ['POST'] /api/patient_summary_openvino
2025-10-09 06:27:49,370 - INFO -   ['POST'] /upload
2025-10-09 06:27:49,370 - INFO -   ['GET'] /get_updated_medical_data
2025-10-09 06:27:49,370 - INFO -   ['PUT'] /update_medical_data
2025-10-09 06:27:49,370 - INFO -   ['POST'] /transcribe
2025-10-09 06:27:49,370 - INFO -   ['POST'] /extract_medical_data
2025-10-09 06:27:49,370 - INFO -   ['POST'] /api/generate_summary
2025-10-09 06:27:49,370 - INFO -   ['POST'] /api/extract_medical_data_from_audio
2025-10-09 06:27:49,370 - INFO -   ['POST'] /api/patient_summary_openvino
2025-10-09 06:27:49,370 - INFO -   ['POST'] /summarize
2025-10-09 06:27:49,370 - INFO -   ['POST'] /upload
2025-10-09 06:27:49,370 - INFO -   ['POST'] /phi/scrub
2025-10-09 06:27:49,370 - INFO -   ['POST'] /api/models/load
2025-10-09 06:27:49,370 - INFO -   ['POST'] /api/models/generate
2025-10-09 06:27:49,370 - INFO -   ['GET'] /api/models/info
2025-10-09 06:27:49,370 - INFO -   ['GET'] /api/models/defaults
2025-10-09 06:27:49,370 - INFO -   ['POST'] /api/models/clear_cache
2025-10-09 06:27:49,370 - INFO -   ['POST'] /api/models/switch
2025-10-09 06:27:49,370 - INFO -   ['GET'] /api/models/health
2025-10-09 06:27:49,370 - INFO -   ['GET'] /debug/routes
2025-10-09 06:27:49,370 - INFO -   ['GET'] /health/live
2025-10-09 06:27:49,370 - INFO -   ['GET'] /health/ready
2025-10-09 06:27:49,370 - INFO - ============================================================
2025-10-09 06:27:49,371 - INFO - Agents initialized successfully
2025-10-09 06:27:49,371 - INFO - ============================================================
2025-10-09 06:27:49,371 - INFO - FINAL REGISTERED ROUTES ON HF SPACES:
2025-10-09 06:27:49,371 - INFO -   ['HEAD', 'GET'] /openapi.json
2025-10-09 06:27:49,371 - INFO -   ['HEAD', 'GET'] /docs
2025-10-09 06:27:49,371 - INFO -   ['HEAD', 'GET'] /docs/oauth2-redirect
2025-10-09 06:27:49,371 - INFO -   ['HEAD', 'GET'] /redoc
2025-10-09 06:27:49,371 - INFO -   ['GET'] /live
2025-10-09 06:27:49,371 - INFO -   ['GET'] /ready
2025-10-09 06:27:49,371 - INFO -   ['GET'] /metrics
2025-10-09 06:27:49,371 - INFO -   ['GET'] /
2025-10-09 06:27:49,371 - INFO -   ['GET'] /api/info
2025-10-09 06:27:49,371 - INFO -   ['GET'] /api/performance_metrics
2025-10-09 06:27:49,371 - INFO -   ['POST'] /generate_patient_summary
2025-10-09 06:27:49,371 - INFO -   ['POST'] /upload
2025-10-09 06:27:49,371 - INFO -   ['GET'] /get_updated_medical_data
2025-10-09 06:27:49,371 - INFO -   ['PUT'] /update_medical_data
2025-10-09 06:27:49,371 - INFO -   ['POST'] /transcribe
2025-10-09 06:27:49,371 - INFO -   ['POST'] /extract_medical_data
2025-10-09 06:27:49,371 - INFO -   ['POST'] /api/generate_summary
2025-10-09 06:27:49,372 - INFO -   ['POST'] /api/extract_medical_data_from_audio
2025-10-09 06:27:49,372 - INFO -   ['POST'] /api/patient_summary_openvino
2025-10-09 06:27:49,372 - INFO -   ['POST'] /upload
2025-10-09 06:27:49,372 - INFO -   ['GET'] /get_updated_medical_data
2025-10-09 06:27:49,372 - INFO -   ['PUT'] /update_medical_data
2025-10-09 06:27:49,372 - INFO -   ['POST'] /transcribe
2025-10-09 06:27:49,372 - INFO -   ['POST'] /extract_medical_data
2025-10-09 06:27:49,372 - INFO -   ['POST'] /api/generate_summary
2025-10-09 06:27:49,372 - INFO -   ['POST'] /api/extract_medical_data_from_audio
2025-10-09 06:27:49,372 - INFO -   ['POST'] /api/patient_summary_openvino
2025-10-09 06:27:49,372 - INFO -   ['POST'] /summarize
2025-10-09 06:27:49,372 - INFO -   ['POST'] /upload
2025-10-09 06:27:49,372 - INFO -   ['POST'] /phi/scrub
2025-10-09 06:27:49,372 - INFO -   ['POST'] /api/models/load
2025-10-09 06:27:49,372 - INFO -   ['POST'] /api/models/generate
2025-10-09 06:27:49,372 - INFO -   ['GET'] /api/models/info
2025-10-09 06:27:49,372 - INFO -   ['GET'] /api/models/defaults
2025-10-09 06:27:49,372 - INFO -   ['POST'] /api/models/clear_cache
2025-10-09 06:27:49,372 - INFO -   ['POST'] /api/models/switch
2025-10-09 06:27:49,372 - INFO -   ['GET'] /api/models/health
2025-10-09 06:27:49,372 - INFO -   ['GET'] /debug/routes
2025-10-09 06:27:49,372 - INFO -   ['GET'] /health/live
2025-10-09 06:27:49,372 - INFO -   ['GET'] /health/ready
2025-10-09 06:27:49,373 - INFO - Total routes registered: 40
2025-10-09 06:27:49,373 - INFO - ============================================================
INFO:     Started server process [1]
INFO:     Waiting for application startup.
2025-10-09 06:27:49,373 - INFO - Detected Hugging Face Spaces environment - disabling Redis and Database connections
2025-10-09 06:27:49,373 - INFO - Skipping Redis initialization on HF Spaces
2025-10-09 06:27:49,373 - INFO - Skipping Database initialization on HF Spaces
2025-10-09 06:27:49,373 - INFO - Scalable service mesh not initialized (Redis not available)
2025-10-09 06:27:49,374 - INFO - Monitoring not initialized (Redis not available)
2025-10-09 06:27:49,374 - INFO - Application started without scalable features (Redis not available)
2025-10-09 06:27:49,374 - INFO - Starting FastAPI application with scalable architecture
2025-10-09 06:27:49,374 - INFO - Python version: 3.10.18 (main, Sep 30 2025, 00:42:07) [GCC 14.2.0]
2025-10-09 06:27:49,374 - INFO - PyTorch version: 2.3.0+cu121
2025-10-09 06:27:49,374 - INFO - CUDA available: True
2025-10-09 06:27:49,374 - INFO - CUDA version: 12.1
2025-10-09 06:27:49,374 - INFO - GPU count: 1
2025-10-09 06:27:49,374 - INFO - GPU 0: Tesla T4
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:7860 (Press CTRL+C to quit)
INFO:     10.16.38.247:4107 - "GET / HTTP/1.1" 200 OK
INFO:     10.16.23.21:26164 - "GET / HTTP/1.1" 200 OK
INFO:     10.16.14.117:47211 - "GET / HTTP/1.1" 200 OK
INFO:     10.16.23.21:14054 - "GET / HTTP/1.1" 200 OK
INFO:     10.16.38.247:57416 - "GET / HTTP/1.1" 200 OK
Background task started for job_id: 9b064426-5331-47a9-9407-224dc2aba308INFO:     10.16.23.21:18245 - "POST /generate_patient_summary?stream=true HTTP/1.1" 200 OK

2025-10-09 06:28:56,222 - INFO - EHR API response status: 200
Step 1 - EHR fetch took 0.82s
Step 2 - Processing took 0.00s
Step 3 - Computing baseline took 0.00s
🧠 GGUF MODE: Single-prompt generation for microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf
📦 Using cache key: repo_id='microsoft/Phi-3-mini-4k-instruct-gguf', filename='Phi-3-mini-4k-instruct-q4.gguf'
🔄 Loading new GGUF pipeline for ('microsoft/Phi-3-mini-4k-instruct-gguf', 'Phi-3-mini-4k-instruct-q4.gguf')
2025-10-09 06:28:56,225 - INFO - Downloading model from microsoft/Phi-3-mini-4k-instruct-gguf/Phi-3-mini-4k-instruct-q4.gguf
/opt/venv/lib/python3.10/site-packages/huggingface_hub/file_download.py:945: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
2025-10-09 06:29:11,970 - INFO - Model downloaded successfully to /tmp/huggingface/models--microsoft--Phi-3-mini-4k-instruct-gguf/snapshots/999f761fe19e26cf1a339a5ec5f9f201301cbb83/Phi-3-mini-4k-instruct-q4.gguf
2025-10-09 06:29:11,970 - INFO - Model file size: 2282.36 MB
2025-10-09 06:29:12,821 - INFO - [GGUF] Model initialized in 0.85s from /tmp/huggingface/models--microsoft--Phi-3-mini-4k-instruct-gguf/snapshots/999f761fe19e26cf1a339a5ec5f9f201301cbb83/Phi-3-mini-4k-instruct-q4.gguf (threads=4, batch=64)
[GGUF] Model loaded successfully in 16.60s: microsoft/Phi-3-mini-4k-instruct-gguf
[ENTRYPOINT] Clearing Hugging Face / Torch / tmp cache...
chmod: changing permissions of '/tmp': Operation not permitted
2025-10-09 06:34:15,501 - INFO - Added src_dir to Python path: /app/services/ai-service/src
2025-10-09 06:34:15,501 - INFO - src_dir exists: True
2025-10-09 06:34:15,501 - INFO - Contents of src_dir: ['__main__.py', 'agents', 'ai_med_extract', 'app.py', 'config_settings.py', 'gradio_app.py', 'routes', 'wsgi.py']
2025-10-09 06:34:15,501 - INFO - ai_med_extract_path exists: True
2025-10-09 06:34:15,502 - INFO - Contents of ai_med_extract: ['__init__.py', 'agents', 'api', 'api_endpoints.py', 'api_middleware.py', 'app.py', 'core_exceptions.py', 'core_logger.py', 'core_security.py', 'database_audit.py', 'gradio_app.py', 'health_endpoints.py', 'inference_service.py', 'main.py', 'metrics_adapter.py', 'monitoring_observability.py', 'phi_scrubber_service.py', 'scalable_service_mesh.py', 'utils']
2025-10-09 06:34:15,502 - INFO - Detected Hugging Face Spaces environment
2025-10-09 06:34:15,502 - INFO - Attempting to import from ai_med_extract package...
2025-10-09 06:34:15,502 - INFO - Python path: ['/app/services/ai-service/src', '', '/opt/venv/bin']
2025-10-09 06:34:15,502 - INFO - Current working directory: /app
2025-10-09 06:34:15,502 - INFO - Files in current directory: ['README.md', '__init__.py', 'app.py', 'requirements.txt', 'services']

libgomp: Invalid value for environment variable OMP_NUM_THREADS

libgomp: Invalid value for environment variable OMP_NUM_THREADS
2025-10-09 06:34:20,990 - INFO - Model manager imported successfully
2025-10-09 06:34:21,303 - INFO - Model manager imported successfully in initialize_agents
2025-10-09 06:34:21,330 - INFO - PatientSummarizerAgent created for Falconsai/medical_summarization (summarization) on cuda (loader deferred)
2025-10-09 06:34:21,430 - WARNING - ONNX Runtime not available: /opt/venv/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_pybind11_state.cpython-310-x86_64-linux-gnu.so: cannot enable executable stack as shared object requires: Invalid argument
[GGUF] Preloading GGUF models as requested by settings...
[GGUF] Preload failed (non-fatal): Model path does not exist: microsoft/Phi-3-mini-4k-instruct-gguf
2025-10-09 06:34:21,476 - INFO - Main router registered with app
2025-10-09 06:34:21,483 - INFO - Redis URL not configured, PHI scrubbing will use fallback mode
2025-10-09 06:34:21,491 - INFO - API router registered with app
2025-10-09 06:34:21,498 - INFO - Model management router registered with app
2025-10-09 06:34:21,498 - INFO - All routes registered successfully
2025-10-09 06:34:21,507 - INFO - Agents initialized and routes registered
2025-10-09 06:34:21,507 - INFO - ============================================================
2025-10-09 06:34:21,507 - INFO - REGISTERED ROUTES:
2025-10-09 06:34:21,507 - INFO -   ['GET', 'HEAD'] /openapi.json
2025-10-09 06:34:21,507 - INFO -   ['GET', 'HEAD'] /docs
2025-10-09 06:34:21,507 - INFO -   ['GET', 'HEAD'] /docs/oauth2-redirect
2025-10-09 06:34:21,507 - INFO -   ['GET', 'HEAD'] /redoc
2025-10-09 06:34:21,507 - INFO -   ['GET'] /live
2025-10-09 06:34:21,507 - INFO -   ['GET'] /ready
2025-10-09 06:34:21,507 - INFO -   ['GET'] /metrics
2025-10-09 06:34:21,507 - INFO -   ['GET'] /
2025-10-09 06:34:21,507 - INFO -   ['GET'] /api/info
2025-10-09 06:34:21,507 - INFO -   ['GET'] /api/performance_metrics
2025-10-09 06:34:21,507 - INFO -   ['POST'] /generate_patient_summary
2025-10-09 06:34:21,507 - INFO -   ['POST'] /summarize
2025-10-09 06:34:21,507 - INFO -   ['POST'] /upload
[GGUF] Preloading GGUF models as requested by settings...
[GGUF] Preload failed (non-fatal): Model path does not exist: microsoft/Phi-3-mini-4k-instruct-gguf
2025-10-09 06:34:21,507 - INFO -   ['POST'] /phi/scrub
2025-10-09 06:34:21,507 - INFO -   ['POST'] /api/models/load
2025-10-09 06:34:21,508 - INFO -   ['POST'] /api/models/generate
2025-10-09 06:34:21,508 - INFO -   ['GET'] /api/models/info
2025-10-09 06:34:21,508 - INFO -   ['GET'] /api/models/defaults
2025-10-09 06:34:21,508 - INFO -   ['POST'] /api/models/clear_cache
2025-10-09 06:34:21,508 - INFO -   ['POST'] /api/models/switch
2025-10-09 06:34:21,508 - INFO -   ['GET'] /api/models/health
2025-10-09 06:34:21,508 - INFO -   ['GET'] /debug/routes
2025-10-09 06:34:21,508 - INFO -   ['GET'] /health/live
2025-10-09 06:34:21,508 - INFO -   ['GET'] /health/ready
2025-10-09 06:34:21,508 - INFO - ============================================================
2025-10-09 06:34:21,508 - INFO - Agents initialized successfully
2025-10-09 06:34:21,508 - INFO - Model manager imported successfully in initialize_agents
2025-10-09 06:34:21,508 - INFO - PatientSummarizerAgent created for Falconsai/medical_summarization (summarization) on cuda (loader deferred)
2025-10-09 06:34:21,516 - INFO - Main router registered with app
2025-10-09 06:34:21,519 - INFO - API router registered with app
2025-10-09 06:34:21,521 - INFO - Model management router registered with app
2025-10-09 06:34:21,521 - INFO - All routes registered successfully
2025-10-09 06:34:21,529 - INFO - Agents initialized and routes registered
2025-10-09 06:34:21,529 - INFO - ============================================================
2025-10-09 06:34:21,529 - INFO - REGISTERED ROUTES:
2025-10-09 06:34:21,529 - INFO -   ['GET', 'HEAD'] /openapi.json
2025-10-09 06:34:21,529 - INFO -   ['GET', 'HEAD'] /docs
2025-10-09 06:34:21,530 - INFO -   ['GET', 'HEAD'] /docs/oauth2-redirect
2025-10-09 06:34:21,530 - INFO -   ['GET', 'HEAD'] /redoc
2025-10-09 06:34:21,530 - INFO -   ['GET'] /live
2025-10-09 06:34:21,530 - INFO -   ['GET'] /ready
2025-10-09 06:34:21,530 - INFO -   ['GET'] /metrics
2025-10-09 06:34:21,530 - INFO -   ['GET'] /
2025-10-09 06:34:21,530 - INFO -   ['GET'] /api/info
2025-10-09 06:34:21,530 - INFO -   ['GET'] /api/performance_metrics
2025-10-09 06:34:21,530 - INFO -   ['POST'] /generate_patient_summary
2025-10-09 06:34:21,530 - INFO -   ['POST'] /upload
2025-10-09 06:34:21,530 - INFO -   ['GET'] /get_updated_medical_data
2025-10-09 06:34:21,530 - INFO -   ['PUT'] /update_medical_data
2025-10-09 06:34:21,530 - INFO -   ['POST'] /transcribe
2025-10-09 06:34:21,530 - INFO -   ['POST'] /extract_medical_data
2025-10-09 06:34:21,530 - INFO -   ['POST'] /api/generate_summary
[GGUF] Preloading GGUF models as requested by settings...
[GGUF] Preload failed (non-fatal): Model path does not exist: microsoft/Phi-3-mini-4k-instruct-gguf
2025-10-09 06:34:21,530 - INFO -   ['POST'] /api/extract_medical_data_from_audio
2025-10-09 06:34:21,530 - INFO -   ['POST'] /api/patient_summary_openvino
2025-10-09 06:34:21,530 - INFO -   ['POST'] /summarize
2025-10-09 06:34:21,530 - INFO -   ['POST'] /upload
2025-10-09 06:34:21,530 - INFO -   ['POST'] /phi/scrub
2025-10-09 06:34:21,530 - INFO -   ['POST'] /api/models/load
2025-10-09 06:34:21,530 - INFO -   ['POST'] /api/models/generate
2025-10-09 06:34:21,530 - INFO -   ['GET'] /api/models/info
2025-10-09 06:34:21,530 - INFO -   ['GET'] /api/models/defaults
2025-10-09 06:34:21,531 - INFO -   ['POST'] /api/models/clear_cache
2025-10-09 06:34:21,531 - INFO -   ['POST'] /api/models/switch
2025-10-09 06:34:21,531 - INFO -   ['GET'] /api/models/health
2025-10-09 06:34:21,531 - INFO -   ['GET'] /debug/routes
2025-10-09 06:34:21,531 - INFO -   ['GET'] /health/live
2025-10-09 06:34:21,531 - INFO -   ['GET'] /health/ready
2025-10-09 06:34:21,531 - INFO - ============================================================
2025-10-09 06:34:21,531 - INFO - Agents initialized successfully
2025-10-09 06:34:21,531 - INFO - Successfully imported create_app and initialize_agents
2025-10-09 06:34:21,531 - INFO - App instance created successfully (without agents)
2025-10-09 06:34:21,531 - INFO - App title: Medical AI Service
2025-10-09 06:34:21,531 - INFO - App version: 1.0.0
2025-10-09 06:34:21,531 - INFO - Initializing agents with preload_small_models=False...
2025-10-09 06:34:21,531 - INFO - Model manager imported successfully in initialize_agents
2025-10-09 06:34:21,531 - INFO - PatientSummarizerAgent created for Falconsai/medical_summarization (summarization) on cuda (loader deferred)
2025-10-09 06:34:21,546 - INFO - Main router registered with app
2025-10-09 06:34:21,550 - INFO - API router registered with app
2025-10-09 06:34:21,551 - INFO - Model management router registered with app
2025-10-09 06:34:21,551 - INFO - All routes registered successfully
2025-10-09 06:34:21,559 - INFO - Agents initialized and routes registered
2025-10-09 06:34:21,559 - INFO - ============================================================
2025-10-09 06:34:21,559 - INFO - REGISTERED ROUTES:
2025-10-09 06:34:21,559 - INFO -   ['GET', 'HEAD'] /openapi.json
2025-10-09 06:34:21,559 - INFO -   ['GET', 'HEAD'] /docs
2025-10-09 06:34:21,560 - INFO -   ['GET', 'HEAD'] /docs/oauth2-redirect
2025-10-09 06:34:21,560 - INFO -   ['GET', 'HEAD'] /redoc
2025-10-09 06:34:21,560 - INFO -   ['GET'] /live
2025-10-09 06:34:21,560 - INFO -   ['GET'] /ready
2025-10-09 06:34:21,560 - INFO -   ['GET'] /metrics
2025-10-09 06:34:21,560 - INFO -   ['GET'] /
2025-10-09 06:34:21,560 - INFO -   ['GET'] /api/info
2025-10-09 06:34:21,560 - INFO -   ['GET'] /api/performance_metrics
2025-10-09 06:34:21,560 - INFO -   ['POST'] /generate_patient_summary
2025-10-09 06:34:21,560 - INFO -   ['POST'] /upload
2025-10-09 06:34:21,560 - INFO -   ['GET'] /get_updated_medical_data
2025-10-09 06:34:21,560 - INFO -   ['PUT'] /update_medical_data
2025-10-09 06:34:21,560 - INFO -   ['POST'] /transcribe
2025-10-09 06:34:21,560 - INFO -   ['POST'] /extract_medical_data
2025-10-09 06:34:21,560 - INFO -   ['POST'] /api/generate_summary
2025-10-09 06:34:21,560 - INFO -   ['POST'] /api/extract_medical_data_from_audio
2025-10-09 06:34:21,560 - INFO -   ['POST'] /api/patient_summary_openvino
2025-10-09 06:34:21,560 - INFO -   ['POST'] /upload
2025-10-09 06:34:21,560 - INFO -   ['GET'] /get_updated_medical_data
2025-10-09 06:34:21,560 - INFO -   ['PUT'] /update_medical_data
2025-10-09 06:34:21,560 - INFO -   ['POST'] /transcribe
2025-10-09 06:34:21,560 - INFO -   ['POST'] /extract_medical_data
2025-10-09 06:34:21,560 - INFO -   ['POST'] /api/generate_summary
2025-10-09 06:34:21,560 - INFO -   ['POST'] /api/extract_medical_data_from_audio
2025-10-09 06:34:21,560 - INFO -   ['POST'] /api/patient_summary_openvino
2025-10-09 06:34:21,560 - INFO -   ['POST'] /summarize
2025-10-09 06:34:21,560 - INFO -   ['POST'] /upload
2025-10-09 06:34:21,560 - INFO -   ['POST'] /phi/scrub
2025-10-09 06:34:21,560 - INFO -   ['POST'] /api/models/load
2025-10-09 06:34:21,560 - INFO -   ['POST'] /api/models/generate
2025-10-09 06:34:21,560 - INFO -   ['GET'] /api/models/info
2025-10-09 06:34:21,560 - INFO -   ['GET'] /api/models/defaults
2025-10-09 06:34:21,560 - INFO -   ['POST'] /api/models/clear_cache
2025-10-09 06:34:21,560 - INFO -   ['POST'] /api/models/switch
2025-10-09 06:34:21,560 - INFO -   ['GET'] /api/models/health
2025-10-09 06:34:21,560 - INFO -   ['GET'] /debug/routes
2025-10-09 06:34:21,560 - INFO -   ['GET'] /health/live
2025-10-09 06:34:21,560 - INFO -   ['GET'] /health/ready
2025-10-09 06:34:21,561 - INFO - ============================================================
2025-10-09 06:34:21,561 - INFO - Agents initialized successfully
2025-10-09 06:34:21,561 - INFO - ============================================================
2025-10-09 06:34:21,561 - INFO - FINAL REGISTERED ROUTES ON HF SPACES:
2025-10-09 06:34:21,561 - INFO -   ['GET', 'HEAD'] /openapi.json
2025-10-09 06:34:21,561 - INFO -   ['GET', 'HEAD'] /docs
2025-10-09 06:34:21,561 - INFO -   ['GET', 'HEAD'] /docs/oauth2-redirect
2025-10-09 06:34:21,561 - INFO -   ['GET', 'HEAD'] /redoc
2025-10-09 06:34:21,561 - INFO -   ['GET'] /live
2025-10-09 06:34:21,561 - INFO -   ['GET'] /ready
2025-10-09 06:34:21,561 - INFO -   ['GET'] /metrics
2025-10-09 06:34:21,561 - INFO -   ['GET'] /
2025-10-09 06:34:21,561 - INFO -   ['GET'] /api/info
2025-10-09 06:34:21,561 - INFO -   ['GET'] /api/performance_metrics
2025-10-09 06:34:21,561 - INFO -   ['POST'] /generate_patient_summary
2025-10-09 06:34:21,561 - INFO -   ['POST'] /upload
2025-10-09 06:34:21,561 - INFO -   ['GET'] /get_updated_medical_data
2025-10-09 06:34:21,561 - INFO -   ['PUT'] /update_medical_data
2025-10-09 06:34:21,561 - INFO -   ['POST'] /transcribe
2025-10-09 06:34:21,561 - INFO -   ['POST'] /extract_medical_data
2025-10-09 06:34:21,561 - INFO -   ['POST'] /api/generate_summary
2025-10-09 06:34:21,561 - INFO -   ['POST'] /api/extract_medical_data_from_audio
2025-10-09 06:34:21,561 - INFO -   ['POST'] /api/patient_summary_openvino
2025-10-09 06:34:21,561 - INFO -   ['POST'] /upload
2025-10-09 06:34:21,561 - INFO -   ['GET'] /get_updated_medical_data
2025-10-09 06:34:21,561 - INFO -   ['PUT'] /update_medical_data
2025-10-09 06:34:21,561 - INFO -   ['POST'] /transcribe
2025-10-09 06:34:21,561 - INFO -   ['POST'] /extract_medical_data
2025-10-09 06:34:21,561 - INFO -   ['POST'] /api/generate_summary
2025-10-09 06:34:21,561 - INFO -   ['POST'] /api/extract_medical_data_from_audio
2025-10-09 06:34:21,561 - INFO -   ['POST'] /api/patient_summary_openvino
2025-10-09 06:34:21,561 - INFO -   ['POST'] /summarize
2025-10-09 06:34:21,561 - INFO -   ['POST'] /upload
2025-10-09 06:34:21,561 - INFO -   ['POST'] /phi/scrub
2025-10-09 06:34:21,561 - INFO -   ['POST'] /api/models/load
2025-10-09 06:34:21,561 - INFO -   ['POST'] /api/models/generate
2025-10-09 06:34:21,561 - INFO -   ['GET'] /api/models/info
2025-10-09 06:34:21,561 - INFO -   ['GET'] /api/models/defaults
2025-10-09 06:34:21,561 - INFO -   ['POST'] /api/models/clear_cache
2025-10-09 06:34:21,561 - INFO -   ['POST'] /api/models/switch
2025-10-09 06:34:21,561 - INFO -   ['GET'] /api/models/health
2025-10-09 06:34:21,561 - INFO -   ['GET'] /debug/routes
2025-10-09 06:34:21,561 - INFO -   ['GET'] /health/live
2025-10-09 06:34:21,561 - INFO -   ['GET'] /health/ready
2025-10-09 06:34:21,561 - INFO - Total routes registered: 40
2025-10-09 06:34:21,562 - INFO - ============================================================
INFO:     Started server process [1]
INFO:     Waiting for application startup.
2025-10-09 06:34:21,562 - INFO - Detected Hugging Face Spaces environment - disabling Redis and Database connections
2025-10-09 06:34:21,562 - INFO - Skipping Redis initialization on HF Spaces
2025-10-09 06:34:21,562 - INFO - Skipping Database initialization on HF Spaces
2025-10-09 06:34:21,562 - INFO - Scalable service mesh not initialized (Redis not available)
2025-10-09 06:34:21,562 - INFO - Monitoring not initialized (Redis not available)
2025-10-09 06:34:21,562 - INFO - Application started without scalable features (Redis not available)
2025-10-09 06:34:21,562 - INFO - Starting FastAPI application with scalable architecture
2025-10-09 06:34:21,562 - INFO - Python version: 3.10.18 (main, Sep 30 2025, 00:42:07) [GCC 14.2.0]
2025-10-09 06:34:21,562 - INFO - PyTorch version: 2.3.0+cu121
2025-10-09 06:34:21,562 - INFO - CUDA available: True
2025-10-09 06:34:21,562 - INFO - CUDA version: 12.1
2025-10-09 06:34:21,562 - INFO - GPU count: 1
2025-10-09 06:34:21,562 - INFO - GPU 0: Tesla T4
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:7860 (Press CTRL+C to quit)
INFO:     10.16.23.21:47109 - "GET / HTTP/1.1" 200 OK
INFO:     10.16.23.21:47109 - "GET / HTTP/1.1" 200 OK
Background task started for job_id: 8bf78119-dd65-4e2f-b674-28c0d7976c61INFO:     10.16.14.52:34296 - "POST /generate_patient_summary?stream=true HTTP/1.1" 200 OK

2025-10-09 06:34:31,881 - INFO - EHR API response status: 200
Step 1 - EHR fetch took 0.82s
Step 2 - Processing took 0.00s
Step 3 - Computing baseline took 0.00s
📝 SUMMARIZATION MODE: google/flan-t5-large
2025-10-09 06:34:31,887 - INFO - Loading Transformers model: google/flan-t5-large (summarization)
2025-10-09 06:34:32,578 - WARNING - AutoModelForSeq2SeqLM failed for google/flan-t5-large: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`
2025-10-09 06:34:59,574 - INFO - Loaded google/flan-t5-large using AutoModel fallback
Device set to use cuda:0
2025-10-09 06:35:00,635 - WARNING - Pipeline creation failed with device 0, trying CPU: 'SummarizationPipeline' object has no attribute 'assistant_model'
Device set to use cuda:0
2025-10-09 06:35:00,636 - ERROR - Failed to load Transformers model: 'SummarizationPipeline' object has no attribute 'assistant_model'
2025-10-09 06:35:00,636 - ERROR - Failed to create model loader for google/flan-t5-large (summarization): Transformers model loading failed: 'SummarizationPipeline' object has no attribute 'assistant_model'
Unified manager load failed for summarization, falling back: Model loading failed for google/flan-t5-large (summarization): Transformers model loading failed: 'SummarizationPipeline' object has no attribute 'assistant_model'
Both `device` and `device_map` are specified. `device` will override `device_map`. You will most likely encounter unexpected behavior. Please remove `device` and keep `device_map`.
Memory cleanup completed. Current usage: 1119.7 MB
2025-10-09 06:35:00,871 - ERROR - Error details: ValueError: Could not load model Falconsai/medical_summarization with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForSeq2SeqLM'>, <class 'transformers.models.t5.modeling_t5.T5ForConditionalGeneration'>). See the original errors:

while loading with AutoModelForSeq2SeqLM, an error is thrown:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained
    return model_class.from_pretrained(
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 310, in infer_framework_load_model
    model = model_class.from_pretrained(model, **fp32_kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained
    return model_class.from_pretrained(
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`

while loading with T5ForConditionalGeneration, an error is thrown:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 310, in infer_framework_load_model
    model = model_class.from_pretrained(model, **fp32_kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`


Background task completed successfully for job_id: 8bf78119-dd65-4e2f-b674-28c0d7976c61
Background task started for job_id: cb243477-e435-49df-938d-c47090dfa52eINFO:     10.16.23.21:7623 - "POST /generate_patient_summary?stream=true HTTP/1.1" 200 OK

2025-10-09 06:35:38,103 - INFO - EHR API response status: 200
Step 1 - EHR fetch took 0.79s
Step 2 - Processing took 0.00s
Step 3 - Computing baseline took 0.00s
🔤 TEXT-GENERATION MODE: microsoft/Phi-3-mini-4k-instruct
2025-10-09 06:35:38,171 - WARNING - Failed to create openvino_telemetry file. Please allow write access to the following directory: /
2025-10-09 06:35:38,171 - WARNING - Failed to create openvino_telemetry file. Please allow write access to the following directory: /
2025-10-09 06:35:38,171 - WARNING - Could not create directory for storing client ID. No data will be sent.
2025-10-09 06:35:38,196 - WARNING - Failed to create openvino_telemetry file. Please allow write access to the following directory: /
2025-10-09 06:35:38,196 - WARNING - Failed to create openvino_telemetry file. Please allow write access to the following directory: /
2025-10-09 06:35:38,196 - WARNING - Could not create directory for storing client ID. No data will be sent.
2025-10-09 06:35:38,627 - WARNING - mkdir -p failed for path /.config/matplotlib: [Errno 13] Permission denied: '/.config'
2025-10-09 06:35:38,628 - WARNING - Matplotlib created a temporary cache directory at /tmp/matplotlib-x1q3_89e because there was an issue with the default path (/.config/matplotlib); it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
No OpenVINO files were found for microsoft/Phi-3-mini-4k-instruct, setting `export=True` to convert the model to the OpenVINO IR. Don't forget to save the resulting model with `.save_pretrained()`
Fetching 2 files:   0%|          | 0/2 [00:00<?, ?it/s]
Fetching 2 files:  50%|█████     | 1/2 [01:23<01:23, 83.64s/it]
Fetching 2 files: 100%|██████████| 2/2 [01:23<00:00, 41.82s/it]
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 14.56it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 14.54it/s]
`loss_type=None` was set in the config but it is unrecognised.Using the default loss: `ForCausalLMLoss`.
/opt/venv/lib/python3.10/site-packages/transformers/cache_utils.py:568: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  or not self.key_cache[layer_idx].numel()  # the layer has no cache
/opt/venv/lib/python3.10/site-packages/transformers/masking_utils.py:187: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if (padding_length := kv_length + kv_offset - attention_mask.shape[-1]) > 0:
/opt/venv/lib/python3.10/site-packages/optimum/exporters/openvino/model_patcher.py:332: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  torch.tensor(0.0, device=mask.device, dtype=dtype),
/opt/venv/lib/python3.10/site-packages/optimum/exporters/openvino/model_patcher.py:333: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  torch.tensor(torch.finfo(torch.float16).min, device=mask.device, dtype=dtype),
/opt/venv/lib/python3.10/site-packages/transformers/cache_utils.py:551: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  elif (
/opt/venv/lib/python3.10/site-packages/transformers/integrations/sdpa_attention.py:59: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  is_causal = query.shape[2] > 1 and attention_mask is None and getattr(module, "is_causal", True)
2025-10-09 06:37:10,477 - WARNING - Failed to create openvino_telemetry file. Please allow write access to the following directory: /
2025-10-09 06:37:10,477 - WARNING - Failed to create openvino_telemetry file. Please allow write access to the following directory: /
2025-10-09 06:37:10,477 - WARNING - Could not create directory for storing client ID. No data will be sent.
INFO:nncf:Statistics of the bitwidth distribution:
┍━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┑
│ Weight compression mode   │ % all parameters (layers)   │ % ratio-defining parameters (layers)   │
┝━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┥
│ int8_asym                 │ 100% (130 / 130)            │ 100% (130 / 130)                       │
┕━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┙
Applying Weight Compression ━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% • 0:00:25 • 0:00:00
[✅ SUCCESS] Text-generation | TIMEOUT_MODE: normal | TOTAL: 395.2s
Background task completed successfully for job_id: cb243477-e435-49df-938d-c47090dfa52e
Background task started for job_id: 9cfb2d70-505a-40d9-b8e9-c0b433b01da4INFO:     10.16.23.21:27796 - "POST /generate_patient_summary?stream=true HTTP/1.1" 200 OK

2025-10-09 06:43:33,022 - INFO - EHR API response status: 200
Step 1 - EHR fetch took 0.83s
Step 2 - Processing took 0.00s
Step 3 - Computing baseline took 0.00s
🔤 TEXT-GENERATION MODE: microsoft/Phi-3-mini-4k-instruct
No OpenVINO files were found for microsoft/Phi-3-mini-4k-instruct, setting `export=True` to convert the model to the OpenVINO IR. Don't forget to save the resulting model with `.save_pretrained()`
Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 14.84it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 14.81it/s]
INFO:nncf:Statistics of the bitwidth distribution:
┍━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┑
│ Weight compression mode   │ % all parameters (layers)   │ % ratio-defining parameters (layers)   │
┝━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┥
│ int8_asym                 │ 100% (130 / 130)            │ 100% (130 / 130)                       │
┕━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┙
Applying Weight Compression ━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% • 0:00:26 • 0:00:00
Background task started for job_id: 698e3f44-fd3d-48b3-aee6-db67a64b8361INFO:     10.16.14.52:59784 - "POST /generate_patient_summary?stream=true HTTP/1.1" 200 OK

2025-10-09 06:46:01,348 - INFO - EHR API response status: 200
Step 1 - EHR fetch took 0.81s
Step 2 - Processing took 0.00s
Step 3 - Computing baseline took 0.00s
📝 SUMMARIZATION MODE: facebook/bart-large-cnn
2025-10-09 06:46:01,352 - INFO - Loading Transformers model: facebook/bart-large-cnn (summarization)
2025-10-09 06:46:02,240 - WARNING - AutoModelForSeq2SeqLM failed for facebook/bart-large-cnn: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`
2025-10-09 06:46:15,746 - INFO - Loaded facebook/bart-large-cnn using AutoModel fallback
Device set to use cuda:0
2025-10-09 06:46:16,589 - WARNING - Pipeline creation failed with device 0, trying CPU: 'SummarizationPipeline' object has no attribute 'assistant_model'
Device set to use cuda:0
2025-10-09 06:46:16,590 - ERROR - Failed to load Transformers model: 'SummarizationPipeline' object has no attribute 'assistant_model'
2025-10-09 06:46:16,590 - ERROR - Failed to create model loader for facebook/bart-large-cnn (summarization): Transformers model loading failed: 'SummarizationPipeline' object has no attribute 'assistant_model'
Unified manager load failed for summarization, falling back: Model loading failed for facebook/bart-large-cnn (summarization): Transformers model loading failed: 'SummarizationPipeline' object has no attribute 'assistant_model'
Both `device` and `device_map` are specified. `device` will override `device_map`. You will most likely encounter unexpected behavior. Please remove `device` and keep `device_map`.
Memory cleanup completed. Current usage: 9320.4 MB
2025-10-09 06:46:16,987 - ERROR - Error details: ValueError: Could not load model Falconsai/medical_summarization with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForSeq2SeqLM'>, <class 'transformers.models.t5.modeling_t5.T5ForConditionalGeneration'>). See the original errors:

while loading with AutoModelForSeq2SeqLM, an error is thrown:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained
    return model_class.from_pretrained(
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 310, in infer_framework_load_model
    model = model_class.from_pretrained(model, **fp32_kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained
    return model_class.from_pretrained(
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`

while loading with T5ForConditionalGeneration, an error is thrown:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 310, in infer_framework_load_model
    model = model_class.from_pretrained(model, **fp32_kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`


Background task completed successfully for job_id: 698e3f44-fd3d-48b3-aee6-db67a64b8361
Background task started for job_id: fae0f26d-3e7a-4d00-92da-16a64fecb19eINFO:     10.16.23.21:50073 - "POST /generate_patient_summary?stream=true HTTP/1.1" 200 OK

2025-10-09 06:46:37,228 - INFO - EHR API response status: 200
Step 1 - EHR fetch took 0.80s
Step 2 - Processing took 0.00s
Step 3 - Computing baseline took 0.00s
📝 SUMMARIZATION MODE: facebook/bart-large-cnn
2025-10-09 06:46:37,234 - INFO - Loading Transformers model: facebook/bart-large-cnn (summarization)
2025-10-09 06:46:37,558 - WARNING - AutoModelForSeq2SeqLM failed for facebook/bart-large-cnn: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`
2025-10-09 06:46:38,088 - INFO - Loaded facebook/bart-large-cnn using AutoModel fallback
Device set to use cuda:0
2025-10-09 06:46:38,775 - WARNING - Pipeline creation failed with device 0, trying CPU: 'SummarizationPipeline' object has no attribute 'assistant_model'
Device set to use cuda:0
2025-10-09 06:46:38,775 - ERROR - Failed to load Transformers model: 'SummarizationPipeline' object has no attribute 'assistant_model'
2025-10-09 06:46:38,775 - ERROR - Failed to create model loader for facebook/bart-large-cnn (summarization): Transformers model loading failed: 'SummarizationPipeline' object has no attribute 'assistant_model'
Unified manager load failed for summarization, falling back: Model loading failed for facebook/bart-large-cnn (summarization): Transformers model loading failed: 'SummarizationPipeline' object has no attribute 'assistant_model'
Both `device` and `device_map` are specified. `device` will override `device_map`. You will most likely encounter unexpected behavior. Please remove `device` and keep `device_map`.
Memory cleanup completed. Current usage: 9365.8 MB
2025-10-09 06:46:39,084 - ERROR - Error details: ValueError: Could not load model Falconsai/medical_summarization with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForSeq2SeqLM'>, <class 'transformers.models.t5.modeling_t5.T5ForConditionalGeneration'>). See the original errors:

while loading with AutoModelForSeq2SeqLM, an error is thrown:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained
    return model_class.from_pretrained(
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 310, in infer_framework_load_model
    model = model_class.from_pretrained(model, **fp32_kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained
    return model_class.from_pretrained(
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`

while loading with T5ForConditionalGeneration, an error is thrown:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 310, in infer_framework_load_model
    model = model_class.from_pretrained(model, **fp32_kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`


Background task completed successfully for job_id: fae0f26d-3e7a-4d00-92da-16a64fecb19e
Background task started for job_id: 29410a9f-c9cc-42ea-8188-611bacc0ce86INFO:     10.16.14.52:39092 - "POST /generate_patient_summary?stream=true HTTP/1.1" 200 OK

2025-10-09 06:47:41,657 - INFO - EHR API response status: 200
2025-10-09 06:47:41,661 - INFO - Loading Transformers model: google/flan-t5-large (summarization)
Step 1 - EHR fetch took 0.81s
Step 2 - Processing took 0.00s
Step 3 - Computing baseline took 0.00s
📝 SUMMARIZATION MODE: google/flan-t5-large
2025-10-09 06:47:41,896 - WARNING - AutoModelForSeq2SeqLM failed for google/flan-t5-large: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`
2025-10-09 06:47:43,725 - INFO - Loaded google/flan-t5-large using AutoModel fallback
Device set to use cuda:0
Unified manager load failed for summarization, falling back: Model loading failed for google/flan-t5-large (summarization): Transformers model loading failed: 'SummarizationPipeline' object has no attribute 'assistant_model'
2025-10-09 06:47:50,211 - WARNING - Pipeline creation failed with device 0, trying CPU: 'SummarizationPipeline' object has no attribute 'assistant_model'
Device set to use cuda:0
2025-10-09 06:47:50,213 - ERROR - Failed to load Transformers model: 'SummarizationPipeline' object has no attribute 'assistant_model'
2025-10-09 06:47:50,213 - ERROR - Failed to create model loader for google/flan-t5-large (summarization): Transformers model loading failed: 'SummarizationPipeline' object has no attribute 'assistant_model'
Both `device` and `device_map` are specified. `device` will override `device_map`. You will most likely encounter unexpected behavior. Please remove `device` and keep `device_map`.
Memory cleanup completed. Current usage: 7539.6 MB
2025-10-09 06:47:50,484 - ERROR - Error details: ValueError: Could not load model Falconsai/medical_summarization with any of the following classes: (<class 'transformers.models.auto.modeling_auto.AutoModelForSeq2SeqLM'>, <class 'transformers.models.t5.modeling_t5.T5ForConditionalGeneration'>). See the original errors:

while loading with AutoModelForSeq2SeqLM, an error is thrown:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained
    return model_class.from_pretrained(
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 310, in infer_framework_load_model
    model = model_class.from_pretrained(model, **fp32_kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 600, in from_pretrained
    return model_class.from_pretrained(
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`

while loading with T5ForConditionalGeneration, an error is thrown:
Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 292, in infer_framework_load_model
    model = model_class.from_pretrained(model, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/venv/lib/python3.10/site-packages/transformers/pipelines/base.py", line 310, in infer_framework_load_model
    model = model_class.from_pretrained(model, **fp32_kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 311, in _wrapper
    return func(*args, **kwargs)
  File "/opt/venv/lib/python3.10/site-packages/transformers/modeling_utils.py", line 4546, in from_pretrained
    raise ValueError(
ValueError: Using a `device_map`, `tp_plan`, `torch.device` context manager or setting `torch.set_default_device(device)` requires `accelerate`. You can install it with `pip install accelerate`


Background task completed successfully for job_id: 29410a9f-c9cc-42ea-8188-611bacc0ce86
[✅ SUCCESS] Text-generation | TIMEOUT_MODE: normal | TOTAL: 312.7s
Background task completed successfully for job_id: 9cfb2d70-505a-40d9-b8e9-c0b433b01da4
Background task started for job_id: baa38ef8-4af1-4aa3-9590-5fa3640147c4INFO:     10.16.14.52:20443 - "POST /generate_patient_summary?stream=true HTTP/1.1" 200 OK

2025-10-09 06:49:01,217 - INFO - EHR API response status: 200
Step 1 - EHR fetch took 0.85s
2025-10-09 06:49:01,221 - INFO - Loading Transformers model: patrickvonplaten/longformer2roberta-cnn_dailymail-fp16 (seq2seq)
Step 2 - Processing took 0.00s
Step 3 - Computing baseline took 0.00s
🔄 SEQ2SEQ MODE: patrickvonplaten/longformer2roberta-cnn_dailymail-fp16
2025-10-09 06:49:01,433 - ERROR - Failed to load Transformers model: The checkpoint you are trying to load has model type `encoder_decoder` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`
2025-10-09 06:49:01,433 - ERROR - Failed to create model loader for patrickvonplaten/longformer2roberta-cnn_dailymail-fp16 (seq2seq): Transformers model loading failed: The checkpoint you are trying to load has model type `encoder_decoder` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`
Seq2Seq model failed: Model loading failed for patrickvonplaten/longformer2roberta-cnn_dailymail-fp16 (seq2seq): Transformers model loading failed: The checkpoint you are trying to load has model type `encoder_decoder` but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

You can update Transformers with the command `pip install --upgrade transformers`. If this does not work, and the checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get the most up-to-date code by installing Transformers from source with the command `pip install git+https://github.com/huggingface/transformers.git`
Background task completed successfully for job_id: baa38ef8-4af1-4aa3-9590-5fa3640147c4
INFO:     10.16.23.21:18218 - "GET / HTTP/1.1" 200 OK
INFO:     Shutting down
INFO:     Waiting for application shutdown.
2025-10-09 06:50:52,880 - INFO - Shutting down FastAPI application
2025-10-09 06:50:52,880 - INFO - Scalable components shutdown complete
INFO:     Application shutdown complete.
INFO:     Finished server process [1]