Update Dockerfiles to use `asgi:app` as the entry point, resolving deployment issues caused by the removal of `app.py`. This change ensures compatibility with the new structure and improves initialization for production environments.
Remove legacy `app.py` file and streamline startup process for Hugging Face Spaces. Refactor `start_hf_spaces.py` to simplify environment setup and application initialization. Enhance `ai_med_extract.app` with improved logging and error handling during app creation and agent initialization. Update route registration in `routes_fastapi.py` for better organization and clarity.
Revert "Refactor `routes_fastapi.py` to enhance performance and maintainability. Introduced `CacheManager`, `ErrorResponseBuilder`, and `PerformanceTracker` for optimized caching, consistent error handling, and improved performance metrics. Updated logging to use safe methods, eliminated redundant code, and maintained backward compatibility. Overall, these changes streamline the patient summary generation process and improve error visibility."
Revert "Refactor `build_result_dict` function by moving it to `routes_helpers.py` to eliminate duplication and improve code organization. Updated timing calculations for better precision and added prompt information handling. This change enhances maintainability and streamlines the result building process."
Refactor `build_result_dict` function by moving it to `routes_helpers.py` to eliminate duplication and improve code organization. Updated timing calculations for better precision and added prompt information handling. This change enhances maintainability and streamlines the result building process.
Refactor `routes_fastapi.py` to enhance performance and maintainability. Introduced `CacheManager`, `ErrorResponseBuilder`, and `PerformanceTracker` for optimized caching, consistent error handling, and improved performance metrics. Updated logging to use safe methods, eliminated redundant code, and maintained backward compatibility. Overall, these changes streamline the patient summary generation process and improve error visibility.
Refactor patient summary processing to improve job status updates. Removed redundant progress updates and ensured accurate visit count reporting after data parsing and computation steps. Enhanced error handling and streamlined the workflow for better maintainability.
Enhance patient summary generation with improved progress updates and error handling. Updated SSEGenerator to ensure frequent data transmission, preventing HTTP/2 protocol errors. Refined job status monitoring and heartbeat intervals for better connection stability during long-running tasks. Enhanced user feedback with detailed progress messages throughout the generation process.
Implement timeout protection and progress updates for patient summary generation. Enhanced error handling for both text generation and summarization processes, ensuring robust job management and improved user feedback during long-running tasks. Updated request queue management to handle job IDs more flexibly, allowing for better tracking and processing of requests.
Enhance patient summary processing with queue management and improved error handling. Introduced a queue manager to handle request slots, ensuring efficient processing and timeout management. Updated background task logic to include performance metrics and detailed error responses, enhancing overall reliability and maintainability of the patient summary generation workflow.
Enhance SSEGenerator job monitoring and error handling. Introduced a mechanism to wait for job creation before erroring out, improved timeout handling to send warnings instead of stopping processing, and adjusted max wait times for operations. Updated heartbeat and progress reporting to ensure more reliable streaming responses.
Refactor streaming response handling in patient summary generation to utilize a centralized SSE generator service. This change simplifies the code by removing custom streaming logic, enhances job status monitoring, and improves error handling. The job management process is also streamlined for better maintainability and performance.
Refactor patient summary generation to enhance performance and reliability. Key improvements include a centralized job management service, standardized error handling, and optimized SSE generation. Introduced new constants for data size thresholds and chunking configurations, ensuring better maintainability and scalability. All changes maintain backward compatibility and improve overall code quality.
Refactor PyTorch compatibility handling by centralizing the RMSNorm patch into a dedicated utility function. This ensures consistent application across modules and improves maintainability. Update logging to reflect the new approach.
Implement RMSNorm patch for PyTorch in ai_med_extract modules to ensure compatibility with models like Phi-3, enhancing tensor normalization functionality and logging.
Remove obsolete .pyc files and add RMSNorm compatibility patch for PyTorch in model_loader_spaces.py, enhancing error handling and fallback mechanisms for model loading.
Revert "Refactor async_patient_summary to unify model selection and enhance summary generation. Introduce robust fallback mechanisms for model types, including support for summarization, seq2seq, gguf, and causal-openvino. Improve logging and error handling for better diagnostics during summary generation."
Refactor async_patient_summary to unify model selection and enhance summary generation. Introduce robust fallback mechanisms for model types, including support for summarization, seq2seq, gguf, and causal-openvino. Improve logging and error handling for better diagnostics during summary generation.
Refactor text generation in routes_fastapi.py to return raw summaries instead of formatted markdown. Remove unnecessary markdown processing functions and streamline summary handling, enhancing performance and clarity in the output structure.
Enhance caching behavior in text generation processes across multiple files. Update patient_summary_agent.py and routes_fastapi.py to ensure proper dynamic cache handling, preventing stale cache issues during single generations. Modify model_loader_spaces.py and unified_model_manager.py to explicitly manage cache settings based on model capabilities, improving overall generation reliability. Update binary files in __pycache__ directories.
Refactor memory management and logging in routes_fastapi.py to enhance monitoring and prevent leaks. Introduce helper functions for safe logging and streamline text generation processes. Update cleanup_memory function to provide detailed memory usage metrics and warnings for high usage scenarios, improving overall performance and reliability.
Update requirements to pin transformers version and modify caching behavior for OpenVINO models. Adjust logic in routes_fastapi.py to disable cache for compatibility with newer transformers, ensuring stability in model generation processes.
Refactor text generation handling in OpenVinoPipeline to prioritize max_new_tokens over max_length, ensuring proper token management for causal models.
Refactor caching behavior in model configuration and pipeline to prevent DynamicCache errors. Set use_cache to None for model's default handling and update related settings in TransformersModel generation parameters.