Spaces:

salvinjose
/

HNTAI

Paused

App Files Files Community

HNTAI

Commit History

Remove obsolete .pyc files and add RMSNorm compatibility patch for PyTorch in model_loader_spaces.py, enhancing error handling and fallback mechanisms for model loading.

7fcf280

sachinchandrankallar commited on Nov 6, 2025

Revert "Add RMSNorm implementation to torch if missing for compatibility with HF models"

9c0908e

sachinchandrankallar commited on Nov 6, 2025

Revert "Refactor async_patient_summary to unify model selection and enhance summary generation. Introduce robust fallback mechanisms for model types, including support for summarization, seq2seq, gguf, and causal-openvino. Improve logging and error handling for better diagnostics during summary generation."

03de9e2

sachinchandrankallar commited on Nov 6, 2025

Refactor async_patient_summary to unify model selection and enhance summary generation. Introduce robust fallback mechanisms for model types, including support for summarization, seq2seq, gguf, and causal-openvino. Improve logging and error handling for better diagnostics during summary generation.

d288f98

sachinchandrankallar commited on Nov 6, 2025

Add RMSNorm implementation to torch if missing for compatibility with HF models

a127f51

sachinchandrankallar commited on Nov 6, 2025

Revert "requirements update"

3463c83

sachinchandrankallar commited on Nov 6, 2025

Revert "test"

54d246e

sachinchandrankallar commited on Nov 6, 2025

test

c8449e4

sachinchandrankallar commited on Nov 6, 2025

requirements update

1e7907f

sachinchandrankallar commited on Nov 6, 2025

refactor

5b000dc

sachinchandrankallar commited on Nov 5, 2025

Refactor text generation in routes_fastapi.py to return raw summaries instead of formatted markdown. Remove unnecessary markdown processing functions and streamline summary handling, enhancing performance and clarity in the output structure.

6aa6b6a

sachinchandrankallar commited on Nov 3, 2025

Enhance caching behavior in text generation processes across multiple files. Update patient_summary_agent.py and routes_fastapi.py to ensure proper dynamic cache handling, preventing stale cache issues during single generations. Modify model_loader_spaces.py and unified_model_manager.py to explicitly manage cache settings based on model capabilities, improving overall generation reliability. Update binary files in pycache directories.

6202dd0

sachinchandrankallar commited on Nov 3, 2025

Revert "merge conflicts"

6c585d3

sachinchandrankallar commited on Nov 3, 2025

merge conflicts

3de57f3

sachinchandrankallar commited on Oct 31, 2025

Refactor memory management and logging in routes_fastapi.py to enhance monitoring and prevent leaks. Introduce helper functions for safe logging and streamline text generation processes. Update cleanup_memory function to provide detailed memory usage metrics and warnings for high usage scenarios, improving overall performance and reliability.

83603a0

sachinchandrankallar commited on Oct 31, 2025

Update requirements to pin transformers version and modify caching behavior for OpenVINO models. Adjust logic in routes_fastapi.py to disable cache for compatibility with newer transformers, ensuring stability in model generation processes.

16be7d2

sachinchandrankallar commited on Oct 31, 2025

Revert "change routes_fastapi.py to use unifiedmodel loader for everywhere"

29a03e3

sachinchandrankallar commited on Oct 30, 2025

change routes_fastapi.py to use unifiedmodel loader for everywhere

c6110a9

sachinchandrankallar commited on Oct 30, 2025

fix

b2ba308

sachinchandrankallar commited on Oct 30, 2025

map casual openvino to text generation

ebbec22

sachinchandrankallar commited on Oct 30, 2025

Refactor text generation logic to utilize unified model manager and improve model loading and configuration

bc359db

sachinchandrankallar commited on Oct 30, 2025

Update transformers version to 4.57.1 for improved compatibility and features

bd0be8c

sachinchandrankallar commited on Oct 30, 2025

Update transformers version to 4.57.1 for improved compatibility and features

18d3466

sachinchandrankallar commited on Oct 30, 2025

Revert "Pin transformers version to 4.37.0 for compatibility with existing dependencies"

c729e2e

sachinchandrankallar commited on Oct 30, 2025

Pin transformers version to 4.37.0 for compatibility with existing dependencies

0f74ef3

sachinchandrankallar commited on Oct 30, 2025

Enhance caching and chunking mechanisms in PatientSummarizerAgent for improved performance and reliability

38e2f33

sachinchandrankallar commited on Oct 30, 2025

Refactor caching configuration in OpenVinoPipeline to allow models to manage their own caching behavior, improving compatibility and flexibility.

be33ded

sachinchandrankallar commited on Oct 30, 2025

Revert "Enhance error logging in model generation and pipeline handling to improve debugging capabilities for DynamicCache and GGUF wrapper failures."

1666dba

sachinchandrankallar commited on Oct 30, 2025

Enhance error logging in model generation and pipeline handling to improve debugging capabilities for DynamicCache and GGUF wrapper failures.

7f933a5

sachinchandrankallar commited on Oct 30, 2025

Refactor text generation handling in OpenVinoPipeline to prioritize max_new_tokens over max_length, ensuring proper token management for causal models.

871e862

sachinchandrankallar commited on Oct 30, 2025

Update caching behavior in model configuration to use None for use_cache, allowing the model to manage caching dynamically.

12df82a

sachinchandrankallar commited on Oct 30, 2025

Refactor caching behavior in model configuration and pipeline to prevent DynamicCache errors. Set use_cache to None for model's default handling and update related settings in TransformersModel generation parameters.

2b5dd8c

sachinchandrankallar commited on Oct 30, 2025

revert

bb973cb

sachinchandrankallar commited on Oct 30, 2025

transformers downgraded

195c13e

sachinchandrankallar commited on Oct 30, 2025

Add cache configuration and max length handling in OpenVinoPipeline

b92b395

sachinchandrankallar commited on Oct 30, 2025

Revert "Enhance model configuration and unified model manager to improve performance. Update max_length and max_new_tokens for consistency, and explicitly disable cache to prevent DynamicCache errors. Add logger import in FastAPI routes for better logging capabilities."

6303241

sachinchandrankallar commited on Oct 30, 2025

Enhance model configuration and unified model manager to improve performance. Update max_length and max_new_tokens for consistency, and explicitly disable cache to prevent DynamicCache errors. Add logger import in FastAPI routes for better logging capabilities.

28d1689

sachinchandrankallar commited on Oct 30, 2025

Implement Hugging Face Spaces configuration and memory management utilities. Enhance model loading and cleanup processes, enabling optimized deployment on HF Spaces. Update memory optimization settings and model configurations for improved performance and resource management.

b190ecb

sachinchandrankallar commited on Oct 30, 2025

Refactor patient summary generation to standardize custom prompt formatting. Update logic to ensure consistent structure across different modes, enhancing clarity and usability in generating comprehensive summaries. Adjust context handling to align with expected input formats for summarization models.

bcaa540

sachinchandrankallar commited on Oct 29, 2025

Enhance patient summary generation by introducing support for custom prompts. Modify the processing logic to append visit data when a custom prompt is provided, improving flexibility and user experience in generating patient summaries. Update related sections to ensure consistent handling of prompts across different modes.

7be0e14

sachinchandrankallar commited on Oct 29, 2025

Refactor patient summary generation to support a flexible structure, allowing for comprehensive summaries without enforcing fixed sections. Update related methods and prompts to enhance clarity and usability. Improve error handling and logging for summary generation processes, ensuring better performance and user experience.

2fb6319

sachinchandrankallar commited on Oct 29, 2025

Remove obsolete documentation and test files related to GGUF operations, streaming fixes, and device parameter handling. This cleanup enhances project maintainability by eliminating unused code and files that are no longer relevant to the current implementation.

8012840

sachinchandrankallar commited on Oct 29, 2025

Enhance patient summary generation with optimized parallel processing and intelligent chunking for large datasets. Introduce extended timeout configurations for complex cases, improving error handling and logging. Update API endpoints for large data processing and streaming, ensuring better performance and user experience. Refactor model loading to support OpenVINO and standard transformers with improved fallback strategies.

992b8bf

sachinchandrankallar commited on Oct 28, 2025

Refactor model management by replacing the legacy model manager with a unified model manager across the application. Update imports and method calls to ensure compatibility with the new structure. Enhance error handling and logging for model loading processes, improving overall performance and maintainability.

416c047

sachinchandrankallar commited on Oct 23, 2025

Refactor summarizer pipeline creation and enhance model loading for HF Spaces compatibility. Introduce a unified approach for model management, including new user models endpoint and improved error handling. Update model configurations and logging for better monitoring during model loading processes.

a5e6a2d

sachinchandrankallar commited on Oct 22, 2025

Implement global exception handling and memory-aware logging across the application. Introduce logging enhancements in the AI service to capture memory snapshots during errors and key operations. Update middleware for request/response logging and improve model loading with detailed progress updates. Refactor patient summary generation to include concise logging for each step, ensuring better monitoring and error handling.

117f00b

sachinchandrankallar commited on Oct 22, 2025

Remove 'Connection: keep-alive' header from event-stream response in patient summary generation. Update binary cache files for model configurations and loaders.

618340b

sachinchandrankallar commited on Oct 17, 2025

Revert "requirements chande"

b7ad04a

sachinchandrankallar commited on Oct 17, 2025

requirements chande

544efc0

sachinchandrankallar commited on Oct 17, 2025

Enhance GGUF model loading and generation process with improved progress updates and logging. Updated job status messages to include visual indicators for different stages of model loading and text generation. Streamlined the use of extended streaming for all requests to prevent timeout issues, ensuring a more responsive user experience.

8a71d89

sachinchandrankallar commited on Oct 17, 2025

Commit History

Remove obsolete .pyc files and add RMSNorm compatibility patch for PyTorch in model_loader_spaces.py, enhancing error handling and fallback mechanisms for model loading. 7fcf280

Revert "Add RMSNorm implementation to torch if missing for compatibility with HF models" 9c0908e

Add RMSNorm implementation to torch if missing for compatibility with HF models a127f51

Revert "requirements update" 3463c83

Revert "test" 54d246e

test c8449e4

requirements update 1e7907f

refactor 5b000dc

Refactor text generation in routes_fastapi.py to return raw summaries instead of formatted markdown. Remove unnecessary markdown processing functions and streamline summary handling, enhancing performance and clarity in the output structure. 6aa6b6a

Revert "merge conflicts" 6c585d3

merge conflicts 3de57f3

Update requirements to pin transformers version and modify caching behavior for OpenVINO models. Adjust logic in routes_fastapi.py to disable cache for compatibility with newer transformers, ensuring stability in model generation processes. 16be7d2

Revert "change routes_fastapi.py to use unifiedmodel loader for everywhere" 29a03e3

change routes_fastapi.py to use unifiedmodel loader for everywhere c6110a9

fix b2ba308

map casual openvino to text generation ebbec22

Refactor text generation logic to utilize unified model manager and improve model loading and configuration bc359db

Update transformers version to 4.57.1 for improved compatibility and features bd0be8c

Update transformers version to 4.57.1 for improved compatibility and features 18d3466

Revert "Pin transformers version to 4.37.0 for compatibility with existing dependencies" c729e2e

Pin transformers version to 4.37.0 for compatibility with existing dependencies 0f74ef3

Enhance caching and chunking mechanisms in PatientSummarizerAgent for improved performance and reliability 38e2f33

Refactor caching configuration in OpenVinoPipeline to allow models to manage their own caching behavior, improving compatibility and flexibility. be33ded

Revert "Enhance error logging in model generation and pipeline handling to improve debugging capabilities for DynamicCache and GGUF wrapper failures." 1666dba

Enhance error logging in model generation and pipeline handling to improve debugging capabilities for DynamicCache and GGUF wrapper failures. 7f933a5

Refactor text generation handling in OpenVinoPipeline to prioritize max_new_tokens over max_length, ensuring proper token management for causal models. 871e862

Update caching behavior in model configuration to use None for use_cache, allowing the model to manage caching dynamically. 12df82a

Refactor caching behavior in model configuration and pipeline to prevent DynamicCache errors. Set use_cache to None for model's default handling and update related settings in TransformersModel generation parameters. 2b5dd8c

revert bb973cb

transformers downgraded 195c13e

Add cache configuration and max length handling in OpenVinoPipeline b92b395

Revert "Enhance model configuration and unified model manager to improve performance. Update max_length and max_new_tokens for consistency, and explicitly disable cache to prevent DynamicCache errors. Add logger import in FastAPI routes for better logging capabilities." 6303241

Enhance model configuration and unified model manager to improve performance. Update max_length and max_new_tokens for consistency, and explicitly disable cache to prevent DynamicCache errors. Add logger import in FastAPI routes for better logging capabilities. 28d1689

Implement Hugging Face Spaces configuration and memory management utilities. Enhance model loading and cleanup processes, enabling optimized deployment on HF Spaces. Update memory optimization settings and model configurations for improved performance and resource management. b190ecb

Remove obsolete documentation and test files related to GGUF operations, streaming fixes, and device parameter handling. This cleanup enhances project maintainability by eliminating unused code and files that are no longer relevant to the current implementation. 8012840

Remove 'Connection: keep-alive' header from event-stream response in patient summary generation. Update binary cache files for model configurations and loaders. 618340b

Revert "requirements chande" b7ad04a

requirements chande 544efc0

Remove obsolete .pyc files and add RMSNorm compatibility patch for PyTorch in model_loader_spaces.py, enhancing error handling and fallback mechanisms for model loading.

7fcf280

Revert "Add RMSNorm implementation to torch if missing for compatibility with HF models"

9c0908e

Add RMSNorm implementation to torch if missing for compatibility with HF models

a127f51

Revert "requirements update"

3463c83

Revert "test"

54d246e

test

c8449e4

requirements update

1e7907f

refactor

5b000dc

Refactor text generation in routes_fastapi.py to return raw summaries instead of formatted markdown. Remove unnecessary markdown processing functions and streamline summary handling, enhancing performance and clarity in the output structure.

6aa6b6a

Revert "merge conflicts"

6c585d3

merge conflicts

3de57f3

Update requirements to pin transformers version and modify caching behavior for OpenVINO models. Adjust logic in routes_fastapi.py to disable cache for compatibility with newer transformers, ensuring stability in model generation processes.

16be7d2

Revert "change routes_fastapi.py to use unifiedmodel loader for everywhere"

29a03e3

change routes_fastapi.py to use unifiedmodel loader for everywhere

c6110a9

fix

b2ba308

map casual openvino to text generation

ebbec22

Refactor text generation logic to utilize unified model manager and improve model loading and configuration

bc359db

Update transformers version to 4.57.1 for improved compatibility and features

bd0be8c

Update transformers version to 4.57.1 for improved compatibility and features

18d3466

Revert "Pin transformers version to 4.37.0 for compatibility with existing dependencies"

c729e2e

Pin transformers version to 4.37.0 for compatibility with existing dependencies

0f74ef3

Enhance caching and chunking mechanisms in PatientSummarizerAgent for improved performance and reliability

38e2f33

Refactor caching configuration in OpenVinoPipeline to allow models to manage their own caching behavior, improving compatibility and flexibility.

be33ded

Revert "Enhance error logging in model generation and pipeline handling to improve debugging capabilities for DynamicCache and GGUF wrapper failures."

1666dba

Enhance error logging in model generation and pipeline handling to improve debugging capabilities for DynamicCache and GGUF wrapper failures.

7f933a5

Refactor text generation handling in OpenVinoPipeline to prioritize max_new_tokens over max_length, ensuring proper token management for causal models.

871e862

Update caching behavior in model configuration to use None for use_cache, allowing the model to manage caching dynamically.

12df82a

Refactor caching behavior in model configuration and pipeline to prevent DynamicCache errors. Set use_cache to None for model's default handling and update related settings in TransformersModel generation parameters.

2b5dd8c

revert

bb973cb

transformers downgraded

195c13e

Add cache configuration and max length handling in OpenVinoPipeline

b92b395

Revert "Enhance model configuration and unified model manager to improve performance. Update max_length and max_new_tokens for consistency, and explicitly disable cache to prevent DynamicCache errors. Add logger import in FastAPI routes for better logging capabilities."

6303241

Enhance model configuration and unified model manager to improve performance. Update max_length and max_new_tokens for consistency, and explicitly disable cache to prevent DynamicCache errors. Add logger import in FastAPI routes for better logging capabilities.

28d1689

Implement Hugging Face Spaces configuration and memory management utilities. Enhance model loading and cleanup processes, enabling optimized deployment on HF Spaces. Update memory optimization settings and model configurations for improved performance and resource management.

b190ecb

Remove obsolete documentation and test files related to GGUF operations, streaming fixes, and device parameter handling. This cleanup enhances project maintainability by eliminating unused code and files that are no longer relevant to the current implementation.

8012840

Remove 'Connection: keep-alive' header from event-stream response in patient summary generation. Update binary cache files for model configurations and loaders.

618340b

Revert "requirements chande"

b7ad04a

requirements chande

544efc0