sachinchandrankallar commited on
Commit
2aa5eb6
·
1 Parent(s): 4182313

Revert "changes"

Browse files

This reverts commit 4182313503b52e3b5687b6592adb1dcdd8d6f43d.

Dockerfile CHANGED
@@ -5,7 +5,6 @@ RUN apt-get update && apt-get install -y \
5
  tesseract-ocr \
6
  poppler-utils \
7
  ffmpeg \
8
- git \
9
  && rm -rf /var/lib/apt/lists/*
10
 
11
  # Set working directory
@@ -15,8 +14,7 @@ WORKDIR /app
15
  COPY requirements.txt .
16
 
17
  # Install Python dependencies
18
- RUN pip install --no-cache-dir -r requirements.txt && \
19
- pip install --no-cache-dir modelscope==1.9.5 qwen2==0.1.0
20
 
21
  # Copy application code
22
  COPY . .
@@ -33,7 +31,6 @@ ENV XDG_CACHE_HOME=/tmp
33
  ENV TORCH_HOME=/tmp/torch
34
  ENV WHISPER_CACHE=/tmp/whisper
35
  ENV PYTHONPATH=/app
36
- ENV MODELSCOPE_CACHE=/tmp/huggingface
37
 
38
  # Expose port
39
  EXPOSE 7860
 
5
  tesseract-ocr \
6
  poppler-utils \
7
  ffmpeg \
 
8
  && rm -rf /var/lib/apt/lists/*
9
 
10
  # Set working directory
 
14
  COPY requirements.txt .
15
 
16
  # Install Python dependencies
17
+ RUN pip install --no-cache-dir -r requirements.txt
 
18
 
19
  # Copy application code
20
  COPY . .
 
31
  ENV TORCH_HOME=/tmp/torch
32
  ENV WHISPER_CACHE=/tmp/whisper
33
  ENV PYTHONPATH=/app
 
34
 
35
  # Expose port
36
  EXPOSE 7860
README_SPACES.md CHANGED
@@ -44,138 +44,3 @@ This Hugging Face Space provides an AI-powered medical document processing syste
44
  ## Privacy
45
 
46
  All processing is done securely within the Hugging Face Space environment. No data is stored permanently.
47
-
48
- # Medical Data Extraction API - Hugging Face Spaces Deployment
49
-
50
- This document provides instructions for deploying the Medical Data Extraction API on Hugging Face Spaces.
51
-
52
- ## Prerequisites
53
-
54
- 1. A Hugging Face account
55
- 2. Git installed on your local machine
56
- 3. Docker installed on your local machine (for local testing)
57
-
58
- ## Local Testing
59
-
60
- Before deploying to Spaces, you can test the application locally:
61
-
62
- 1. Build the Docker image:
63
- ```bash
64
- docker build -t ai-med-extract .
65
- ```
66
-
67
- 2. Run the container:
68
- ```bash
69
- docker run -p 7860:7860 ai-med-extract
70
- ```
71
-
72
- 3. Test the API endpoints:
73
- ```bash
74
- # Health check
75
- curl http://localhost:7860/health
76
-
77
- # Extract medical data
78
- curl -X POST http://localhost:7860/api/extract_medical_data \
79
- -H "Content-Type: application/json" \
80
- -d '{"text": "Patient presents with fever and cough..."}'
81
-
82
- # Transcribe audio
83
- curl -X POST http://localhost:7860/api/voice_to_text_extraction \
84
- -F "audio=@recording.wav"
85
- ```
86
-
87
- ## Deploying to Hugging Face Spaces
88
-
89
- 1. Create a new Space:
90
- - Go to huggingface.co/spaces
91
- - Click "Create new Space"
92
- - Choose "Docker" as the SDK
93
- - Name your space (e.g., "medical-data-extraction")
94
- - Set visibility (Public or Private)
95
-
96
- 2. Connect your repository:
97
- - Push your code to a Git repository
98
- - In the Space settings, connect to your repository
99
- - The Space will automatically build using the Dockerfile
100
-
101
- 3. Monitor the deployment:
102
- - Check the "Logs" tab for build progress
103
- - Use the health check endpoint to verify the deployment:
104
- ```
105
- https://your-space-name.hf.space/health
106
- ```
107
-
108
- ## API Endpoints
109
-
110
- 1. Health Check:
111
- ```
112
- GET /health
113
- ```
114
-
115
- 2. Extract Medical Data:
116
- ```
117
- POST /api/extract_medical_data
118
- Content-Type: application/json
119
- {
120
- "text": "Patient presents with..."
121
- }
122
- ```
123
-
124
- 3. Generate Summary:
125
- ```
126
- POST /api/summary_creation
127
- Content-Type: application/json
128
- {
129
- "text": "Patient presents with..."
130
- }
131
- ```
132
-
133
- 4. Transcribe Audio:
134
- ```
135
- POST /api/voice_to_text_extraction
136
- Content-Type: multipart/form-data
137
- audio: [audio file]
138
- ```
139
-
140
- 5. Audio to Chart:
141
- ```
142
- POST /api/audio_to_chart
143
- Content-Type: multipart/form-data
144
- audio: [audio file]
145
- ```
146
-
147
- ## Environment Variables
148
-
149
- The following environment variables are automatically set in the Dockerfile:
150
- - `PYTHONUNBUFFERED=1`
151
- - `HF_HOME=/tmp/huggingface`
152
- - `TRANSFORMERS_CACHE=/tmp/huggingface`
153
- - `XDG_CACHE_HOME=/tmp`
154
- - `TORCH_HOME=/tmp/torch`
155
- - `WHISPER_CACHE=/tmp/whisper`
156
- - `PYTHONPATH=/app`
157
- - `MODELSCOPE_CACHE=/tmp/huggingface`
158
-
159
- ## Troubleshooting
160
-
161
- 1. If models fail to load:
162
- - Check the logs in the Spaces interface
163
- - Verify disk space using the health check endpoint
164
- - Ensure all dependencies are correctly specified in requirements.txt
165
-
166
- 2. If audio transcription fails:
167
- - Verify the audio file format (supported: wav, mp3, m4a, ogg)
168
- - Check file size (max 100MB)
169
- - Ensure the Whisper model is loaded (check health endpoint)
170
-
171
- 3. If CORS errors occur:
172
- - Verify the frontend URL is correctly configured
173
- - Check the CORS settings in app.py
174
- - Ensure proper headers are set in the frontend requests
175
-
176
- ## Support
177
-
178
- For issues or questions:
179
- 1. Check the application logs in the Spaces interface
180
- 2. Use the health check endpoint to diagnose problems
181
- 3. Review the error messages in the API responses
 
44
  ## Privacy
45
 
46
  All processing is done securely within the Hugging Face Space environment. No data is stored permanently.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ai_med_extract/api/routes.py CHANGED
@@ -757,168 +757,3 @@ def register_routes(app, agents):
757
  @app.route("/")
758
  def home():
759
  return "Medical Data Extraction API is running!", 200
760
-
761
- @app.route("/health", methods=["GET"])
762
- def health_check():
763
- try:
764
- # Check if models are loaded
765
- models_status = {
766
- "medical_data_extractor": MedicalDataExtractorAgent.gen_model_loader._model is not None,
767
- "summarizer": SummarizerAgent.summarization_model_loader._model is not None,
768
- "whisper": whisper_model._model is not None
769
- }
770
-
771
- # Check disk space
772
- import shutil
773
- total, used, free = shutil.disk_usage("/")
774
- disk_space = {
775
- "total_gb": round(total / (1024**3), 2),
776
- "used_gb": round(used / (1024**3), 2),
777
- "free_gb": round(free / (1024**3), 2)
778
- }
779
-
780
- return jsonify({
781
- "status": "healthy",
782
- "models": models_status,
783
- "disk_space": disk_space,
784
- "version": "1.0.0"
785
- })
786
- except Exception as e:
787
- logging.error(f"Health check failed: {str(e)}", exc_info=True)
788
- return jsonify({
789
- "status": "unhealthy",
790
- "error": str(e)
791
- }), 500
792
-
793
- @app.route("/api/extract_medical_data", methods=["POST"])
794
- def extract_medical_data():
795
- try:
796
- data = request.json
797
- if not data or "text" not in data:
798
- return jsonify({"error": "No text provided"}), 400
799
-
800
- text = data["text"]
801
- if not text.strip():
802
- return jsonify({"error": "Empty text provided"}), 400
803
-
804
- # Extract medical data
805
- result = MedicalDataExtractorAgent.extract_medical_data(text)
806
- return jsonify(result)
807
-
808
- except Exception as e:
809
- logging.error(f"Error in extract_medical_data: {str(e)}", exc_info=True)
810
- return jsonify({"error": str(e)}), 500
811
-
812
- @app.route("/api/summary_creation", methods=["POST"])
813
- def generate_summary():
814
- try:
815
- data = request.json
816
- if not data or "text" not in data:
817
- return jsonify({"error": "No text provided"}), 400
818
-
819
- text = data["text"]
820
- if not text.strip():
821
- return jsonify({"error": "Empty text provided"}), 400
822
-
823
- # Generate summary
824
- summary = SummarizerAgent.generate_summary(text)
825
- return jsonify({"summary": summary})
826
-
827
- except Exception as e:
828
- logging.error(f"Error in generate_summary: {str(e)}", exc_info=True)
829
- return jsonify({"error": str(e)}), 500
830
-
831
- @app.route("/api/voice_to_text_extraction", methods=["POST"])
832
- def transcribe_audio():
833
- try:
834
- if "audio" not in request.files:
835
- return jsonify({"error": "No audio file provided"}), 400
836
-
837
- audio_file = request.files["audio"]
838
- if audio_file.filename == "":
839
- return jsonify({"error": "No selected audio file"}), 400
840
-
841
- # Validate file extension
842
- if not allowed_file(audio_file.filename):
843
- return jsonify({
844
- "error": "Unsupported audio format. Allowed formats: wav, mp3, m4a, ogg"
845
- }), 400
846
-
847
- # Check file size
848
- valid_size, error_message = check_file_size(audio_file)
849
- if not valid_size:
850
- return jsonify({"error": error_message}), 400
851
-
852
- # Save audio file temporarily
853
- temp_dir = os.path.join(tempfile.gettempdir(), 'audio_uploads')
854
- os.makedirs(temp_dir, exist_ok=True)
855
- temp_filename = f"{uuid.uuid4()}_{secure_filename(audio_file.filename)}"
856
- temp_path = os.path.join(temp_dir, temp_filename)
857
-
858
- try:
859
- audio_file.save(temp_path)
860
- # Transcribe audio
861
- result = whisper_model.transcribe(temp_path)
862
- return jsonify({"text": result["text"]})
863
- finally:
864
- # Clean up temporary file
865
- if os.path.exists(temp_path):
866
- os.remove(temp_path)
867
-
868
- except Exception as e:
869
- logging.error(f"Error in transcribe_audio: {str(e)}", exc_info=True)
870
- return jsonify({"error": str(e)}), 500
871
-
872
- @app.route("/api/audio_to_chart", methods=["POST"])
873
- def audio_to_chart():
874
- try:
875
- if "audio" not in request.files:
876
- return jsonify({"error": "No audio file provided"}), 400
877
-
878
- audio_file = request.files["audio"]
879
- if audio_file.filename == "":
880
- return jsonify({"error": "No selected audio file"}), 400
881
-
882
- # Validate file extension
883
- if not allowed_file(audio_file.filename):
884
- return jsonify({
885
- "error": "Unsupported audio format. Allowed formats: wav, mp3, m4a, ogg"
886
- }), 400
887
-
888
- # Check file size
889
- valid_size, error_message = check_file_size(audio_file)
890
- if not valid_size:
891
- return jsonify({"error": error_message}), 400
892
-
893
- # Save audio file temporarily
894
- temp_dir = os.path.join(tempfile.gettempdir(), 'audio_uploads')
895
- os.makedirs(temp_dir, exist_ok=True)
896
- temp_filename = f"{uuid.uuid4()}_{secure_filename(audio_file.filename)}"
897
- temp_path = os.path.join(temp_dir, temp_filename)
898
-
899
- try:
900
- audio_file.save(temp_path)
901
- # Transcribe audio
902
- transcription = whisper_model.transcribe(temp_path)
903
- transcribed_text = transcription["text"]
904
-
905
- # Extract medical data from transcription
906
- medical_data = MedicalDataExtractorAgent.extract_medical_data(transcribed_text)
907
-
908
- # Generate summary
909
- summary = SummarizerAgent.generate_summary(transcribed_text)
910
-
911
- return jsonify({
912
- "transcription": transcribed_text,
913
- "medical_data": medical_data,
914
- "summary": summary
915
- })
916
-
917
- finally:
918
- # Clean up temporary file
919
- if os.path.exists(temp_path):
920
- os.remove(temp_path)
921
-
922
- except Exception as e:
923
- logging.error(f"Error in audio_to_chart: {str(e)}", exc_info=True)
924
- return jsonify({"error": str(e)}), 500
 
757
  @app.route("/")
758
  def home():
759
  return "Medical Data Extraction API is running!", 200
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ai_med_extract/app.py CHANGED
@@ -10,7 +10,6 @@ from .agents.phi_scrubber import MedicalTextUtils
10
  from .agents.summarizer import SummarizerAgent
11
  from .agents.medical_data_extractor import MedicalDataExtractorAgent
12
  from .agents.medical_data_extractor import MedicalDocDataExtractorAgent
13
- import torch
14
 
15
 
16
  # Load environment variables
@@ -27,16 +26,7 @@ logging.basicConfig(
27
  )
28
 
29
  app = Flask(__name__)
30
-
31
- # Configure CORS
32
- CORS(app, resources={
33
- r"/api/*": {
34
- "origins": ["*"], # Allow all origins in development
35
- "methods": ["GET", "POST", "OPTIONS"],
36
- "allow_headers": ["Content-Type", "Authorization"],
37
- "max_age": 3600
38
- }
39
- })
40
 
41
  # Configure upload directory
42
  UPLOAD_DIR = '/data/uploads'
@@ -70,37 +60,13 @@ class LazyModelLoader:
70
  if self._model is None:
71
  try:
72
  logging.info(f"Loading {self.model_name}...")
73
-
74
- # Special handling for Qwen2 models
75
- if "qwen2" in self.model_name.lower():
76
- from modelscope import AutoModelForCausalLM, AutoTokenizer
77
- tokenizer = AutoTokenizer.from_pretrained(
78
- self.model_name,
79
- trust_remote_code=True,
80
- cache_dir=os.environ.get('TRANSFORMERS_CACHE', '/tmp/huggingface')
81
- )
82
- model = AutoModelForCausalLM.from_pretrained(
83
- self.model_name,
84
- trust_remote_code=True,
85
- device_map="auto",
86
- torch_dtype=torch.float16,
87
- cache_dir=os.environ.get('TRANSFORMERS_CACHE', '/tmp/huggingface')
88
- )
89
- self._model = pipeline(
90
- task=self.model_type,
91
- model=model,
92
- tokenizer=tokenizer,
93
- device_map="auto",
94
- torch_dtype=torch.float16
95
- )
96
- else:
97
- self._model = pipeline(
98
- task=self.model_type,
99
- model=self.model_name,
100
- trust_remote_code=True,
101
- device_map="auto",
102
- low_cpu_mem_usage=True
103
- )
104
  logging.info(f"Successfully loaded {self.model_name}")
105
  except Exception as e:
106
  if self.fallback_model:
@@ -145,7 +111,7 @@ class WhisperModelLoader:
145
  try:
146
  # Use smaller models for Hugging Face Spaces
147
  medalpaca_model_loader = LazyModelLoader(
148
- "Qwen/Qwen2-7B-Chat", # Using Qwen2 model
149
  "text-generation",
150
  fallback_model="facebook/bart-base" # Fallback model
151
  )
 
10
  from .agents.summarizer import SummarizerAgent
11
  from .agents.medical_data_extractor import MedicalDataExtractorAgent
12
  from .agents.medical_data_extractor import MedicalDocDataExtractorAgent
 
13
 
14
 
15
  # Load environment variables
 
26
  )
27
 
28
  app = Flask(__name__)
29
+ CORS(app)
 
 
 
 
 
 
 
 
 
30
 
31
  # Configure upload directory
32
  UPLOAD_DIR = '/data/uploads'
 
60
  if self._model is None:
61
  try:
62
  logging.info(f"Loading {self.model_name}...")
63
+ self._model = pipeline(
64
+ task=self.model_type,
65
+ model=self.model_name,
66
+ trust_remote_code=True,
67
+ device_map="auto",
68
+ low_cpu_mem_usage=True
69
+ )
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
  logging.info(f"Successfully loaded {self.model_name}")
71
  except Exception as e:
72
  if self.fallback_model:
 
111
  try:
112
  # Use smaller models for Hugging Face Spaces
113
  medalpaca_model_loader = LazyModelLoader(
114
+ "medalpaca/medalpaca-7b", # Smaller model
115
  "text-generation",
116
  fallback_model="facebook/bart-base" # Fallback model
117
  )
ai_med_extract/utils/file_utils.py CHANGED
@@ -5,72 +5,63 @@ import logging
5
  from werkzeug.utils import secure_filename
6
  from flask import current_app
7
 
8
- # Configure logging
9
- logging.basicConfig(level=logging.INFO)
10
- logger = logging.getLogger(__name__)
11
-
12
- # Allowed file extensions
13
- ALLOWED_EXTENSIONS = {
14
- 'pdf': ['pdf'],
15
- 'image': ['png', 'jpg', 'jpeg', 'gif', 'bmp', 'tiff'],
16
- 'audio': ['wav', 'mp3', 'm4a', 'ogg'],
17
- 'document': ['doc', 'docx', 'txt', 'rtf']
18
- }
19
-
20
  MAX_SIZE_PDF_DOCS = 1 * 1024 * 1024 * 1024 # 1GB
21
  MAX_SIZE_IMAGES = 500 * 1024 * 1024 # 500MB
22
  MAX_SIZE_AUDIO = 100 * 1024 * 1024 # 100MB
23
 
24
 
25
- def allowed_file(filename, file_type='audio'):
26
- """Check if the file extension is allowed"""
27
- if not filename:
28
- return False
29
- return '.' in filename and \
30
- filename.rsplit('.', 1)[1].lower() in ALLOWED_EXTENSIONS.get(file_type, [])
31
 
32
 
33
- def check_file_size(file, max_size_mb=100):
34
- """Check if the file size is within limits"""
35
  try:
 
 
 
 
36
  file.seek(0, os.SEEK_END)
37
  size = file.tell()
38
- file.seek(0) # Reset file pointer
39
 
40
- max_size_bytes = max_size_mb * 1024 * 1024
41
- if size > max_size_bytes:
42
- return False, f"File size exceeds {max_size_mb}MB limit"
 
 
 
 
 
 
 
43
  return True, None
44
  except Exception as e:
45
- logger.error(f"Error checking file size: {str(e)}")
46
- return False, "Error checking file size"
47
 
48
 
49
  def save_data_to_storage(filename, data):
50
- """Save extracted data to storage"""
51
  try:
52
- storage_dir = os.path.join(os.getcwd(), 'storage')
53
- os.makedirs(storage_dir, exist_ok=True)
54
-
55
- file_path = os.path.join(storage_dir, f"{secure_filename(filename)}.json")
56
- with open(file_path, 'w') as f:
57
- json.dump(data, f)
58
- return True
59
  except Exception as e:
60
- logger.error(f"Error saving data to storage: {str(e)}")
61
- return False
62
 
63
 
64
  def get_data_from_storage(filename):
65
- """Retrieve data from storage"""
66
  try:
67
- storage_dir = os.path.join(os.getcwd(), 'storage')
68
- file_path = os.path.join(storage_dir, f"{secure_filename(filename)}.json")
69
-
70
- if os.path.exists(file_path):
71
- with open(file_path, 'r') as f:
72
- return json.load(f)
73
- return None
74
  except Exception as e:
75
- logger.error(f"Error retrieving data from storage: {str(e)}")
76
  return None
 
5
  from werkzeug.utils import secure_filename
6
  from flask import current_app
7
 
8
+ ALLOWED_EXTENSIONS = {"pdf", "jpg", "jpeg", "png", "svg", "docx", "doc", "xlsx", "xls", "wav", "mp3", "m4a", "ogg"}
 
 
 
 
 
 
 
 
 
 
 
9
  MAX_SIZE_PDF_DOCS = 1 * 1024 * 1024 * 1024 # 1GB
10
  MAX_SIZE_IMAGES = 500 * 1024 * 1024 # 500MB
11
  MAX_SIZE_AUDIO = 100 * 1024 * 1024 # 100MB
12
 
13
 
14
+ def allowed_file(filename):
15
+ return "." in filename and filename.rsplit(".", 1)[1].lower() in ALLOWED_EXTENSIONS
 
 
 
 
16
 
17
 
18
+ def check_file_size(file):
 
19
  try:
20
+ # Store current position
21
+ current_pos = file.tell()
22
+
23
+ # Check size
24
  file.seek(0, os.SEEK_END)
25
  size = file.tell()
 
26
 
27
+ # Return to original position
28
+ file.seek(current_pos)
29
+
30
+ extension = file.filename.rsplit('.', 1)[-1].lower()
31
+ if extension in {"pdf", "docx"} and size > MAX_SIZE_PDF_DOCS:
32
+ return False, f"File {file.filename} exceeds 1GB size limit"
33
+ elif extension in {"jpg", "jpeg", "png"} and size > MAX_SIZE_IMAGES:
34
+ return False, f"Image {file.filename} exceeds 500MB size limit"
35
+ elif extension in {"wav", "mp3", "m4a", "ogg"} and size > MAX_SIZE_AUDIO:
36
+ return False, f"Audio file {file.filename} exceeds 100MB size limit"
37
  return True, None
38
  except Exception as e:
39
+ logging.error(f"Error checking file size: {e}", exc_info=True)
40
+ return False, f"Error checking file size: {str(e)}"
41
 
42
 
43
  def save_data_to_storage(filename, data):
 
44
  try:
45
+ upload_folder = current_app.config.get("UPLOAD_FOLDER", "uploads")
46
+ if not os.path.exists(upload_folder):
47
+ os.makedirs(upload_folder, exist_ok=True)
48
+ filename = filename.rsplit(".", 1)[0]
49
+ filepath = os.path.join(upload_folder, f"{filename}.json")
50
+ with open(filepath, "w") as file:
51
+ json.dump(data, file)
52
  except Exception as e:
53
+ logging.error(f"Exception during save: {e}")
 
54
 
55
 
56
  def get_data_from_storage(filename):
 
57
  try:
58
+ upload_folder = current_app.config.get("UPLOAD_FOLDER", "uploads")
59
+ filepath = os.path.join(upload_folder, f"{filename}.json")
60
+ if not os.path.exists(filepath):
61
+ return None
62
+ with open(filepath, "r") as file:
63
+ data = json.load(file)
64
+ return data
65
  except Exception as e:
66
+ logging.error(f"Error loading data: {e}")
67
  return None
requirements.txt CHANGED
@@ -8,7 +8,7 @@ python-dotenv==1.0.1
8
  torch==2.1.0
9
  torchaudio==2.1.0
10
  torchvision==0.16.0
11
- transformers==4.37.2
12
  sentence-transformers==2.2.2
13
  scikit-learn==1.3.2
14
  numpy==1.24.3
@@ -16,9 +16,7 @@ pandas==2.1.4
16
  scipy==1.11.4
17
  accelerate==0.25.0
18
 
19
- # Qwen2 dependencies
20
- modelscope==1.9.5
21
- qwen2==0.1.0
22
 
23
  # NLP
24
  spacy==3.7.2
 
8
  torch==2.1.0
9
  torchaudio==2.1.0
10
  torchvision==0.16.0
11
+ transformers==4.36.2
12
  sentence-transformers==2.2.2
13
  scikit-learn==1.3.2
14
  numpy==1.24.3
 
16
  scipy==1.11.4
17
  accelerate==0.25.0
18
 
19
+
 
 
20
 
21
  # NLP
22
  spacy==3.7.2