madhavkarthi
/

project-1-location-classifier-resnet18

Image Classification

scene-classification

transfer-learning

computer-vision

Model card Files Files and versions

madhavkarthi commited on Oct 11, 2025

Commit

e0febff

·

verified ·

1 Parent(s): 9695f77

Update README.md

Files changed (1) hide show

README.md +8 -4

README.md CHANGED Viewed

@@ -35,6 +35,7 @@ The model is part of a larger pipeline that generates contextual music based on
 **Limitations:**
 - Limited to 4 specific scene categories (cafe, gym, library, outdoor)
 - Trained on relatively small dataset extracted from videos
 - May not generalize well to significantly different scene compositions
 - Performance may degrade on low-quality or heavily edited images
@@ -81,13 +82,14 @@ Training was conducted over 3 epochs with consistent loss reduction:
 | Epoch | Training Loss | Status |
 |:-----:|:-------------:|:------:|
-| 1     | 0.4523        | ✓      |
-| 2     | 0.2156        | ✓      |
-| 3     | 0.1089        | ✓      |
 Note: Formal validation metrics were not computed during training. Model was validated qualitatively on held-out images.
 ## Usage
 ### Loading the model
@@ -145,4 +147,6 @@ ResNet-18 Structure:
 This model was developed as part of a course project (24-679) exploring multimodal AI systems.
 It serves as the visual classification component in an image-to-music generation pipeline that combines scene recognition,
-metadata extraction, weather context, and music synthesis.

 **Limitations:**
 - Limited to 4 specific scene categories (cafe, gym, library, outdoor)
+- Limited to Carnegie Mellon University (CMU) campus
 - Trained on relatively small dataset extracted from videos
 - May not generalize well to significantly different scene compositions
 - Performance may degrade on low-quality or heavily edited images
 | Epoch | Training Loss | Status |
 |:-----:|:-------------:|:------:|
+| 1     | 0.3395        | ✓      |
+| 2     | 0.0111        | ✓      |
+| 3     | 0.0041        | ✓      |
 Note: Formal validation metrics were not computed during training. Model was validated qualitatively on held-out images.
 ## Usage
+This can be used to classify any input image into one of four classifiers: Library, Cafe, Gym, Outdoor.
 ### Loading the model
 This model was developed as part of a course project (24-679) exploring multimodal AI systems.
 It serves as the visual classification component in an image-to-music generation pipeline that combines scene recognition,
+metadata extraction, weather context, and music synthesis.
+AI- ChatGPT, Claude were used in the creation of this model and dataset