Update README.md

b896e93 verified almost 2 years ago

4.94 kB

	---
	library_name: transformers
	license: apache-2.0
	base_model: facebook/detr-resnet-50
	tags:
	- generated_from_trainer
	model-index:
	- name: detr_finetuned_cppe5
	results: []
	datasets:
	- rishitdagli/cppe-5
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Model Card for DETR Finetuned on CPPE-5

	## Model Overview

	This model is a fine-tuned version of [facebook/detr-resnet-50](https://huggingface.co/facebook/detr-resnet-50) on a custom dataset, likely focused on detecting personal protective equipment (PPE) items. The fine-tuning has optimized the model to recognize various PPE elements such as face shields, masks, gloves, and goggles.

	The model is based on the DEtection TRansformer (DETR) architecture, leveraging a ResNet-50 backbone for feature extraction. This fine-tuned version retains DETR's core functionality, enabling object detection tasks but is specifically adjusted to detect items relevant to occupational safety or PPE.

	## Model Performance

	The model achieves the following metrics on its evaluation set:

	- Loss: 1.2294
	- mAP (mean Average Precision):
	- Overall: 0.2366
	- 50 IoU threshold: 0.4852
	- 75 IoU threshold: 0.2032
	- Small objects: 0.1082
	- Medium objects: 0.2086
	- Large objects: 0.3408
	- mAR (mean Average Recall):
	- At 1 detection: 0.2819
	- At 10 detections: 0.4463
	- At 100 detections: 0.4665
	- Small objects: 0.249
	- Medium objects: 0.4004
	- Large objects: 0.5893

	For specific categories (face shields, gloves, goggles, masks), the precision and recall vary, with room for improvement, particularly for small objects like goggles.

	## Intended Use and Limitations

	### Intended Use
	- Detecting personal protective equipment (PPE) in images or video streams.
	- Monitoring workplace safety by ensuring proper usage of PPE items such as masks, gloves, face shields, and goggles.
	- Suitable for industries like construction, healthcare, and manufacturing where PPE detection is critical for compliance and safety.

	### Limitations
	- The model may not generalize well to non-PPE items or general object detection tasks.
	- Performance on small or occluded objects can be limited, as indicated by lower mAP and mAR scores for small objects.
	- The model was trained on a dataset specific to PPE detection, so its performance on images outside of this domain might be inconsistent.

	## Training and Evaluation Data

	The dataset used for fine-tuning remains unspecified, but it appears to focus on personal protective equipment, such as face shields, masks, goggles, and gloves.

	## Training Procedure

	### Hyperparameters:
	- Learning rate: 5e-05
	- Train batch size: 8
	- Eval batch size: 8
	- Optimizer: Adam (betas=(0.9, 0.999), epsilon=1e-08)
	- Learning rate scheduler: Cosine decay
	- Number of epochs: 30
	- Seed: 42

	The model was trained for 30 epochs with Adam optimization, using a learning rate of 5e-05 and cosine learning rate decay. The training was conducted with a batch size of 8 for both training and evaluation.

	## Evaluation Results

	The following are performance metrics captured during the training process across multiple epochs:

	\| Epoch \| Validation Loss \| mAP \| mAP 50 \| mAP 75 \| mAR \| Comments \|
	\|-------\|-----------------\|-----\|--------\|--------\|-----\|----------\|
	\| 1 \| 2.1073 \| 0.0518 \| 0.1075 \| 0.0423 \| 0.2819 \| Initial training \|
	\| 5 \| 1.6220 \| 0.1223 \| 0.2258 \| 0.1115 \| 0.4463 \| Significant improvement \|
	\| 10 \| 1.5033 \| 0.155 \| 0.3265 \| 0.1325 \| 0.5032 \| Stable performance \|
	\| 20 \| 1.2649 \| 0.2211 \| 0.4427 \| 0.1952 \| 0.5867 \| Peak performance \|
	\| 25 \| 1.2347 \| 0.2333 \| 0.4831 \| 0.1989 \| 0.5966 \| Final metrics \|

	## Limitations and Ethical Considerations

	### Limitations:
	- Domain-specific: The model performs well in PPE-related object detection but may not generalize to other tasks.
	- Bias: If the dataset is skewed or limited, certain PPE items may be under-represented, leading to poorer performance for some categories.
	- Real-time Applications: The model might not meet the latency requirements for real-time detection in high-throughput environments.

	### Ethical Considerations:
	- Privacy: Using this model in surveillance scenarios (e.g., workplaces) may raise concerns about employee privacy, especially if applied without clear consent.
	- Misuse: Improper use of this model could lead to incorrect enforcement of safety regulations.

	## Future Work

	- Dataset Improvements: Expanding the dataset to include more diverse PPE items, environments, and object scales could improve model performance, especially for smaller objects.
	- Model Efficiency: Further fine-tuning or model distillation may help make the model more suitable for real-time applications.