---
base_model: Omnifact/conditional-detr-resnet-101-dc5
datasets:
- Voxel51/fisheye8k
library_name: transformers
license: mit
tags:
- generated_from_trainer
pipeline_tag: object-detection
model-index:
- name: fisheye8k_Omnifact_conditional-detr-resnet-101-dc5
  results: []
---

# fisheye8k_Omnifact_conditional-detr-resnet-101-dc5

This model is a fine-tuned version of [Omnifact/conditional-detr-resnet-101-dc5](https://huggingface.co/Omnifact/conditional-detr-resnet-101-dc5) on the [Fisheye8K dataset](https://huggingface.co/datasets/Voxel51/fisheye8k). It is part of the **Mcity Data Engine** project.

This model was presented in the paper [Mcity Data Engine: Iterative Model Improvement Through Open-Vocabulary Data Selection](https://huggingface.co/papers/2504.21614).

## Model description

This model is a fine-tuned object detection model specifically designed for identifying objects within fisheye camera data, particularly relevant for **Intelligent Transportation Systems (ITS)**. It is a key artifact of the **Mcity Data Engine**, an open-source system that provides a complete data-based development cycle—from data acquisition to model deployment—for continuously improving machine learning models.

The Mcity Data Engine focuses on addressing the challenge of detecting **rare and novel long-tail classes** in large amounts of unlabeled data through an **open-vocabulary data selection process**. This model checkpoint demonstrates the application of this iterative improvement framework to enhance perception capabilities in complex transportation environments.

## Intended uses & limitations

### Intended uses

*   **Object detection** in fisheye camera imagery within Intelligent Transportation Systems (ITS).
*   Identifying both common and **long-tail object classes** such as vehicles (Bus, Bike, Car, Truck) and Vulnerable Road Users (Pedestrian).
*   Integration into **iterative model improvement pipelines** using the Mcity Data Engine framework.
*   Research and development in autonomous driving and roadside perception, particularly for data-centric AI approaches.

### Limitations

*   Performance may vary on datasets significantly different from the training distribution (Fisheye8K), especially for camera types other than fisheye.
*   While designed for open-vocabulary data selection, the model's generalization to entirely novel or highly obscured objects may require further iterative data enrichment and fine-tuning.
*   Optimal performance is achieved when integrated within the continuous data improvement loop enabled by the Mcity Data Engine.

## Training and evaluation data

This model was fine-tuned on the [Voxel51/fisheye8k](https://huggingface.co/datasets/Voxel51/fisheye8k) dataset. The Fisheye8K dataset is specifically curated for object detection in fisheye camera images, capturing diverse urban and suburban scenarios relevant to intelligent transportation. The data originates from vehicle fleets and roadside perception systems, providing a rich source for training robust object detection models.

## Usage

You can use this model directly with the Hugging Face `transformers` library for object detection.

```python
from transformers import pipeline
from PIL import Image
import requests
from io import BytesIO

# Load the object detection pipeline
model_id = "mcity-data-engine/fisheye8k_Omnifact_conditional-detr-resnet-101-dc5"
detector = pipeline("object-detection", model=model_id)

# Example image (replace with your fisheye image or a relevant ITS image)
# This example uses a generic image. For best results, use an image from the model's domain.
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/conditional_detr_image.png"
response = requests.get(url)
image = Image.open(BytesIO(response.content)).convert("RGB")

# Perform inference
predictions = detector(image)

# Print detected objects
for pred in predictions:
    print(f"Label: {pred['label']}, Score: {pred['score']:.2f}, Box: {pred['box']}")

# Example output format:
# [{'box': {'xmin': 10, 'ymin': 20, 'xmax': 100, 'ymax': 120}, 'score': 0.98, 'label': 'Car'}]
```

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 8
- seed: 0
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- num_epochs: 36
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss |
|:-------------:|:-----:|:-----:|:---------------:|\
| 1.0147        | 1.0   | 5288  | 1.5035          |\
| 0.9144        | 2.0   | 10576 | 1.4618          |\
| 0.8685        | 3.0   | 15864 | 1.3823          |\
| 0.8375        | 4.0   | 21152 | 1.5128          |\
| 0.7715        | 5.0   | 26440 | 1.5045          |\
| 0.7664        | 6.0   | 31728 | 1.6914          |\
| 0.7073        | 7.0   | 37016 | 1.6101          |\
| 0.6966        | 8.0   | 42304 | 1.6175          |


### Framework versions

- Transformers 4.48.3
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0

## Links

*   **Paper**: [Mcity Data Engine: Iterative Model Improvement Through Open-Vocabulary Data Selection](https://huggingface.co/papers/2504.21614)
*   **Project Documentation**: [Mcity Data Engine Docs](https://mcity.github.io/mcity_data_engine/)
*   **GitHub Repository**: [mcity/mcity_data_engine](https://github.com/mcity/mcity_data_engine)
*   **Google Colab Demo**: [Mcity Data Engine Web Demo](https://colab.research.google.com/github/mcity/mcity_data_engine/blob/main/fish_eye_8k_colab.ipynb)

## Acknowledgements

Mcity would like to thank Amazon Web Services (AWS) for their pivotal role in providing the cloud infrastructure on which the Data Engine depends.

## Citation

If you use the Mcity Data Engine in your research, feel free to cite the project:

```bibtex
@article{bogdoll2025mcitydataengine,
  title={Mcity Data Engine},
  author={Bogdoll, Daniel and Anata, Rajanikant Patnaik and Stevens, Gregory},
  journal={GitHub. Note: https://github.com/mcity/mcity_data_engine},
  year={2025}
}
```