Object Detection
The task of identifying and localizing objects within an image by drawing bounding boxes around each detected object and classifying them.
Popular Datasets
- COCO (Common Objects in Context)
- Description: Large-scale object detection, segmentation, and captioning dataset with over 200,000 labeled images.
- URL: COCO
- PASCAL VOC
- Description: Dataset for object detection with 20 classes, providing images, annotations, and segmentation masks.
- URL: PASCAL VOC
- ImageNet
- Description: Over 14 million images with 1,000 object categories, also includes a subset for object detection.
- URL: ImageNet
- Open Images Dataset
- Description: Contains ~9 million images annotated with image-level labels, object bounding boxes, and segmentation masks.
- URL: Open Images
- KITTI
- Description: Dataset for autonomous driving with images, 3D point clouds, and annotations for object detection and tracking.
- URL: KITTI
Popular Models
- R-CNN (Region-based Convolutional Neural Networks)
- Variants: R-CNN, Fast R-CNN, Faster R-CNN
- Description: Uses region proposal networks to identify regions of interest and then classify objects within those regions.
- URL: Faster R-CNN
- YOLO (You Only Look Once)
- Variants: YOLOv1, YOLOv2, YOLOv3, YOLOv4, YOLOv5
- Description: Real-time object detection system that predicts bounding boxes and class probabilities directly from full images.
- URL: YOLO
- SSD (Single Shot MultiBox Detector)
- Description: Detects objects in images using a single deep neural network.
- URL: SSD
- RetinaNet
- Description: Combines a backbone network for feature extraction with a novel Focal Loss to handle class imbalance.
- URL: RetinaNet
- EfficientDet
- Description: Scalable and efficient object detector, part of the EfficientNet family.
- URL: EfficientDet
Hyperparameters
- Learning Rate
- Description: Controls the step size at each iteration while moving towards a minimum of the loss function.
- Batch Size
- Description: The number of training examples used in one iteration.
- Number of Epochs
- Description: The number of complete passes through the training dataset.
- Anchor Boxes
- Description: Predefined bounding boxes of different sizes and aspect ratios used for detection.
- IoU Threshold
- Description: Intersection over Union (IoU) threshold for determining true positive detections.
- Non-Maximum Suppression (NMS) Threshold
- Description: Threshold for filtering out overlapping bounding boxes.
- Backbone Network
- Examples: ResNet, VGG, MobileNet
- Optimizer
- Examples: SGD, Adam
Popular Loss Functions
- Cross-Entropy Loss
- Description: Measures the classification error in object detection tasks.
- Smooth L1 Loss
- Description: Used for bounding box regression, combining L1 and L2 loss.
- Focal Loss
- Description: Addresses class imbalance by focusing on hard examples.
- URL: Focal Loss
- IoU Loss
- Description: Directly optimizes the Intersection over Union metric.
Popular Evaluation Metrics
- Mean Average Precision (mAP)
- Description: The average precision across all classes.
- Intersection over Union (IoU)
- Description: Measures the overlap between the predicted bounding box and the ground truth.
- Precision-Recall Curve
- Description: Plots precision against recall for different threshold values.
- F1 Score
- Description: The harmonic mean of precision and recall.
- Average Precision (AP)
- Description: The area under the precision-recall curve for a single class.
Other Important Topics
- Data Augmentation
- Description: Techniques to increase the diversity of the training dataset without collecting new data.
- Examples: Scaling, Translation, Rotation, Flipping, Adding Noise
- Transfer Learning
- Description: Using a pre-trained model on a new, related task.
- Example: Fine-tuning a model pre-trained on COCO for a custom object detection task.
- Fine-Tuning
- Description: Adjusting a pre-trained model’s parameters on a new dataset.
- Hyperparameter Tuning
- Techniques: Grid Search, Random Search, Bayesian Optimization
- Model Interpretability
- Techniques: Visualization of feature maps, Activation maximization
- Post-Processing Techniques
- Examples: Non-Maximum Suppression (NMS), Soft-NMS
- Frameworks and Libraries
- Examples: TensorFlow Object Detection API, Detectron2, MMDetection
- Edge and Real-Time Object Detection
- Description: Deploying object detection models on edge devices for real-time applications.
- Examples: TensorFlow Lite, NVIDIA Jetson, OpenVINO