Skip to content

dt2229/273Kelvin

 
 

Repository files navigation

273Kelvin

NYU Deep Learning Project

Abstract from paper

We present a semi-supervised model for object detection tasks. The goal is to accurately predict bounding boxes and their labels for objects inside images. Our dataset comprises 512,000 unlabeled 224×224 images and 50,000 labeled images of various sizes that we split 60:40 for training and validation purposes. The number of objects we train our model to recognize is 100. We approach the task by training a selfsupervised model to learn visual representations from the unlabeled dataset and feed this trained model as a backbone to a downstream task that outputs bounding boxes and labels. We then finetune the whole model in a supervised fashion with the labeled dataset. We utilize SimCLR (Chen et al., 2020), a contrastive technique, as the framework to train the ResNet-18 backbone and Faster R-CNN (Ren et al., 2015) for the downstream object detection task. Overall, we achieved an Average Precision (AP) - Intersection of Union (IoU)=0.50:0.95 - of 0.093 on the validation set after training for ten epochs.

About

NYU Deep Learning Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.1%
  • Shell 0.9%