In recent years, with the rapid development of deep learning, a number of research areas have achieved good results and accompanied by the continuous improvement of convolution neural networks, computer vision has arrived at a new peak. From the ALexNet in 2012 years to the ZF Net in 2013 years, and then to the VGG Net, the ResNet and so on, the architecture of the convolution neural network is constantly improving. In addition, the return of the convolution neural network also makes the application of computer vision greatly improve, such as face recognition, object detection, object tracking, semantic segmentation, and so on.
Object detection as one of the important applications in the field of computer vision has been the focus of research, and convolution neural network has made great progress in object detection. Object detection is developing from the single object recognition to multi-object recognition. The meaning of the first is just from an image to identify a single object, it can be said that it is a problem of classification and the meaning of the later is not only can identify all the objects in an image, including the exact location of the objects. Deep learning has formed a mainstream object recognition algorithm based on RCNN, and these algorithms are refreshing the higher accuracy in a number of famous datasets.
DATASET:
For deep learning, dataset, and neural network are two important parts. The dataset is the fuel for deep learning so that the number and quality of the dataset will affect the accuracy of the neural network output and the choice of neural network or the network architecture will also affect the accuracy.
Dataset
Dataset is one of the foundations of deep learning, for many researchers to get enough data to carry out the experiment just by themselves is a big problem, so we need a lot of open source dataset for everyone to use. Some commonly used datasets in computer vision are the following. The term data set may also be used more loosely, to refer to the data in a collection of closely related tables, corresponding to an experiment or event. Less used names for this kind of data sets are data corpus and data stock. An example of this type is the data sets collected by space agencies performing experiments with instruments aboard space probes. Data sets that are so large that traditional data processing applications are inadequate to deal with them are known as big data. In the open data discipline, data set is the unit to measure the information released in a public open data repository. The European Open Data portal aggregates more than half a million data sets. In this field, other definitions have been proposed but currently, there is not an official one. Some other issues (real-time data sources, non-relational data sets, etc.) increases the difficulty to reach a consensus about it.
1) ImageNet
The imagenet dataset has more than 14 million images covering more than 20,000 categories. There are more than a million pictures with explicit class annotations and annotations of object locations in the image. Imagenet dataset is one of the most widely used datasets in the field of deep learning. Most of the research work such as image classification, location, and detection is based on this dataset. The Imagenet dataset is detailed and is very easy to use. It is very widely used in the field of computer vision research and has become the "standard" dataset of the current deep learning of image domain to test algorithm performance. There is a well-known challenge called "ImageNet International Computer Vision Challenge" (ILSVRC) based on the Imagenet dataset. It is worth mentioning that the winners of ILSVRC2016 are Chinese teams for all projects. The database was presented for the first time as a poster at the 2009 Conference on Computer Vision and Pattern Recognition (CVPR) in Florida by researchers from the Computer Science department at Princeton University. ImageNet primary researchers and inventors include Stanford University computer science professor and researcher Fei-Fei Li.
2) PASCAL VOC
The PASCAL VOC (pattern analysis, statistical modeling and computational learning visual object classes) provides standardized image data sets for object class recognition and
provides a common set of tools for accessing the data sets and annotations. The PASCAL VOC dataset includes 20 classes and has a challenge based on this dataset. The PASCAL VOC Challenge is no longer available after 2012, but its dataset is of good quality and well-marked and enables evaluation and comparison of different methods. And because the amount of data of the PASCAL VOC dataset is small, compared to the imagenet dataset, very suitable for researchers to test network programs. Our dataset is also created based on the PASCAL VOC dataset standard.
3) COCO
COCO (Common Objects in Context) is new image recognition, segmentation, and captioning dataset, sponsored by Microsoft. COCO dataset has more than 300,000 images covering 80 object categories. The open source of this dataset makes great progress in semantic segmentation in recent years, and it has become a "standard" dataset for the performance of image semantic understanding, and also COCO has its own challenge.
Object Detection and Data sets- Theoretical Overview
Reviewed by Akhil Kumar
on
April 25, 2019
Rating:
No comments: