2015年11月11日水曜日

3D Object Recognition by Caffe 〜 20-class classification 〜


Introduction


 In the previous page, a new method for a 3D object recognition was proposed. The method was applied to a 10-class classification and a classification accuracy of 90.2% was achieved. In this page, the method is evaluated on a 20-class classification. It is shown that a high classification accuracy of 95.3% is reached.

Dataset


 The pre-trained CNN model (bvlc_reference_caffenet.caffemodel) that Caffe provides is fine-tuned using a dataset consisting of 20 categories which are chosen among the ModelNet40 dataset. These categories are shown below.
  1. airplane
  2. bathtub
  3. bed
  4. bench
  5. bookshelf
  6. bottle
  7. bowl
  8. car
  9. chair
  10. cone
  11. cup
  12. curtain
  13. desk
  14. door
  15. dresser
  16. flower_pot
  17. glass_box
  18. guitar
  19. keyboard
  20. lamp
Each category has training and testing 3D models which are in Object File Format (OFF). The numbers of the models in categories are as follows:
label name the number of trainings the number of testings
0 airplane 626 100
1 bathtub 106 50
2 bed 515 100
3 bench 173 20
4 bookshelf 572 100
5 bottle 335 100
6 bowl 64 20
7 car 197 100
8 chair 889 100
9 cone 167 20
10 cup 79 20
11 curtain 138 20
12 desk 200 86
13 door 109 20
14 dresser 200 86
15 flower_pot 149 20
16 glass_box 171 100
17 guitar 155 100
18 keyboard 145 20
19 lamp 124 20
5114 1202

Results of CNN and Classification


 The fine-tuning yields a high recognition accuracy of 93% as shown below.
As described in the previous page, a 3D model yields 20 gray images. The fine-tuned CNN is applied to each of them and 20 labels are obtained per 3D model. The final label is decided by majority vote. This algorithm is evaluated on those 3D models belonging to the test phase whose number is 1202 as shown above. The classification accuracy of 95.3% is reached.

0 件のコメント:

コメントを投稿