Week5 – Open the Black Box – Wenhe

Brief

For this assignment, I have trained a set of models for cifar-10 based on CNN architecture. Through the whole training process, I tested the effect of different nn networks,  dropout layers, different batch size, and epochs. Also I play with the data augmentation.

Hardware

I am using Google Cloud Platform and set up a platform with Tesla V4 and 4-core CPU.

Architecture

Above is a vgg like arch, which containers two successive conv layers, embedded with dropout layers and finally fully-connected layers.  Below is a table indicating how it goes.

Epoch Batch Size Time Accuracy Test Accuracy
20 64 40s 69% 68%
100 64 109s 75% 72%
20 5 2200s 80% 75%
20(Augumented) 64 40s 70% 71%

As we can see, the large batch size, the more accurate it is. It is also the same to epoch. However, because of the staple point, the increment of epoch does not always make the accuracy increase.

Also, I have tried some other architecture, like two layers of Conv, which gives a similar result compared with vgg one, which is likely to be related to a relatively small set of features. 

In addition, the great drop for the for the third test is mainly due to the small set of batch which makes it harder to generalize.

Leave a Reply