Week 05 Assignment: Train CIFAR-10 CNN —— Lishan Qin

Intro

For this week’s research, I’ve run the program 8 times in total to experiment on how different machine specs and training setup influence the accuracy of the classification and the time spent. I experimented on the influences the valuables such as epoch (number of iterations), the batch size (number of samples per gradient update) and the pool size (the size of the pooling) respectively exert on the performance of the program.

The Epoch Variable

Firstly, I only changed the variable epochs and make the other settings the same. The results are as follows.

I first changed epochs to 10. It took almost 18 minutes for the program to finish the 10 iterations, yet still the test accuracy is only 0.4711.

           

I then changed the epochs to 5 and 3, and left other settings unchanged. The test loss is now 1.83 and the accuracy is 0.4074 when epochs equals to 5, and the time required for the program to finish iterations is 9.4 minutes. And when epoch equals to 3, it took 5.9 minutes for the program to finish and the test loss is only 1.73 and test accuracy is 0.3977.

       

Discovery

According to the comparison between the result when epochs equals 10 and the results when epochs equals 5 and epochs equals 3, it appears that the smaller epochs is, the lower the test accuracy is and the higher the test lost is. There seems to be a positive correlation between the number of epochs and the test accuracy and a negative correlation between epochs and the test loss. But whether such pattern is definite or not is questionable. However, one thing for sure is that the larger the epochs, the more the time is spent for the program to finish because it has to iterate for more times.

The Batch Size Variable

I then only changed the variable epochs and make the other settings the same. The results are as follows.

.   

       

     

Discoveries

It seems that the smaller the batch size is, the higher the test accuracy is. I actually thought the result should be the other way around…But I’m guessing the reason behind this is that since the batch size is smaller, it took longer time for the program to finish examining all the samples, which requires more times to finish all samples, so the time is longer and the accuracy is higher…….?

The Pool Size Variable

Finally, I decided to research on how the pool size variable influences the accuracy of the estimation. I only changed the pool size value and left other values the same. The results are as follows.

.      .     

Discoveries

In terms of the pool size variable, it seems that the larger the pool size is, the less time is needed for the program to finish and the lower the accuracy is. I guess it’s because when the pool size is large, it took fewer times for the program to compare the sample one pool by one pool. And when the times of comparison is less, the estimation becomes rough too, thus lowering the accuracy.

Thoughts and Questions

In conclusion, this experiment showed me that there exists three correlations between the performance of the program and its training set up (epoch, batch size, pool size). it seems that the larger the epoch is, the more the time spent is and the higher the accuracy is. The larger the batch size is, the faster the program finishes and the lower the accuracy is. Same with the pool size. The larger the pool size is, the less time is required and the lower its accuracy is. But due to the limited data collected from my experiment, I’m still not sure to what extent the influence these variables exert on the program’s performance. I’m even not sure if these correlations are definite at all time. Because when in class, I encounter one situation where the program gets higher accuracy when the epoch equals to 2 than when it equals to 3. Thus, I believe that in order to better understand the deeper relation behind the variables and the performance, I still need to do more research about the structure of such program and the purpose of its.

Week 05 Assignment: Train CIFAR-10 CNN –Crystal Liu

Experiment

This week’s assignment is to train the model. The first step for me is to understand the meaning and the function of the key concept. And then I will validate them by practicing.

  1. Epoch:

According to Sagar Sharma’s article on Medium, an epoch is when an entire dataset is passed forward and backward through the neural network only once. According to my trials, the more the time of epoch, the more accurate the result.

When the epoch is 5, as other variables are the same, the accuracy is 0.34.

When the epoch is 8, the accuracy is 0.3964.

  1. Batch size

The batch size means the number of samples that will be propagated through the network. Since one epoch is too big to feed to the computer at once we divide it into several smaller batches. For example, if the batch size is 50, the algorithm will divide every 50 samples into groups and use these groups of data to train the model. The larger the batch size, the faster the training. 

In my experiment, the smaller the batch size, the more accurate the result.

when the batch size is 1024, the accuracy is 0.4188 (epoch=5). The time it took is 409s.

when the batch size is 512, the accuracy is 0.4427(epoch=5). The time it took is 399s.

3.Dropout: 

Dropout is a technique where randomly selected neurons are ignored during training. If some random neurons are dropped out during the training, others will replace them in the training. The result is that the network will be less sensitive to the specific weight of neurons and then get better generalization and to some extend avoid overfitting the training data.

I changed the data of the dropout from 0.25 to 0.50 but the accuracy  decreased from 0.34 to 0.3308. I’m still not so sure whether there’s specific and close relationship between the dropout and the accuracy since I did’t find clear explanation on the internet.

4. Size:

layers.Conv2D(32, (3, 3), padding=’same’, activation=tf.nn.relu,
input_shape=x_train.shape[1:]),
layers.Conv2D(32, (3, 3),activation=tf.nn.relu),
layers.MaxPooling2D(pool_size=(2, 2)),
layers.Dropout(0.1),
 
This time I chose to change the bolded data to 64, and the accuracy changed from 0.34 to 0.3918. The accuracy has increased.
 
Then I changed the pool size from (2,2) to (3,2), and the accuracy changed from 0.34 to 0.3803. So the accuracy increased when the pool size increased.
 

Confusion&Thought

I was confused about the relationship between the batch size and the accuracy. Since I have searched batch size before and the articles said that compared with the number of all the sample, the use of batch size will decrease the accuracy of the estimation of the gradient. According to this idea, I thought that the larger the batch size, the more accuracy the result. However, the real case is totally opposite. Then I did some deeper researches and found that the smaller batch size can help the model training better by adding a little bit of noise into the search. But the bigger batch can help run faster although it has to sacrifice the accuracy. So it is important to set up a proper batch size based on the total number of samples to keep the balance of the speed and the accuracy.

Week 05 Assignment: Train CIFAR-10 CNN

Train your own CIFAR-10 CNN and write an experiment report / blog post:

  • Try CIFAR-10 CNN with your own computers
  • It can be CPU or GPU
  • If it’s GPU, you might need to manage the environment setup
  • Include your machine specs, training setup, and total time spent
  • Include your thoughts and discoveries
  • Post it on the IMA blog before Friday Midnight, 11th with tag: aiarts05