For this week’s assignment, I trained the Cifar-10 CNN model for 100 epochs on the data. It took me 30 minutes because I ran it on the intel server. I actually increased the batch size because the compute speed on the server is really fast, so bringing batches in and out of memory too often would actually slow training down, so I increased it to 2048. Other than that , I didn’t tune any other parameters. The training speed on the cluster server is extremely fast compared to my Macbook, only taking 17 seconds per epoch, compared to the one hour it would take on mine. I thought about changing the optimizer to Adam, but I figured RMSprop would be better for this situation. In addition, I added model checkpointing so that I can stop the program and continue training another time. Overall, this was a good learning exercise to setup jupyter notebook and to setup checkpointing so that I can use models in the future. The Intel machine learning server was also very powerful and great to use.
Overall I got a test accuracy of 65.4%, so I am pretty happy with the results for such a shallow CNN.
Epoch 00100: val_acc did not improve from 0.65740 10000/10000 [==============================] - 1s 136us/step Test loss: 0.9971296675682068 Test accuracy: 0.654
Notebook and weights: https://drive.google.com/open?id=1iml5bOarTovsm0hlL71hlhOKdAOMMSn_