I thought it was interesting how in the experiment we did in class, even if the epoch number was higher, the accuracy didn’t necessarily always increase, even if the general trend was upward. I wondered if this would still be true if I changed all the epoch numbers on my own laptop.
Machine specs:
When I opened the week05-02-trainCNN python code I was surprised to see such large numbers for the epochs, especially because in class we kept the epochs to double and single digits. I decided to run the code just as it was as a control/starting point (100 epochs + 2048 batch size). However, as soon as I ran it, I instantly regretted it – I was 8 minutes in and only on the 4th epoch when I decided to terminate the program.
Instead, I consulted this stack overflow forum as to what a good starting point for the batch size would be. One person mentioned a batch size of 32 is pretty standard, so I decided to use this batch size number as a control for testing out different epoch numbers.
Test 1
- Epochs: 1
- Batch size: 32
- Time spent: 4 minutes
- Accuracy: 0.4506
That was honestly a lot more accurate than I thought it was going to be. For the next test, I increased the number of epochs to 5 under the assumption that the accuracy would increase.
Test 2
- Epochs: 5
- Batch size: 32
- Time spent: 22 minutes
- Accuracy: 0.6278
This was a significantly larger accuracy than I was expecting, since in my mind 5 seems to be a pretty low epoch number compared to the original 100 that was written in the code. Although the overall accuracy increased, it was interesting to note that after it passed through the first epoch the accuracy was only 0.4413, which was lower than the accuracy was in test 1. I assumed it would be the same or at least higher, considering I am using the same computer and same numbers except for number of epochs.
Now I was curious about how changing the batch number would affect the accuracy. I was also curious as to how it would affect the time, because when I initially ran the 100 epoch + 2048 batch size code it was running at a faster rate than my first two tests (even though I was still too impatient to sit through it). I decided to keep the number of epochs at 5 for these tests as a control, so I could compare the results to test 2.
Test 3
- Epochs: 5
- Batch size: 100
- Time spent: 18 mins
- Accuracy: 0.546
As suspected, this test took less time. However, what surprised me was that the accuracy was lower in comparison to test 2. For some reason I assumed that if the batch size was higher, then the accuracy would also be higher.
The biggest takeaway from this experiment is that training takes a lot of time! At least in these three tests, the one that took the most time gave the highest accuracy which made the time seem worth it. I also didn’t experiment nearly enough to try and find the ideal factors for an optimal accuracy rate – it definitely seems that it takes a very specific combination of factors and a lot of testing in order to get the desired results.